Home & blog  /  Tag: JavaScript  /

The magic of JavaScript's apply()

posted: 23 Jul '12 14:10 tags: JavaScript, apply(), context, PHP, Stack-Overflow

I recently answered a Stack Overflow question re: how to call a function in PHP with a dynamic number of arguments. The answer is call_user_func_array().

I mentioned in passing that the JS equivalent was apply(), and that apply() could also fake execution context. asked me to elaborate, so here I am.

Meet apply()

apply() is the clever gem of intermediate-advanced JavaScript, and is one of my favourite features of the language.

Inherited by every function/method, apply() has two uses, allowing you to:

  • set the this context in which a particular function/method executes
  • pass an array of arguments to a function that ordinarily expects each argument to be passed separately

Both of these are hugely useful and allow you to accomplish things that would otherwise require cumbersome loops or outright code re-writes.

Below I'll look at four varying use-cases for apply().

Use case 1: library APIs

Whilst 'faking' the context of a function may seem somewhat whorish at first, it can be hugely useful where libraries are concerned. Why, we need look no further than our ubiquitous friend jQuery. Consider this:

1var long_paras = $('p').filter(function() {

2     return $(this).text().length > 5;

3});

There, I select all paragraphs with text longer than 5 characters. (The filter() method, if you don't know, takes a callback function and applies it iteratively to each matched element; the element is removed from the stack if the function returns false).

But have you ever wondered why, conveniently, this inside the callback magically points to the element? There's certainly no native reason this should be so. The reason is, internally, jQuery does something like this: (Warning - guesswork ahead, but the point remains).

1jQuery.prototype.filter = function(func) {

2    

3     //for each element on the stack

4     for (var i=0, len=this.elements.length; i<len; i++)

5    

6         //apply the callback function in context of element

7         if (!func.apply(this.elements[i]))

8    

9             //if func returns falsy, remove element from stack

10             delete this.elements[i];

11};

As you can see, the callback is not merely invoked, it is applied, with each element (in turn) set as the this context. Clever, eh? If this wasn't possible, jQuery would have to pass the element as an argument to your callback instead - decidedly less graceful. The callback is all about the element, so it's sensible this points to it, just as it does natively in, say, event callbacks bound with addEventListener().

Use case 2: array as arguments

This is easily demonstrated with some built-in methods, for example Math.min(). This method, you won't be staggered to learn, finds the minimum value from within a list of numbers passed as separate arguments.

Math.min(2, 5, 7); //2

That's fine, but the chances of you ever wanting to use this method in the above way are pretty slim. Almost always, you'd have an array of numbers, probably gathered or computed dynamically, and want to find the lowest one from within that.

Other languages, such as PHP via array_min, allow this. In JS, you'll need apply() magic:

1//gather up some random numbers into an array

2var random_nums = [], num_nums = 50;

3for (var i=0; i<num_nums; i++)

4     random_nums.push(Math.floor(Math.random() * 10000));

5    

6//find lowest, with some apply() fudgery

7Math.min.apply(null, random_nums);

Since our numbers there are generated dynamically, the first approach is not an option. Because apply() takes an array and passes it to a function as though each value of that array was a separate argument being passed, this is our solution.

Note that I pass null as the first argument, since Math.min() is a static method that does not care about context - it merely deals with what it is passed as arguments.

Let's look at an example where contexts is relevant.

Use case 3: an approach to inheritance

JavaScript lacks a formal notion of classes and, therefore, classical inheritance. What it does have, and what is at the centre of the language, is prototypal inheritance. Using this, there are various approaches to simulating traditional inheritance.

One of them (though not necessarily the best one) utilises apply().

1function Parent() {}

2Parent.prototype.sayHi = function() { alert('hi'); };

3function Child() { Parent.apply(this, arguments); }

4Child.prototype = new Parent();

As I say, this is just one approach to inheritance simulation. Like all approaches, it has advantages and disadvantages. It would be off-topic to discuss them here, but there's more info on this here. Ultimately, the approach to inheritance used comes down to taste and requirement a lot of the time.

Use case 4: Converting function arguments to array

You might think this sounds like the second use-case I demonstrated, but I'm driving at something rather different here.

Inside any function, a local variable, arguments, is implicitly created. It is an array-like object (but NOT an array) of the arguments passed to that function.

1function someFunc() { console.log(arguments); }

2someFunc('apple', 'pear');

There, arguments looks and feels like a normal, indexed array - in fact it's an object, with keys named 0, 1 and 2.

A common need is to treat arguments as an array - and, happily, apply() can help us with this.

(Well, it's part of the help. The following technique also owes a lot the fact that Array's slice method is not particularly fussy about needing a real array - apparently an array-like object will do fine, also.)

1function someFunc() {

2     arguments.push('banana'); //ERROR

3     var args_as_array = [].slice.apply(arguments);

4     args_as_array.push('banana'); //OK

5}

6someFunc('apple', 'pear');

Line 2 fails because push is a method of Array, not Object. We must first 'coerce' the arguments object to an array, as we do in line 3. Normally, slice() is concerned with the extraction and return of certain elements of an array, but since we don't pass it any arguments denoting this, it simply gives us back an array in its entirety.

apply() vs. call() vs. bind()

These three tend to get lumped together because they're all concerned with setting the this context in which the stipulated function runs. Let's quickly look over some of the differences.

call() works just like apply() except arguments must be passed individually, not as an array. So it has only one benefit - context setting. With this in mind, you'll probably find yourself using apply() far more than call().

bind(), on the other hand, is concerned with the creation of new functions that are permanently bound to a particular context. Contrast this with apply(), which does not create functions - it merely invokes them.

So if you wanted to make a permanent copy of a function or method, but which runs in a different context from its original, use bind().

This is best demonstrated with reference to a common error among early JavaScript developers. Perhaps armed with the knowledge that complex types are copied by reference, not value, the following does not behave as they expect:

1var obj = {

2     property: "hello",

3     method: function() { alert(this.property); }

4}

5    

6//make a shortcut to the object's method

7var alias = obj.method;

8    

9obj.method(); //alerts hello, as expected

10alias(); //alerts undefined - context not preserved

We make a reference to the method - but the context is not stored. Therefore, when the alias is invoked, by the that time it's running in the context of the global scope, not obj.

To make the alias whilst also preserving context, bind() is needed:

1var alias = obj.method.bind(obj);

2alias(); //alerts hello - context preserved

(Sidenote: another capability of bind() is that you can specify default values for function arguments, a la partial application, but again that's off-topic for this post. For more on this very useful method, check out the excellent MDN documentation on it.)

And finally... apply() with instantiation

One of the minor drawbacks of apply() is that it cannot natively be used for instantiation. You might try something like this:

1function Dog(name, colour) {}

2var fido = new Dog.apply(null, ['Fido', 'brown']);

What we meant was to run apply() on the instantiation of the class. What actually happens, though, is JavaScript first runs Dog.apply(...) and then the result of that operation is instantiated. In short, a right mess, and not what we wanted.

Fortunately there's a couple of workarounds for this. I favour this one, demonstrated in this Stack Overflow answer:

1//set up our class and a method

2function Dog(name, colour) {

3     this.name = name;

4     this.colour = colour;

5}

6Dog.prototype.sayHi = function() {

7     alert(this.name+' says hi!');

8}

9    

10//make a new version of apply() that supports instantiation

11function newApply(classs, args) {

12     function F() { return classs.apply(this, args); }

13     F.prototype = classs.prototype;

14     return new F();

15}

16    

17//instantiate and test

18var fido = newApply(Dog, ['Fido', 'brown']);

19fido.sayHi(); //Fido says hi!

A bonus of that snippet is it demonstrates an alternate approach to simulating classical inheritance (I mentioned there were a few ways). This approach is known as the temporary constructor approach, and utilises a blank function as a 'proxy' between the parent and child classes. As I mentioned, I won't go into these here - but there's more info on them in this article.

Conclusion

So there you have it. I told you apply() was clever. A lot of what we take for granted in library APIs depends its magic. At first the concept can seem quite shocking - faking context? Isn't that bad pattern design? - but as I've hopefully shown, there are good cases for this. And there's more - I showed only a handful.

(Icon attribution: Oxygen Icons)

5 comments | post new

REGEX round-up: two issues...

posted: 02 Jul '12 19:27 tags: REGEX, Stack-Overflow, parsing, JavaScript, PHP

I've recently become something of an addict to answering questions on Stack Overflow. As something of a gleeful REGEX sadist, this is one of the areas I answer on.

Two questions recently highlighted again some of the inabilities in REGEX.

1. Capturing repeating sub-groups

One concerns captured sub-groups. If you're not sure what this means, take a look at this:

"hello, there".match(/hello, (there)/);

The result of that is an array of the following structure:

1[

2     "hello, there", //the whole match is always the first key

3     "there" //...followed by any sub-matches

4]

As the comment says, in most REGEX implementations if not all, the first element of the matches array is the entire match. But I also asked for a sub-match, (there), so that gets sent in as the second array element. And so on, for as many sub-matches as I specified.

This is fine, but gets tricky where you want to repeat a sub-match and capture each repetition of it separately. Consider this:

"foo bar bar bar".match(/foo (bar ?)+/);

That pattern asks for "foo" followed by "bar" one or more times (since I use the "once or more" operator, +, after my sub-group). It returns this:

["foo bar bar bar", "bar"]

You can see the instruction to allow repetition of the sub-group has been honoured insofar as the pattern, in its entirety, matches. But you will also notice that it captured only one of the sub-group instances, not three.

If you're finding this a bit of a headache (and you either love REGEX or find it melts your brain), the bottom line is this:

Your matches array will contain only as many results as there are sub-groups explicitly defined in your pattern (plus the entire match as the first element).

At least in JavaScrtipt and PHP, anyway. I have seen fleeting references to functionality in other languages that can capture repeated - i.e. implied instances of - sub-groups, but I don't know for sure.

Partial workaround

As confirmed by the priceless Regular-Expressions.info, the above pattern is a common pitfall, but I include it just to illustrate the point.

A partial workaround is to capture the repeated sub-group in yet another sub-group. So:

"foo bar bar bar".match(/foo ((bar ?)+)/);

That returns this:

["foo bar bar bar", "bar bar bar", "bar"]

...because this time I made a point of capturing not only the sub-group to be repeated, but also the cumulative result OF that repetition.

But the problem remains that we have not captured each instance separately. When I say "problem", it's not an error of R&G&X; it wasn't designed to do this. It's just an inability.

It's not a show-stopping problem. If you really want "bar" to be represented three times separately in an array, it doesn't take much to first do the REGEX then split the second array element (in our case) by the space delimiter).

2. REGEX is not a parser

Another question that I answered the other day considered the use of REGEX to parse a proprietary tag format.

This should set alarm bells ringing immediately. REGEX is not a parsing tool.

In any case the user wanted to parse the various tags from the following string. The complication is in the fact that tags may contain nested sub-tags.

Hi [{tag:there}], [{tag:how are you?}]. [{I'm good}], thanks.

A similar question came up today, asking the same thing, but with HTML. (Incidentally, REGEX is always a bad choice for parsing XML-based languages, given the DOM route).

The problem with both is this: REGEX cannot hope to reliably know which opening tags correspond to which closing tags.

Take the following HTML (based on the question I mentioned above):

1<table>

2     <tr>

3         <td>Nottingham Forest</td>

4     </tr>

5     <tr>

6         <td>Notts County</td>

7     </tr>

8     <tr>

9         <td>Mansfield Town</td>

10     </tr>

11</table>

Let's say you want to capture the row that contains the string "County":

/<tr\b[^>]*?>[\s\S]*?County[\s\S]*?<\/tr>/gi

Tested against our HTML, this will be the result:

1<tr>

2     <td>Nottingham Forest</td>

3</tr>

4<tr>

5     <td>Notts County</td>

6</tr>

This is logical when you think how REGEXP works. It begins at the start of the string, and is asked to find an opening <

tag. It is then told to allow for zero or more characters of any kind ([\s\S]up to and including the string "Forest".

(Side note: if you're wondering, I use [\s\S] rather than . because the latter, though a wildcard character, does not in fact match white-space characters, which is a big deal here as our string is multi-line, and line breaks are spacial characters. The only way to match truly anything is with the former, which literally says "get me all non-space and space characters", i.e. everything.)

This is precisely the problem. In telling it to allow any characters of any kind before the string "Forest", we actually cover from the start of the string (i.e. the first row) into the second.

In other words, our match will be from the beginning to the end of the requested row - never just the requested row.

Ideally we'd limit that first [\s\S] from matching another

A partial workaround

Let's return to the first example, with the proprietary tag format. The aim, of course, is to extract each of the three tags. This question was a PHP one, so I came up with this horror.

1$str = "Hi [{tag:there}], [{tag:how are you?}]. [{I'm good}], thanks.";

2$matches = array();

3function replace_cb($this_match) {

4     global $matches;

5     $this_match = $this_match[0];

6     foreach($matches as $index => $match) $this_match = str_replace('**'.($index + 1).'**', $match, $this_match);

7     array_push($matches, $this_match);

8     return '**'.count($matches).'**';

9}

10while(preg_match('/\[\{[^\[]*?\}\]/', $str)) $str = preg_replace_callback('/\[\{[^\[]*?\}\]/', 'replace_cb', $str);

That actually works. The concept is to first locate and extract all tags that do not contain nested sub-tags, and then work outwards. I'll save you the laborious, line-by-line explanation, but if we print_r($matches), we get this:

1Array

2(

3     [0] => [{tagname:content}]

4     [1] => [{tag2: more data here}]

5     [2] => [{tag1:xnkudfdhkfujhkdjki diidfo now nested tag [{tag2: more data here}] kj udf}]

6)

post a comment

JavaScript getters and setters: varying approaches

posted: 13 Mar '12 17:50 tags: JavaScript, ECMA5, object, responsive UI

Last week I posted an introductory article on ECMAScript 5 object properties, and the mini-revolution that I think they constitute. (The post made the coveted JavaScript Weekly - thanks, guys.)

One of the key features of them is the ability to define getter/setter callbacks on them.

Getters and setters are a means of providing an arm's-length way of getting or setting certain data, whilst keeping private other data, and are common of most languages. In JavasScript, setters are also a good way of ensuring your UI stays up to date as your data changes, which I'll show you an example implementation further down.

A new approach to getters and setters

The new approach looks like this, and can be used only on properties created via the new Object.create() and Object.definePropert[y/ies]() methods.

1var dog = {}, name;

2var name;

3Object.defineProperty(dog, 'name', {

4     get: function() { return name; },

5     set: function(newName) { name = newName; }

6});

7dog.name = 'Fido';

8alert(dog.name); //Fido

You'll note that this approach requires the help of a 'tracker variable' (in our case name) via which the getter/setter reference the property's value. This is to avoid maximum recursion errors that the following would cause:

1...

2     get: function() { return this.name; }, //MR error

3...

That happens because we set a getter, via which any attempt to read the property is routed. Therefore, having the getter reference this.name is effectively asking the getter to call itself - endlessly. Likewise for a setter, if it tried to assign to this.name.

Since each property needs its own tracker, and you don't want lots of variables flying around, it's a good idea to use a closure when declaring several properties.

1var dog = {}, props = {name: 'Fido', type: 'spaniel', age: 4};

2for (var prop in props)

3     (function() {

4         var propVal = props[prop];

5         Object.defineProperty(dog, prop, {

6             get: function() { return propVal; },

7             set: function(newVal) { propVal = newVal; }

8         });

9     })()

10alert(dog.name+' is a '+dog.type); //Fido is a spaniel

11dog.name = 'Rex';

12alert(dog.name+' is age '+dog.age); //Rex is age 4

There, we declare what properties we want on our object, and some start values. The loop sets each property, and tracks its value via a private propVal variable in its closure.

One of the things I like about this new approach is you no longer have to call the getters/setters explicitly (as you did with previous implementations - see below) - they fire simply by talking to the property.

Admittedly this has its proponents and its opponents; those in favour say getters/setters should fire simply by calling/assigning to the property - not calling some special methods to do that. Those against normally point out that someone new to the code might be surprised to find that talking to a property in fact fires a function.

My take is that, as long as this is part of the spec, and your code is well documented, there can be few complaints with using the new implementation.

Other ways of doing getters/setters

In any case, I much prefer them to the implementation we got in JavaScript 1.5.

1var dog = {

2     type: 'Labrador',

3     get foo() { return this.type; },

4     set foo(newType) { this.type = newType; }

5};

6alert(dog.type); //Labrador

7dog.foo = 'Rotweiller';

8alert(dog.foo); //Rotweiller

I've never been in love with this approach, chiefly because you don't deal directly with the property but with a proxy that represents its getter/setter callbacks - in the above example foo. The new approach does away with this; you call/assign to the property just as you would if there were no getters/setters in play, and the getter/setter callbacks kick in automatically - they are not referenced explicitly.

That said, one good point about this separation of property value from getter/setter is that the getter/setter can safely reference the property via this without the risk of recursion error, as befalls the new approach.

The older way

There's also the depracated __defineGetter__() and __defineSetter__() technique.

1var dog = {

2     type: 'Labrador'    

3};

4dog.__defineGetter__('get', function() { return this.type; });

5alert(dog.get); //Labrador

Once again you have to name your setters/getters. By far the most notable point about this approach, though, is you can assign getters/setters after assigning the property - not a super common desire, but useful any time you don't want to or can't alter the prototype. The other two implementations don't allow you to do this, at least without a lot of reworking.

A final point about these latter implementations is that they don't hijack control of your property like the new implementation does. That is, if a developer ignores them and manipulates the property directly, they can. This is not good news; if you defined getters/setters, you probably want them to run, not be bypassed.

1var dog = {

2     name: 'Henry',

3     set foo(newName) { alert('Hi from the setter!'); this.name = newName; }

4};

5dog.name = 'Rex'; //setter bypassed; its alert doesn't fire

Setters and a responsive UI

As I mentioned in the intro, another role of getters in JavaScript can be to keep your UI up to date as your data changes. Frameworks like Backbone JS sell themselves heavily on this concept.

As the intro to the Backbone documentation points out, medium-large JavaScript applications can easily get bogged down with jQuery selectors and other means trying to keep your views in-sync with your data.

A getter can help here. Here's something I cooked up:

1Object.UIify = function(obj) {

2     for(var property in obj) {

3         var orig = obj[property];

4         (function() {

5             var propVal;

6             Object.defineProperty(obj, property, {

7                 get: function() { return propVal; },

8                 set: (function(target) {

9                     return function(newVal) {

10                         propVal = newVal;

11                         $(target).text(propVal);

12                     };

13                 })(orig.target)

14             });

15         })()

16         obj[property] = orig.val;

17     }

18     return obj;

19};

20    

21$(function() {

22     var dog = Object.UIify({name: {val: 'Fido', target: '#name'}, type: {val: 'Labrador', target: '#type'}});

23     dog.name = 'Bert';

24     dog.type = 'Rotweiller';

25});

And here's some example HTML:

<p id='dog'>Hi - my name's <span id='name'></span> and I'm a <span id='type'></span>!</p>

I'll go into the details of what my method does in a further post. Essentially, though, what's happening is we pass an object to the UIify() method where each property is a sub-object containing its starting value (val) and a CSS/jQuery selector pointing to to the UI element that should be updated as and when the value changes (target.)

UIify() then returns an object using the new ECMA5 getters/setters. Whenever a property of the object is overwritten, the corresponding UI element denoted by the target we specified is updated. In my case, the targets were simply elements with IDs, but it could of course be more complex targets - it's just CSS/jQuery selector syntax.

---------

So there you have it, three approaches through the ages. Next time up I'll be looking more at the new Object funcionality in ECMA5.

(p.s. for further reading, be sure to check out the extensive MDN article on working with objects, which talks a lot about getter/setter techniques.)

8 comments | post new

ECMAScript 5: a revolution in object properties

posted: 29 Feb '12 20:03 tags: JavaScript, ECMA5, object

Over the coming weeks I'm going to focus on discussing the mini revolution that ECMAScript 5 brought, and the implications in particular for objects and their properties.

ECMA5's final draft was published at the end of 2009, but it was only really when IE9 launched in early 2011 - and, with it, impressive compatibility for ECMA5 - that it became a genuinely usable prospect. Now in 2012, it is being used more and more as browser vendors support it and its power becomes apparent. (Full ECMA5 compatibility table).

JavaScript has always been a bit of an untyped, unruly free-for-all. ECMAScript 5 remedies that somewhat by giving you much greater control over what, if anything, can happen to object properties once changed - and it's this I'll be looking at in this first post.

A new approach to object properties

In fact the whole idea of an object property has changed; it's no longer a case of it simply being a name-value pairing - you can now dictate all sorts of configuration and behaviour for the property. The new configurations available to each property are:

  • value - the property's value (obviously)
  • writable - whether the property can be overwritten
  • configurable - whether the property can be deleted or have its configuration properties changed
  • enumerable - whether it will show up in a for-in loop
  • get - a function to fire when the property's value is read
  • set - a function to fire when the property's value is set

Collectively, these new configuration properties are called a property's descriptor. What's vital to understand, though, is that some are incompatible with others.

Two flavours of objects

The extensive MDN article on ECMAScript 5 properties suggests thinking of object properties in two flavours:

  • data descriptors - a property that has a value. In its descriptor you can set value and writable but NOT get or set
  • accessor descriptors - a property described not by a value but by a pair of getter-setter functions. In its descriptor you can set get and set but NOT value or writable.

    Note that enumerable and configurable are usable on both types of property. I'm struggling to understand why someone thought the ability to set a value and a setter function, for example, were incompatible desires. If I find out, I'll let you know.

    New methods

    To harness this new power, you need to define properties in one of three ways - all stored as new static methods of the base Object object:

    • defineProperty()
    • defineProperties()
    • create()

    The first two work identically except the latter allows you to set multiple properties in one go. As for Object.create(), I'll be covering that separately in a forthcoming post.

    Object.defineProperty() is arguably the most important part of this new ECMAScript spec; as John Resig points out in his post on the new features, practically every other new feature relies on this methd.

    Object.defineProperty() accepts three arguments:

    • the object you wish to add a property to
    • the name of the property you wish to add
    • a descriptor object to configure the property (see descriptor properties above)

    Let's see it in action.

    1var obj1 = {};

    2Object.defineProperty(obj1, 'newProp', {value: 'new value', writable: false});

    3obj1.newProp = 'changed value';

    4console.log(obj1.newProp); //new value - no overwritten

    See how the overwrite failed? No error or warning is thrown - it simply fails silently. In ECMA5's new 'strict mode', though, it does throw an exception. (Thanks to Michiel van Eerd for pointing this out.)

    There we set a data descriptor. Let's set an accessor descriptor instead.

    1var obj = {}, newPropVal;

    2Object.defineProperty(obj, 'newProp', {

    3     get: function() { return newPropVal; },

    4     set: function(newVal) { newPropVal = newVal; }

    5});

    6obj.newProp = 'new val';

    7console.log(obj.newProp); //new val

    You might be wondering what on earth is going on with that newPropVal variable. I'll come to that in my next post which will look at getters and setters in detail. Note also how, with our setter, the new value is forwarded to it as its only argument, as you'd expect.

    The fact that these properties can be set only via these methods means you cannot create them by hand or in JSON files. So you can't do:

    1var obj = {prop: {value: 'some val', writable: false}}; //etc

    2obj.prop = 'overwritten'; //works; it's not write-protected

    ECMA 5 properties don't replace old-style ones

    An important thing to understand early on is that this new form of 'uber' property is not the default. If you define properties in the old way, they will behave like before.

    1var obj = {prop: 'val'};

    2obj.prop = 'new val'; //overwritte - no problemo

    Reporting back

    Note that these new configuration properties are, once set, not available via the API; rather, they are remembered in the ECMAScript engine itself. So you can't do this (using the example above):

    console.log(obj1.newProp.writable); //error; newProp is not an object

    Instead, you'll be needing Object.getOwnPropertyDescriptor. This takes two arguments - the object in question and the property you want to know about. It returns the same descriptor object you set above, so something like:

    {value: 'new value', writable: true, configurable: true, enumerable: true}

    More to come..

    So there you go - a very exciting mini revolution, as I said. This new breed of intelligent object property really is at the heart of arguably the most major shake-up to the language for a long time. Next week I'll continue this theme - stay posted!

    10 comments | post new