Writing Efficient JavaScript: Chapter 7 - Even Faster Websites

by Nicholas C. Zakas

Today’s web applications are powered by a large amount of JavaScript code. Whereas early web sites used JavaScript to perform simple tasks, the language is now used to run the entire user interface in many places. The result can be thousands of lines of JavaScript code to execute every time a user interaction takes place. Performance, therefore, is not just about how long it takes for the page to load, but also about how it responds as it’s being used. The best way to ensure a fast, enjoyable user interface is to write JavaScript as efficiently as possible for all browsers.[22]

Even Faster Web Sites book cover

This excerpt is from Even Faster Web Sites .

Souders and eight expert contributors provide best practices and pragmatic advice for improving your site's performance in three critical categories: JavaScript, in the network, and in the browser.

buy button

This chapter covers some of the hidden performance issues in JavaScript and how to address them. Some changes concern small code structure issues while others may require revisiting your algorithm. The important thing to remember is that there is no silver bullet when trying to improve performance; no one thing will work in 100% of the cases. Only when various techniques are combined can you realize the largest performance improvement.

Managing Scope

When JavaScript code is being executed, an execution context is created. The execution context (also sometimes called the scope) defines the environment in which code is to be executed. A global execution context is created upon page load, and additional execution contexts are created as functions are executed, ultimately creating an execution context stack where the topmost context is the active one.

Each execution context has a scope chain associated with it, which is used for identifier resolution. The scope chain contains one or more variable objects that define in-scope identifiers for the execution context. The global execution context has only one variable object in its scope chain, and this object defines all of the global variables and functions available in JavaScript. When a function is created (but not executed), its internal [[Scope]] property is assigned to contain the scope chain of the execution context in which it was created (internal properties cannot be accessed through JavaScript, so you cannot access this property directly). Later, when execution flows into a function, an activation object is created and initialized with values for this, arguments, named arguments, and any variables local to the function. The activation object appears first in the execution context’s scope chain and is followed by the objects contained in the function’s [[Scope]] property.

During code execution, identifiers such as variable and function names are resolved by searching the scope chain of the execution context. Identifier resolution begins at the front of the scope chain and proceeds toward the back. Consider the following code:

function add(num1, num2){
    return num1 + num2;
}

var result = add(5, 10);

When this code is executed, the add function has a [[Scope]] property that contains only the global variable object. As execution flows into the add function, a new execution context is created, and an activation object containing this, arguments, num1, and num2 is placed into the scope chain. Figure 7.1, “Relationship of execution context and scope chain” illustrates the behind-the-scenes object relationships that occur while the add function is being executed.

Figure 7.1. Relationship of execution context and scope chain

Relationship of execution context and scope chain

Inside the add function, the identifiers num1 and num2 need to be resolved when the function is executing. This resolution is performed by inspecting each object in the scope chain until the specific identifier is found. The search begins at the first object in the scope chain, which is the activation object containing the local variables for the function. If the identifier isn’t found there, the next object in the scope chain is inspected for the identifier. When the identifier is found, the search stops. In the case of this example, the identifiers num1 and num2 exist in the local activation object and so the search never goes on to the global object.

Understanding scopes and scope chain management in JavaScript is important because identifier resolution performance is directly related to the number of objects to search in the scope chain. The farther up the scope chain an identifier exists, the longer the search goes on and the longer it takes to access that variable; if scopes aren’t managed properly, they can negatively affect the execution time of your script.

Use Local Variables

Local variables are, by far, the fastest identifiers both to read from and write to in JavaScript. Because they exist in the activation object of the executing function, identifier resolution involves inspecting a single object in the scope chain. The amount of time necessary to read the value of a variable increases with each step along the scope chain, so the greater the identifier depth, the slower the access is going to be. This effect can be seen in every browser except Google Chrome using v8 and Safari 4+ using the Nitro JavaScript engine, both of which are so fast that the identifier depth has little effect on access speed.

To determine the exact performance impact of identifier depth, I ran an experiment involving 200,000 variable operations. I alternated between reads and writes, accessing the variables from different identifier depths. The page I used for this experiment is located at http://www.nczonline.net/experiments/javascript/performance/identifier-depth/.

Figure 7.2, “Variable read time compared to identifier depth” illustrates the amount of time it takes to write to a variable based on scope chain depth, and Figure 7.3, “Variable write time compared to identifier depth” illustrates the amount of time it takes to read from an identifier based on its scope chain depth (a depth of 1 signifies a local identifier).

Figure 7.2. Variable read time compared to identifier depth

Variable read time compared to identifier depth

Figure 7.3. Variable write time compared to identifier depth

Variable write time compared to identifier depth

As these figures clearly indicate, identifiers are accessed significantly faster when they are higher in the scope chain. You can take advantage of this knowledge by using local variables whenever possible. A good rule of thumb is to store any out-of-scope variables in a local variable whenever it’s used more than once within the function. For example:

function createChildFor(elementId){
    var element = document.getElementById(elementId),
        newElement = document.createElement("div");

    element.appendChild(newElement);
}

This function has two references to the global variable document. Since document is being used more than once, it should be stored in a local variable for faster reference, such as here:

function createChildFor(elementId){
    var doc = document,  //store in a local variable
        element = doc.getElementById(elementId),
        newElement = doc.createElement("div");

    element.appendChild(newElement);
}

The rewritten version of the function stores document in a local variable called doc. Since doc exists in the first part of the scope chain, it can be resolved faster than document. Keep in mind that the global variable object is always the last object in the scope chain, and so global identifier resolution is always the most expensive.

Note

A very common mistake that leads to performance issues is to omit the var keyword when assigning a variable’s value for the first time. Assignment to an undeclared variable automatically results in a global variable being created.

Scope Chain Augmentation

The scope chain for a given execution context typically remains unchanged during code execution. There are, however, two statements that temporarily augment the scope chain of an execution context. The first is the with statement, which is designed to allow easy access to object properties by making them appear as local variables. For example:

var person = {
    name: "Nicholas",
    age: 30
};

function displayInfo(){
    var count = 5;
    with(person){
        alert(name + " is " + age);
        alert("Count is " + count);
    }
}

displayInfo();

In this code, the person object is passed into a with block. This allows you to access the name and age properties as though they were locally defined. What actually happens, though, is that a new variable object is pushed to the front of the execution context’s scope chain. This variable object contains all of the properties of the specified object (in this case, person) so that they can be accessed without using dot notation. Figure 7.4, “Scope chain augmentation using the with statement” shows how the scope chain for displayInfo is augmented while the with statement is being executed.

Figure 7.4. Scope chain augmentation using the with statement

Scope chain augmentation using the with statement

Though it seems very convenient when an object’s properties are being used repeatedly, this extra object in the scope chain hurts local identifier resolution. While code within a with statement is being executed, the local function variables now exist in the second object in the scope chain instead of the first, automatically slowing down identifier access. In the previous example, the count variable now takes longer to access because it’s not in the first object of the scope chain. Once the with statement finishes executing, the scope chain is restored to its previous state. Due to this major downside, it’s recommended to avoid using the with statement.

The second statement that augments the scope chain is the catch clause of a try-catch block. The catch clause behaves in a manner similar to the with statement where it adds a variable object to the front of the scope chain while it executes the code in the block. That variable object contains an entry for the named exception object specified by catch. However, the catch clause is executed only when an error occurs during execution of the try clause, making it somewhat less problematic than the with statement, though you should take care not to execute too much code within the catch clause to minimize the performance impact.

Minding scope chain depth is an easy way to get performance improvements with a small amount of work. Avoid unnecessarily augmenting the scope chain and inadvertently slowing down execution.

Efficient Data Access

Where data is stored in a script contributes directly to the amount of time it takes to execute. In general, there are four places from which data can be accessed in a script:

  • Literal value

  • Variable

  • Array item

  • Object property

Reading data always incurs a performance cost, and that cost depends on which of these four locations the data is stored in.

In most browsers, the cost of reading a value from a literal versus a local variable is so small as to be negligible; you should feel free to mix and match literals and local variables without worrying about a performance penalty. The real difference comes when you move to reading data from an array or object. Accessing values from one of these data structures requires a lookup of the location in which the data is stored, either by index (for array) or by property name (for objects).

To test the data access times based on data location, I created an experiment that reads values from each of these locations 200,000 times. You can find the experiment online at http://www.nczonline.net/experiments/javascript/performance/data-access/. The result of running this experiment on multiple browsers is that there is almost an even split across browsers as to which is faster: Internet Explorer, Opera, and Firefox 3 all access array items faster than object properties; Chrome, Safari, Firefox 2, and Firefox 3.1+ access object properties faster than array items (see Figure 7.5, “Data access time across browsers”).

Figure 7.5. Data access time across browsers

Data access time across browsers

The important lesson to take from this information is to always store frequently accessed values in a local variable. Consider the following code:

function process(data){
    if (data.count > 0){
        for (var i=0; i < data.count; i++){
            processData(data.item[i]);
        }
    }
}

This snippet accesses the value of data.count multiple times. At first glance, it looks like this value is used twice: once in the if statement and once in the for loop. In reality, though, data.count is accessed data.count plus 1 times in this function, since the control statement (i < data.count) is executed each time through the loop. The function will run faster if this value is stored in a local variable and then accessed from there:

function process(data){
    var count = data.count;
    if (count > 0){
        for (var i=0; i < count; i++){

            processData(data.item[i]);
        }
    }
}

The rewritten version of this function accesses data.count only once, at the beginning in order to store it in a local variable. The local variable count is used in its place elsewhere in the function, limiting the number of times an object property must be accessed to retrieve this value. This function will run faster than the previous function because the number of object property lookups has been reduced.

The effect of data access is exaggerated as the value’s data structure depth increases. For example, data.count is faster to access than data.item.count, which is faster to access than data.item.subitem.count. When dealing with properties, the number of times a dot is used (for property lookup) directly relates to the amount of time it takes to access that value. Figure 7.6, “Access times for object properties by depth” shows the relative data access times by property depth across browsers. The tests for this research are part of the data access experiment located at http://www.nczonline.net/experiments/javascript/performance/data-access/.

Figure 7.6. Access times for object properties by depth

Access times for object properties by depth

A good approach to take when dealing with data access is to store in a local variable any object property or array item that is used more than once in a function.

Note

For most browsers, there is virtually no difference between using dot notation for object property access (data.count) and bracket notation (data["count"]). The one exception is Safari, where bracket notation is significantly slower than dot notation. This holds true even for Safari 4 and later using the Nitro JavaScript engine.

Using local variables is especially important when dealing with HTMLCollection objects (those returned from DOM methods such as getElementsByTagName and properties such as element.childNodes). Each HTMLCollection object is actually a live query being run against the DOM document every time a property is accessed. For example:

var divs = document.getElementsByTagName("div");
for (var i=0; i < divs.length; i++){  //Avoid!
    var div = divs[i];
    process(div);
}

The first line of this code creates a query that returns every <div> element on the page and stores that query in divs. Each time divs has a property accessed either by name or by index, the DOM actually reexecutes that query against the entire page; in this code, it occurs each time divs.length or divs[i] is accessed. These property lookups take longer than the average non-DOM object property or array item lookup. It’s therefore important to store such values in local variables whenever possible to avoid the requerying penalty associated with HTMLCollection objects. For example:

var divs = document.getElementsByTagName("div");
for (var i=0, len=divs.length; i < len; i++){  //Better
    var div = divs[i];
    process(div);
}

This example stores the length of the divs HTMLCollection in a local variable, limiting the number of times the object is accessed directly. In the previous version of this code, divs was accessed twice per iteration: once to retrieve the object in the given position, and once to check the length. This new version eliminates direct length-checking with each iteration.

Note

Generally speaking, interacting with DOM objects is always more expensive than interacting with non-DOM objects. Due to DOM behavior, property lookups typically take longer than non-DOM property lookups. The HTMLCollection object is the worst-performing object in the DOM. If you need to repeatedly access members of an HTMLCollection, it is more efficient to copy them into an array first.

Flow Control

Next to data access, flow control is perhaps the most important aspect of JavaScript relating to performance. JavaScript, as with most programming languages, has a number of flow control statements that determine which part of the code should be executed next. There’s a series of conditional and loop statements that enable developers to precisely control how execution flows from one part of the code to another. Choosing the right option at each point can dramatically affect how fast your script runs.

Fast Conditionals

The classic question of whether to use a switch statement or a series of if and else statements is not unique to JavaScript and has spurred discussions in nearly every programming language that has these constructs. The real issue is not between individual statements, of course, but rather relates to the speed with which each is able to handle a range of conditional statements. The details of this section are based on tests that you can run at http://www.nczonline.net/experiments/javascript/performance/conditional-branching/.

The if statement

Discussions usually begin surrounding complex if statements such as this:

if (value == 0){
    return result0;
} else if (value == 1){
    return result1;
} else if (value == 2){
    return result2;
} else if (value == 3){
    return result3;
} else if (value == 4){
    return result4;
} else if (value == 5){
    return result5;
} else if (value == 6){
    return result6;
} else if (value == 7){
    return result7;
} else if (value == 8){
    return result8;
} else if (value == 9){
    return result9;
} else {
    return result10;
}

Typically, this type of construct is frowned upon. The major problem is that the deeper into the statement the execution flows, the more conditions have to be evaluated. It will take longer to complete the execution when value is 9 than if value is 0 because every other condition must be evaluated beforehand. As the overall number of conditions increases, so does the performance hit for going deep into the conditions. While having a large number of if conditions isn’t advisable, there are steps you can take to increase the overall performance.

The first step is to arrange the conditions in decreasing order of frequency. Since exiting after the first condition is the fastest operation, you want to make sure that happens as often as possible. Suppose the most common case in the previous example is that value will equal 5 and the second most common is that value will equal 9. In that case, you know five conditions will be evaluated before getting to the most common case and nine before getting to the second most common case; this is incredibly inefficient. Even though the increasing numeric order of the conditions makes it easier to read, it should actually be rewritten as follows:

if (value == 5){

    return result5;
} else if (value == 9){
    return result9;
} else if (value == 0){
    return result0;
} else if (value == 1){
    return result1;
} else if (value == 2){
    return result2;
} else if (value == 3){
    return result3;
} else if (value == 4){
    return result4;
} else if (value == 6){
    return result6;
} else if (value == 7){
    return result7;
} else if (value == 8){
    return result8;
} else {
    return result10;
}

Now the two most common conditions appear at the top of the if statement, ensuring optimal performance for these cases.

Another way to optimize if statements is to organize the conditions into a series of branches, following a binary search algorithm to find the valid condition. This is advisable in the case where a large number of conditions are possible and no one or two will occur with a high enough rate to simply order according to frequency. The goal is to minimize the number of conditions to be evaluated for as many of the conditions as possible. If all of the conditions for value in the example will occur with the same relative frequency, the if statements can be rewritten as follows:

if (value < 6){

    if (value < 3){
        if (value == 0){
            return result0;
        } else if (value == 1){
            return result1;
        } else {
            return result2;
        }
    } else {
        if (value == 3){
            return result3;
        } else if (value == 4){
            return result4;
        } else {
            return result5;
        }
    }

} else {

    if (value < 8){
        if (value == 6){
            return result6;
        } else {
            return result7;
        }
    } else {
        if (value == 8){
            return result8;
        } else if (value == 9){
            return result9;
        } else {
            return result10;
        }

    }
}

This code ensures that there will never be any more than four conditions evaluated. Instead of evaluating each condition to find the right value, the conditions are separated first into a series of ranges before identifying the actual value. The overall performance of this example is improved because the cases where eight and nine conditions need to be evaluated have been removed. The maximum number of condition evaluations is now four, creating an average savings of about 30% off the execution time of the previous version. Keep in mind, also, that an else statement has no condition to evaluate. However, the problem remains that each additional condition ends up taking more time to execute, affecting not only the performance but also the maintainability of this code. This is where the switch statement comes in.

The switch statement

The switch statement simplifies both the appearance and the performance of multiple conditions. You can rewrite the previous example using a switch statement as follows:

switch(value){
    case 0:
        return result0;
    case 1:
        return result1;
    case 2:
        return result2;
    case 3:
        return result3;
    case 4:
        return result4;
    case 5:
        return result5;
    case 6:
        return result6;
    case 7:
        return result7;
    case 8:
        return result8;
    case 9:
        return result9;
    default:
        return result10;
}

This code clearly indicates the conditions as well as the return values in an arguably more readable form. The switch statement has the added benefit of allowing fall-through conditions, which allow you to specify the same result for a number of different values without creating complex nested conditions. The switch statement is often cited in other programming languages as the hands-down better option for evaluating multiple conditions. This isn’t because of the nature of the switch statement, but rather because of how compilers are able to optimize switch statements for faster evaluation. Since most JavaScript engines don’t have such optimizations, performance of the switch statement is mixed.

Firefox handles switch statements very well, with each condition’s evaluation executing in roughly the same amount of time regardless of the order in which they are defined. That means the case of value equal to 0 will take roughly the same amount of time to execute as when value is 9. Other browsers, however, aren’t nearly as good. Internet Explorer, Opera, Safari, and Chrome all show noticeable increases in the execution time as you get deeper into the switch statement. Those increases, however, are smaller than the increases experienced with each additional condition of an if statement. You can therefore improve the performance of switch statements by ordering the conditions in decreasing rate of frequency (the same as if statement optimization).

In JavaScript, if statements are generally faster than switch statements when there are just one or two conditions to be evaluated. When there are more than two conditions, and the conditions are simple (not ranges), the switch statement tends to be faster. This is because the amount of time it takes to execute a single condition in a switch statement is often less than it takes to execute a single condition in an if statement, making the switch statement optimal only when there are a larger number of conditions.

Another option: Array lookup

There are more than two solutions for dealing with conditionals in JavaScript. Alongside the if statement and the switch statement is a third approach: looking up values in arrays. The example for this section maps a given number to a specific result, which is exactly what arrays are for. Instead of writing a large if statement or switch statement, you can use the following code:

//define the array of results
var results = [result0, result1, result2, result3, result4, result5, result6,
result7,
               result8, result9, result10]

//return the correct result
return results[value];

Instead of using conditional statements, all of the results are stored in an array whose index maps to the value variable. Retrieving the appropriate result is simply a matter of array value lookup. Although array lookup times also increase the deeper into the array you go, the incremental increase is so small that it is irrelevant relative to the increases in each condition evaluation for if and switch statements. This makes array lookup ideal whenever there are a large number of conditions to be met, and the conditions can be represented by discrete values such as numbers or strings (for strings, you can use an Object to store the results rather than an Array).

It’s not practical to use array lookup for small numbers of results because array lookup is often slower than evaluating a small number of conditions. Array lookups can be very helpful when there are a large number of ranges because they eliminate the need to test both upper and lower bounds; you can simply fill in that range of indexes in the array with the appropriate value and do a straight array lookup.

The fastest conditionals

The three techniques presented here—the if statement, the switch statement, and array lookup—each have their uses in optimizing code execution:

  • Use the if statement when:

    • There are no more than two discrete values for which to test.

    • There are a large number of values that can be easily separated into ranges.

  • Use the switch statement when:

    • There are more than two but fewer than 10 discrete values for which to test.

    • There are no ranges for conditions because the values are nonlinear.

  • Use array lookup when:

    • There are more than 10 values for which to test.

    • The results of the conditions are single values rather than a number of actions to be taken.

Fast Loops

As mentioned in Chapter 1, Understanding Ajax Performance, loops are a frequent source of performance issues in JavaScript, and the way you write loops drastically changes its execution time. Once again, JavaScript developers don’t get to rely on compiler optimizations that make loops faster regardless of the initial code, so it’s important to understand the various ways to write loops and how they affect performance.

Simple loop performance boosts

There are four different types of loops in JavaScript. In this section, we will discuss three of them: the for loop, the do-while loop, and the while loop. (The fourth type is a for-in loop that is used to iterate over object properties, but I won’t cover it here because its purpose is very unique.) The various loop types are coded as follows:

//unoptimized code
var values = [1,2,3,4,5];

//for loop
for (var i=0; i < values.length; i++){
    process(values[i]);
}

//do-while loop
var j=0;
do {
    process(values[j++]);
} while (j < values.length);

//while loop
var k=0;
while (k < values.length){
    process(values[k++]);
}

Each of the loops in this example achieves the same result: all items in the values array are passed into the process function. These are the most common constructs used for iterating over a number of values in an array. Each of these loops runs in about the same amount of time because they’re doing roughly the same amount of work. There are, however, ways to improve the performance.

Perhaps the most glaring issue in each loop is the constant comparison of the iterator variable against the array length. As mentioned earlier in this chapter, property lookup is a much more expensive operation than local variable access. This code is retrieving the value of values.length every time the loop executes to see whether the terminal condition has been reached. This is incredibly inefficient given that the length of the array won’t change while the loop is being executed. Using a local variable instead of a property lookup can speed up the loops:

var values = [1,2,3,4,5];

var length = values.length;

//for loop
for (var i=0; i < length; i++){
    process(values[i]);
}

//do-while loop
var j=0;
do {
    process(values[j++]);
} while (j < length);

//while loop
var k=0;
while (k < length){

    process(values[k++]);
}

Each loop now uses the local variable length as its comparison point instead of values.length, eliminating a property lookup each time through the loop. This technique is especially important when dealing with HTMLCollection objects because, as mentioned previously, every property access on such an object is actually a query against the DOM for all nodes matching some criteria. That makes a property lookup on an HTMLCollection very expensive and, when included in the terminal condition of a loop, adds significant execution time to the overall loop.

Another simple way to improve the performance of a loop is to decrement the iterator toward 0 rather than incrementing toward the total length. Making this simple change can result in savings of up to 50% off the original execution time, depending on the complexity of each iteration. For example:

var values = [1,2,3,4,5];
var length = values.length;

//for loop

for (var i=length; i--;){
    process(values[i]);
}

//do-while loop
var j=length;
do {
    process(values[--j]);
} while (j);

//while loop
var k=length;
while (k--){
    process(values[k]);
}

Each of these loops is now even faster by virtue of changing the terminal condition to a comparison against 0 (note that the terminal condition evaluates to true once the iterator variable equals 0). The performance of each type of loop is comparable, so you needn’t worry about choosing among the three variations for speed purposes.

Note

Be careful when using the native indexOf method for arrays. This method can take significantly longer to iterate over each array item than using a regular loop. If speed is your primary concern, use one of the three loop types mentioned in this section.

Avoid the for-in loop

Another variation of the for loop is the for-in loop, whose purpose is to iterate over the enumerable properties of a JavaScript object. Typical usage is as follows:

for (var prop in object){
    if (object.hasOwnProperty(prop)){  //to filter out prototype properties
        process(object[prop]);
    }
}

This code iterates over the properties in a given object, using the hasOwnProperty method to ensure that only instance properties are processed.

Because the for-in loop has a specific purpose, there is little you can do to change its performance. The terminal condition cannot be altered, and the order of the properties to iterate over cannot be changed. Further, a for-in loop is typically much slower than any of the other loops because it requires resolving every enumerable property on a particular object. That, in turn, means the object’s prototype and entire prototype chain must be examined to extract these properties. Traversing the prototype chain, just like traversing the scope chain, takes time and slows down the performance of the entire loop.

If you know the specific properties you’re interested in, it’s much faster to create a standard loop (for, do-while, or while) and iterate over an array of names, such as:

//known properties to iterate over
var props = ["name", "age", "title"];

//while loop
var i=props.length;
while (i--){
    process(object[props[i]]);
}

This loop runs much faster than the for-in loop, and not simply because of the small number of properties in the props array. Even increasing the number of properties over which to iterate would yield significantly better performance than the for-in loop. The loop in this example takes advantage of all the normal loop performance enhancements and still allows iteration over a known set of object properties.

Naturally, this approach works only when you know the object properties to iterate over; when dealing with unknown properties, as with JSON objects, a for-in loop may still be necessary.

Unrolling loops

It is a common practice in several programming languages to unroll small loops to improve performance. The basis of this practice is that limiting the number of iterations can mitigate the performance overhead of a loop. The implementation of such a solution is typically called unrolling the loop, which means making each iteration do the work of multiple iterations. Consider the following loop:

var i=values.length;
while (i--){
    process(values[i]);
}

If there are only five items in the values array, it is actually faster to remove the loop and do the work on each value individually:

//unrolled loop
process(values[0]);
process(values[1]);
process(values[2]);
process(values[3]);
process(values[4]);

Of course, this approach is arguably less maintainable, as it takes more code to write and any change to the number of items in the values array requires changes to the code. Further, the performance gains for such a small number of statements aren’t worth the maintenance overhead. This technique can be quite useful, however, when you’re dealing with a large number of values and a potentially large number of iterations.

Tom Duff, a computer programmer working for Lucasfilm at the time, first proposed a construct for unrolling loops in the C programming language. This pattern became known as Duff’s Device and was later converted to JavaScript by Jeff Greenberg, who also published one of the first comprehensive studies on JavaScript performance optimization (which is still available at http://home.earthlink.net/~kendrasg/info/js_opt/). Greenberg’s Duff’s Device implementation is as follows:

var iterations = Math.ceil(values.length / 8);
var startAt = values.length % 8;
var i = 0;

do {
    switch(startAt){
        case 0: process(values[i++]);
        case 7: process(values[i++]);
        case 6: process(values[i++]);
        case 5: process(values[i++]);
        case 4: process(values[i++]);
        case 3: process(values[i++]);
        case 2: process(values[i++]);
        case 1: process(values[i++]);
    }
    startAt = 0;
} while (--iterations > 0);

The idea behind Duff’s Device is that each trip through the loop does the work of between one and eight iterations of a normal loop. This is done by first determining the number of iterations by dividing the total number of array values by eight. Duff found that eight was an optimal number to use for this processing (it’s not arbitrary). Since not all array lengths will be equally divisible by eight, you must also calculate how many items won’t be processed by using the modulus operator. The startAt variable, therefore, contains the number of additional items to be processed. This variable is used only the first time through the loop, to do the extra work, and then is set back to zero so that each subsequent trip through the loop results in a full eight items being processed. Duff’s Device runs faster than a normal loop over a large number of iterations, but it can be made even faster.

The book Speed Up Your Site (New Riders) introduced a version of Duff’s Device in JavaScript that moves the processing of the extra array items outside the main loop, allowing the switch statement to be removed and resulting in an even faster way of processing a large number of items:

var iterations = Math.floor(values.length / 8);
var leftover = values.length % 8;
var i = 0;

if (leftover > 0){
    do {
        process(values[i++]);
    } while (--leftover > 0);
}

do {
    process(values[i++]);
    process(values[i++]);
    process(values[i++]);
    process(values[i++]);
    process(values[i++]);
    process(values[i++]);
    process(values[i++]);
    process(values[i++]);
} while (--iterations > 0);

This code executes faster over a large number of array items primarily due to the removal of the switch statement from the main loop. As discussed earlier in this chapter, conditionals do have performance overhead; removing that overhead from the algorithm speeds up the processing. The separation of processing into two discrete loops allows this augmentation.

Duff’s Device, and the modified version presented here, is useful primarily with large arrays. For small arrays, the performance gain is minimal compared to standard loops. Therefore, you should attempt to use Duff’s Device only if you notice a performance bottleneck relating to a loop that must process a large number of items.

String Optimization

String manipulation is a very common occurrence in JavaScript. There are multiple ways to deal with string processing, depending on the particular task, and each task brings with it specific performance considerations. There are a number of different ways to manipulate strings, whether it be using built-in string methods and operators or intermixing the use of regular expressions and arrays. The exact technique to use for optimal performance is tied directly to the type of manipulation being performed.

String Concatenation

Traditionally, string concatenation has been one of the poorest-performing aspects of JavaScript. Typically, string concatenation is done using the plus operator (+), such as in the following:

var text = "Hello";
text += " "
text += "World!";

Early browsers had no optimization for such operations. Since strings are immutable, that meant creating intermediate strings to contain the concatenation result. This constant creation and destruction of strings behind the scenes led to very poor string concatenation performance.

Having discovered this, developers turned to the JavaScript Array object for help. One of the Array object’s methods is join, which concatenates all items in the array and inserts a given string between the items. Instead of using the plus operator, each string is added to an array and the join method is called when all items have been added. For example:

var buffer = [],
    i = 0;
buffer[i++] = "Hello";
buffer[i++] = " ";
buffer[i++] = "World!";

var text = buffer.join("");

In this code, each string is added into the buffer array. The join method is called after all strings are in the array, returning the concatenated string and storing it in the variable text. Adding the items directly into the appropriate index is slightly faster than calling push for each value. This technique proved to be much faster in early browsers than using the plus operator because no intermediate strings are being created and destroyed. However, browser string optimizations have changed the string concatenation picture.

Firefox was the first browser to optimize string concatenation. Beginning with version 1.0, the array technique is actually slower than using the plus operator in all cases. Other browsers have also optimized string concatenation, so Safari, Opera, Chrome, and Internet Explorer 8 also show better performance using the plus operator. Internet Explorer prior to version 8 didn’t have such an optimization, and so the array technique is always faster than the plus operator.

This doesn’t necessarily mean browser detection should be used whenever string concatenation is necessary. There are two factors to consider when determining the most appropriate way to concatenate strings: the size of the strings being concatenated and the number of concatenations.

All browsers can comfortably complete the task in less than one millisecond using the plus operator when the size of the strings is relatively small (20 characters or less) and the number of concatenations is also relatively small (1,000 concatenations or less). There is no reason to consider anything other than using the plus operator if this is your situation.

As you increase the number of concatenations for small strings, or the size of the strings with a small number of concatenations, the performance gets significantly worse in Internet Explorer through version 7. Also, as the size of the strings increases, the performance difference between using the plus operator and the array technique decreases in Firefox. As the number of concatenations increases, the difference between the two techniques decreases in Safari as well. The only browsers where the plus operator remains consistently and significantly faster with varying string size and concatenation numbers are Chrome and Opera.

With all of the performance variance across browsers, the technique to use is heavily dependent on the use case as well as on the browsers you’re targeting. If your users largely use Internet Explorer 6 or 7, it may be worth using the array technique all the time because that will affect the largest number of people. The performance decrease of the array technique in other browsers is typically much less than the performance increase gained in Internet Explorer, so try to balance your users’ experience based on their browsers rather than trying to target specific situations and browser versions. In most common cases, however, using the plus operator is preferred.

Trimming Strings

One of the most glaring omissions of JavaScript strings is the lack of a native trim method used to remove leading and trailing whitespace. The most common implementation of a trim function is as follows:

function trim(text){
    return text.replace(/^\s+|\s+$/g, "");
}

This implementation uses a regular expression that matches one or more whitespace characters at the beginning or end of the string. The string’s replace method is used to replace any matches with an empty string. This implementation, however, has a performance issue based in the regular expression.

The performance impact comes from two aspects of the regular expression: the pipe operator, indicating that there are two patterns to match, and the g flag, indicating that the pattern should be applied globally. Taking this into mind, you can rewrite the function to be a bit faster by breaking up the regular expression into two and getting rid of the g flag:

function trim(text){
    return text.replace(/^\s+/, "").replace(/\s+$/, "");

}

Breaking the single replace method into two calls allows each regular expression to become much simpler and, therefore, faster. This method is faster than the original, but you can optimize it even further.

Steven Levithan, after performing research on the fastest way to execute string trimming in JavaScript, arrived at the following function:

function trim(text){
    text = text.replace(/^\s+/, "");
    for (var i = text.length - 1; i >= 0; i--) {
        if (/\S/.test(text.charAt(i))) {
            text = text.substring(0, i + 1);
            break;
        }
    }
    return text;
}

This trim function consistently performs better than other variations. The source of the speed is keeping the regular expressions as simple as possible. The first line removes leading whitespace and then the for loop is used to strip trailing whitespace. The loop uses a very simple, single-character regular expression that matches nonwhitespace characters. This information is used to either remove a character from the string or break the loop. The resulting function performs faster than the previous versions across all browsers. For Levithan’s complete analysis, see his post at http://blog.stevenlevithan.com/archives/faster-trim-javascript.

As with string concatenation, the speed of string trimming matters only if it is performed with enough frequency during execution. The second trim function in this section performs fine for smaller strings over the course of a few calls; the third trim function is significantly faster when used on longer strings.

Note

The next version of the ECMAScript specification, code-named ECMAScript 3.1, defines a native trim method for strings; it is likely that this native version will be faster than any of the functions in this section. When available, the native function should be used.

Avoid Long-Running Scripts

One of the critical performance issues with JavaScript is that code execution freezes a web page. Because JavaScript is a single-threaded language, only one script can be run at a time per window or tab. This means that all user interaction is necessarily halted while JavaScript code is being executed. This is an important feature of browsers since JavaScript may change the underlying page structure during its execution, with the possibility of nullifying or altering the response to user interaction.

If JavaScript code isn’t carefully crafted, it’s possible to freeze the web page for an extended period of time and ultimately cause the browser to stop responding. Most browsers will detect long-running scripts and notify the user of a problem with a dialog box asking whether the script should be allowed to proceed.

Exactly what causes the browser to display the long-running script dialog varies depending on the vendor:

  • Internet Explorer monitors the number of statements that have been executed by a script. When a maximum number of statements have been executed, 5 million by default, the long-running script dialog is displayed (as shown in Figure 7.7, “Internet Explorer 7 long-running script dialog”).

  • Firefox monitors the amount of time a script is taking to execute. When a script takes longer than a preset amount of time, 10 seconds by default, the long-running script dialog is displayed.

  • Safari also uses the execution time to determine whether a script is long-running. The default timeout is set to five seconds, after which the long-running script dialog is displayed.

  • Chrome as of version 1.0 has no set limit on how long JavaScript is allowed to run. The process will crash when it has run out of memory.

  • Opera is the only browser that doesn’t protect against long-running scripts. Scripts are allowed to continue until execution is complete.

Figure 7.7. Internet Explorer 7 long-running script dialog

Internet Explorer 7 long-running script dialog

If you ever see the long-running script dialog, it’s an indication that the JavaScript code needs to be refactored. Generally speaking, no single continuous script execution should take longer than 100 milliseconds; anything longer than that and the web page will almost certainly appear to be running slowly to the user. Brendan Eich, the creator of JavaScript, is also quoted as saying, “[JavaScript] that executes in whole seconds is probably doing something wrong....”

The most common reasons why a script takes too long to execute include:

Too much DOM interaction

DOM manipulation is more expensive than any other JavaScript process. Minimizing DOM interactions significantly cuts the JavaScript runtime. Most browsers update the DOM only after the entire script has finished executing, which slows the perceived responsiveness of the web page to the user.

Loops that do too much

Any loop that either runs too many times or performs too many operations with each iteration can cause long-running script issues. It helps separate out functionality whenever possible. The effect is worsened when loops are used to perform DOM manipulations, sometimes causing the browser to completely freeze without ever showing the long-running script dialog.

Too much recursion

JavaScript engines put a limit on the amount of recursion that scripts can use. Rewriting the code to avoid recursion helps ameliorate the issue.

Sometimes simple code refactoring, keeping these issues in mind, can prevent runaway scripts. There may, however, be times when complex processes must necessarily be executed for the web application to function correctly. In that case, the code must be restructured to yield periodically, as explained in the next section.

Yielding Using Timers

The single-threaded nature of JavaScript means that only one script can be executed in a window or tab at any given point in time. No user interactions can be processed during this time and so it’s necessary to introduce breaks in long-executing JavaScript code. On simple web pages, the breaks occur naturally as the user interacts with the page. In complex web applications, it’s often necessary to insert the breaks yourself. The easiest way to do this is to use a timer.

Timers are created using the setTimeout function, passing in the function to execute as well as a delay (in milliseconds) before the function should be executed. When the delay has passed, the code to execute is placed into a queue. The JavaScript engine uses this queue to determine what to do next. When a script finishes executing, the JavaScript engine yields to allow other browser tasks to catch up. The web page display is typically updated during this time in relation to changes made via the script. Once the display has been updated, the JavaScript engine checks for more scripts to run on the queue. If another script is waiting, it is executed and the process repeats; if there are no more scripts to execute, the JavaScript engine remains idle until another script appears in the queue.

When you create a timer, you’re actually scheduling some code to be inserted into the JavaScript engine’s queue to be executed later. That insertion happens after the amount of time specified when calling setTimeout. In essence, timers push code execution off into the future, where all long-running script limits are reset. Consider the following code:

window.onload = function(){

    //Page Load

    //create first timer
    setTimeout(function(){

        //Delayed Script 1

        setTimeout(function(){

            //Delayed Script 2

        }, 100);

        //Delayed Script 1, continued

    }, 100);

};

In this example, a script is run when the page loads. That script calls setTimeout to create the first timer. When that timer executes, it calls setTimeout again to create a second timer. The second delayed script cannot start running, however, until the first has finished executing and the browser has updated the display. Figure 7.8, “JavaScript code execution with timers” shows the timeline for this code execution, indicating that no two scripts are run at the same time.

Timers are the de facto standard for splitting up JavaScript code execution in browsers. Whenever a script is taking too long to complete, look to delay parts of the execution until later.

Note that very small timer delays can also cause the browser to become unresponsive. It’s recommended to never use a delay of zero milliseconds, as this isn’t enough time for all browsers to properly update their display. In general, delays between 50 and 100 milliseconds are appropriate and allow browsers enough idle time to perform necessary display updates.

Figure 7.8. JavaScript code execution with timers

JavaScript code execution with timers

Timer Patterns for Yielding

Array processing is one of the most frequent causes of long-running scripts. Typically, this is because processing must be done on each member of the array, and so the execution time increases directly in proportion to the number of items in the array. If the array processing doesn’t have to be executed synchronously, it is a good candidate for splitting up using timers.

In my book, Professional JavaScript for Web Developers, Second Edition (Wrox), I describe a simple function that can be used to split up the processing of arrays using timers:

function chunk(array, process, context){
    setTimeout(function(){
        var item = array.shift();
        process.call(context, item);

        if (array.length > 0){
            setTimeout(arguments.callee, 100);
        }
    }, 100);
}

The chunk function accepts three arguments: an array of data to process, a function with which to process each item, and an optional context argument in which the processing function should be executed (by default, all functions passed into setTimeout are run in the global context, so this is equal to window). Processing of the items is done using timers, and so the code execution yields after each item has been processed. The next item to process is always at the front of the array and is removed before being processed. Afterward, a check is made to determine whether there are any further values left to process. If so, a new timer is created and the function is called again via arguments.callee. Note that the chunk function uses the passed-in array as a “to do” list of items to process and so is altered once execution is complete. You can use the function as follows:

var names = ["Nicholas", "Steve", "Doug", "Bill", "Ben", "Dion"],
    todo = names.concat();  //clone the array

chunk(todo, function(item){
    console.log(item);
});

The code in this simple example outputs each name in the names array to the console (available in Firefox with Firebug installed, Internet Explorer 8+, Safari 2+, and all versions of Chrome). The processing function is very short but could easily be replaced with something more complex. The chunk function is best used with long arrays where each item requires significant processing.

Another popular pattern is to perform small, sequential parts of a larger operation using timers. Julien Lecomte presented this pattern in his blog post, “Running CPU Intensive JavaScript Computations in a Web Browser”, in which he showed how sorting of a large data set could be achieved using an inefficient algorithm (bubble sort) without incurring a long-running script issue. The following is an adaptation of Lecomte’s code:

function sort(array, onComplete){

    var pos = 0;

    (function(){

        var j, value;

        for (j=array.length; j > pos; j--){
            if (array[j] < array[j-1]){
                value = data[j];
                data[j] = data[j-1];
                data[j-1] = value;
            }
        }

        pos++;

        if (pos < array.length){
            setTimeout(arguments.callee,10);
        } else {
            onComplete();
        }

    })();

}

The sort function splits up each traversal through the array for sorting, allowing the browser to continue functioning while this processing occurs. The inner anonymous function is called immediately to do the first traversal and then is called subsequently via a timer by passing arguments.callee into setTimeout. When the array is finally sorted, the onComplete function is called to notify the developer that the data is ready to be used. The function can be used as follows:

sort(values, function(){
    alert("Done!");
});

When sorting an array with a large number of items, the difference in browser responsiveness is immediately apparent.

Summary

The speed with which JavaScript executes is very dependent on how it is written. In this chapter, you learned several ways to speed up JavaScript code execution:

  • Managing your scope is important, since out-of-scope variables take longer to access than local variables. Try to avoid constructs that artificially augment the scope chain, such as the with statement and the catch clause of a try-catch statement. If an out-of-scope value is being used more than once, store it in a local variable to minimize the performance penalty.

  • The way you store and access data can greatly impact the performance of your script. Literal values and local variables are always the fastest; you incur a performance penalty for accessing array items and object properties. If an array item or object property is used more than once, store it in a local variable to speed up access to the value.

  • Flow control is also an important determinant of script execution speed. There are three ways to handle conditionals: the if statement, the switch statement, and array lookup. The if statement is best used with a small number of discrete values or a range of values; the switch statement is best used when there are between 3 and 10 discrete values to test for; array lookup is most efficient for a larger number of discrete values.

  • Loops are frequently found to be bottlenecks in JavaScript. To make a loop the most efficient, reverse the order in which you process the items so that the control condition compares the iterator to zero. This is far faster than comparing a value to a nonzero number and significantly speeds up array processing. If there are a large number of required iterations, you may also want to consider using Duff’s Device to speed up execution.

  • Be careful when using HTMLCollection objects. Each time a property is accessed on one of these objects, it requires a query of the DOM for matching nodes. This is an expensive operation that can be avoided by accessing HTMLCollection properties only when necessary and storing frequently used values (such as the length property) in local variables.

  • Common string operations may have unintended performance implications. String concatenation is much slower in Internet Explorer than in other browsers, but it’s not worth worrying about unless you’re dealing with more than 1,000 concatenations. You can optimize string concatenation in Internet Explorer by using an array to store the individual strings and then calling join() to merge them together. Trimming strings may also be expensive, depending on the size of the string. Make sure to use the most optimal algorithm if trimming is a large part of your script.

  • Browsers have limits on how long JavaScript can run, capping either the number of statements or the amount of time the JavaScript engine is allowed to run. You can circumvent these limits, and prevent the browser from displaying a warning about the long-running script, by using timers to split up the amount of work.



[22] All of the research in this chapter is based on experiments run on Firefox versions 3.0 and 3.1 beta 2, Google Chrome 1.0, Internet Explorer versions 7 and 8 beta 2, Safari versions 3.0–3.2, and Opera version 9.62. When the version numbers aren’t mentioned, all tested versions of the browser are relevant.

If you enjoyed this excerpt, buy a copy of Even Faster Web Sites .