Today’s web applications are powered by a large amount of JavaScript code. Whereas early web sites used JavaScript to perform simple tasks, the language is now used to run the entire user interface in many places. The result can be thousands of lines of JavaScript code to execute every time a user interaction takes place. Performance, therefore, is not just about how long it takes for the page to load, but also about how it responds as it’s being used. The best way to ensure a fast, enjoyable user interface is to write JavaScript as efficiently as possible for all browsers.[22]
This excerpt is from Even Faster Web Sites .
Souders and eight expert contributors provide best practices and pragmatic advice for improving your site's performance in three critical categories: JavaScript, in the network, and in the browser.
This chapter covers some of the hidden performance issues in JavaScript and how to address them. Some changes concern small code structure issues while others may require revisiting your algorithm. The important thing to remember is that there is no silver bullet when trying to improve performance; no one thing will work in 100% of the cases. Only when various techniques are combined can you realize the largest performance improvement.
When JavaScript code is being executed, an execution context is created. The execution context (also sometimes called the scope) defines the environment in which code is to be executed. A global execution context is created upon page load, and additional execution contexts are created as functions are executed, ultimately creating an execution context stack where the topmost context is the active one.
Each execution context has a scope chain associated with it, which
is used for identifier resolution. The scope chain contains one or more
variable objects that define in-scope identifiers for the execution
context. The global execution context has only one variable object in its
scope chain, and this object defines all of the global variables and functions available in JavaScript. When
a function is created (but not executed), its internal
[[Scope]] property is assigned to
contain the scope chain of the execution context in which it was created
(internal properties cannot be accessed through JavaScript, so you cannot
access this property directly). Later, when execution flows into a
function, an activation object is created and initialized with values for
this, arguments, named arguments, and any variables local to the
function. The activation object appears first in the execution context’s
scope chain and is followed by the objects contained in the function’s
[[Scope]] property.
During code execution, identifiers such as variable and function names are resolved by searching the scope chain of the execution context. Identifier resolution begins at the front of the scope chain and proceeds toward the back. Consider the following code:
function add(num1, num2){
return num1 + num2;
}
var result = add(5, 10);When this code is executed, the add function has a [[Scope]] property that contains only the global
variable object. As execution flows into the add function, a new execution context is
created, and an activation object containing this, arguments, num1, and num2 is placed into the scope chain. Figure 7.1, “Relationship of execution context and scope chain” illustrates the
behind-the-scenes object relationships that occur while the add function is being executed.
Inside the add function, the
identifiers num1 and num2 need to be resolved when the function is
executing. This resolution is performed by inspecting each object in the
scope chain until the specific identifier is found. The search begins at
the first object in the scope chain, which is the activation object
containing the local variables for the function. If the identifier isn’t
found there, the next object in the scope chain is inspected for the
identifier. When the identifier is found, the search stops. In the case of
this example, the identifiers num1 and
num2 exist in the local activation
object and so the search never goes on to the global object.
Understanding scopes and scope chain management in JavaScript is important because identifier resolution performance is directly related to the number of objects to search in the scope chain. The farther up the scope chain an identifier exists, the longer the search goes on and the longer it takes to access that variable; if scopes aren’t managed properly, they can negatively affect the execution time of your script.
Local variables are, by far, the fastest identifiers both to read from and write to in JavaScript. Because they exist in the activation object of the executing function, identifier resolution involves inspecting a single object in the scope chain. The amount of time necessary to read the value of a variable increases with each step along the scope chain, so the greater the identifier depth, the slower the access is going to be. This effect can be seen in every browser except Google Chrome using v8 and Safari 4+ using the Nitro JavaScript engine, both of which are so fast that the identifier depth has little effect on access speed.
To determine the exact performance impact of identifier depth, I ran an experiment involving 200,000 variable operations. I alternated between reads and writes, accessing the variables from different identifier depths. The page I used for this experiment is located at http://www.nczonline.net/experiments/javascript/performance/identifier-depth/.
Figure 7.2, “Variable read time compared to identifier depth” illustrates the amount of time it takes to write to a variable based on scope chain depth, and Figure 7.3, “Variable write time compared to identifier depth” illustrates the amount of time it takes to read from an identifier based on its scope chain depth (a depth of 1 signifies a local identifier).
As these figures clearly indicate, identifiers are accessed significantly faster when they are higher in the scope chain. You can take advantage of this knowledge by using local variables whenever possible. A good rule of thumb is to store any out-of-scope variables in a local variable whenever it’s used more than once within the function. For example:
function createChildFor(elementId){
var element = document.getElementById(elementId),
newElement = document.createElement("div");
element.appendChild(newElement);
}This function has two references to the global variable document. Since
document is being used more than
once, it should be stored in a local variable for faster reference, such
as here:
function createChildFor(elementId){
var doc = document, //store in a local variable
element = doc.getElementById(elementId),
newElement = doc.createElement("div");
element.appendChild(newElement);
}The rewritten version of the function stores document in a local variable called doc. Since doc exists in the first part of the scope
chain, it can be resolved faster than document. Keep in mind that the global
variable object is always the last object in the scope chain, and so
global identifier resolution is always the most expensive.
The scope chain for a given execution context typically remains unchanged
during code execution. There are, however, two statements that
temporarily augment the scope chain of an execution context. The first
is the with statement, which is designed to allow easy access to object
properties by making them appear as local variables. For example:
var person = {
name: "Nicholas",
age: 30
};
function displayInfo(){
var count = 5;
with(person){
alert(name + " is " + age);
alert("Count is " + count);
}
}
displayInfo();In this code, the person object
is passed into a with block. This
allows you to access the name and
age properties as though they were
locally defined. What actually happens, though, is that a new variable
object is pushed to the front of the execution context’s scope chain.
This variable object contains all of the properties of the specified
object (in this case, person) so that
they can be accessed without using dot notation. Figure 7.4, “Scope chain augmentation using the with statement” shows how the
scope chain for displayInfo is
augmented while the with statement is
being executed.
Though it seems very convenient when an object’s properties are
being used repeatedly, this extra object in the scope chain hurts local
identifier resolution. While code within a with statement is being executed, the local
function variables now exist in the second object in the scope chain instead
of the first, automatically slowing down identifier access. In the
previous example, the count variable
now takes longer to access because it’s not in the first object of the
scope chain. Once the with statement
finishes executing, the scope chain is restored to its previous state.
Due to this major downside, it’s recommended to avoid
using the with statement.
The second statement that augments the scope chain is
the catch clause of a
try-catch block. The catch clause behaves in a manner similar to
the with statement where it adds a
variable object to the front of the scope chain while it executes the
code in the block. That variable object contains an entry for the named
exception object specified by catch.
However, the catch clause is executed
only when an error occurs during execution of the try clause, making it somewhat less
problematic than the with statement,
though you should take care not to execute too much code within the
catch clause to minimize the
performance impact.
Minding scope chain depth is an easy way to get performance improvements with a small amount of work. Avoid unnecessarily augmenting the scope chain and inadvertently slowing down execution.
Where data is stored in a script contributes directly to the amount of time it takes to execute. In general, there are four places from which data can be accessed in a script:
Literal value
Variable
Array item
Object property
Reading data always incurs a performance cost, and that cost depends on which of these four locations the data is stored in.
In most browsers, the cost of reading a value from a literal versus a local variable is so small as to be negligible; you should feel free to mix and match literals and local variables without worrying about a performance penalty. The real difference comes when you move to reading data from an array or object. Accessing values from one of these data structures requires a lookup of the location in which the data is stored, either by index (for array) or by property name (for objects).
To test the data access times based on data location, I created an experiment that reads values from each of these locations 200,000 times. You can find the experiment online at http://www.nczonline.net/experiments/javascript/performance/data-access/. The result of running this experiment on multiple browsers is that there is almost an even split across browsers as to which is faster: Internet Explorer, Opera, and Firefox 3 all access array items faster than object properties; Chrome, Safari, Firefox 2, and Firefox 3.1+ access object properties faster than array items (see Figure 7.5, “Data access time across browsers”).
The important lesson to take from this information is to always store frequently accessed values in a local variable. Consider the following code:
function process(data){
if (data.count > 0){
for (var i=0; i < data.count; i++){
processData(data.item[i]);
}
}
}This snippet accesses the value of data.count multiple times. At first glance, it
looks like this value is used twice: once in the if statement and once in the for loop. In reality, though, data.count is accessed data.count plus 1 times in this function, since
the control statement (i <
data.count) is executed each time through the loop. The function
will run faster if this value is stored in a local variable and then
accessed from there:
function process(data){
var count = data.count;
if (count > 0){
for (var i=0; i < count; i++){
processData(data.item[i]);
}
}
}The rewritten version of this function accesses data.count only once, at the beginning in order
to store it in a local variable. The local variable count is used in its place elsewhere in the function, limiting the
number of times an object property must be accessed to retrieve this
value. This function will run faster than the previous function because
the number of object property lookups has been reduced.
The effect of data access is exaggerated as the value’s data
structure depth increases. For example, data.count is faster to access than data.item.count, which is faster to access than
data.item.subitem.count. When dealing
with properties, the number of times a dot is used (for property lookup)
directly relates to the amount of time it takes to access that value.
Figure 7.6, “Access times for object properties by depth” shows the
relative data access times by property depth across browsers. The tests
for this research are part of the data access experiment located at http://www.nczonline.net/experiments/javascript/performance/data-access/.
A good approach to take when dealing with data access is to store in a local variable any object property or array item that is used more than once in a function.
For most browsers, there is virtually no difference between using
dot notation for object property access (data.count) and bracket notation (data["count"]). The one exception is Safari,
where bracket notation is significantly slower than dot notation. This
holds true even for Safari 4 and later using the Nitro JavaScript
engine.
Using local variables is especially important when dealing with
HTMLCollection objects (those returned
from DOM methods such as getElementsByTagName and
properties such as element.childNodes).
Each HTMLCollection object is
actually a live query being run against the DOM document every time a
property is accessed. For example:
var divs = document.getElementsByTagName("div");
for (var i=0; i < divs.length; i++){ //Avoid!
var div = divs[i];
process(div);
}The first line of this code creates a query that returns
every <div> element on
the page and stores that query in divs.
Each time divs has a property accessed
either by name or by index, the DOM actually reexecutes that query against
the entire page; in this code, it occurs each time divs.length or divs[i] is accessed. These property lookups take
longer than the average non-DOM object property or array item lookup. It’s
therefore important to store such values in local variables whenever
possible to avoid the requerying penalty associated with HTMLCollection objects. For example:
var divs = document.getElementsByTagName("div");
for (var i=0, len=divs.length; i < len; i++){ //Better
var div = divs[i];
process(div);
}This example stores the length of the divs
HTMLCollection in a local variable, limiting the number of times
the object is accessed directly. In the previous version of this code,
divs was accessed twice per iteration:
once to retrieve the object in the given position, and once to check the
length. This new version eliminates direct length-checking with each
iteration.
Generally speaking, interacting with DOM objects is always more
expensive than interacting with non-DOM objects. Due to DOM behavior,
property lookups typically take longer than non-DOM property lookups.
The HTMLCollection object is the
worst-performing object in the DOM. If you need to repeatedly access
members of an HTMLCollection, it is more efficient to
copy them into an array first.
Next to data access, flow control is perhaps the most important aspect of JavaScript relating to performance. JavaScript, as with most programming languages, has a number of flow control statements that determine which part of the code should be executed next. There’s a series of conditional and loop statements that enable developers to precisely control how execution flows from one part of the code to another. Choosing the right option at each point can dramatically affect how fast your script runs.
The classic question of whether to use a switch statement or a series of if and else
statements is not unique to JavaScript and has spurred discussions in
nearly every programming language that has these constructs. The real
issue is not between individual statements, of course, but rather
relates to the speed with which each is able to handle a range of
conditional statements. The details of this section are based on tests
that you can run at http://www.nczonline.net/experiments/javascript/performance/conditional-branching/.
Discussions usually begin surrounding complex if statements such as this:
if (value == 0){
return result0;
} else if (value == 1){
return result1;
} else if (value == 2){
return result2;
} else if (value == 3){
return result3;
} else if (value == 4){
return result4;
} else if (value == 5){
return result5;
} else if (value == 6){
return result6;
} else if (value == 7){
return result7;
} else if (value == 8){
return result8;
} else if (value == 9){
return result9;
} else {
return result10;
}Typically, this type of construct is frowned upon. The major
problem is that the deeper into the statement the execution flows, the
more conditions have to be evaluated. It will take longer to complete
the execution when value is 9 than
if value is 0 because every other
condition must be evaluated beforehand. As the overall number of
conditions increases, so does the performance hit for going deep into
the conditions. While having a large number of if conditions isn’t advisable, there are
steps you can take to increase the overall performance.
The first step is to arrange the conditions in decreasing order
of frequency. Since exiting after the first condition is the fastest
operation, you want to make sure that happens as often as possible.
Suppose the most common case in the previous example is that value will equal 5 and the second most
common is that value will equal 9.
In that case, you know five conditions will be evaluated before
getting to the most common case and nine before getting to the second
most common case; this is incredibly inefficient. Even though the
increasing numeric order of the conditions makes it easier to read, it
should actually be rewritten as follows:
if (value == 5){ return result5; } else if (value == 9){ return result9; } else if (value == 0){ return result0; } else if (value == 1){ return result1; } else if (value == 2){ return result2; } else if (value == 3){ return result3; } else if (value == 4){ return result4; } else if (value == 6){ return result6; } else if (value == 7){ return result7; } else if (value == 8){ return result8; } else { return result10; }
Now the two most common conditions appear at the top of the
if statement, ensuring optimal
performance for these cases.
Another way to optimize if
statements is to organize the conditions into a series of branches,
following a binary search algorithm to find the valid condition. This
is advisable in the case where a large number of conditions are
possible and no one or two will occur with a high enough rate to
simply order according to frequency. The goal is to minimize the
number of conditions to be evaluated for as many of the conditions as
possible. If all of the conditions for value in the example will occur with the
same relative frequency, the if
statements can be rewritten as follows:
if (value < 6){
if (value < 3){
if (value == 0){
return result0;
} else if (value == 1){
return result1;
} else {
return result2;
}
} else {
if (value == 3){
return result3;
} else if (value == 4){
return result4;
} else {
return result5;
}
}
} else {
if (value < 8){
if (value == 6){
return result6;
} else {
return result7;
}
} else {
if (value == 8){
return result8;
} else if (value == 9){
return result9;
} else {
return result10;
}
}
}This code ensures that there will never be any more than four
conditions evaluated. Instead of evaluating each condition to find the
right value, the conditions are separated first into a series of
ranges before identifying the actual value. The overall performance of
this example is improved because the cases where eight and nine
conditions need to be evaluated have been removed. The maximum number
of condition evaluations is now four, creating an average savings of
about 30% off the execution time of the previous version. Keep in
mind, also, that an else statement
has no condition to evaluate. However, the problem remains that each
additional condition ends up taking more time to execute, affecting
not only the performance but also the maintainability of this code.
This is where the switch statement
comes in.
The switch statement
simplifies both the appearance and the performance of
multiple conditions. You can rewrite the previous example using a
switch statement as follows:
switch(value){
case 0:
return result0;
case 1:
return result1;
case 2:
return result2;
case 3:
return result3;
case 4:
return result4;
case 5:
return result5;
case 6:
return result6;
case 7:
return result7;
case 8:
return result8;
case 9:
return result9;
default:
return result10;
}This code clearly indicates the conditions as well as the return
values in an arguably more readable form. The switch statement has the added benefit of
allowing fall-through conditions, which allow you to specify the same
result for a number of different values without creating complex
nested conditions. The switch
statement is often cited in other programming languages as the
hands-down better option for evaluating multiple conditions. This
isn’t because of the nature of the switch statement, but rather because of how
compilers are able to optimize switch statements for faster evaluation.
Since most JavaScript engines don’t have such optimizations,
performance of the switch statement
is mixed.
Firefox handles switch
statements very well, with each condition’s evaluation executing in
roughly the same amount of time regardless of the order in which they
are defined. That means the case of value equal to 0 will take roughly the same
amount of time to execute as when value is 9. Other browsers, however, aren’t
nearly as good. Internet Explorer, Opera, Safari, and Chrome all show
noticeable increases in the execution time as you get deeper into the
switch statement. Those increases,
however, are smaller than the increases experienced with each
additional condition of an if
statement. You can therefore improve the performance of switch statements by ordering the conditions
in decreasing rate of frequency (the same as if statement optimization).
In JavaScript, if statements
are generally faster than switch statements when there are just
one or two conditions to be evaluated. When there
are more than two conditions, and the conditions are simple (not
ranges), the switch statement tends
to be faster. This is because the amount of time it takes to execute a
single condition in a switch
statement is often less than it takes to execute a single condition in
an if statement, making the
switch statement optimal only when
there are a larger number of conditions.
There are more than two solutions for dealing with conditionals in
JavaScript. Alongside the if
statement and the switch statement
is a third approach: looking up values in arrays. The example for this
section maps a given number to a specific result, which is exactly
what arrays are for. Instead of writing a large if statement or switch statement, you can use the following
code:
//define the array of results var results = [result0, result1, result2, result3, result4, result5, result6, result7, result8, result9, result10] //return the correct result return results[value];
Instead of using conditional statements, all of the results are
stored in an array whose index maps to the value variable. Retrieving the appropriate
result is simply a matter of array value lookup. Although array lookup
times also increase the deeper into the array you go, the incremental
increase is so small that it is irrelevant relative to the increases
in each condition evaluation for if
and switch statements. This makes
array lookup ideal whenever there are a large number of conditions to
be met, and the conditions can be represented by discrete values such
as numbers or strings (for strings, you can use an Object to store the results rather than an
Array).
It’s not practical to use array lookup for small numbers of results because array lookup is often slower than evaluating a small number of conditions. Array lookups can be very helpful when there are a large number of ranges because they eliminate the need to test both upper and lower bounds; you can simply fill in that range of indexes in the array with the appropriate value and do a straight array lookup.
The three techniques presented here—the if statement, the switch statement, and array lookup—each have
their uses in optimizing code execution:
As mentioned in Chapter 1, Understanding Ajax Performance, loops are a frequent source of performance issues in JavaScript, and the way you write loops drastically changes its execution time. Once again, JavaScript developers don’t get to rely on compiler optimizations that make loops faster regardless of the initial code, so it’s important to understand the various ways to write loops and how they affect performance.
There are four different types of loops in JavaScript. In this
section, we will discuss three of them: the for loop,
the do-while loop, and the while loop. (The
fourth type is a for-in loop that is used to iterate over object
properties, but I won’t cover it here because its purpose is very
unique.) The various loop types are coded as follows:
//unoptimized code
var values = [1,2,3,4,5];
//for loop
for (var i=0; i < values.length; i++){
process(values[i]);
}
//do-while loop
var j=0;
do {
process(values[j++]);
} while (j < values.length);
//while loop
var k=0;
while (k < values.length){
process(values[k++]);
}Each of the loops in this example achieves the same result: all
items in the values array are
passed into the process function.
These are the most common constructs used for iterating over a number
of values in an array. Each of these loops runs in about the same
amount of time because they’re doing roughly the same amount of work.
There are, however, ways to improve the performance.
Perhaps the most glaring issue in each loop is the constant
comparison of the iterator variable against the array length. As
mentioned earlier in this chapter, property lookup is a much more
expensive operation than local variable access. This code is
retrieving the value of values.length every time the loop executes
to see whether the terminal condition has been reached. This is
incredibly inefficient given that the length of the array won’t change
while the loop is being executed. Using a local variable instead of a
property lookup can speed up the loops:
var values = [1,2,3,4,5]; var length = values.length; //for loop for (var i=0; i < length; i++){ process(values[i]); } //do-while loop var j=0; do { process(values[j++]); } while (j < length); //while loop var k=0; while (k < length){ process(values[k++]); }
Each loop now uses the local variable length as its comparison point instead of
values.length,
eliminating a property lookup each time through the loop. This
technique is especially important when dealing with HTMLCollection
objects because, as mentioned previously, every property access on
such an object is actually a query against the DOM for all nodes
matching some criteria. That makes a property lookup on an HTMLCollection very expensive and, when
included in the terminal condition of a loop, adds significant
execution time to the overall loop.
Another simple way to improve the performance of a loop is to decrement the iterator toward 0 rather than incrementing toward the total length. Making this simple change can result in savings of up to 50% off the original execution time, depending on the complexity of each iteration. For example:
var values = [1,2,3,4,5]; var length = values.length; //for loop for (var i=length; i--;){ process(values[i]); } //do-while loopvar j=length; do { process(values[--j]); } while (j); //while loop var k=length; while (k--){ process(values[k]); }
Each of these loops is now even faster by virtue of changing the
terminal condition to a comparison against 0 (note that the terminal
condition evaluates to true once
the iterator variable equals 0). The performance of each type of loop
is comparable, so you needn’t worry about choosing among the three
variations for speed purposes.
Another variation of the for
loop is the for-in loop, whose purpose is to iterate over
the enumerable properties of a JavaScript object. Typical usage is as
follows:
for (var prop in object){
if (object.hasOwnProperty(prop)){ //to filter out prototype properties
process(object[prop]);
}
}This code iterates over the properties in a given object,
using the hasOwnProperty
method to ensure that only instance properties are processed.
Because the for-in loop has a specific purpose, there is
little you can do to change its performance. The terminal condition
cannot be altered, and the order of the properties to iterate over
cannot be changed. Further, a for-in
loop is typically much slower than any of the other loops because it
requires resolving every enumerable property on a particular object.
That, in turn, means the object’s prototype and entire prototype chain
must be examined to extract these properties. Traversing the prototype
chain, just like traversing the scope chain, takes time and slows down
the performance of the entire loop.
If you know the specific properties you’re interested in, it’s
much faster to create a standard loop (for, do-while,
or while) and iterate over an array
of names, such as:
//known properties to iterate over
var props = ["name", "age", "title"];
//while loop
var i=props.length;
while (i--){
process(object[props[i]]);
}This loop runs much faster than the for-in
loop, and not simply because of the small number of properties in the
props array. Even increasing the
number of properties over which to iterate would yield significantly
better performance than the for-in
loop. The loop in this example takes advantage of all the normal loop
performance enhancements and still allows iteration over a known set
of object properties.
Naturally, this approach works only when you know the object
properties to iterate over; when dealing with unknown properties, as
with JSON objects, a for-in
loop may still be necessary.
It is a common practice in several programming languages to unroll small loops to improve performance. The basis of this practice is that limiting the number of iterations can mitigate the performance overhead of a loop. The implementation of such a solution is typically called unrolling the loop, which means making each iteration do the work of multiple iterations. Consider the following loop:
var i=values.length;
while (i--){
process(values[i]);
}If there are only five items in the values array, it is actually faster to
remove the loop and do the work on each value individually:
//unrolled loop process(values[0]); process(values[1]); process(values[2]); process(values[3]); process(values[4]);
Of course, this approach is arguably less maintainable, as it
takes more code to write and any change to the number of items in the
values array requires changes to
the code. Further, the performance gains for such a small number of
statements aren’t worth the maintenance overhead. This technique can
be quite useful, however, when you’re dealing with a large number of
values and a potentially large number of iterations.
Tom Duff, a computer programmer working for Lucasfilm at the time, first proposed a construct for unrolling loops in the C programming language. This pattern became known as Duff’s Device and was later converted to JavaScript by Jeff Greenberg, who also published one of the first comprehensive studies on JavaScript performance optimization (which is still available at http://home.earthlink.net/~kendrasg/info/js_opt/). Greenberg’s Duff’s Device implementation is as follows:
var iterations = Math.ceil(values.length / 8);
var startAt = values.length % 8;
var i = 0;
do {
switch(startAt){
case 0: process(values[i++]);
case 7: process(values[i++]);
case 6: process(values[i++]);
case 5: process(values[i++]);
case 4: process(values[i++]);
case 3: process(values[i++]);
case 2: process(values[i++]);
case 1: process(values[i++]);
}
startAt = 0;
} while (--iterations > 0);The idea behind Duff’s Device is that each trip through the loop
does the work of between one and eight iterations of a normal loop.
This is done by first determining the number of iterations by dividing
the total number of array values by eight. Duff found that eight was
an optimal number to use for this processing (it’s not arbitrary).
Since not all array lengths will be equally divisible by eight, you
must also calculate how many items won’t be processed by using the
modulus operator. The startAt
variable, therefore, contains the number of additional items to be
processed. This variable is used only the first time through the loop,
to do the extra work, and then is set back to zero so that each
subsequent trip through the loop results in a full eight items being
processed. Duff’s Device runs faster than a normal loop over a large
number of iterations, but it can be made even faster.
The book Speed Up Your Site (New Riders)
introduced a version of Duff’s Device in JavaScript that moves the
processing of the extra array items outside the main loop, allowing
the switch statement to
be removed and resulting in an even faster way of processing a large
number of items:
var iterations = Math.floor(values.length / 8);
var leftover = values.length % 8;
var i = 0;
if (leftover > 0){
do {
process(values[i++]);
} while (--leftover > 0);
}
do {
process(values[i++]);
process(values[i++]);
process(values[i++]);
process(values[i++]);
process(values[i++]);
process(values[i++]);
process(values[i++]);
process(values[i++]);
} while (--iterations > 0);This code executes faster over a large number of array items
primarily due to the removal of
the switch statement from the main
loop. As discussed earlier in this chapter, conditionals do have
performance overhead; removing that overhead from the algorithm speeds
up the processing. The separation of processing into two discrete
loops allows this augmentation.
Duff’s Device, and the modified version presented here, is useful primarily with large arrays. For small arrays, the performance gain is minimal compared to standard loops. Therefore, you should attempt to use Duff’s Device only if you notice a performance bottleneck relating to a loop that must process a large number of items.
String manipulation is a very common occurrence in JavaScript. There are multiple ways to deal with string processing, depending on the particular task, and each task brings with it specific performance considerations. There are a number of different ways to manipulate strings, whether it be using built-in string methods and operators or intermixing the use of regular expressions and arrays. The exact technique to use for optimal performance is tied directly to the type of manipulation being performed.
Traditionally, string concatenation has been one of the
poorest-performing aspects of JavaScript. Typically, string
concatenation is done using the plus operator (+), such
as in the following:
var text = "Hello"; text += " " text += "World!";
Early browsers had no optimization for such operations. Since strings are immutable, that meant creating intermediate strings to contain the concatenation result. This constant creation and destruction of strings behind the scenes led to very poor string concatenation performance.
Having discovered this, developers turned to the JavaScript Array object
for help. One of the Array object’s
methods is join, which concatenates
all items in the array and inserts a given string between the items.
Instead of using the plus operator, each string is added to an array and
the join method is called
when all items have been added. For example:
var buffer = [],
i = 0;
buffer[i++] = "Hello";
buffer[i++] = " ";
buffer[i++] = "World!";
var text = buffer.join("");In this code, each string is added into the buffer array. The join method is called after all strings are in
the array, returning the concatenated string and storing it in the
variable text. Adding the items
directly into the appropriate index is slightly faster than calling
push for each value. This technique
proved to be much faster in early browsers than using the plus operator
because no intermediate strings are being created and destroyed.
However, browser string optimizations have changed the string
concatenation picture.
Firefox was the first browser to optimize string concatenation. Beginning with version 1.0, the array technique is actually slower than using the plus operator in all cases. Other browsers have also optimized string concatenation, so Safari, Opera, Chrome, and Internet Explorer 8 also show better performance using the plus operator. Internet Explorer prior to version 8 didn’t have such an optimization, and so the array technique is always faster than the plus operator.
This doesn’t necessarily mean browser detection should be used whenever string concatenation is necessary. There are two factors to consider when determining the most appropriate way to concatenate strings: the size of the strings being concatenated and the number of concatenations.
All browsers can comfortably complete the task in less than one millisecond using the plus operator when the size of the strings is relatively small (20 characters or less) and the number of concatenations is also relatively small (1,000 concatenations or less). There is no reason to consider anything other than using the plus operator if this is your situation.
As you increase the number of concatenations for small strings, or the size of the strings with a small number of concatenations, the performance gets significantly worse in Internet Explorer through version 7. Also, as the size of the strings increases, the performance difference between using the plus operator and the array technique decreases in Firefox. As the number of concatenations increases, the difference between the two techniques decreases in Safari as well. The only browsers where the plus operator remains consistently and significantly faster with varying string size and concatenation numbers are Chrome and Opera.
With all of the performance variance across browsers, the technique to use is heavily dependent on the use case as well as on the browsers you’re targeting. If your users largely use Internet Explorer 6 or 7, it may be worth using the array technique all the time because that will affect the largest number of people. The performance decrease of the array technique in other browsers is typically much less than the performance increase gained in Internet Explorer, so try to balance your users’ experience based on their browsers rather than trying to target specific situations and browser versions. In most common cases, however, using the plus operator is preferred.
One of the most glaring omissions of JavaScript strings is the lack of a
native trim method used to remove leading and trailing whitespace. The
most common implementation of a trim function is as
follows:
function trim(text){
return text.replace(/^\s+|\s+$/g, "");
}This implementation uses a regular expression that matches one or
more whitespace characters at the beginning or end of the string. The
string’s replace method is used
to replace any matches with an empty string. This implementation,
however, has a performance issue based in the regular expression.
The performance impact comes from two aspects of the regular
expression: the pipe operator, indicating that there are two patterns to
match, and the g flag, indicating
that the pattern should be applied globally. Taking this into mind, you
can rewrite the function to be a bit faster by breaking up the regular
expression into two and getting rid of the g flag:
function trim(text){
return text.replace(/^\s+/, "").replace(/\s+$/, "");
}Breaking the single replace
method into two calls allows each regular expression to become much
simpler and, therefore, faster. This method is faster than the original,
but you can optimize it even further.
Steven Levithan, after performing research on the fastest way to execute string trimming in JavaScript, arrived at the following function:
function trim(text){
text = text.replace(/^\s+/, "");
for (var i = text.length - 1; i >= 0; i--) {
if (/\S/.test(text.charAt(i))) {
text = text.substring(0, i + 1);
break;
}
}
return text;
}This trim function consistently
performs better than other variations. The source of the speed is
keeping the regular expressions as simple as possible. The first line
removes leading whitespace and then the for loop is used to strip trailing whitespace.
The loop uses a very simple, single-character regular expression that
matches nonwhitespace characters. This information is used to either
remove a character from the string or break the loop. The resulting
function performs faster than the previous versions across all browsers.
For Levithan’s complete analysis, see his post at http://blog.stevenlevithan.com/archives/faster-trim-javascript.
As with string concatenation, the speed of string trimming matters
only if it is performed with enough frequency during execution. The
second trim function in this section
performs fine for smaller strings over the course of a few calls; the
third trim function is significantly
faster when used on longer strings.
One of the critical performance issues with JavaScript is that code execution freezes a web page. Because JavaScript is a single-threaded language, only one script can be run at a time per window or tab. This means that all user interaction is necessarily halted while JavaScript code is being executed. This is an important feature of browsers since JavaScript may change the underlying page structure during its execution, with the possibility of nullifying or altering the response to user interaction.
If JavaScript code isn’t carefully crafted, it’s possible to freeze the web page for an extended period of time and ultimately cause the browser to stop responding. Most browsers will detect long-running scripts and notify the user of a problem with a dialog box asking whether the script should be allowed to proceed.
Exactly what causes the browser to display the long-running script dialog varies depending on the vendor:
Internet Explorer monitors the number of statements that have been executed by a script. When a maximum number of statements have been executed, 5 million by default, the long-running script dialog is displayed (as shown in Figure 7.7, “Internet Explorer 7 long-running script dialog”).
Firefox monitors the amount of time a script is taking to execute. When a script takes longer than a preset amount of time, 10 seconds by default, the long-running script dialog is displayed.
Safari also uses the execution time to determine whether a script is long-running. The default timeout is set to five seconds, after which the long-running script dialog is displayed.
Chrome as of version 1.0 has no set limit on how long JavaScript is allowed to run. The process will crash when it has run out of memory.
Opera is the only browser that doesn’t protect against long-running scripts. Scripts are allowed to continue until execution is complete.
If you ever see the long-running script dialog, it’s an indication that the JavaScript code needs to be refactored. Generally speaking, no single continuous script execution should take longer than 100 milliseconds; anything longer than that and the web page will almost certainly appear to be running slowly to the user. Brendan Eich, the creator of JavaScript, is also quoted as saying, “[JavaScript] that executes in whole seconds is probably doing something wrong....”
The most common reasons why a script takes too long to execute include:
DOM manipulation is more expensive than any other JavaScript process. Minimizing DOM interactions significantly cuts the JavaScript runtime. Most browsers update the DOM only after the entire script has finished executing, which slows the perceived responsiveness of the web page to the user.
Any loop that either runs too many times or performs too many operations with each iteration can cause long-running script issues. It helps separate out functionality whenever possible. The effect is worsened when loops are used to perform DOM manipulations, sometimes causing the browser to completely freeze without ever showing the long-running script dialog.
JavaScript engines put a limit on the amount of recursion that scripts can use. Rewriting the code to avoid recursion helps ameliorate the issue.
Sometimes simple code refactoring, keeping these issues in mind, can prevent runaway scripts. There may, however, be times when complex processes must necessarily be executed for the web application to function correctly. In that case, the code must be restructured to yield periodically, as explained in the next section.
The single-threaded nature of JavaScript means that only one script can be executed in a window or tab at any given point in time. No user interactions can be processed during this time and so it’s necessary to introduce breaks in long-executing JavaScript code. On simple web pages, the breaks occur naturally as the user interacts with the page. In complex web applications, it’s often necessary to insert the breaks yourself. The easiest way to do this is to use a timer.
Timers are created using the setTimeout function,
passing in the function to execute as well as a delay (in milliseconds)
before the function should be executed. When the delay has passed, the
code to execute is placed into a queue. The JavaScript engine uses this
queue to determine what to do next. When a script finishes executing,
the JavaScript engine yields to allow other browser tasks to catch up.
The web page display is typically updated during this time in relation
to changes made via the script. Once the display has been updated, the
JavaScript engine checks for more scripts to run on the queue. If
another script is waiting, it is executed and the process repeats; if
there are no more scripts to execute, the JavaScript engine remains idle
until another script appears in
the queue.
When you create a timer, you’re actually scheduling some code to
be inserted into the JavaScript engine’s queue to be executed later.
That insertion happens after the amount of time specified when calling
setTimeout. In essence, timers push
code execution off into the future, where all long-running script limits
are reset. Consider the following code:
window.onload = function(){
//Page Load
//create first timer
setTimeout(function(){
//Delayed Script 1
setTimeout(function(){
//Delayed Script 2
}, 100);
//Delayed Script 1, continued
}, 100);
};In this example, a script is run when the page loads. That script
calls setTimeout to create the first
timer. When that timer executes, it calls setTimeout again to create a second timer. The
second delayed script cannot start running, however, until the first has
finished executing and the browser has updated the display. Figure 7.8, “JavaScript code execution with timers” shows the timeline
for this code execution, indicating that no two scripts are run at the
same time.
Timers are the de facto standard for splitting up JavaScript code execution in browsers. Whenever a script is taking too long to complete, look to delay parts of the execution until later.
Note that very small timer delays can also cause the browser to become unresponsive. It’s recommended to never use a delay of zero milliseconds, as this isn’t enough time for all browsers to properly update their display. In general, delays between 50 and 100 milliseconds are appropriate and allow browsers enough idle time to perform necessary display updates.
Array processing is one of the most frequent causes of long-running scripts. Typically, this is because processing must be done on each member of the array, and so the execution time increases directly in proportion to the number of items in the array. If the array processing doesn’t have to be executed synchronously, it is a good candidate for splitting up using timers.
In my book, Professional JavaScript for Web Developers, Second Edition (Wrox), I describe a simple function that can be used to split up the processing of arrays using timers:
function chunk(array, process, context){
setTimeout(function(){
var item = array.shift();
process.call(context, item);
if (array.length > 0){
setTimeout(arguments.callee, 100);
}
}, 100);
}The chunk function accepts
three arguments: an array of data to process, a function with which to
process each item, and an optional context argument in which the
processing function should be executed (by default, all functions passed
into setTimeout are run in the global
context, so this is equal to window). Processing of the items is done using
timers, and so the code execution yields after each item has been
processed. The next item to process is always at the front of the array
and is removed before being processed. Afterward, a check is made to
determine whether there are any further values left to process. If so, a
new timer is created and the function is called again via arguments.callee. Note
that the chunk function uses the
passed-in array as a “to do” list of items to process and so is altered
once execution is complete. You can use the function as follows:
var names = ["Nicholas", "Steve", "Doug", "Bill", "Ben", "Dion"],
todo = names.concat(); //clone the array
chunk(todo, function(item){
console.log(item);
});The code in this simple example outputs each name in the names array to the console (available in Firefox with Firebug
installed, Internet Explorer 8+, Safari 2+, and all versions of Chrome).
The processing function is very short but could easily be replaced with
something more complex. The chunk
function is best used with long arrays where each item requires
significant processing.
Another popular pattern is to perform small, sequential parts of a larger operation using timers. Julien Lecomte presented this pattern in his blog post, “Running CPU Intensive JavaScript Computations in a Web Browser”, in which he showed how sorting of a large data set could be achieved using an inefficient algorithm (bubble sort) without incurring a long-running script issue. The following is an adaptation of Lecomte’s code:
function sort(array, onComplete){
var pos = 0;
(function(){
var j, value;
for (j=array.length; j > pos; j--){
if (array[j] < array[j-1]){
value = data[j];
data[j] = data[j-1];
data[j-1] = value;
}
}
pos++;
if (pos < array.length){
setTimeout(arguments.callee,10);
} else {
onComplete();
}
})();
}The sort function splits up each traversal through the array for sorting,
allowing the browser to continue functioning while this processing
occurs. The inner anonymous function is called immediately to do the
first traversal and then is called subsequently via a timer by passing
arguments.callee into setTimeout. When the array is finally sorted, the onComplete function is called to notify the developer that the data is ready
to be used. The function can be used as follows:
sort(values, function(){
alert("Done!");
});When sorting an array with a large number of items, the difference in browser responsiveness is immediately apparent.
The speed with which JavaScript executes is very dependent on how it is written. In this chapter, you learned several ways to speed up JavaScript code execution:
Managing your scope is important, since out-of-scope variables
take longer to access than local variables. Try to avoid constructs
that artificially augment the scope chain, such as the with statement and the catch clause of a try-catch
statement. If an out-of-scope value is being used more than once,
store it in a local variable to minimize the performance
penalty.
The way you store and access data can greatly impact the performance of your script. Literal values and local variables are always the fastest; you incur a performance penalty for accessing array items and object properties. If an array item or object property is used more than once, store it in a local variable to speed up access to the value.
Flow control is also an important determinant of script
execution speed. There are three ways to handle conditionals: the
if statement, the switch statement, and array lookup. The
if statement is best used with a
small number of discrete values or a range of values; the switch statement is best used when there are
between 3 and 10 discrete values to test for; array lookup is most
efficient for a larger number of discrete values.
Loops are frequently found to be bottlenecks in JavaScript. To make a loop the most efficient, reverse the order in which you process the items so that the control condition compares the iterator to zero. This is far faster than comparing a value to a nonzero number and significantly speeds up array processing. If there are a large number of required iterations, you may also want to consider using Duff’s Device to speed up execution.
Be careful when using HTMLCollection objects. Each time a property
is accessed on one of these objects, it requires a query of the DOM
for matching nodes. This is an expensive operation that can be avoided
by accessing HTMLCollection
properties only when necessary and storing frequently used values
(such as the length property) in
local variables.
Common string operations may have unintended performance
implications. String concatenation is much slower in Internet Explorer
than in other browsers, but it’s not worth worrying about unless
you’re dealing with more than 1,000 concatenations. You can optimize
string concatenation in Internet Explorer by using an array to store
the individual strings and then calling join() to merge them together. Trimming
strings may also be expensive, depending on the size of the string.
Make sure to use the most optimal algorithm if trimming is a large
part of your script.
Browsers have limits on how long JavaScript can run, capping either the number of statements or the amount of time the JavaScript engine is allowed to run. You can circumvent these limits, and prevent the browser from displaying a warning about the long-running script, by using timers to split up the amount of work.
[22] All of the research in this chapter is based on experiments run on Firefox versions 3.0 and 3.1 beta 2, Google Chrome 1.0, Internet Explorer versions 7 and 8 beta 2, Safari versions 3.0–3.2, and Opera version 9.62. When the version numbers aren’t mentioned, all tested versions of the browser are relevant.
If you enjoyed this excerpt, buy a copy of Even Faster Web Sites .
Copyright © 2009 O'Reilly Media, Inc.