Performance Tools: Appendix - Even Faster Websites

by Steve Souders
Even Faster Web Sites book cover

This excerpt is from Even Faster Web Sites .

Souders and eight expert contributors provide best practices and pragmatic advice for improving your site's performance in three critical categories: JavaScript, in the network, and in the browser.

buy button

Like all good engineers, web developers need to build up a set of tools to do a high-quality job. This appendix describes the tools I recommend for analyzing and improving web site performance. The tools are grouped into the following sections:

the section called “Packet Sniffers”

When I sit down to analyze a web site, I start by looking at the HTTP requests for the web page in question. This makes it possible to identify the slow parts of the page. A packet sniffer that is handy and easy to use is the first tool to add to your kit. The tools included in this category are HttpWatch, Firebug Net Panel, AOL Pagetest, VRTA, IBM Page Detailer, Web Inspector Resources Panel, Fiddler, Charles, and Wireshark.

the section called “Web Development Tools”

Page performance isn’t just about load time—JavaScript, CSS, and DOM structure play a significant role, especially for Web 2.0 applications. Web development tools provide inspectors, profilers, and debuggers for analyzing web page behavior. This section includes Firebug, Web Inspector, and IE Developer Toolbar.

the section called “Performance Analyzers”

Performance analyzers evaluate a given web page against a set of performance best practices. As I will explain, they vary a great deal in what they measure. This section describes YSlow, AOL Pagetest, VRTA, and neXpert.

the section called “Miscellaneous”

This section includes a grab bag of tools I use regularly: Hammerhead, Smush.it, Cuzillion, and UA Profiler.

Packet Sniffers

Every web developer working on performance needs to look at how his web pages load, including all the resources within the page. This is done using a packet sniffer. The packet sniffers listed in this section range from tools that give a higher-level view of network traffic, such as HttpWatch, to lower-level tools that expose each packet sent over the network, such as Wireshark. In most of my web performance analysis, I use the higher-level network monitors; they are typically easier to configure and have a user interface that is more visual. In some situations, such as debugging chunked encoding, I drop down into one of the lower-level packet sniffers in order to see the contents and timing of each packet sent over the wire.

HttpWatch

HttpWatch is my preferred packet sniffer. HttpWatch depicts network traffic in a graphical way, as shown in Figure A.1, “HttpWatch”. Most of the HTTP waterfall charts in this book were captured using HttpWatch. This graphical display makes it easy to spot performance delays.

HttpWatch is built by Simtec. You can try out the free download, but it’s limited to work on only a few major sites, such as Google and Yahoo!. You have to pay for the full-featured version, but it’s money well spent. HttpWatch runs on Microsoft Windows with Internet Explorer and Firefox.

Figure A.1. HttpWatch

HttpWatch

Firebug Net Panel

Firebug has many features critical to any web developer and is described more thoroughly in the section called “Web Development Tools”. The Firebug Net Panel, however, deserves mention here. Net Panel displays HTTP waterfall charts, making it an easy alternative for developers who already have Firebug installed. I especially like Net Panel’s use of vertical lines to mark the DOMContentLoaded and onload events in the page load timeline, as shown in Figure A.2, “Firebug Net Panel”. This is a feature that other packet sniffers should adopt.

Figure A.2. Firebug Net Panel

Firebug Net Panel

One drawback of Net Panel is that the timing information can be affected by the web page itself. This is due to the fact that Firebug is implemented in JavaScript and therefore executes in the same Firefox process as the current web page. Because of this, if network events happen while JavaScript is executing in the main page, Net Panel is blocked from recording the correct timing information for those requests. Net Panel’s accuracy is sufficient in most situations, and its ease of use makes it a good choice. If you require more precise time measurements or have a page with long-running blocks of JavaScript, you should consider using one of the other packet sniffers mentioned in this section.

An additional constraint is that Firebug is a Firefox add-on, so it isn’t available in other browsers.

AOL Pagetest

AOL Pagetest is an Internet Explorer plug-in that produces HTTP waterfall charts. It also identifies areas for improving performance, as discussed in the section called “Performance Analyzers”.

VRTA

VRTA from Microsoft focuses on improving network performance. Its HTTP waterfall chart is more detailed than other network monitors, putting an emphasis on reusing existing TCP connections. See the section called “Performance Analyzers” for more information about VRTA.

IBM Page Detailer

IBM Page Detailer used to be my preferred packet sniffer, but IBM stopped selling the professional version. The basic version is still available, but it lacks many features that I consider mandatory, such as support for analyzing HTTPS requests and the ability to export data. IBM Page Detailer runs on Microsoft Windows.

I use IBM Page Detailer when analyzing browsers other than Internet Explorer and Firefox, such as Opera and Safari (since these browsers aren’t supported by HttpWatch). IBM Page Detailer can monitor network traffic for any process that uses HTTP. This is enabled by editing the wd_WS2s.ini file and adding the process’s name to the Executable line, like so:

Executable=(FIREFOX.EXE),(OPERA.EXE),(SAFARI.EXE)

There’s an interesting twist that prevents IBM Page Detailer from analyzing Chrome: Chrome has a separate process for the browser UI plus one for each tab. IBM Page Detailer attaches to the browser UI process, and so it doesn’t see any of the HTTP traffic for the actual web pages being loaded. Nevertheless, if support for HTTPS and exporting data isn’t required, IBM Page Detailer is a useful alternative.

Web Inspector Resources Panel

Safari’s Web Inspector, similar to Firebug, is a web development tool that includes a network monitor. See the section called “Web Development Tools” for more information.

Fiddler

The main distinguishing feature of Fiddler, built by Eric Lawrence from the Microsoft Internet Explorer team, is that it supports a scripting capability that allows for setting breakpoints and manipulating HTTP traffic. One downside is that it acts as a proxy, and so it may alter the behavior of the browser (e.g., the number of open connections per server). If you need a scripting capability and are mindful of any side effects of using a proxy, I highly recommend Fiddler. It runs on Microsoft Windows.

Charles

Charles is an HTTP proxy, similar to Fiddler. It has many of the same features as Fiddler, including the ability to analyze both HTTP and HTTPS traffic, and bandwidth throttling. Charles supports Microsoft Windows, Mac OS X, and Linux.

Wireshark

Wireshark evolved from Ethereal. It analyzes HTTP requests at the packet level. Its UI is not as graphical as other network monitors. It also doesn’t have the concept of a “web page,” so it’s up to you to discern where the web page’s packets start and end. If you have to look at traffic at the packet level, such as to analyze chunked encoding, Wireshark is the best choice. It’s available on many platforms, including Microsoft Windows, Mac OS X, and Linux.

Web Development Tools

Packet sniffers show the network activity while a page is loading, but there’s more to a web page’s performance than just HTTP requests. Chapters 1 and 2 discuss how JavaScript and modifications to the DOM can slow a page down. The web development tools presented in this section—Firebug, Web Inspector, and IE Developer Toolbar—include features such as DOM inspectors, JavaScript debuggers and profilers, CSS editors, and network monitors.

These tools are the tip of the iceberg. More extensive tools are needed to give developers visibility into memory consumption, CPU load, JavaScript execution, CSS application, and HTML parsing and rendering over the entire page load timeline. And this analysis is needed without altering normal browser behavior.

Firebug

Firebug is the most popular web development tool, with more than 14 million (yes, million!) downloads. It was created by Joe Hewitt in January 2006. It includes inspectors for HTML, CSS, DOM, and layout. Firebug’s Net Panel, discussed in the section called “Packet Sniffers”, provides an HTTP waterfall chart of network activity. Firebug also has a JavaScript command line and console, as well as a JavaScript debugger and profiler. The debugger and profiler are Firebug’s strongest features.

Firebug is an add-on to Firefox. Although porting the JavaScript debugging and profiling functionality to other browsers would be a tremendous undertaking, many of Firebug’s other features are available across browsers by virtue of Firebug Lite. Firebug Lite is a bookmarklet, and therefore it works in all the major browsers. It had a major upgrade by Azer Koçulu and now includes inspectors for HTML, DOM, and CSS, as well as a JavaScript command line and console. Providing a common UI across all browsers and a fairly complete set of features, Firebug Lite is the perfect recipe for solving nasty browser incompatibility bugs.

Developers love Firebug because of their ability to extend it. This open extension model makes it possible to add on to Firebug’s features in a way that also allows for that new functionality to be shared with other developers. You can find useful Firebug extensions at http://getfirebug.com/extensions/index.html.

Web Inspector

Safari’s Web Inspector had a significant upgrade at the end of 2008. The Resources Panel, mentioned previously, is shown in Figure A.3, “Safari Web Inspector”. Web Inspector’s functionality is similar to Firebug. It has a console with autocompletion, a DOM and CSS inspector, and a JavaScript debugger and profiler.

Figure A.3. Safari Web Inspector

Safari Web Inspector

IE Developer Toolbar

The Internet Explorer Developer Toolbar has a feature set similar to Firebug Lite. It doesn’t have JavaScript debugging or profiling, but it does support validating HTML and CSS, DOM inspection, and pixel layout tools. The IE Developer Toolbar is targeted at Internet Explorer 6 and 7. The functionality has been built into Internet Explorer 8 under the Developer Tools menu item.

Performance Analyzers

YSlow was the first widely used performance “lint” tool. AOL Pagetest, VRTA, and neXpert were released subsequently. Each of these tools has its own set of performance best practices. I’ve aggregated all of these best practices in Table A.1, “Performance best practices”, with an indication of which rules are evaluated by each particular tool. I’ve grouped the best practices into three categories:

  • The rules included in High Performance Web Sites

  • The best practices described in this book

  • Other rules that I haven’t addressed but that are incorporated in at least one of these tools

Looking at Table A.1, “Performance best practices”, it’s clear that there is little overlap in the best practices espoused by each tool. In one sense, this is good—bringing in different perspectives on the performance problem leads to the discovery of new best practices. But this diversity has a more important and unfavorable impact: confusion and fragmentation in the web development community. It’s unclear which set of best practices is best. The choice of tool might be dictated by development environment rather than by the content of the performance analysis.

Across the developers of these tools, there is more agreement on performance best practices than is reflected in Table A.1, “Performance best practices”. The inconsistencies arise for several reasons. There’s a desire to introduce new best practices and to focus less on covering what has already been covered somewhere else. Development time is always an issue; developers may decide to skip the implementation of well-known best practices. Don’t underestimate the impact of personal interests; for instance, it’s clear that the developers of VRTA have more interest and familiarity with networking issues than I do.

Table A.1. Performance best practices

Best practice

YSlow

Pagetest

VRTA

neXpert

High Performance Web Sites

 

Combine JavaScript and CSS

X

X

  

Use CSS sprites

X

 

X

 

Use a CDN

X

X

  

Set Expires in the future

X

X

X

X

Gzip text responses

X

X

X

X

Put CSS at the top

X

   

Put JavaScript at the bottom

X

   

Avoid CSS expressions

X

   

Make JavaScript and CSS external

X

   

Reduce DNS lookups

X

   

Minify JavaScript

X

X

  

Avoid redirects

X

 

X

X

Remove dupe scripts

X

   

Remove ETags

X

X

 

X

Even Faster Web Sites

 

Don’t block the UI thread

    

Split JavaScript payload

    

Load scripts asynchronously

  

X

 

Inline scripts before stylesheet

    

Write efficient JavaScript

    

Minimize uncompressed size

    

Optimize images

 

X

  

Shard domains

  

X

 

Flush the document early

    

Avoid iframes

    

Simplify CSS selectors

  

X

 

Other

 

Use persistent connections

 

X

X

X

Reduce cookies

 

X

 

X

Avoid network congestion

  

X

 

Increase MTU, TCP window

  

X

 

Avoid server congestion

  

X

 

Moving forward, web developers would be well served if it became possible for these and other tools to share a common set of performance best practices. I fully expect this will happen. These tools were created in the spirit of evangelizing a faster web experience for all users and to help developers easily identify where they can make the greatest improvement to their site’s speed. In that spirit, it makes sense to give developers tools that are more consistent regardless of their platform and tool of choice.

That’s the future. For now, the following sections provide descriptions of YSlow, AOL Pagetest, VRTA, and neXpert, as they exist today.

YSlow

I created YSlow while working at Yahoo!. It existed first as a bookmarklet, and then as a Greasemonkey script. Joe Hewitt was kind enough to explain how to port YSlow to be a Firebug extension. Swapnil Shinde did a lot of the coding to get it to work with Firebug. The motivation I gave Swapnil was that I was certain YSlow would be used by as many as 10,000 people. YSlow was released in July 2007 and crossed the 1 million download mark a year and a half later. The name is a play on the question “whY is this page Slow?”

YSlow contains the following rules which are echoed as chapters in High Performance Web Sites. When YSlow was released, I also posted summaries of each rule at http://developer.yahoo.com/performance/rules.html. That page has subsequently been updated by the folks at Yahoo! to include 34 rules! Here are the original 13 rules that are still the basis for YSlow’s performance analysis:

  • Rule 1: Make Fewer HTTP Requests

  • Rule 2: Use a Content Delivery Network

  • Rule 3: Add an Expires Header

  • Rule 4: Gzip Components

  • Rule 5: Put Stylesheets at the Top

  • Rule 6: Put Scripts at the Bottom

  • Rule 7: Avoid CSS Expressions

  • Rule 8: Make JavaScript and CSS External

  • Rule 9: Reduce DNS Lookups

  • Rule 10: Minify JavaScript

  • Rule 11: Avoid Redirects

  • Rule 12: Remove Duplicate Scripts

  • Rule 13: Configure ETags

YSlow, as an extension to Firebug, is available only within Firefox. It generates a score for each rule and an overall score based on a weighted average of the individual rule scores. It also displays a list of all the resources used in the page as well as overall statistics (number of requests, total page weight, etc.). It has other useful tools, including integration with JSLint and output of all the CSS and JavaScript into a single browser window for easy searching.

AOL Pagetest

AOL Pagetest and its web-based counterpart, WebPagetest, analyze web pages using these best practices:

  • Enable browser caching of static assets

  • Use one CDN for all static assets

  • Combine static CSS and JavaScript files

  • Gzip-encode all appropriate text assets

  • Compress images

  • Use persistent connections

  • Proper cookie usage

  • Minify JavaScript

  • No ETag headers

AOL Pagetest is a plug-in for Internet Explorer. WebPagetest is accessible through any browser; it runs Internet Explorer on the backend server. In addition to performance analysis, both provide an HTTP waterfall chart, screenshots, page load times, and summary statistics.

The deployment of this functionality via the WebPagetest web site is intriguing. WebPagetest is fairly popular, but it hasn’t gotten the wide adoption it deserves. It lets you analyze any web site from any browser, without the hassle of downloading, installing, and configuring an application or plug-in. It does this by running AOL Pagetest in Internet Explorer on the WebPagetest site’s backend servers. WebPagetest users, from any browser, simply enter the URL of the site they want to analyze into the web-based form, and the results are presented a minute or so later. Figure A.4, “WebPagetest” shows the results for http://www.aol.com/.

Making WebPagetest available through a web page form makes it easy to use for everyone, including nondevelopers, but it does have some limitations. It’s important to remember that the results are always generated using Internet Explorer running in WebPagetest’s remote location. This can be confusing. Notice in Figure A.4, “WebPagetest” that I’m using Firefox; remembering that these results were produced using Internet Explorer is a challenge. Similarly, the results do not necessarily reflect your local conditions. If you’re trying to debug a problem with your current Internet connection, or you’re loading a page that depends on your current cookies, that can’t be captured by WebPagetest. AOL Pagetest (the downloaded, locally installed Internet Explorer plug-in) or the other packet sniffers mentioned in the previous section are the choice for analyzing your current browsing experience.

Figure A.4. WebPagetest

WebPagetest

VRTA

VRTA from Microsoft is short for Visual Round Trip Analyzer. It displays HTTP waterfall charts, but these are more detailed than those found in other tools. VRTA focuses on network optimization. One key aspect of this is reusing existing TCP connections. In most HTTP waterfall charts, each HTTP request is a separate horizontal bar. Instead, VRTA represents each TCP connection as a horizontal bar. This makes it easy to see how well TCP connections are being utilized. VRTA also shows a bit rate histogram, to show how well the available bandwidth is utilized.

In addition to its sophisticated network charts, VRTA evaluates the page download information against the following set of performance best practices:

  • Open enough ports

  • Limit the number of small files to be downloaded

  • Load JavaScript files outside of the JavaScript engine

  • Turn on keepalives

  • Identify network congestion

  • Increase network maximum transmission unit (MTU) or TCP window size

  • Identify server congestion

  • Check for unnecessary round trips

  • Set expiration dates

  • Think before you redirect

  • Use compression

  • Edit your CSS

neXpert

neXpert is also from Microsoft. It’s an add-on to Fiddler (see the section called “Packet Sniffers” for more information about Fiddler). It uses Fiddler to gather information about the resources downloaded for a web page. neXpert analyzes this information against a set of performance best practices and produces a report of suggested improvements. neXpert goes further than other performance analyzers in that it predicts the impact these improvements might have on the web page’s load time. The list of performance best practices analyzed by neXpert includes the following:

  • HTTP response codes

  • Compression

  • ETags

  • Cache headers

  • Connection header

  • Cookies

Miscellaneous

The tools in this section address specific web performance areas not covered in the previous sections. I use all of these tools on a regular, if not daily, basis.

Hammerhead

Improving web performance requires measuring page load times. Although this sounds simple, in reality it’s extremely difficult to gather load time measurements in an accurate and statistically sound way that is representative of real-world users. There’s no single solution. Instead, multiple techniques are required, including measuring real-world traffic, bucket testing, and scripted or synthetic testing. The problem is that all of these techniques are costly, in terms of both dollars and calendar time.

I created Hammerhead to make it easier for developers to measure load times early in the development process. Hammerhead is an extension to Firebug. To test, or “hammer,” a set of web pages, enter the URLs into Hammerhead, along with the number of measurements desired. Figure A.5, “Hammerhead” shows an example.

Figure A.5. Hammerhead

Hammerhead

Hammerhead loads each URL the specified number of times and records each measurement, as well as the average and median load times. The pages are loaded with both an empty and a primed cache (Hammerhead manages the cache for you). Although Hammerhead measurements are gathered under just one set of test conditions (your development environment), they provide a quick and easy way to compare two or more web page alternatives.

Smush.it

Smush.it is a service for analyzing and optimizing images in your web page. It was created by Stoyan Stefanov and Nicole Sullivan, the authors of Chapter 10, Optimizing Images. Smush.it tells you how many bytes you can save by optimizing your images, as shown in Figure A.6, “Smush.it”. It even produces the optimized images for you as a single ZIP file for easy download. There is also a Smush.it bookmarklet and Firefox extension, so you can get similar functionality inside the browser.

Figure A.6. Smush.it

Smush.it

Cuzillion

Almost every day I wonder about or am asked about a performance edge case. Do external scripts load in parallel if there’s an inline script in between them? What if there’s an inline script and a stylesheet in between them? Is the behavior the same on Firefox 3.1 and Chrome 2.0?

Instead of writing a new HTML page for each edge case that comes up, I use Cuzillion, shown in Figure A.7, “Cuzillion”. It has a graphical web page “avatar” onto which you can drag-and-drop different types of resources (external scripts, inline scripts, stylesheets, inline style blocks, images, and iframes). Clicking on a resource exposes a variety of configuration settings such as the domain used for loading the resource and how long it takes to respond.

Figure A.7. Cuzillion

Cuzillion

I created Cuzillion while I was working on Chapter 4, Loading Scripts Without Blocking. I needed to test hundreds of test cases. Creating a test framework made this possible in a fraction of the time. The name comes from the tag line: “‘cuz there are a zillion pages to check.”

UA Profiler

When Google released Chrome, Dion Almaer (coauthor of Chapter 2, Creating Responsive Web Applications) asked whether I was going to review it from a performance perspective. Rather than put Chrome through the paces manually, I created a set of HTML pages, each of which contains a specific test: are scripts loaded in parallel, do prefetch links work, and so forth. I then chained those pages together so that the tests would all run automatically.

UA Profiler, shown in Figure A.8, “UA Profiler”, is this set of browser performance tests. In addition to providing a performance test suite for browsers, UA Profiler is also a repository for gathering test results to share with the larger web community. Anyone can point any web client (as long as it supports JavaScript) at UA Profiler and contribute another data point to the results database. By allowing the community to execute the tests, I avoid the cost of running a regression test lab, and also get results under a wider variety of test conditions.

Figure A.8. UA Profiler

UA Profiler

For web developers, UA Profiler is useful for confirming how a given browser will perform during a specific optimization. For example, if you’re adding future caching headers to a redirect but it still doesn’t seem to be cached, you can check UA Profiler to make sure you’re using a browser that supports redirect caching.

If you enjoyed this excerpt, buy a copy of Even Faster Web Sites .