Web DevCenter    
 Published on Web DevCenter (http://www.oreillynet.com/javascript/)
 See this if you're having trouble printing code examples


JSON: Appendix E - JavaScript: The Good Parts

by Douglas Crockford

Farewell: the leisure and the fearful time Cuts off the ceremonious vows of love And ample interchange of sweet discourse, Which so long sunder'd friends should dwell upon: God give us leisure for these rites of love! Once more, adieu: be valiant, and speed well!

--William Shakespeare, The Tragedy of Richard the Third
JavaScript: The Good Parts book cover

This excerpt is from JavaScript: The Good Parts . This authoritative book scrapes away these bad features to reveal a subset of JavaScript that's more reliable, readable, and maintainable than the language as a whole-a subset you can use to create truly extensible and efficient code.

buy button

JavaScript Object Notation (JSON) is a lightweight data interchange format. It is based on JavaScript's object literal notation, one of JavaScript's best parts. Even though it is a subset of JavaScript, it is language independent. It can be used to exchange data between programs written in all modern programming languages. It is a text format, so it is readable by humans and machines. It is easy to implement and easy to use. There is a lot of material about JSON at http://www.JSON.org/.

JSON Syntax

JSON has six kinds of values: objects, arrays, strings, numbers, booleans (true and false), and the special value null. Whitespace (spaces, tabs, carriage returns, and newline characters) may be inserted before or after any value. This can make JSON texts easier for humans to read. Whitespace may be omitted to reduce transmission or storage costs.

A JSON object is an unordered container of name/value pairs. A name can be any string. A value can be any JSON value, including arrays and objects. JSON objects can be nested to any depth, but generally it is most effective to keep them relatively flat. Most languages have a feature that maps easily to JSON objects, such as an object, struct, record, dictionary, hash table, property list, or associative array.

The JSON array is an ordered sequence of values. A value can be any JSON value, including arrays and objects. Most languages have a feature that maps easily onto JSON arrays, such as an array, vector, list, or sequence.

A JSON string is wrapped in double quotes. The \ character is used for escapement. JSON allows the / character to be escaped so that JSON can be embedded in HTML <script> tags. HTML does not allow the sequence </ except to start the </script> tag. JSON allows <\/, which produces the same result but does not confuse HTML.

JSON numbers are like JavaScript numbers. A leading zero is not allowed on integers because some languages use that to indicate the octal. That kind of radix confusion is not desirable in a data interchange format. A number can be an integer, real, or scientific.

That's it. That is all of JSON. JSON's design goals were to be minimal, portable, textual, and a subset of JavaScript. The less we need to agree on in order to interoperate, the more easily we can interoperate.

[
    {
        "first": "Jerome",
        "middle": "Lester",
        "last": "Howard",
        "nick-name": "Curly",
        "born": 1903,
        "died": 1952,
        "quote": "nyuk-nyuk-nyuk!"
    },
    {
        "first": "Harry",
        "middle": "Moses",
        "last": "Howard",
        "nick-name": "Moe",
        "born": 1897,
        "died": 1975,
        "quote": "Why, you!"
    },
    {
        "first": "Louis",
        "last": "Feinberg",
        "nick-name": "Larry",
        "born": 1902,
        "died": 1975,
        "quote": "I'm sorry. Moe, it was an accident!"
    }
]

Using JSON Securely

JSON is particularly easy to use in web applications because JSON is JavaScript. A JSON text can be turned into a useful data structure with the eval function:

var myData = eval('(' + myJSONText + ')');

(The concatenation of the parentheses around the JSON text is a workaround for an ambiguity in JavaScript's grammar.)

The eval function has horrendous security problems, however. Is it safe to use eval to parse a JSON text? Currently, the best technique for obtaining data from a server in a web browser is through XMLHttpRequest. XMLHttpRequest can obtain data only from the same server that produced the HTML. eval ing text from that server is no less secure than the original HTML. But, that assumes the server is malicious. What if the server is simply incompetent?

An incompetent server might not do the JSON encoding correctly. If it builds JSON texts by slapping together some strings rather than using a proper JSON encoder, then it could unintentionally send dangerous material. If it acts as a proxy and simply passes JSON text through without determining whether it is well formed, then it could send dangerous material again.

The danger can be avoided by using the JSON.parse method instead of eval (see http://www.JSON.org/json2.js). JSON.parse will throw an exception if the text contains anything dangerous. It is recommended that you always use JSON.parse instead of eval to defend against server incompetence. It is also good practice for the day when the browser provides safe data access to other servers.

There is another danger in the interaction between external data and innerHTML. A common Ajax pattern is for the server to send an HTML text fragment that gets assigned to the innerHTML property of an HTML element. This is a very bad practice. If the HTML text contains a <script> tag or its equivalent, then an evil script will run. This again could be due to server incompetence.

What specifically is the danger? If an evil script gets to run on your page, it gets access to all of the state and capabilities of the page. It can interact with your server, and your server will not be able to distinguish the evil requests from legitimate requests. The evil script has access to the global object, which gives it access to all of the data in the application except for variables hidden in closures. It has access to the document object, which gives it access to everything that the user sees. It also gives the evil script the capability to dialog with the user. The browser's location bar and all of the anti-phishing chrome will tell the user that the dialog should be trusted. The document object also gives the evil script access to the network, allowing it to load more evil scripts, or to probe for sites within your firewall, or to send the secrets it has learned to any server in the world.

This danger is a direct consequence of JavaScript's global object, which is far and away the worst part of JavaScript's many bad parts. These dangers are not caused by Ajax or JSON or XMLHttpRequest or Web 2.0 (whatever that is). These dangers have been in the browser since the introduction of JavaScript, and will remain until JavaScript is replaced. Be careful.

A JSON Parser

This is an implementation of a JSON parser in JavaScript:

var json_parse = function () {

// This is a function that can parse a JSON text, producing a JavaScript
// data structure. It is a simple, recursive descent parser.

// We are defining the function inside of another function to avoid creating
// global variables.

     var at,     // The index of the current character
         ch,     // The current character
         escapee = {
             '"':  '"',
             '\\': '\\',
             '/':  '/',
             b:    'b',
             f:    '\f',
             n:    '\n',
             r:    '\r',
             t:    '\t'
         },
         text,

         error = function (m) {

// Call error when something is wrong.

             throw {
                 name:    'SyntaxError',
                 message: m,
                 at:      at,
                 text:    text
             };
         },

         next = function (c) {

// If a c parameter is provided, verify that it matches the current character.

             if (c && c !== ch) {
                 error("Expected '" + c + "' instead of '" + ch + "'");
             }

// Get the next character. When there are no more characters,
// return the empty string.

             ch = text.charAt(at);
             at += 1;
             return ch;
         },

         number = function () {

// Parse a number value.

             var number,
                 string = '';

             if (ch === '-') {
                 string = '-';
                 next('-');
             }
             while (ch >= '0' && ch <= '9') {
                 string += ch;
                 next();
             }
             if (ch === '.') {
                 string += '.';
                 while (next() && ch >= '0' && ch <= '9') {
                     string += ch;
                 }
             }
             if (ch === 'e' || ch === 'E') {
                 string += ch;
                 next();
                 if (ch === '-' || ch === '+') {
                     string += ch;
                     next();
                 }
                 while (ch >= '0' && ch <= '9') {
                     string += ch;
                     next();
                 }
             }
             number = +string;
             if (isNaN(number)) {
                 error("Bad number");
             } else {
                 return number;
             }
         },

         string = function () {

// Parse a string value.

             var hex,
                 i,
                 string = '',
                 uffff;

// When parsing for string values, we must look for " and \ characters.

             if (ch === '"') {
                 while (next()) {
                     if (ch === '"') {
                         next();
                         return string;
                     } else if (ch === '\\') {
                         next();
                         if (ch === 'u') {
                             uffff = 0;
                             for (i = 0; i < 4; i += 1) {
                                 hex = parseInt(next(), 16);
                                 if (!isFinite(hex)) {
                                     break;
                                 }
                                 uffff = uffff * 16 + hex;
                             }
                             string += String.fromCharCode(uffff);
                         } else if (typeof escapee[ch] === 'string') {
                             string += escapee[ch];
                         } else {
                             break;
                         }
                     } else {
                         string += ch;
                     }
                 }
             }
             error("Bad string");
         },

         white = function () {

// Skip whitespace.

             while (ch && ch <= ' ') {
                 next();
             }
         },

         word = function () {

// true, false, or null.

             switch (ch) {
             case 't':
                 next('t');
                 next('r');
                 next('u');
                 next('e');
                 return true;
             case 'f':
                 next('f');
                 next('a');
                 next('l');
                 next('s');
                 next('e');
                 return false;
             case 'n':
                 next('n');
                 next('u');
                 next('l');
                 next('l');
                 return null;
             }
             error("Unexpected '" + ch + "'");
         },

         value,  // Place holder for the value function.

         array = function () {

// Parse an array value.

             var array = [];

             if (ch === '[') {
                 next('[');
                 white();
                 if (ch === ']') {
                     next(']');
                     return array;   // empty array
                 }
                 while (ch) {
                     array.push(value());
                     white();
                     if (ch === ']') {
                         next(']');
                         return array;
                     }
                     next(',');
                     white();
                 }
             }
             error("Bad array");
         },

         object = function () {

// Parse an object value.

             var key,
                 object = {};

             if (ch === '{') {
                 next('{');
                 white();
                 if (ch === '}') {
                     next('}');
                     return object;   // empty object
                 }
                 while (ch) {
                     key = string();
                     white();
                     next(':');
                     object[key] = value();
                     white();
                     if (ch === '}') {
                         next('}');
                         return object;
                     }
                     next(',');
                     white();
                 }
             }
             error("Bad object");
         };

     value = function () {

// Parse a JSON value. It could be an object, an array, a string, a number,
// or a word.

         white();
         switch (ch) {
         case '{':
             return object();
         case '[':
             return array();
         case '"':
             return string();
         case '-':
             return number();
         default:
             return ch >= '0' && ch <= '9' ? number() : word();
         }
     };

// Return the json_parse function. It will have access to all of the above
// functions and variables.

     return function (source, reviver) {
         var result;

         text = source;
         at = 0;
         ch = ' ';
         result = value();
         white();
         if (ch) {
             error("Syntax error");
         }

// If there is a reviver function, we recursively walk the new structure,
// passing each name/value pair to the reviver function for possible
// transformation, starting with a temporary boot object that holds the result
// in an empty key. If there is not a reviver function, we simply return the
// result.

         return typeof reviver === 'function' ?
             function walk(holder, key) {
                 var k, v, value = holder[key];
                 if (value && typeof value === 'object') {
                     for (k in value) {
                         if (Object.hasOwnProperty.call(value, k)) {
                             v = walk(value, k);
                             if (v !== undefined) {
                                 value[k] = v;
                             } else {
                                 delete value[k];
                             }
                         }
                     }
                 }
                 return reviver.call(holder, key, value);
             }({'': result}, '') : result;

     };
}();

If you enjoyed this excerpt, buy a copy of JavaScript: The Good Parts .

Copyright © 2009 O'Reilly Media, Inc.