Encode Text for URLs

Make sure the text in XML/HTTP queries is valid for URLs.

One thing to keep in mind when making XML/HTTP requests is that they behave exactly like URLs for web pages. This means that spaces and symbols need to be encoded. Spaces aren’t allowed in URLs, so anything after a space could be disregarded by the server. Also, characters like ampersands (&), question marks (?), and number signs (#) give directions to the server about how the URL should be processed. So if you’re doing an XML/HTTP Amazon ArtistSearch for a band like Kruder & Dorfmeister, you’ve got trouble—the spaces and ampersand will break the request. But you can translate the characters into a URL-friendly format.

Technically, you can encode these characters by using the percent sign (%) followed by their hexadecimal numeric values. The numeric value for a space is 20, so a space is represented as %20 in a URL. Spaces can also be escaped as plus signs (+) for many systems, including Amazon’s. Here are some commonly escaped characters and their encoded values:

Ampersand (&)

%26

Question mark (?)

%3F

Number sign (#)

%23

Comma (,)

%2C

Colon (:)

%3A

The ArtistSearch mentioned will only work if the band name is encoded as Kruder%20%26%20Dorfmeister. Doing this by hand each time you make a request is out of the question. Luckily, this is such a common task that most programming environments have built-in functions to handle this for you.

The Code

Here are few common ways to escape text ...

Get Amazon Hacks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.