Take a peek at the iTunes Music Store metadata and use the metadata for your own web applications.
Apple’s iTunes Music Store (iTMS) is more than just a place to buy DRM-restricted songs for $0.99 apiece; it is also a massive audio information repository. This searchable database contains loads of valuable metadata about each song track and album—song and album name, publication date, and record label, to name but a few—not to mention a free 30-second preview for each track and a thumbnail image of each CD cover. Of course, a major limitation is that to search this trove of information you need the iTunes application, which means you need to be sitting in front of a Mac OS X or Windows 2000/XP machine. Here are just some of the possible actions that this limitation precludes:
Browsing the iTunes Music Store from your cell phone
Querying the iTMS from Linux or (gasp!) Mac OS 9
Borrowing thumbnail images and preview clips for use in your own applications
Crosschecking iTunes tracks against RIAA Radar (http://www.magnetbox.com/riaa/) to avoid buying RIAA-owned tracks (see http://www.riaa.com)
Developing a full-blown iTMS client for your favorite platform
There are many other desirable actions like these, all singing the same refrain: it sure would be nice to be able to access the iTMS from anywhere.
Thankfully, there is a solution, and it’s called (appropriately enough) iTMS4-ALL.
iTMS-4-ALL (http://hcsoftware.sourceforge.net/jason-rohrer/itms4all/; free) is a Perlbased CGI script that lets you search the iTMS from any web browser. In addition to being a useful search tool in its own right, the script serves as an example of how to interact with Apple’s Store server.
You can download and run iTMS-4-ALL on your own web server or just take it for a spin at http://itunes.punboy.net/cgi-bin/itms4all.pl. Figure 4-64 shows iTMS-4-ALL in action.
Before diving into the code for this hack, let’s examine the details of the iTMS protocol. How does your iTunes client communicate with Apple’s Music Store server? What kind of information is exchanged that might be useful to us? Here is what we know so far:
iTunes communicates with Apple almost exclusively through HTTP.
iTunes authentication (logging in so you can actually buy something) happens not through HTTP, but instead through HTTPS. For some reason, iTunes will not direct its HTTPS requests through a web proxy, even though other applications (such as Internet Explorer) will.
iTunes fetches gzipped (i.e., compressed using the GZIP format) XML files from Apple to lay out its GUI (to display the storefront, genre pages, and search results).
Every gzipped XML file is encrypted with AES-128 (Rijndael) in CBC mode. The CBC initialization vector is included as one of the HTTP headers (
x-apple-crypto-iv
). In other words, you essentially need two 128-bit strings to decrypt the XML: the first one (the initialization vector) is provided right in the HTTP response, while the second one (the AES key) is supposed to be a secret shared by Apple’s server and your iTunes client.The secret AES key used by Apple and your iTunes client is
8a9dad399fb014c131be611820d78895
. This secret key is used over and over, though a fresh initialization vector is selected for each communication. (Sean Kasun gleaned this key from the iTunes application).
Fetching information from Apple (for example, searching for “Xiu Xiu,” a flamboyant post-rock band) involves the following steps:
iTunes sends the following HTTP (web) request to phobos.apple.com on port 80:
GET /WebObjects/MZSearch.woa/wa/com.apple.jingle.search.DirectAction/ search?term=Xiu%20Xiu HTTP/1.1 User-Agent: iTunes/4.2 (Macintosh; U; PPC Mac OS X 10.2) Accept-Language: en-us, en;q=0.50 Cookie: countryVerified=1 Accept-Encoding: gzip, x-aes-cbc Connection: close Host: phobos.apple.com
Apple responds with the following wodge of HTTP:
HTTP/1.1 200 Apple Date: Fri, 16 Apr 2004 13:55:07 GMT Content-Length: 4320 Content-Type: text/xml; charset=UTF-8 Cache-Control: no-cache Connection: close Server: Apache/1.3.27 (Darwin) Pragma: no-cache content-encoding: gzip, x-aes-cbc x-apple-max-age: 3600 x-apple-crypto-iv: 19953b75e9846ea59715be906cdca0c8 x-apple-protocol-key: 2 x-apple-asset-version: 2118 x-apple-application-instance: 20 Via: 1.1 netcache04 (NetCache NetApp/5.2.1R3) [-- encrypted gzip archive starts here --]
iTunes then initializes an AES-128 CBC cipher with its key (
8a9dad399fb014c131be611820d78895
) and the initialization vector provided by x-apple-crypto-iv (19953b75e9846ea59715be906cdca0c8
). iTunes decrypts the GZIP archive and then un-gzips it to get the raw XML. In other words, the decryption algorithm is initialized with two 128-bit strings (the AES key and the initialization vector) and then used to decode the encrypted data. After decryption, the data is still in GZIP-compressed form and needs to be decompressed before it can be used.
The full XML document for search results is too long to show
here (one example is 72 KB of text when uncompressed). The XML
includes lots of layout information, so Apple can change the way
results are displayed to the user without upgrading the iTunes client.
The dict
entries near the end of
the document contain information for each track matching your search.
These entries are dictionaries (think about
looking up something in the dictionary: you want a definition
associated with a particular word) that map various key names to
pieces of metadata. Here is an example dict
entry:
<dict> <key>kind</key><string>song</string> <key>artistName</key> <string>Xiu Xiu</string> <key>artistId</key><string>3208396</string> <key>bitRate</key><integer>128</integer> <key>buyParams</key><string>productType=S&salableAdamId=5390052&price=990* </string> <key>price</key><integer>990</integer> <key>copyright</key><string>_ 2004 5 Rue Christine</string> <key>dateModified</key><date>2004-03-10T06:44:25Z</date> <key>discCount</key><integer>1</integer> <key>discNumber</key><integer>1</integer> <key>duration</key><integer>179164</integer> <key>explicit</key><integer>0</integer> <key>fileExtension</key><string>m4p</string> <key>genre</key><string>Alternative</string> <key>genreId</key><integer>20</integer> <key>playlistName</key><string>Fabulous Muscles</string> <key>playlistArtistName</key><string>Xiu Xiu</string> <key>playlistArtistId</key><integer>3208396</integer> <key>playlistId</key><string>5390070</string> <key>previewURL</key><string>http://a1535.phobos.apple.com/Music/y2004* /m02/d06/h14/s05.ojrmonwq.p.m4p</string> <key>previewLength</key><integer>30</integer> <key>relevance</key><string>1.0</string> <key>releaseDate</key><string>2004-02-17T08:00:00Z</string> <key>sampleRate</key><integer>44100</integer> <key>songId</key><integer>5390052</integer> <key>comments</key><string></string> <key>trackCount</key><integer>10</integer> <key>trackNumber</key><integer>2</integer> <key>songName</key><string>I Luv the Valley OH!</string> <key>vendorId</key><integer>1143</integer> <key>year</key><integer>2004</integer> </dict>
Just look at all that lovely metadata! The album name (Fabulous Muscles
) is provided under the
playlistName
key, while the song
name (I Luv the Valley OH!
) is
tagged with the songName
key. Of
particular interest is the previewURL
, which in this case is
http://a1535.phobos.apple.com/Music/
y2004/m02/d06/h14/s05.ojrmonwq.p.m4p; this URL can be
fetched by any web browser (baked into iTunes or not) and played on most platforms (Mac,
Windows, Unix, etc.) using VideoLAN’s VLC media player (http://www.videolan.org; free).
In addition to the metadata included in each dict
entry, the search results also include
CD cover thumbnails, which appear in the XML as URLs for JPEG files.
In our example results, the cover JPEG for Fabulous
Muscles, shown in Figure 4-64, has the URL
http://a1.phobos.apple.com/Music/y2004/
m02/d06/h14/s05.kmxqqbbr.60x60-75.jpg. The current iTunes
Music Store incarnation includes up to four thumbnails with each set
of search results.
This is the protocol that iTunes uses to interact with the iTMS server, but how do you interact with the server sans iTunes? Here is where you get to start hacking.
With knowledge of the protocol in hand, you can now start writing code to fetch search results from Apple and access the XML-formatted metadata.
wget
is a
command-line agent for grabbing data off the Web. In general, if you
pass a URL to the wget
command,
wget
will download the contents
pointed to by the URL and save them to disk. wget
is standard issue on most Unix-like
platforms, including Mac OS X, and you can also download it for
Windows platforms from various sources (try Googling for “wget for
Windows”).
You can grab encrypted iTMS data from Apple yourself with wget
, but you need to specify an
iTunes User-Agent header to override wget
’s default User-Agent
header:
$ wget http://phobos.apple.com/WebObjects/MZSearch.woa/wa/ * com.apple.jingle.search.DirectAction/search?term=Xiu%20Xiu -U * "iTunes/4.0 (Macintosh; U; PPC Mac OS X 10.2)"
Of course, the fetched file is encrypted with AES, as described above. Unfortunately, there are no standardissue tools for decrypting these files, so we need to resort to some relatively simple Perl code to go any further.
To decrypt AES-128 CBC, you need two nonstandard Perl
modules: Crypt::CBC
and Crypt::Rijndael
. Both modules can be
downloaded from CPAN (http://www.cpan.org).
Tip
In case you are wondering, Rijndael is another name for AES, since the Rijndael algorithm was selected as the AES standard.
CBC.pm is pure Perl, but the Rijndael module must be compiled for your platform. Compilation instructions are included with the module package that you download from CPAN. Once installed, these modules can be included in your Perl program as follows:
use Crypt::CBC; use Crypt::Rijndael;
You can get the encryption initialization vector (IV) for the
x-apple-crypto-iv
HTTP header, as
described previously. Apple picks a fresh IV for each response, and
you must use the IV included with a response to decrypt that
response. Assume the IV is 19953b75e9846ea59715be906cdca0c8
. You can
set up variables for the key and IV as follows:
my $iTunesKeyHex = "8a9dad399fb014c131be611820d78895"; my $ivHex = "19953b75e9846ea59715be906cdca0c8";
The CBC module requires that both keys and IVs be in binary form, though we currently have them in hex-encoded form. We can pack our key and IV into binary form as follows:
my $iTunesKeyBinary = pack( "H*", $iTunesKeyHex ); my $ivBinary = pack( "H*", $ivHex );
Using these binary values, you can create a Rijndael CBC cipher as follows:
my $cipher = Crypt::CBC->new( { 'key' => $iTunesKeyBinary, 'cipher' => 'Rijndael', 'iv' => $ivBinary, 'regenerate_key' => 0, 'padding' => 'standard', 'prepend_iv' => 0 } );
You can think of this initialized cipher object as a black box
that takes encrypted data as input an outputs decrypted data.
Assuming that you have your encrypted GZIP data stored in a variable
called $encryptedSearchResults
,
you can finally decrypt the results as follows:
my $decryptedSearchResultsGZIP = $cipher->decrypt( $encryptedSearchResults );
Now, your results can be decompressed with GZIP, producing raw XML that you can peruse, parse, and otherwise enjoy.
iTMS-4-ALL is a Perl-based CGI script that pulls all of the aforementioned pieces together into a user-friendly package. The script can be installed on any web server that supports CGI and Perl and then accessed from any web browser. The user interface for searching the iTMS was shown earlier in Figure 4-64. If you want to explore the script right away, you can download the code from http://hcsoftware.sourceforge.net/ jason-rohrer/itms4all/. A live installation of the script is also available on that page, so you can search the iTMS from your browser without installing anything.
The HTML user interface generated by iTMS-4-ALL is basic by design: it works in all web browsers, including text-mode applications such as Lynx and the palmtop microbrowsers present on cell phones. Thus, iTMS-4-ALL not only unshackles iTMS searching from the officially supported iTunes platforms, it also enables searching away from the desktop. You can now browse the iTunes store while sitting on the bus.
Installing the script on your own web server is relatively painless. All necessary Perl modules are included with the download package, and a script is provided to compile the modules for your server’s platform. After running the compilation script, you need to copy the files into your web server’s cgi-bin directory. For example, if your server keeps CGI scripts in /httpd/cgi-bin, you would type:
cp –r itms4all.pl Crypt IO auto /httpd/cgi-bin
Finally, you need to make sure that your web server has permission to execute your script. For most common server setups, you can grant permission with the following command:
chmod o+x /httpd/cgi-bin/itms4all.pl
This command grants execution permission (x)
to the other
users (o)
, including your web
server. Now you are ready to test the script. If your server had the
address http://www.myserver.com, you could run
the script by pointing your browser to http://www.myserver.com/cgi-bin/itms4all.pl.
—Jason Rohrer
Get iPod and iTunes Hacks now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.