O'Reilly Hacks
oreilly.comO'Reilly NetworkSafari BookshelfConferences Sign In/My Account | View Cart   
Book List Learning Lab PDFs O'Reilly Gear Newsletters Press Room Jobs  


 
Buy the book!
Amazon Hacks
By Paul Bausch
August 2003
More Info

HACK
#33
Sort Your Recommendations by Average Customer Rating
Find the highest rated items among your Amazon product recommendations
The Code
[Discuss (1) | Link to this hack]

The Code

Because Amazon doesn't offer sorting by customer rating, this script first gathers all of your Amazon book recommendations into one list. By providing your Amazon account email address and password, the script logs in as you, and then requests the book recommendations page. It continues to request pages in a loop, picking out the details of your product recommendations with regular expressions. Once all the products and details are stored in an array, they can be sorted by star rating and printed out in any order wanted—in this case, the average star rating.

Be sure to replace your email address and password in the proper places below. You'll also need to have write permission in the script's directory so you can store Amazon cookies in a text file, cookies.lwp.

#!/usr/bin/perl
# get_recommendations.pl
#
# A script to log on to Amazon, retrieve
# recommendations, and sort by highest rating.
# Usage: perl get_recommendations.pl

use warnings;
use strict;
use HTTP::Cookies;
use LWP::UserAgent;

# Amazon email and password.
my $email = 'insert Amazon account email';
my $password = 'insert Amazon account password';

# Amazon login URL for normal users.
my $logurl = "http://www.amazon.com/exec/obidos/flex-sign-in-done/";

# Now login to Amazon.
my $ua = LWP::UserAgent->new;
$ua->agent("(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)");
$ua->cookie_jar( HTTP::Cookies->new('file' => 'cookies.lwp','autosave' &return;
=> 1));
my %headers = ( 'content-type' => "application/x-www-form-urlencoded" );
$ua->post($logurl, [ email       => $email,
          password    => $password,
          method      => 'get', 
          opt         => 'oa',
          page        => 'recs/instant-recs-sign-in-standard.html',
          response  => 'tg/recs/recs-post-login-dispatch/-/recs/pd_rw_gw_r',
          'next-page' => 'recs/instant-recs-register-standard.html',
          action      => 'sign-in' ], %headers);

# Set some variables to hold
# our sorted recommendations.
my (%title_list, %author_list);
my (@asins, @ratings, $done);

# We're logged in, so request the recommendations.
my $recurl = "http://www.amazon.com/exec/obidos/tg/". 
             "stores/recs/instant-recs/-/books/0/t";

# Set all Amazon recommendations in
# an array /  title and author in hashes.
until ($done) {

     # send the request for the recommendations
     my $content = $ua->get($recurl)->content;
     #print $content;

     # loop through the HTML looking for matches.
     while ($content =~ m!<td colspan=2 width=100%>.*?detail/-/(.*?)&return;
/ref.*?<b>(.*?)</b>.*?by (.*?)\n.*?Average Customer Review&#58;.*?(.*?)&return;
out of 5 stars.*?<td colspan=3><hr noshade size=1></td>!mgis) {
         my ($asin,$title,$author,$rating) = ($1||'',$2||'',$3||'',$4||'');
         $title  =~ s!<.+?>!!g;          # drop HTML tags.
         $rating =~ s!\n!!g;             # remove newlines.
         $rating =~ s! !!g;              # remove spaces.
         $title_list{$asin} = $title;    # store the title.
         $author_list{$asin} = $author;  # and the author.
         push (@asins, $asin);           # and the ASINs.
         push (@ratings, $rating);       # and th... OK!
     }

     # see if there are more results... if so continue the loop
     if ($content =~ m!<a href=(.*?instant-recs.*?)>more results.*?</a>!i) {
        $recurl = "http://www.amazon.com$1";# reassign the URL.
     } else { $done = 1; } # nope, we're done.
}

# sort the results by highest star rating and print!
for (reverse sort { $ratings[$a] <=> $ratings[$b] } 0..$#ratings) {
    next unless $asins[$_]; # skip blanks.
    print "$title_list{$asins[$_]}  ($asins[$_])\n" . 
          "by $author_list{$asins[$_]} \n" .
          "$ratings[$_] stars.\n\n";
}


O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website: | Customer Service: | Book issues:

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.