O'Reilly Hacks
oreilly.comO'Reilly NetworkSafari BookshelfConferences Sign In/My Account | View Cart   
Book List Learning Lab PDFs O'Reilly Gear Newsletters Press Room Jobs  


 
Buy the book!
Amazon Hacks
By Paul Bausch
August 2003
More Info

HACK
#45
Find Purchase Circles by Zip Code
Combining two different Web Services can create a new feature
The Code
[Discuss (1) | Link to this hack]

The Code

This PHP script combines two different services with screen scraping. It looks for all of the cities within a particular zip code at the U.S. Postal Service web site. Then it finds matching purchase circles at Amazon and provides links to them. Create zip_circle.php with the following code:

<?php
$strZip = $_GET['zipcode'];
$zipPage = "";
$indexPage = "";
$cntCity = 0;
$myCity = "";

set_time_limit(60);

//get a certain number of characters
function left($str_in,$num) {
    return ereg_replace("^(.{1,$num})[ .,].*","\\1", $str_in); 
}

function findCircle($city) {
    $indexPage = "";
    //Find purchase cirlce brose codes
    $abc = "a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z";
    $a_abc = split(",", $abc);
    $thisLetter = strtolower(substr($city,0,1));
    for($j = 0;$j < count($a_abc)-1; $j++) {
        if ($thisLetter == $a_abc[$j]) {
            $thisCode = 226120 + ($j + 1);
        }
    }
    $url = "http://www.amazon.com/exec/obidos/tg/cm/browse-communities/-/" . 
           $thisCode . "/t/";
    $contents = fopen($url,"r");
    do {
       $data = fread ($contents, 4096);                                     
       if (strlen($data) == 0) {                                            
          break;                                                             
       }                                                                    
       $indexPage .= $data;                                                 
   } while(1);
    fclose ($contents);
    $k = 0;
    if (preg_match_all('/i>.*?<a href=/exec/obidos/tg/browse/-/(.*?)/t/&return;
.*?>(.*?)</a>.*?<l/s',$indexPage,$cities)) {
      foreach ($cities[2] as $cityName) {
        if (strtolower($city) == strtolower($cityName)) {
          $link = "http://www.amazon.com/exec/obidos/tg/cm/&return;
browse-communities/-/";
          $link = $link . $cities[1][$k];
          $link = "<a href=" . $link . ">";
          $link = $link . $cityName . "</a>";
          return $link;
        }
        $k++;
       } // foreach
       return "No purchase circle found for " . $city;
    } //if
} //function

//Get cities associated with zip codes from USPS
$url = "http://www.usps.com/zip4/zip_response.jsp?zipcode=" . $strZip;
$contents = fopen($url,"r");
while (!feof ($contents))
    $zipPage .= fgets($contents, 4096);
fclose ($contents);

if (preg_match_all('/<tr valign="top" bgcolor=".*?">(.*?)</tr>/
s',$zipPage,$cityState)) {
    foreach ($cityState[0] as $cs) {
        $cntCity++;
        if (preg_match_all('/\n\n(.*?)</font></td>\n/',$cs,$c)) {
            foreach ($c[1] as $d) {
                $myCity = $myCity . $d . ", ";
            }
        $myCity = $myCity . Chr(8);
        $myCity = ereg_replace(", ".chr(8), chr(8), $myCity);
        } else {
            echo "city not found.";
        }
    }
}

echo "<h2>Purchase Circles</h2>";

$a_myCity = split(chr(8), $myCity);
for($i = 0;$i < count($a_myCity)-1; $i++) {
    $thisCity = $a_myCity[$i];
    echo findCircle($thisCity) . "<br>";
}
?>

Scraping both the USPS site and Amazon can take some time, so the time limit for the script has been increased to 60 seconds.

This script relies on the fact that certain purchase circle pages have predictable IDs. Armed with the knowledge that every city that starts with A is in Purchase Circle 226121, we can assume that cities that start with B will be in Purchase Circle ID 226122, etc. As cities are found at the USPS site, their first letter is matched up with a Purchase Circle ID, and the page is scraped to find an exact match.


O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website: | Customer Service: | Book issues:

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.