
The CodeThis PHP script
combines two different services with screen scraping. It looks for
all of the cities within a particular zip code at the U.S. Postal
Service web site. Then it finds matching purchase circles at Amazon
and provides links to them. Create
zip_circle.php
with the following code: <?php
$strZip = $_GET['zipcode'];
$zipPage = "";
$indexPage = "";
$cntCity = 0;
$myCity = "";
set_time_limit(60);
//get a certain number of characters
function left($str_in,$num) {
return ereg_replace("^(.{1,$num})[ .,].*","\\1", $str_in);
}
function findCircle($city) {
$indexPage = "";
//Find purchase cirlce brose codes
$abc = "a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z";
$a_abc = split(",", $abc);
$thisLetter = strtolower(substr($city,0,1));
for($j = 0;$j < count($a_abc)-1; $j++) {
if ($thisLetter == $a_abc[$j]) {
$thisCode = 226120 + ($j + 1);
}
}
$url = "http://www.amazon.com/exec/obidos/tg/cm/browse-communities/-/" .
$thisCode . "/t/";
$contents = fopen($url,"r");
do {
$data = fread ($contents, 4096);
if (strlen($data) == 0) {
break;
}
$indexPage .= $data;
} while(1);
fclose ($contents);
$k = 0;
if (preg_match_all('/i>.*?<a href=/exec/obidos/tg/browse/-/(.*?)/t/&return;
.*?>(.*?)</a>.*?<l/s',$indexPage,$cities)) {
foreach ($cities[2] as $cityName) {
if (strtolower($city) == strtolower($cityName)) {
$link = "http://www.amazon.com/exec/obidos/tg/cm/&return;
browse-communities/-/";
$link = $link . $cities[1][$k];
$link = "<a href=" . $link . ">";
$link = $link . $cityName . "</a>";
return $link;
}
$k++;
} // foreach
return "No purchase circle found for " . $city;
} //if
} //function
//Get cities associated with zip codes from USPS
$url = "http://www.usps.com/zip4/zip_response.jsp?zipcode=" . $strZip;
$contents = fopen($url,"r");
while (!feof ($contents))
$zipPage .= fgets($contents, 4096);
fclose ($contents);
if (preg_match_all('/<tr valign="top" bgcolor=".*?">(.*?)</tr>/
s',$zipPage,$cityState)) {
foreach ($cityState[0] as $cs) {
$cntCity++;
if (preg_match_all('/\n\n(.*?)</font></td>\n/',$cs,$c)) {
foreach ($c[1] as $d) {
$myCity = $myCity . $d . ", ";
}
$myCity = $myCity . Chr(8);
$myCity = ereg_replace(", ".chr(8), chr(8), $myCity);
} else {
echo "city not found.";
}
}
}
echo "<h2>Purchase Circles</h2>";
$a_myCity = split(chr(8), $myCity);
for($i = 0;$i < count($a_myCity)-1; $i++) {
$thisCity = $a_myCity[$i];
echo findCircle($thisCity) . "<br>";
}
?>
Scraping both the USPS site and Amazon can take some time, so the
time limit for the script has been increased to 60 seconds. This script relies on the fact that certain purchase circle pages
have predictable IDs. Armed with the knowledge that every city that
starts with A is in Purchase Circle 226121, we can assume that cities
that start with B will be in Purchase Circle ID
226122, etc. As cities are found at the USPS
site, their first letter is matched up with a Purchase Circle ID, and
the page is scraped to find an exact match.
Showing messages 1 through 1 of 1.
-
Doesn't work anymore
2004-07-06 09:56:19
Bud1960
[View]
|
Showing messages 1 through 1 of 1.
|
|
O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website:
| Customer Service:
| Book issues:
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
|
|
Is there a way to masquerade your URL as theirs?