Create a comma-delimited file from a list of phone numbers returned by Google.
Just because Google’s API doesn’t support the phonebook: [Hack #17] syntax doesn’t mean that you can’t make use of Google phonebook data.
This simple Perl script takes a page of Google
phonebook:
results and produces a comma-delimited
text file suitable for import into Excel or your average database
application. The script doesn’t use the Google API,
though, because the API doesn’t yet support
phonebook lookups. Instead, you’ll need to run the
search in your trusty web browser and save the results to your
computer’s hard drive as an HTML file. Point the
script at the HTML file and it’ll do
its thing.
Which results should you save? You have two choices depending on which syntax you’re using:
If you’re using the
phonebook:
syntax, save the second page of results, reached by clicking the “More business listings...” or “More residential listings...” links on the initial results page.If you’re using the
bphonebook:
orrphonebook:
syntax, simply save the first page of results. Depending on how many pages of results you have, you might have to run the program several times.
Because this program is so simple, you might be tempted to plug this code into a program that uses LWP::Simple to automatically grab result pages from Google, automating the entire process. You should know that accessing Google with automated queries outside of the Google API is against their Terms of Service.
#!/usr/bin/perl # phonebook2csv # Google Phonebook results in CSV suitable for import into Excel # Usage: perl phonebook2csv.pl < results.html > results.csv # CSV header print qq{"name","phone number","address"\n}; my @listings = split /<hr size=1>/, join '', <>; foreach (@listings[1..($#listings-1)]) { s!\n!!g; # drop spurious newlines s!<.+?>!!g; # drop all HTML tags s!"!""!g; # double escape " marks print '"' . join('","', (split /\s+-\s+/)[0..2]) . "\"\n"; }
Run the script from the command line, specifying the phonebook
results HTML filename and name of the CSV file you wish to create or
to which you wish to append additional results. For example, using
results.html
as our input and
results.csv
as our output:
$ perl phonebook2csv.pl < results.html > results.csv
Leaving off the >
and CSV filename sends the
results to the screen for your perusal:
$ perl phonebook2csv.pl < results.html
"name","phone number","address"
"John Doe","(555) 555-5555","Wandering, TX 98765"
"Jane Doe","(555) 555-5555","Horsing Around, MT 90909"
"John and Jane Doe","(555) 555-5555","Somewhere, CA 92929"
"John Q. Doe","(555) 555-5555","Freezing, NE 91919"
"Jane J. Doe","(555) 555-5555","1 Sunnyside Street, "Tanning, FL 90210""
"John Doe, Jr.","(555) 555-5555","Beverly Hills, CA 90210"
"John Doe","(555) 555-5555","1 Lost St., Yonkers, NY 91234"
"John Doe","(555) 555-5555","1 Doe Street, Doe, OR 99999"
"John Doe","(555) 555-5555","Beverly Hills, CA 90210"
Using a double >>
before the CSV filename
appends the current set of results to the CSV file, creating it if it
doesn’t already exist. This is useful for combining
more than one set of results, represented by more than one saved
results page:
$ perl phonebook2csv.pl < results_1.html > results.csv $ perl phonebook2csv.pl < results_2.html >> results.csv
Get Google Hacks now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.