
|
|
|
Summarize Results by Domain
Get an overview of the sorts of domains
(educational, commercial, foreign, and so forth) found in the results
of a Google query
The Code
[Discuss (0) | Link to this hack] |
The CodeSave the code as suffixcensus.cgi, a CGI script
["How to Run the Hacks" in the
Preface] on your web server: #!/usr/local/bin/perl
# suffixcensus.cgi
# Generates a snapshot of the kinds of sites responding to a
# query. The suffix is the .com, .net, or .uk part.
# suffixcensus.cgi is called as a CGI with form input.
# Your Google API developer's key.
my $google_key='insert key here';
# Location of the GoogleSearch WSDL file.
my $google_wdsl = "./GoogleSearch.wsdl";
# Number of times to loop, retrieving 10 results at a time.
my $loops = 10;
use SOAP::Lite;
use CGI qw/:standard *table/;
print
header( ),
start_html("SuffixCensus"),
h1("SuffixCensus"),
start_form(-method=>'GET'),
'Query: ', textfield(-name=>'query'),
' ',
submit(-name=>'submit', -value=>'Search'),
end_form( ), p( );
if (param('query')) {
my $google_search = SOAP::Lite->service("file:$google_wdsl");
my %suffixes;
for (my $offset = 0; $offset <= $loops*10; $offset += 10) {
my $results = $google_search ->
doGoogleSearch(
$google_key, param('query'), $offset, 10, "false", "", "false",
"", "latin1", "latin1"
);
last unless @{$results->{resultElements}};
map { $suffixes{ ($_->{URL} =~ m#://.+?\.(\w{2,4})/#)[0] }++ }
@{$results->{resultElements}};
}
print
h2('Results: '), p( ),
start_table({cellpadding => 5, cellspacing => 0, border => 1}),
map( { Tr(td(uc $_),td($suffixes{$_})) } sort keys %suffixes ),
end_table( );
}
print end_html( );
Be sure to replace insert key here with
your Google API key.
|
O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website:
| Customer Service:
| Book issues:
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
|
|