O'Reilly Hacks
oreilly.comO'Reilly NetworkSafari BookshelfConferences Sign In/My Account | View Cart   
Book List Learning Lab PDFs O'Reilly Gear Newsletters Press Room Jobs  



HACK
#37
Find the Largest Page
We all know about Feeling Lucky with Google. But how about Feeling Large?
The Code
[Discuss (0) | Link to this hack]

The Code

Save the following code as a CGI script ["How to Run the Hacks" in the Preface] named goolarge.cgi in your web server's cgi-bin directory. Be sure to replace insert key here with your Google API key.

#!/usr/local/bin/perl
# goolarge.cgi
# A take-off on "I'm Feeling Lucky," redirects the browser to the largest
# (size in K) document found in the first n results.  n is set by number
# of loops x 10 results per.
# goolarge.cgi is called as a CGI with form input
     
# Your Google API developer's key.
my $google_key='insert key here';
     
# Location of the GoogleSearch WSDL file.
my $google_wdsl = "./GoogleSearch.wsdl";
     
# Number of times to loop, retrieving 10 results at a time.
my $loops = 10;
     
use strict;
     
use SOAP::Lite;
use CGI qw/:standard/;
     
# Display the query form.
unless (param('query')) {
  print
    header( ),
    start_html("GooLarge"),
    h1("GooLarge"),
    start_form(-method=>'GET'),
    'Query: ', textfield(-name=>'query'),
    '   ',
    submit(-name=>'submit', -value=>"I'm Feeling Large"),
    end_form( ), p( );
}
     
# Run the query.
else {
  my $google_search  = SOAP::Lite->service("file:$google_wdsl");
  my($largest_size, $largest_url);
     
  for (my $offset = 0; $offset <= $loops*10; $offset += 10) {
     
    my $results = $google_search -> 
      doGoogleSearch(
        $google_key, param('query'), $offset, 
        10, "false", "",  "false", "", "latin1", "latin1"
      );
     
    @{$results->{'resultElements'}} or print p('No results'), last;
     
    # Keep track of the largest size and its associated URL.
    foreach (@{$results->{'resultElements'}}) {
      substr($_->{cachedSize}, 0, -1) > $largest_size and
        ($largest_size, $largest_url) = 
        (substr($_->{cachedSize}, 0, -1), $_->{URL});
    }
  }
     
  # Redirect the browser to the largest result.
  print redirect $largest_url;
}


O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website: | Customer Service: | Book issues:

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.