O'Reilly Hacks
oreilly.comO'Reilly NetworkSafari BookshelfConferences Sign In/My Account | View Cart   
Book List Learning Lab PDFs O'Reilly Gear Newsletters Press Room Jobs  


 
Buy the book!
Yahoo! Hacks
By Paul Bausch
October 2005
More Info

HACK
#90
Compare the Popularity of Related Search Terms
By gluing together two different kinds of Yahoo! Web Services requests, you can find the most widely used alternate search requests for any given topic
The Code
[Discuss (0) | Link to this hack]

The Code

This hack relies on several nonstandard Perl modules that you might need to install before you can run the script. As with most examples in this book, LWP::Simple makes API requests and XML::Simple parses the response. In addition, URI::Escape is used to encode queries so they can be used in a URL, and Number::Format is used as a quick way to add commas to large numbers.

Because this script makes trips to the Yahoo! server in different ways, the code for accessing Yahoo! has been encapsulated into the query_yahoo( ) function, which returns a parsed XML response based on the parameters sent to it.

TIP

The function accepts a results value that tells Yahoo! how many search results to return. By changing this value, you can find more or fewer related search terms.

Save the following code to a file called pop_related.pl:

	#!/usr/bin/perl
	# pop_related.pl
	# Accepts a search term, finds related search terms, queries
	# Yahoo! to find the total results available for each, and
	# prints a report with the popularity of each related term.
	# Usage: pop_related.pl <query>
	#
	# You can create an AppID, and read the full documentation
	# for Yahoo! Web Services at http://developer.yahoo.net/
	
	use strict;
	use LWP::Simple;
	use XML::Simple;
	use URI::Escape; 
	use Number::Format;

	# Set your unique Yahoo! Application ID 
	my $appID = "insert your app ID";

	# Grab the incoming search query 
	my $query = join(' ', @ARGV) or die "Usage: pop_related.pl <query>\n";

	# Initialize some variables 
	my ($final_related, $final_total);

	# Define the file header 
	format STDOUT_TOP= 
				Related Search Terms

	query: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 
		   $query
	---------------------------------------------
	Related Search				Total
	---------------------------------------------
	.

	# Define the line-item details
	format STDOUT=
	@<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @>>>>>>>>
	$final_related, $final_total
	.	

	# Make the API call with query_yahoo() function
	my $yahoo_xml = &query_yahoo($appID,"relatedSuggestion",25,$query);

	# Initialize results array
	my @popresults;

	# Loop through the items returned, printing them out
	foreach my $related (@{$yahoo_xml->{Result}}) { 
		my $query = uri_escape($related); 
		my $query = "\"$query\"";

	# Make the API call with query_yahoo() function 
	my $yahoo_xml = &query_yahoo($appID,"webSearch",1,$query);

	# Grab Total Available Results for related term
	my $total = $yahoo_xml->{totalResultsAvailable};

	# Store in a hash, add to results array
	my $thisRelated = {
				   related => $related,
				   total => $total,
	}; 
	push @popresults, $thisRelated; 
}
	# Sort the array,
	@popresults = sort({ $$b{total} <=> $$a{total} } @popresults);

	# And print the results
	for my $pop(@popresults) {
		$final_related = $$pop{related};
		$final_total = $$pop{total};
		my $x = new Number::Format();
		$final_total = $x->format_number($final_total,2);

		write;
	}

	# This function assembles a Y!WS URL and retuns a parsed response
	sub query_yahoo () {
		my ($appID,$type,$results,$query) = @_;
	
		# Construct a Yahoo! Search Query with only required options
		my $req_url = "http://api.search.yahoo.com/";
			$req_url .= "WebSearchService/V1/$type?";
			$req_url .= "appid=$appID";
			$req_url .= "&query=$query";
			$req_url .= "&results=$results";

		# Make the request
		my $yahoo_response = get($req_url);

		# Parse the XML
		my $xmlsimple = XML::Simple->new();
		return $xmlsimple->XMLin($yahoo_response);
	}

The response is formatted with the built-in format method and its write command. The format is set for standard output on the command line, or STDOUT, but could easily be switched to a Perl filehandle if you'd rather print the results to a text file automatically.


O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website: | Customer Service: | Book issues:

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.