Listing: go-ogle
#!/usr/bin/perl
use Socket;
$|++;
open(NG,"ngrep -lqi '(GET|POST).*/(search|find)' |");
print "Go ogle online.\n";
my ($go,$i) = 0;
my %host = ( );
while(<NG>) {
if(/^T (\d+\.\d+.\d+\.\d+):\d+ -> (\d+\.\d+\.\d+\.\d+):80/) {
$i = inet_aton($1);
$host{$1} ||= gethostbyaddr($i, AF_INET) || $1;
$i = inet_aton($2);
$host{$2} ||= gethostbyaddr($i, AF_INET) || $2;
print "$host{$1} -> $host{$2} : ";
$go = 1;
next;
}
if(/(q|p|query|for)=(.*)?(&|HTTP)/) {
next unless $go;
my $q = $2;
$q =~ s/(\+|&.*)/ /g;
$q =~ s/%(\w+)/chr(hex($1))/ge;
print "$q\n";
$go = 0;
}
}
I call the script
go-ogle. This will run an
ngrep looking for any GET or POST request that
includes search or find
somewhere in the URL. The results look something like this:
Go ogle online.
caligula.nocat.net -> www.google.com : o'reilly mac os x conference
caligula.nocat.net -> s1.search.vip.scd.yahoo.com : junk mail $$$
tiberius.nocat.net -> altavista.com : babel fish
caligula.nocat.net -> 166-140.amazon.com : Brazil
livia.nocat.net -> 66.161.12.119 : lart
It will unescape encoded strings in the query (note the ` in the google query and the
$$$ from yahoo). It will also convert IP addresses to hostnames for
you (since ngrep doesn't seem
to have that feature, probably so it can optimize capturing for
speed). The last two results are interesting: the Brazil query was
actually run on http://www.imdb.com/, and the
last one was to http://www.dictionary.com/.
Evidently IMDB is now in a partnership with Amazon, and
Dictionary.com's search machine
doesn't have a PTR record. It's
amazing how much you can learn about the world by watching other
people's packets.
Note that you must be root to run ngrep, and for
best results, it should be run from the router at the edge of your
network.
regex compile: Unmatched ) or \)
on linux-2.6.16-gentoo-r9
Pepa