An undocumented but powerful feature of Google’s search and API is the ability to search within a particular date range.
Before delving into the actual use of date-range searching, there are a few things you should understand. The first is this: a date-range search has nothing to do with the creation date of the content and everything to do with the indexing date of the content. If I create a page on March 8, 1999, and Google doesn’t get around to indexing it until May 22, 2002, for the purposes of a date-range search, the date in question is May 22, 2002.
The second thing is that Google can index pages several times, and
each time it does so the date on it changes. So
don’t count on a date-range search staying
consistent from day to day. The daterange:
timestamp
can change when a page is indexed more
than one time. Whether it does change depends on whether the content
of the page has changed.
Third, Google doesn’t “stand behind” the results of a search done using the date-range syntaxes. So if you get a weird result, you can’t complain to them. Google would rather you use the date-range options on their advanced search page, but that page allows you to restrict your options only to the last three months, six months, or year.
Why would you want to search by daterange:
? There
are several reasons:
It narrows down your search results to fresher content. Google might find some obscure, out-of-the-way page and index it only once. Two years later this obscure, never-updated page is still turning up in your search results. Limiting your search to a more recent date range will result in only the most current of matches.
It helps you dodge current events. Say John Doe sets a world record for eating hot dogs and immediately afterward rescues a baby from a burning building. Less than a week after that happens, Google’s search results are going to be filled with John Doe. If you’re searching for information on (another) John Doe, babies, or burning buildings, you’ll scarcely be able to get rid of him.
However, you can avoid Mr. Doe’s exploits by setting the date-range syntax to before the hot dog contest. This also works well for avoiding recent, heavily covered news events such as a crime spree or a forest fire and annual events of at least national importance such as national elections or the Olympics.
It allows you to compare results over time; for example, if you want to search for occurrences of “Mac OS X” and “Windows XP” over time.
Of course, a count like this isn’t foolproof; indexing dates change over time. But generally it works well enough that you can spot trends.
Using the daterange:
syntax is as simple as:
daterange:startdate-enddate
The catch is that the date must be expressed as a
Julian
date, a continuous count of days since noon UTC on January 1, 4713
BC. So, for example, July 8, 2002 is Julian date 2452463.5 and May
22, 1968 is 2439998.5. Furthermore, Google isn’t
fond of decimals in its daterange:
queries; use
only integers: 2452463 or 2452464 (depending on whether you prefer to
round up or down) in the previous example.
Tip
There are plenty of places you can convert Julian dates online.
We’ve found a couple of nice converters at the U.S.
Naval Observatory Astronomical Applications Department (http://aa.usno.navy.mil/data/docs/JulianDate.html)
and Mauro Orlandini’s home page (http://www.tesre.bo.cnr.it/~mauro/JD/), the
latter converting either Julian to Gregorian or vice versa. More may
be found via a Google search for julian
date
(http://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&q=julian+date).
You can use the daterange:
syntax with most other
Google special syntaxes, with the exception of the
link:
syntax, which doesn’t
mix [Hack #8]
well with other special syntaxes [Section 1.5] and the
Google’s Special Collections [Chapter 2] (e.g., stocks:
and
phonebook:
).
daterange:
does wonders for narrowing your search
results. Let’s look at a couple of examples. Geri
Halliwell left the Spice Girls around May 27, 1998. If you wanted to
get a lot of information about the breakup, you could try doing a
date search in a ten-day window—Say, May 25 to June 4. That
query would look like this:
"Geri Halliwell" "Spice Girls" daterange:2450958-2450968
At this writing, you’ll get about two dozen results,
including several news stories about the breakup. If you wanted to
find less formal sources, search for Geri
or
Ginger
Spice
instead of
Geri
Halliwell
.
That example’s a bit on the silly side, but you get the idea. Any event that you can clearly divide into before and after dates—an event, a death, an overwhelming change in circumstances—can be reflected in a date-range search.
You can also use an individual event’s date to change the results of a larger search. For example, former ImClone CEO Sam Waksal was arrested on June 12, 2002. You don’t have to search for the name Sam Waskal to get a very narrow set of results for June 13, 2002:
imclone daterange:2452439-2452439
Similarly, if you search for imclone
before the
date of 2452439, you’ll get very different results.
And as an interesting exercise, try a search that reflects the
arrest, only date it a few days before the actual arrest:
imclone investigated daterange:2452000-2452435
This is a good way to find information or analysis that predates the actual event, but that provides background that might help explain the event itself. (Unless you use the date-range search, usually this kind of information is buried underneath news of the event itself.)
But what about narrowing your search results based on content creation date?
Searching for materials based on content creation is difficult. There’s no standard date format (score one for Julian dates), many people don’t date their pages anyway, some pages don’t contain date information in their header, and still other content management systems routinely stamp pages with today’s date, confusing things still further.
We can offer few suggestions for searching by content creation date. Try adding a string of common date formats to your query. If you wanted something from May 2003, for example, you could try appending:
("May * 2003" | "May 2003" | 05/03 | 05/*/03)
A query like that uses up most of your ten-query limit, however, so
it’s best to be judicious—perhaps by cycling
through these formats one a time. If any one of these is giving you
too many results, try restricting your search to the
title
tag of the page.
If you’re feeling really lucky you can search for a
full date, like May 9, 2003. Your decision then is if you want to
search for the date in the format above or as one of many variations:
9 May 2003
, 9/5/2003
,
9 May 03
, and so forth. Exact-date searching will
severely limit your results and shouldn’t be used
except as a last-ditch option.
When using date-range searching, you’ll have to be flexible in your thinking, more general in your search than you otherwise would be (because the date-range search will narrow your results down a lot), and persistent in your queries because different dates and date ranges will yield very different results. But you’ll be rewarded with smaller result sets that are focused on very specific events and topics.
Get Google Hacks now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.