Fetching a URL from a Perl Script
Problem
You have a URL that you want to fetch from a script.
Solution
Use the get
function from by the CPAN module
LWP::Simple, part of LWP.
use LWP::Simple; $content = get($URL);
Discussion
The right library makes life easier, and the LWP modules are the right ones for this task.
The get
function from LWP::Simple returns
undef
on error, so check for errors this way:
use LWP::Simple; unless (defined ($content = get $URL)) { die "could not get $URL\n"; }
When it’s run that way, however, you can’t determine the cause of the error. For this and other elaborate processing, you’ll have to go beyond LWP::Simple.
Example 20.1 is a program that fetches a document remotely. If it fails, it prints out the error status line. Otherwise it prints out the document title and the number of bytes of content. We use three modules from LWP and one other from CPAN.
- LWP::UserAgent
This module creates a virtual browser. The object returned from the new constructor is used to make the actual request. We’ve set the name of our agent to “Schmozilla/v9.14 Platinum” just to give the remote webmaster browser-envy when they see it in their logs.
- HTTP::Request
This module creates a request but doesn’t send it. We create a GET request and set the referring page to a fictitious URL.
- HTTP::Response
This is the object type returned when the user agent actually runs the request. We check it for errors and contents.
- URI::Heuristic
This curious little module uses Netscape-style guessing ...
Get Perl Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.