Programming with Perl Modules: Chapter 1

Introduction to Perl Modules and CPAN

What Are Packages?
So What's in a Name?
Packages and Symbol Tables
Package Constructors and Destructors
Use Versus Require
Object-Oriented Programming
Method Invocation
The CPAN Architecture

Perl modules are best described as batches of reusable code. Want to send an email message from your Perl program? You could write the code from scratch, or you can just use Net::SMTP. Want to give your script an elegant graphical interface? Take a look at the pTk module, which does just that.

The virtues extolled for Perl programmers are laziness, impatience, and hubris. Together, these admirable characteristics have led to the creation and use of many publicly accessible Perl modules. Because of laziness, programmers would rather write modules than repeat a procedure over and over (and would rather use modules written by other people than write new code from scratch). Because of impatience, programmers write consolidated code that is flexible enough to anticipate their future needs. And because of hubris, programmers share their triumphs with the rest of the Perl community and continually tweak their modules until they're the best they can be.

Recent Perl distributions include a variety of modules that perform a number of tasks, from parsing command-line arguments using the Getopt modules, to timing programs using Benchmark.

This chapter offers a conceptual overview of packages and modules in Perl and an introduction to the structure of the Comprehensive Perl Archive Network (CPAN).

If you're interested in writing your own Perl modules, refer to Chapter 13, Contributing to CPAN, which details the process of writing Perl modules and how to register with and distribute your contributions through CPAN.

What Are Packages?

Most people consider it rude to enter someone's home without knocking on the door. Even if you're a family member or close friend, you're probably imposing on someone's privacy if you don't alert them when you arrive.

Perl provides a mechanism to separate residents and guests, known as packages. A package can act like the front door to your house; you only invite people you know to come inside; you decide who can enter. People who live in your house extend the same courtesy to your neighbors by knocking before entering those people's homes.

Your residence and property might be compared to a package's namespace: when you buy a property, the mortgage is in your name--"Nathan owns this house."

So, what if I live in a duplex or condominium? The same applies. Although there may be 20 units in your building, each unit has its own address and door.

A package, then, is a namespace implementation that protects packages from affecting variables in other packages.

The extent of the effects of the package statement includes everything from the package declaration through the end of the enclosing block, eval, end of file, or declaration of another package--whichever comes first. A package statement affects only dynamic variables (globals, even when local()ized), not lexical variables (declared with my()).

So What's in a Name?

As mentioned above, a package starts with a package statement; let's work with a package named BushWhack:

package BushWhack;
Let's add a subroutine called lawn_kid():

sub lawn_kid {
    my $lk = shift;
    print("$lk is a lawn kid.\n");
The code compiled in package BushWhack can access lawn_kid() without fully qualifying its name:

lawn_kid('Joe'); # or
BushWhack::lawn_kid('Joe\'s sister Sue');
Now's let's add package LawnCare to the same file:

package LawnCare;
Bear in mind that it is confusing to have multiple packages in the same file. Look at this:

my$asleep = 151;
my $not_paying_attention = 20;
package DUH; print "$not_paying_attention, $asleep\n";
package WAKEUP; print "$not_paying_attention, $asleep\n";
Oops. The $not_paying_attention is visible in both pieces of code, because a package declaration only affects dynamics (globals), not lexicals (my()s). And both packages could have their own global $not_paying_attention, both accessing them as $DUH::not_paying_attention and $WAKEUP::not_paying_attention, respectively. But code compiled in those packages in a different scope (block, eval, file) can't get the lexical $not_paying_attention from the scope above. And a lexical can't be qualified with a package namespace.

The awful_chemical() subroutine is in package LawnCare:

sub awful_chemical {
        my $ac = shift;
        print("$ac is a(n) awful chemical.\n");
This function, however, can't be called from BushWhack in the same way. To call awful_chemical() from BushWhack, give it the package name where the subroutine lives:

Otherwise, you'll get an undefined subroutine error.

You are able to create a BushWhack::LawnCare package. Symbols are local to the current package or qualified from the outer package down. In other words, there is no place in BushWhack where $LawnCare::BushWhack::variable refers to $BushWhack::variable.

Packages and Symbol Tables

A package's namespace is a symbol table. The name of your package is stored in a hash named after your package with two colons appended to it. If you name a package BushWhack, its symbol table name is %BushWhack::. Packages are represented as %main:: or %:: in the symbol table by default. Since we're dealing with a hash, each key must have a value. Because keys are identifiers, values are the corresponding typeglob values; globs are pretty efficient because they do the symbol table lookups at compile-time.

In other words, *BushWhack represents the value of %BushWhack::--see the following:

local *low_flyer = *BushWhack::variable; # compile time
local *low_flyer = *BushWhack::{"variable"}; # run time
You can look up all the keys and variables of a package with this example. You may use undef() on these to clear their memory, and they will be reported as undefined. You shouldn't undefine anything here unless you don't plan to load these packages again. Because the memory has already been filled, it saves time when you load them if you leave them defined:[1]

foreach $symbol_name (sort keys %BushWhack::) {
        local *local_sym = $BushWhack::{$symbol_name};
        print "\$$symbol_name is defined\n"
        print "\@$symbol_name is defined\n"
        print "\%$symbol_name is defined\n"

Package Constructors and Destructors

The BEGIN and END routines are constructors and destructors. A BEGIN subroutine is executed immediately; it's a way for the compiler to make a call into the interpreter.

Even if you have a subroutine call that appears before BEGIN, BEGIN still executes first:

package MakeRoom;
sub call_me_now { print "I'm gonna be first!\n"; } # umm, no
BEGIN { print "See, told you that I'd be first.\n"; }
END { print "th-th-th-that's all folks.\n"; }
and outputs:

See, told you that I'd be first.
I'm gonna be first!
th-th-th-that's all folks.
Multiple BEGIN blocks are executed in the order they have been defined:

package Repetition;
sub call_me_now { 
        # err, no
        print "Hey, I *said* that I was going to be first!\n";
BEGIN { print "Yeah, I'm first!\n"; }
BEGIN { print "And I'm next!\n"; }
END { print "Well, you've got nothing to complain about - I'm ",
                "last\n"; }
This outputs:

Yeah, I'm first!
And I'm next!
Hey, I *said* that I was going to be first!
Well, you've got nothing to complain about - I'm last
You can't call BEGIN; it's undefined as soon as it's finished running. Any code it uses returns to Perl's memory pool.

The END subroutine does what it says. Code contained in an END subroutine is executed when the interpreter is exiting; even if the interpreter is exiting because of a die().

A program can have multiple END statements, where the last END is executed until the first END is reached:[2]

END { print "Am I *really* first?\n"; }
$random_file = 'some_file.ext';
open(FOO, $random_file)
    or die("can't open $random_file: $!");
END { print "Am I *really* second?\n"; }
This program outputs:

can't open some_file.ext: No such file or directory at myscript line 4.
Am I *really* second?
Am I *really* first?


Unlike languages like C++ or Java, Perl doesn't use an explicit class declaration. A module may work like a class if you implement its subroutines as methods. Packages can derive methods from other packages by including the other package's name in its @ISA array.

So, what's a module? A module is a package stored in a file with the same name; it is intended to be reused. Modules can export symbols into their caller's package. Symbols don't need to be explicitly exported. Class modules can also export their symbols but typically should not.

Regardless of the mechanism you use to write a module or any other goodies, such as exporting symbols or creating objects, Perl modules have a .pm extension.[3]

Since you probably won't be writing your modules to be pragmas (compiler directives), you should capitalize module names. Since we must use the package name as its filename, such as Nathan::LastName, we'll use a filename like Nathan/ In this example, we'll be discussing Some::Module, contributed by Tom Christiansen.

Create a file called Some/, and insert the following into it:

package Some::Module;  # assumes Some/
use strict;
   use Exporter   ();
   use vars       qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS);
   # set the version for version checking
   $VERSION     = 1.00;
   @ISA         = qw(Exporter);
   @EXPORT      = qw(&func1 &func2 &func3);
   %EXPORT_TAGS = ( );     # eg: TAG => [ qw!name1 name2! ],
   # your exported package globals go here,
   # as well as any optionally exported functions
   @EXPORT_OK   = qw($Var1 %Hashit);
use vars      @EXPORT_OK;
# nonexported package globals go here
use vars      qw(@more $stuff);
# initialize package globals, first exported ones
$Var1   = '';
%Hashit = ();
# then the others (which are still accessible as $Some::Module::stuff)
$stuff  = '';
@more   = ();
# all file-scoped lexicals must be created before
# the functions below that use them.
# file-private lexicals go here
my $priv_var    = '';
my %secret_hash = ();
# here's a file-private function as a closure,
# callable as &$priv_func;  it cannot be prototyped.
my $priv_func = sub {
# stuff goes here.
# make all your functions, whether exported or not;
# remember to put something interesting in the {} stubs
sub func1      {}    # no prototype
sub func2()    {}    # proto'd void
sub func3($$)  {}    # proto'd to 2 scalars
# this one isn't exported, but could be called!
sub func4(\%)  {}    # proto'd to 1 hash ref
END { }       # module clean-up code here (global destructor)
Let's look at this example more closely.

  1. After the package declaration and enabling of the strict pragma, we use a BEGIN to initialize @EXPORT_OK; we'll need to use vars @EXPORT_OK later.

    BEGIN {
       use Exporter   ();
    We use the Exporter module as if we're going to import symbols, but we're not importing anything. When you use use(), you're telling the package not to import any symbols into your current package. In this case, we're not importing any symbols from Exporter, just loading it in at compile-time.

  2. Next, we bring in some globals the Exporter needs.

       use vars       qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS);
       # set the version for version checking
       $VERSION     = 1.00;
    @ISA tells the interpreter where to look for a method that can't be found in the current package; this is how Perl handles inheritance. One class (package) recursively inherits methods from all classes (packages) listed in its @ISA array. @ISA contains names of other packages; the packages are searched (depth-first, recursively) for the methods the interpreter is seeking. For example, we'll be using Exporter to handle importing.

       @ISA         = qw(Exporter);
  3. Now, let's tell what we'll be exporting. In this case, it's both functions (&func1, &func2, &func4) and exported package globals and/or functions ($Var1, %Hashit, &func3).

       @EXPORT      = qw(&func1 &func2 &func4);
       %EXPORT_TAGS = ( );     # eg: TAG => [ qw!name1 name2! ],
       # your exported package globals go here,
       # as well as any optionally exported functions
       @EXPORT_OK   = qw($Var1 %Hashit &func3);
  4. Finally, it's time to add the nonexported globals, lexicals, and function prototypes, and the global destructor.

    END { }

Use Versus Require

The use statement implies a BEGIN block. The library module is loaded and symbols imported as soon as the use statement is compiled (even before the rest of the file). use allows modules to declare subroutines that are visible as list operators to the rest of the file. More important, it also makes visible prototypes from the module subroutines from that point onward. Of course, prototypes are compile-time only, so are ignored on method calls.

In other words, use and require (which is used for reading Perl libraries) are two different things.

You can use Perl modules in your program with:

use Module;
use Module LIST;
This is not the same as:

require "Module";
require "";
If you use require, you aren't importing anything from the module unless you explicitly make the module accessible. If you choose not to use use, you must do something like the following:

BEGIN { require ""; import Module; }

BEGIN { require ""; import Module LIST; }
Let's say you have a module called TestModule containing the function test_me_out(). If you choose to use require, you need to make TestModule accessible in order to call its functions and die:

require TestModule;
$value = TestModule::test_me_out();
You can't employ require with TestModule in this fashion. Doing so results in a function that doesn't exist, main::test_me_out(), being called:

require TestModule;
$value = test_me_out(); # wrong!
You can use use to import the names from TestModule and then call test_me_out():

use TestModule;
$value = test_me_out();

Object-Oriented Programming

Stop me if you've heard this one before:

[Language name here] is a revolutionary object-oriented programming language...


It makes your life easier when you're trying to generate a canvas filled with bouncing heads.

What does this have to do with object-oriented programming? And if anything, what does this have to do with Perl?

You can use Perl modules for object-oriented programming (OOP), but this doesn't mean you'll need to write (or even rewrite) your modules with object-oriented methodology in mind. Let's put Perl modules and OOP into perspective:

  • An object is simply a reference that happens to know which class it belongs to.

  • A class is simply a package that happens to provide methods to deal with object references.

  • A method is simply a subroutine that expects an object reference (or a package name, for class methods) as the first argument.

An Object Is Simply a Reference

Unlike C++ or Java, Perl doesn't have a predefined syntax for constructors. Perl constructors must allocate new memory, whereas C++'s constructors are just initializing memory already allocated when they're called. Object-oriented Perl modules use a subroutine that returns a reference to something "blessed" into a class as a constructor--with bless(). bless() marks a reference with a default package so the interpreter can look there for method definitions.

Here's a minimal case:

package FrothyMug;
sub new { bless {} }
{} returns a reference to a new anonymous hash, an empty one with no key/value pairs. When {} is bless()ed, it's telling the object it references that it's a FrothyMug and returns the reference whatever has been blessed. The referenced object is aware that it has been blessed.

sub new {
   my $self = {};
   bless $self;
   return $self;
You must use the two-argument form of bless if you plan on dealing with inheritance (which you probably will do sooner or later):

sub new {
   my $class = shift;
   my $self = {};
   bless $self, $class;
   return $self;
Remember the package examples we showed before? The function trolling() in a package GoFish can only be called if it's been fully qualified. Other packages must call this function (if they're allowed) with GoFish::trolling().

The scenario is similar here. A package's methods treat the reference as any other reference. Outside the package, the reference should only be accessed through the package's methods

A Class Is Simply a Package

C++ and Java use class declarations; Perl does not. You create a class by putting subroutine definitions and a package declaration into a file. Yes, it's that easy.

The interpreter uses @ISA (see previous section, "Modules") to search for missing methods. If you change @ISA or add new subroutines, Perl needs to look up the method again because the cache has been changed. If Perl still can't find a method in @ISA, it does the lookup in UNIVERSAL, but if you have defined an (object's package) AUTOLOAD routine, AUTOLOAD is called instead of the missing method. If Perl can't find an AUTOLOAD routine, it checks (object's @ISA) AUTOLOAD. Finally, it checks for the method in UNIVERSAL AUTOLOAD; if this fails, the program exits with an error that the method can't be found.

Perl only does method inheritance, that is, interface inheritance. Access to instance data is left to the class. This isn't a problem because most classes' objects use an anonymous hash, which is very much like the grassy areas in the heartland of the United States--the anonymous hash acts like a grassy field where herds of cattle (other classes) come to graze.

A Method Is Simply a Subroutine

Perl doesn't use any special syntax for method definition; a method is a subroutine. A method's first argument will be the object or package that invokes it:

meth Class;
meth $obj;
meth $obj_or_clasname;
Class and instance methods

Class and object methods could be static and instance methods, except that static is a fighting word in the Perl community. Class methods expect the name of the class as the first argument passed to the method. The constructor is an example of a class method. Class methods may simply ignore the first argument because the package of their caller is irrelevant.

You can also use a class method to look up an object by name:

sub find_my_object_by_name {
   my ($class, $name) = @_;
An instance method expects an object as its first argument. Typically it shifts the first argument into a self or this variable, and then uses that as an ordinary reference:

sub display_widget {
   my $self = shift;
   my @keys = @_ ? @_ : sort keys %$self;
   foreach $key (@keys) {
      print "\t$key => $self->{$key}\n";

Method Invocation

There are two ways to invoke a method; we'll cover both of them in this section. Let's say that we have two statements:

$object = method Class "Whatever";
method_2 $object 'Param 1', 'Param 2';
We can combine these statements into one with a BLOCK in the indirect object slot:

method_2 { method Class "Whatever" } 'Param 1', 'Param 2';
Those of you who salivate over C++ (or even at perl -e 'print "\007";') will probably like the -> notation that does the same as the above. You'll need to use parentheses if you'll be passing any arguments:

$object = Class->method("Whatever");
$fred->display('Param 1', 'Param 2');
Yes, the parentheses are important. Freedom is nice, but it's not always appropriate to let things hang out, particularly when this causes your program to act unreliably. You should probably avoid coding techniques such as:

$parrot = Bird->noisy("Shh"), 'be', 'quiet';
$parrot->shoot(times => 5), pain => 'likely';
And shamefully I must admit that:

m1 $ob->m2;
parses as:

not as:


The CPAN Architecture

The Comprehensive Perl Archive Network represents the development interests of a cross-section of the Perl community, including Perl utilities, modules, documentation, and (of course) the Perl distribution itself. CPAN was created by Jarkko Hietaniemi and Andreas Koenig.

The Perl Resource Kit contains a complete CPAN distribution, so access to the Perl modules discussed in this book is at your fingertips. See the accompanying Perl Utilities Guide for more information on how to install modules from the Perl Resource Kit CD

CPAN Mirrors

The design of CPAN is to support and maintain many identified sites, or mirrors, across the globe. This ensures that anyone with an Internet connection can have reliable access to its contents at any time. Since the structure of all CPAN sites are the same, a user searching for the current version of Perl can be sure that the latest.tar.gz file is the same on every site.

Users can directly connect to a CPAN site if they know the specific address. However, to facilitate the use of CPAN, site maintainers have also developed a multiplexor to automate the downloading of materials from CPAN sites. If a user visiting a multiplexed site (such as selects a file, the multiplexor connects to the CPAN site best suited to the user.


How Is CPAN Organized?

CPAN materials are categorized by Perl modules, distributions, documentation, announcements, ports, scripts, and contributing authors. Each category is linked with related categories. For example, links to a graphing module written by an author appears in both the CPAN modules and author areas.

Most CPAN materials are distributed "tar-gzipped." tar and gzip are popular UNIX data-archiving formats. Non-UNIX-based users must download software that extracts tar files first. A version of such software for Microsoft Windows 95 is Winzip, which is available from

Since CPAN provides the same offerings worldwide, the directory structure has been standardized so files can be located in the same location in the directory hierarchy at all CPAN sites. All CPAN sites use CPAN as the root directory, from which the user can select a specific Perl item. The CPAN snapshot that appears on your CD-ROM contains the same directory structure, starting with a CPAN directory.

From the CPAN directory you have the following choices:

Current directory is CPAN
CPAN.html       An HTML formatted CPAN info page
ENDINGS         Describes what the ".tgz" file extensions mean
MIRRORED.BY     A list of sites mirroring CPAN
MIRRORING.FROM  A list of sites mirroring CPAN
README          A brief description of what you'll find on CPAN
README.html     An HTML formatted version of the README file
RECENT          Recent additions to the CPAN site
RECENT.DAY      Recent additions to the CPAN site (daily)
RECENT.html     An HTML formatted list of recent additions
RECENT.WEEK     Recent additions to the CPAN site (weekly)
ROADMAP         What you'll find on CPAN and where
ROADMAP.html    An HTML formatted version of ROADMAP
SITES           An exhaustive list of CPAN sites
SITES.html      An HTML formatted version of SITES
authors         A list of CPAN authors
clpa            An archive of comp.lang.perl.announce
doc             Various Perl documentation, FAQs, etc.
indices         All that is indexed.
latest.tar.gz   The latest Perl distribution sources
misc            Misc Perl stuff like Larry Wall quotes and gifs
modules         Modules for Perl version 5
other-archives  Other things yet uncategorized
ports           Various Perl ports
scripts         Various scripts appearing in Perl books
src             The Perl sources from various versions
The directory we're most concerned with is modules. It categorizes modules in three ways:

by-author   Modules organized by author's registered CPAN name
by-category Modules categorized by subject matter (see below)
by-module   Modules categorized by namespace (i.e., MIME)
In CPAN, Perl modules are currently organized into 21 categories. Each category is linked to contributors and related modules. The modules chosen for discussion in this book fit into many of these categories:

Once you've chosen the area from which you'd like to download a module, you should tell your ftp client to request a directory listing for the area. You'll find a list of files in the directory; tar files have a .tar.gz extension and README files have a .readme extension.

Here's a sample directory listing from a CPAN site:

CGI-Response-0.03.readme@       CGI_Imagemap-1.00.readme@
CGI-Response-0.03.tar.gz@       CGI_Imagemap-1.00.tar.gz@
CGI-modules-2.75.tar.gz@        DOUGM@
CGI-modules-2.76.readme@        LDS@
CGI-modules-2.76.tar.gz@        MGH@             MIKEH@             MUIR@             SHGUN@
If your ftp client supports inline viewing of files on an ftp server, select the .readme file of the most current archive and review its contents carefully. README files often give special instructions about building the module; they obtain other modules needed for proper functioning and they inform you if the module can't be built under certain versions of Perl.

How Do I Install the Module?

Most system administrators install popular software so that it can be executed globally. When you log in to your account, your system administrator might even announce software installations or upgrades in the login message. Perl modules can also follow this pattern. Since many Perl modules are useful to everyone, the modules are installed so they can be used globally, generally in a branch of the lib directory with the rest of the Perl libraries.

If you have root privileges or write access to the locations where Perl modules are installed on your system, you can easily follow these steps when installing most modules:

perl Makefile.PL
make test
make install
If you don't have write permission to global areas (e.g., if you have your UNIX account with an ISP), you'll probably have to install your modules locally. You might also install modules locally if you wish to test a module in your home directory before installing for the world at large. To install a module locally, you must pass the PREFIX argument to Perl when generating a Makefile from Makefile.PL. The PREFIX argument tells MakeMaker to use the directory following PREFIX as the base directory when installing the module.

For example, to install a module in the directory /home/nvp/Perl/Modules, the PREFIX argument would look like:

perl Makefile.PL PREFIX=/home/nvp/Perl/Modules
Then you would follow the same steps as above:

make test
make install
You now have one more step. Since Perl generally looks in systemwide areas for modules, it won't find local modules unless you tell Perl where to find them. Otherwise, you'll receive an error message like the following:

Can't locate <ModuleName>.pm in @INC.
BEGIN failed--compilation aborted.
For example, if the module has been installed in /home/nvp/Perl/Modules, you need to tell Perl to look in that location with use lib 'path':

#!/usr/local/bin/perl -w
use lib '/home/nvp/Perl/Modules';
use ModuleName;

Where Is the Module Documented?

Many of the modules you'll be interested in are covered in this book. However, there is also often documentation that is provided by the module author itself, written in a special format called pod. Most of the pod documentation for CPAN modules is printed in the the Perl Module Reference, Volumes 1 and 2.

"Pod" stands for "plain old documentation." If you are familiar with mark-up languages like HTML, you won't have a difficult time understanding pod. Pod-formatted files contain plain text represented by special tags in a Perl module or script that doesn't require an interpreter to be read by humans. Pod tags are not interpreted when the script is executed; programmers may use pod tags as multiline comments. You'll find several examples of auto-generating pod tags in Chapter 13, Contributing to CPAN.

Pod files are installed into a subdirectory of the Perl lib directory, Pod, which contains the base manpages included with your Perl distribution. You can view these pages by using the perldoc command or by converting the pod file to the format of your choice by using one of the pod2XXX tools (e.g., pod2text, which converts to plain text; pod2html, which converts to HTML; or pod2man, which converts to standard manpage (n/troff) format).

For the nonstandard modules installed on your system, you can also use the perldoc command. For example:

perldoc CGI
shows the pod documentation for the module.

Most modules were also distributed with manpages formatted in nroff or troff, in which case you can use the man command. If the man command fails, either the manpages have failed to have been installed or the location of the Perl manpages is not in your MANPATH. If the latter is the case, ask your system administrator to add the location of the Perl manpages to your system's global manpath.

How Do I Know What Modules Are Installed on My System?

Each time a module is installed globally, information gets appended to perllocal.pod. This file contains the date, the location, the linktype (dynamic versus static), and the version of the module installed, as well as information about any executables installed with the module. You can parse this file using one of the pod-conversion tools previously mentioned. You can also use the CPAN setup tool discussed in Chapter 2 of the Perl Utilities Guide.

[1] Warning: this counterintuitive behavior of defined() on aggregates may be changed, fixed, or broken in a future release of Perl.

[2] ENDs can be circumvented by signals that you have to trap on your own.

[3] A module can also call dynamically linked executables or autoload subroutines associated with the module, but this is transparent to the user.

Return to Table of Contents

Copyright 1998, O'Reilly & Associates, Inc.