Perl has excellent tools for creating, testing, and distributing modules. On the other hand, Perl’s good for writing standalone programs that don’t need anything else to be useful. I want my programs to be able to use the module development tools and be testable in the same way as modules. To do this, I restructure my programs to turn them into modulinos.
Other languages aren’t as DWIM as Perl, and they make us create a
top-level subroutine that serves as the starting point for the
application. In C or Java, I have to name this subroutine main
:
/* hello_world.c */ #include <stdio.h> int main ( void ) { printf( "Hello C World!\n" ); return 0; }
Perl, in its desire to be helpful, already knows this and does it
for me. My entire program is the main
routine, which is how Perl ends up with the default package main
. When I run my Perl program, Perl starts to
execute the code it contains as if I had wrapped my main
subroutine around the entire file.
In a module most of the code is in methods or subroutines, so most
of it doesn’t immediately execute. I have to call a subroutine to make
something happen. Try that with your favorite module; run it from the
command line. In most cases, you won’t see anything happen. I can use
perldoc
’s -l
switch to locate the actual module file so I can run it to see
nothing happen:
$ perldoc -l Astro::MoonPhase /usr/local/lib/perl5/site_perl/5.8.7/Astro/MoonPhase.pm $ perl /usr/local/lib/perl5/site_perl/5.8.7/Astro/MoonPhase.pm
I can write my program as a module and then decide at runtime how to treat the code. If I run my file as a program, it will act just like a program, but if I include it as a module, perhaps in a test suite, then it won’t run the code and it will wait for me to do something. This way I get the benefit of a standalone program while using the development tools for modules.
My first step takes me backward in Perl evolution. I need to get
that main
routine back and then run it
only when I decide I want to run it. For simplicity, I’ll do this with a
“Just another Perl hacker” (JAPH) program, but develop something more complex later.
Normally, Perl’s version of “Hello World” is simple, but I’ve thrown
in package main
just for fun and use the string “Just another Perl hacker,”
instead. I don’t need that for anything other than reminding the next
maintainer what the default package is. I’ll use this idea later:
#!/usr/bin/perl package main; print "Just another Perl hacker, \n";
Obviously, when I run this program, I get the string as output. I don’t want that in this case though. I want it to behave more like a module so when I run the file, nothing appears to happen. Perl compiles the code, but doesn’t have anything to execute. I wrap the entire program in its own subroutine:
#!/usr/bin/perl package main; sub run { print "Just another Perl hacker, \n"; }
The print
statement won’t run
until I execute the subroutine, and now I have to figure out when to do
that. I have to know how to tell the difference between a program and a
module.
The caller
built-in tells me
about the call stack, which lets me know where I am in Perl’s descent into
my program. Programs and modules can use caller
, too; I don’t have to use it in a
subroutine. If I use caller
in the top
level of a file I run as a program, it returns nothing because I’m already
at the top level. That’s the root of the entire program. Since I know that
for a file I use as a module caller
returns something and that when I call the same file as a program caller
returns nothing, I have what I need to
decide how to act depending on how I’m called:
#!/usr/bin/perl package main; run() unless caller(); sub run { print "Just another Perl hacker, \n"; }
I’m going to save this program in a file, but now I have to decide
how to name it. Its schizophrenic nature doesn’t suggest a file extension,
but I want to use this file as a module later, so I could go along with
the module file-naming convention, which adds a .pm to the name. That way, I can use
it and Perl can find it just as it finds
other modules. Still, the terms program and
module get in the way because it’s really both. It’s
not a module in the usual sense, though, and I think of it as a tiny
module, so I call it a modulino.
Now that I have my terms straight, I save my modulino as Japh.pm. It’s in my current directory, so I
also want to ensure that Perl will look for modules there (i.e., it has
“.” in the search path). I check the behavior of my modulino. First, I use
it as a module. From the command line, I can load a module with the
-M
switch. I use a “null program,” which I specify with the -e
switch. When I load it as a module nothing
appears to happen:
$ perl -MJaph -e 0 $
Perl compiles the module and then goes through the statements it can
execute immediately. It executes caller
, which returns a list of the elements of
the program that loaded my modulino. Since this is true, the unless
catches it and doesn’t call run()
. I’ll do more with this in a
moment.
Now I want to run Japh.pm as a
program. This time, caller
returns
nothing because it is at the top level. This fails the unless
check and so Perl invokes the run()
and I see the output. The only difference
is how I called the file. As a module it does module things, and as a
program it does program things. Here I run it as a script and get
output:
$ perl Japh.pm Just another Perl hacker, $
Now that I have the basic framework of a modulino, I can take advantage of its benefits. Since my program doesn’t execute if I include it as a module, I can load it into a test program without it doing anything. I can use all of the Perl testing framework to test programs, too.
If I write my code well, separating things into small subroutines
that only do one thing, I can test each subroutine on its own. Since the
run
subroutine does its work by
printing, I use Test::Output
to capture standard output and compare the result:
use Test::More tests => 2; use Test::Output; use_ok( 'Japh' ); stdout_is( sub{ main::run() }, "Just another Perl hacker, \n" );
This way, I can test each part of my program until I finally put
everything together in my run()
subroutine, which now looks more like what I would expect from a program
in C, where the main
loop calls
everything in the right order.
There are a variety of ways to make a Perl distribution, and we
covered these in Chapter 15 of Intermediate Perl.
If I start with a program that I already have, I like to use my scriptdist
program, which is available on CPAN
(and beware, because everyone seems to write this program for themselves
at some point). It builds a distribution around the program based on
templates I created in ~/.scriptdist, so I can make the distro any
way that I like, which also means that you can make it any way that you
like, not just my way. At this point, I need the basic tests and a
Makefile.PL to control the whole
thing, just as I do with normal modules. Everything ends up in a
directory named after the program but with .d
appended to it. I typically don’t use that
directory name for anything other than a temporary placeholder since I
immediately import everything into source control. Notice I leave myself
a reminder that I have to change into the directory before I do the
import. It only took me a 50 or 60 times to figure that out:
$ scriptdist Japh.pm Home directory is /Users/brian RC directory is /Users/brian/.scriptdist Processing Japh.pm... Making directory Japh.pm.d... Making directory Japh.pm.d/t... RC directory is /Users/brian/.scriptdist cwd is /Users/brian/Dev/mastering_perl/trunk/Scripts/Modulinos Checking for file [.cvsignore]... Adding file [.cvsignore]... Checking for file [.releaserc]... Adding file [.releaserc]... Checking for file [Changes]... Adding file [Changes]... Checking for file [MANIFEST.SKIP]... Adding file [MANIFEST.SKIP]... Checking for file [Makefile.PL]... Adding file [Makefile.PL]... Checking for file [t/compile.t]... Adding file [t/compile.t]... Checking for file [t/pod.t]... Adding file [t/pod.t]... Checking for file [t/prereq.t]... Adding file [t/prereq.t]... Checking for file [t/test_manifest]... Adding file [t/test_manifest]... Adding [Japh.pm]... Copying script... Opening input [Japh.pm] for output [Japh.pm.d/Japh.pm] Copied [Japh.pm] with 0 replacements Creating MANIFEST... ------------------------------------------------------------------ Remember to commit this directory to your source control system. In fact, why not do that right now? Remember, `cvs import` works from within a directory, not above it. ------------------------------------------------------------------
Inside the Makefile.PL I only
have to make a few minor adjustments to the usual module setup so it
handles things as a program. I put the name of the program in the
anonymous array for EXE_FILES
and
ExtUtils::MakeMaker
will do the rest. When I run
make install
, the program ends up in
the right place (also based on the PREFIX
setting). If I want to install a
manpage, instead of using MAN3PODS
,
which is for programming support documentation, I use MAN1PODS
, which is for application
documentation:
WriteMakefile( 'NAME' => $script_name, 'VERSION' => '0.10', 'EXE_FILES' => [ $script_name ], 'PREREQ_PM' => {}, 'MAN1PODS' => { $script_name => "\$(INST_MAN1DIR)/$script_name.1", }, clean => { FILES => "*.bak $script_name-*" }, );
An advantage of EXE_FILES
is
that ExtUtils::MakeMaker
modifies the shebang line to
point to the path of the perl
binary
that I used to run Makefile.PL. I
don’t have to worry about the location of perl
.
Once I have the basic distribution set up, I start off with some
basic tests. I’ll spare you the details since you can look in scriptdist
to see what it creates. The
compile.t
test simply ensures that
everything at least compiles. If the program doesn’t compile, there’s no
sense going on. The pod.t
file checks
the program documentation for Pod errors (see Chapter 15
for more details on Pod), and the prereq.t
test ensures that I’ve declared all
of my prerequisites with Perl. These are the tests that clear up my most
common mistakes (or, at least, the most common ones before I started
using these test files with all of my distributions).
Before I get started, I’ll check to ensure everything works correctly. Now that I’m treating my program as a module, I’ll test it every step of the way. The program won’t actually do anything until I run it as a program, though:
$ cd Japh.pm.d $ perl Makefile.PL; make test Checking if your kit is complete... Looks good Writing Makefile for Japh.pm cp Japh.pm blib/lib/Japh.pm cp Japh.pm blib/script/Japh.pm /usr/local/bin/perl "-MExtUtils::MY" -e "MY->fixin(shift)" blib/script/Japh.pm /usr/local/bin/perl "-MTest::Manifest" "-e" "run_t_manifest(0,↲ 'blib/lib', 'blib/arch', )" Level is Test::Manifest::test_harness found [t/compile.t t/pod.t t/prereq.t] t/compile....ok t/pod........ok t/prereq.....ok All tests successful. Files=3, Tests=4, 6 wallclock secs ( 3.73 cusr + 0.48 csys = 4.21 CPU)
Now that I have all of the infrastructure in place, I want to further develop the program. Since I’m treating it as a module, I want to add additional subroutines that I can call when I want it to do the work. These subroutines should be small and easy to test. I might even be able to reuse these subroutines by simply including my modulino in another program. It’s just a module, after all, so why shouldn’t other programs use it?
First, I move away from a hardcoded message. I’ll do this in baby
steps to illustrate the development of the modulino, and the first thing
I’ll do is move the actual message to its own subroutine. That hides the
message to print behind an interface, and later I’ll change how I get
the message without having to change the run
subroutine. I’ll also be able to test
message
separately. At the same time,
I’ll put the entire program in its own package, which I’ll call Japh
. That helps compartmentalize anything I
do when I want to test the modulino or use it in another program:
#!/usr/bin/perl package Japh; run() unless caller(); sub run { print message(), "\n"; } sub message { 'Just another Perl hacker, '; }
I can add another test file to the t/ directory now. My first test is simple. I
check that I can use
the modulino and
that my new subroutine is there. I won’t get into testing the actual
message yet since I’m about to change that:[61]
# message.t use Test::More tests => 4; use_ok( 'Japh.pm' ); ok( defined &message );
Now I want to be able to configure the message. At the moment it’s in English, but maybe I don’t always want that. How am I going to get the message in other languages? I could do all sorts of fancy internationalization things, but for simplicity I’ll create a file that contains the language, the template string for that language, and the locales for that language. Here’s a configuration file that maps the locales to a template string for that language:
en_US "Just another %s hacker, " eu_ES "apenas otro hacker del %s, " fr_FR "juste un autre hacker de %s, " de_DE "gerade ein anderer %s Hacker, " it_IT "appena un altro hacker del %s, "
I add some bits to read the language file. I need to add a
subroutine to read the file and return a data structure based on the
information, and my message
routine
has to pick the correct template. Since message
is now returning a template string, I
need run
to use sprintf
instead. I also add another
subroutine, topic
, to return the type
of hacker I am. I won’t branch out into the various ways I can get the
topic, although you can see how I’m moving the program away from doing
(or saying) one thing to making it much more flexible:
sub run { my $template = get_template(); print message( $template ), "\n"; } sub message { my $template = shift; return sprintf $template, get_topic(); } sub get_topic { 'Perl' } sub get_template { ... shown later ... }
I can add some tests to ensure that my new subroutines still work and also check that the previous tests still work.
Being quite pleased with myself that my modulino now works in many
languages and that the message is configurable, I’m disappointed to find
out that I’ve just introduced a possible problem. Since the user can
decide the format string, he can do anything that printf
allows him to do,[62]and that’s quite a bit. I’m using user-defined data to run
the program, so I should really turn on taint checking (see Chapter 3), but even better than that, I should get away from
the problem rather than trying to put a bandage on it.
Instead of printf
, I’ll use the
Template
module. My format strings will turn into templates:
en_US "Just another [% topic %] hacker, " eu_ES "apenas otro hacker del [% topic %], " fr_FR "juste un autre hacker de [% topic %], " de_DE "gerade ein anderer [% topic %] Hacker, " it_IT "Solo un altro hacker del [% topic %], "
Inside my modulino, I’ll include the Template
module and configure the Template
parser so it
doesn’t evaluate Perl code. I only need to change message
because nothing else needs to know how
message
does its work:
sub message { my $template = shift; require Template; my $tt = Template->new( INCLUDE_PATH => '', INTERPOLATE => 0, EVAL_PERL => 0, ); $tt->process( \$template, { topic => get_topic() }, \ my $cooked ); return $cooked; }
Now I have a bit of work to do on the distribution side. My
modulino now depends on Template
so I need to add
that to the list of prerequisites. This way, CPAN
(or
CPANPLUS
) will automatically detect the dependency
and install it as it installs my modulino. That’s just another benefit
of wrapping the program in a distribution:
WriteMakefile( ... 'PREREQ_PM' => { Template => '0'; }, ... );
What happens if there is no configuration file, though? My
message
subroutine should still do
something, so I give it a default message from get_template
, but I also issue a warning if I
have warnings enabled:
sub get_template { my $default = "Just another [% topic %] hacker, "; my $file = "t/config.txt"; unless( open my( $fh ), "<", $file ) { carp "Could not open '$file'"; return $default; } my $locale = shift || 'en_US'; while( <$fh> ) { chomp; my( $this_locale, $template ) = m/(\S+)\s+"(.*?)"/g; return $template if $this_locale eq $locale; } return $default; }
You know the drill by now: the new additions to the program require more tests. Again, I’ll leave that up to you.
Finally, I need to test the whole thing as a program. I’ve tested
the bits and pieces individually, but do they all work together? To find
out, I use the Test::Output
module to run an external
command and capture the output. I’ll compare that with what I expect.
How I do this for programs depends on what the particular program is
supposed to actually do. To run my program inside the test file, I wrap
it in a subroutine and use the value of $^X
for the perl
binary I should use. That will be the
same perl binary that’s running the tests:
#!/usr/bin/perl use File::Spec; use Test::More 'no_plan'; use Test::Output; my $script = File::Spec->catfile( qw(blib script Japh.pm ) ); sub run_program { print `$^X $script`; } { # test for US English local %ENV; $ENV{LANG} = 'en_US'; stdout_is( \&run_program, "Just another Perl hacker, \n" ); } { # test for Spanish local %ENV; $ENV{LANG} = 'eu_ES'; stdout_is( \&run_program, "apenas otro hacker del Perl, \n" ); } { # test with no LANG setting local %ENV; delete $ENV{LANG}; stdout_is( \&run_program, "Just another Perl hacker, \n" ); } { # test with nonsense LANG setting local %ENV; $ENV{LANG} = 'blah blah'; stdout_is( \&run_program, "Just another Perl hacker, \n" ); }
Once I create the program distribution, I can upload it to CPAN (or anywhere
else that I like) so other people can download it. To create the archive,
I do the same thing I do for modules. First, I run make disttest
, which creates a distribution,
unwraps it in a new directory, and runs the tests. That ensures that the
archive I give out has the necessary files and everything runs properly
(well, most of the time):
$ make disttest
After that, I create the archive in which ever format that I like:
$ make tardist ==OR== $ make zipdist
Finally, I upload it to PAUSE and announce it to the world. In real
life, however, I use my release
utility
that comes with Module::Release
and this (and much more) all happens in one step.
As a module living on CPAN, my modulino is a candidate for CPAN Testers, the loosely connected group of volunteers and automated computers that test just about every module. They don’t test programs, but our modulino doesn’t look like a program.
There is a little known area of CPAN called “scripts” where people have uploaded standalone programs without the full distribution support.[63] Kurt Starsinic did some work on it to automatically index the programs by category, and his solution simply looks in the program’s Pod documentation for a section called “SCRIPT CATEGORIES.”[64]If I wanted, I could add my own categories to that section, and the programs archive should automatically index those on its next pass:
=pod SCRIPT CATEGORIES CPAN/Administrative =cut
I can create programs that look like modules. The entire program (outside of third-party modules) exists in a single file. Although it runs just like any other program, I can develop and test it just like a module. I get all the benefits of both forms, including testability, dependency handling, and installation. Since my program is a module, I can easily re-use parts of it in other programs, too.
“How a Script Becomes a Module” originally appeared on Perlmonks: http://www.perlmonks.org/index.pl?node_id=396759.
I also wrote about this idea for The Perl Journal in “Scripts as Modules.” Although it’s the same idea, I chose a completely different topic: turning the RSS feed from TPJ into HTML: http://www.ddj.com/dept/lightlang/184416165.
Denis Kosykh wrote “Test-Driven Development” for The Perl Review 1.0 (Summer 2004): http://www.theperlreview.com/Issues/subscribers.html.
[61] If you like Test-Driven Development, just switch the order of the tests and program changes in this chapter. Make the new tests before you change the program.
[62] The Sys::Syslog
module once suffered from
this problem, and its bug report explains the situation. See Dyad
Security’s notice for details: http://dyadsecurity.com/webmin-0001.html.
Get Mastering Perl now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.