Since Doug MacEachern introduced mod_perl 1.0[1] in 1996, he has
had to tweak it with every change in Apache and Perl, while
maintaining compatibility with the older versions. These rewrites
have led to very complex source code, with hundreds of
#ifdef
s and workarounds for various
incompatibilities in older Perl and Apache versions.
Apache 2.0, however, is based on a new threads design, requiring that mod_perl be based on a thread-safe Perl interpreter. Perl 5.6.0 was the first Perl version to support internal thread-safety across multiple interpreters. Since Perl 5.6.0 and Apache 2.0 are the very minimum requirements for the newest version of mod_perl, backward compatibility was no longer a concern, so this seemed like a good time to start from scratch. mod_perl 2.0 was the result: a leaner, more efficient mod_perl that’s streamlined for Apache 2.0.
mod_perl 2.0 includes a mechanism for building the Perl interface to
the Apache API automatically, allowing us to easily adjust mod_perl
2.0 to the ever-changing Apache 2.0 API during its development
period. Another important feature is the
Apache::Test
framework, which was originally
developed for mod_perl 2.0 but then was adopted by Apache 2.0
developers to test the core server features and third-party modules.
Moreover the tests written using the Apache::Test
framework could be run with Apache 1.0 and 2.0, assuming that both
supported the same features.
Many other interesting changes have already happened to mod_perl in Version 2.0, and more will be developed in the future. Some of these will be covered in this chapter, and some you will discover on your own while reading mod_perl documentation.
At the time of this writing, mod_perl 2.0 is considered beta when used with the prefork Multi-Processing Model module (MPM) and alpha when used with a threaded MPM. It is likely that Perl 5.8.0+ will be required for mod_perl 2.0 to move past alpha with threaded MPMs. Also, the Apache 2.0 API hasn’t yet been finalized, so it’s possible that certain examples in this chapter may require modifications once production versions of Apache 2.0 and mod_perl 2.0 are released.
In this chapter, we’ll first discuss the new features in Apache 2.0, Perl 5.6 and later, and mod_perl 2.0 (in that order). Then we’ll cover the installation and configuration of mod_perl 2.0. Details on the new functionality implemented in mod_perl 2.0 are provided in Chapter 25.
Whereas Apache 1.2 and 1.3 were based on the NCSA httpd code base, Apache 2.0 rewrote big chunks of the 1.3 code base, mainly to support numerous new features and enhancements. Here are the most important new features:
- Apache Portable Runtime (APR)
The APR presents a standard API for writing portable client and server applications, covering file I/O, logging, shared memory, threads, managing child processes, and many other functionalities needed for developing the Apache core and third-party modules in a portable and efficient way. One important effect is that it significantly simplifies the code that uses the APR, making it much easier to review and understand the Apache code, and increasing the number of revealed bugs and contributed patches.
The APR uses the concept of memory pools, which significantly simplifies the memory-management code and reduces the possibility of memory leaks (which always haunt C programmers).
- I/O filtering
Apache 2.0 allows multiple modules to filter both the request and the response. Now one module can pipe its output to another module as if it were being sent directly from the TCP stream. The same mechanism works with the generated response.
With I/O filtering in place, simple filters (e.g., data compression and decompression) can easily be implemented, and complex filters (e.g., SSL) can now be implemented without needing to modify the the server code (unlike with Apache 1.3).
To make the filtering mechanism efficient and avoid unnecessary copying, the bucket brigades model was used, as follows.
A bucket represents a chunk of data. Buckets linked together comprise a brigade. Each bucket in a brigade can be modified, removed, and replaced with another bucket. The goal is to minimize the data copying where possible. Buckets come in different types: files, data blocks, end-of-stream indicators, pools, etc. You don’t need to know anything about the internal representation of a bucket in order to manipulate it.
The stream of data is represented by bucket brigades. When a filter is called, it gets passed the brigade that was the output of the previous filter. This brigade is then manipulated by the filter (e.g., by modifying some buckets) and passed to the next filter in the stack.
Figure 24-1 depicts an imaginary bucket brigade. The figure shows that after the presented bucket brigade has passed through several filters, some buckets were removed, some were modified, and some were added. Of course, the handler that gets the brigade doesn’t know the history of the brigade; it can only see the existing buckets in the brigade. We will see bucket brigades in use when discussing protocol handlers and filters.
- Multi-Processing Model modules (MPMs)
In the previous Apache generation, the same code base was trying to manage incoming requests for different platforms, which led to scalability problems on certain (mostly non-Unix) platforms. This also led to an undesired complexity of the code.
Apache 2.0 introduces the concept of MPMs, whose main responsibility is to map the incoming requests to either threads, processes, or a threads/processes hybrid. Now it’s possible to write different processing modules specific to various platforms. For example, Apache 2.0 on Windows is much more efficient and maintainable now, since it uses mpm_winnt, which deploys native Windows features.
Here is a partial list of the major MPMs available as of this writing:
- prefork
The prefork MPM implements Apache 1.3’s preforking model, in which each request is handled by a different forked child process.
- worker
The worker MPM implements a hybrid multi-process/multi-threaded approach based on the pthreads standard.
- mpmt_os2, netware, winnt, and beos
These MPMs also implement the hybrid multi-process/multi-threaded model, like worker, but unlike worker, each is based on the native OS thread implementations, while worker uses the pthread library available on Unix.
On platforms that support more than one MPM, it’s possible to switch the used MPMs as the need changes. For example, on Unix it’s possible to start with a preforked module, then migrate to a more efficient threaded MPM as demand grows and the code matures (assuming that the code base is capable of running in the threaded environment).
- New hook scheme
In Apache 2.0 it’s possible to dynamically register functions for each Apache hook, with more than one function registered per hook. Moreover, when adding new functions, you can specify where the new function should be added—for example, a function can be inserted between two already registered functions, or in front of them.
- Protocol modules
The previous Apache generation could speak only the HTTP protocol. Apache 2.0 has introduced a “server framework” architecture, making it possible to plug in handlers for protocols other than HTTP. The protocol module design also abstracts the transport layer, so protocols such as SSL can be hooked into the server without requiring modifications to the Apache source code. This allows Apache to be extended much further than in the past, making it possible to add support for protocols such as FTP, NNTP, POP3, RPC flavors, and the like. The main advantage is that protocol plug-ins can take advantage of Apache’s portability, process/thread management, configuration mechanism, and plug-in API.
- GNU Autoconf-based configuration
Apache 2.0 uses the ubiquitous GNU Autoconf for its configuration process, to make the configuration process more portable.
- Parsed configuration tree
Apache 2.0 makes the parsed configuration tree available at runtime, so modules needing to read the configuration data (e.g., mod_info) don’t have to re-parse the configuration file, but can reuse the parsed tree.
All these new features boost Apache’s performance, scalability, and flexibility. The APR helps the overall performance by doing lots of platform-specific optimizations in the APR internals and giving the developer the already greatly optimized API.
The I/O layering helps performance too, since now modules don’t need to waste memory and CPU cycles to manually store the data in shared memory or pnotes in order to pass the data to another module (e.g., to provide gzip compression for outgoing data).
And, of course, an important impact of these features is the simplification and added flexibility for the core and third-party Apache module developers.
As mentioned earlier, Perl 5.6.0 is the minimum requirement for mod_perl 2.0. However, certain new features work only with Perl 5.8.0 and higher.
The following are the important changes in the recent Perl versions that had an impact on mod_perl. For a complete list of changes, see the appropriate perldelta manpage. The 5.6 generation of Perl introduced the following features:
The beginnings of support for running multiple interpreters concurrently in different threads. In conjunction with the
perl_clone( )
API call, which can be used to selectively duplicate the state of any given interpreter, it is possible to compile a piece of code once in an interpreter, clone that interpreter one or more times, and run all the resulting interpreters in distinct threads. See the perlembed and perl561delta manpages.The core support for declaring subroutine attributes, which is used by mod_perl 2.0’s method handlers (with the
: method
attribute). See the attributes manpage.The
warnings
pragma, which allows programmers to force the code to be super clean, via the setting:use warnings FATAL => 'all';
which will abort any code that generates warnings. This pragma also allows fine control over what warnings should be reported. See the perllexwarn manpage.
Certain
CORE:
: functions can now be overridden via theCORE::GLOBAL:
: namespace. For example, mod_perl now can overrideexit( )
globally by definingCORE::GLOBAL::exit
. So whenexit( )
is called,CORE::GLOBAL::exit( )
gets invoked. Note that you can still useCORE::exit( )
to get the original behavior. See the perlsub manpage.The
XSLoader
extension as a simpler alternative toDynaLoader
. See the XSLoader manpage.Large-file support. If you have filesystems that support files larger than 2 GB), you may now also be able to create and access them from Perl. See the perl561delta manpage.
Multiple performance enhancements. See the perl561delta manpage.
Numerous memory leaks were fixed. See the perl561delta manpage.
Improved security features: more potentially unsafe operations taint their results for improved security. See the perlsec and perl561delta manpages.
Perl is now available on new platforms: GNU/Hurd, Rhapsody/Darwin, and EPOC.
Overall, multiple bugs and problems were fixed in Perl 5.6.1, so if you plan on running the 5.6 generation, you should run at least 5.6.1. It is possible that when this book is released 5.6.2 will be out, which will then incorporate the bug fixes from Perl 5.8.0.
Perl 5.8.0 has introduced the following features:
The experimental PerlIO layer, introduced in 5.6.0, has been stabilized and become the default I/O layer in 5.8.0. Now the I/O stream can be filtered through multiple I/O layers. See the perlapio and perliol manpages.
For example, this allows mod_perl to interoperate with the APR I/O layer and even use the APR I/O layer in Perl code. See the APR::PerlIO manpage.
Another example of using this new feature is the extension of the
open( )
functionality to create anonymous temporary files via:open my $fh, "+>", undef or die $!;
That is a literal
undef( )
, not an undefined value. See theopen( )
entry in the perlfunc manpage.More keywords are now overridable via
CORE::GLOBAL:
:. See the perlsub manpage.The signal handling in Perl has been notoriously unsafe because signals have been able to arrive at inopportune moments, leaving Perl in an inconsistent state. Now Perl delays signal handling until it is safe.
File::Temp
was added to allow creation of temporary files and directories in an easy, portable, and secure way. See the File::Temp manpage.A new command-line option, -t, is available. It is the little brother of -T: instead of dying on taint violations, lexical warnings are given. This is meant only as a temporary debugging aid while securing the code of old legacy applications. It is not a substitute for -T. See the perlrun manpage.
A new special variable,
${^TAINT}
, was introduced. It indicates whether taint mode is enabled. See the perlvar manpage.Thread implementation is much improved since 5.6.0. The Perl interpreter should now be completely thread-safe, and 5.8.0 marks the arrival of the
threads
module, which allows Perl programs to work with threads (creating them, sharing variables, etc.).Much better support for Unicode has been added.
Numerous bugs and memory leaks have been fixed. For example, now you can localize the tied
Apache::DBI
database handles without leaking memory.Perl is now available on new platforms: AtheOS, Mac OS Classic, MinGW, NCR MP-RAS, NonStop-UX, NetWare, and UTS. Also, the following platforms are again supported: BeOS, DYNIX/ptx, POSIX-BC, VM/ESA, and z/OS (OS/390).
The new features introduced by Apache 2.0 and the Perl 5.6 and 5.8 generations provide the base of the new mod_perl 2.0 features. In addition, mod_perl 2.0 reimplements itself from scratch, providing such new features as a new build and testing framework. Let’s look at the major changes since mod_perl 1.0.
In order to adapt to the Apache 2.0 threads architecture (for threaded MPMs), mod_perl 2.0 needs to use thread-safe Perl interpreters, also known as ithreads (interpreter threads). This mechanism is enabled at compile time and ensures that each Perl interpreter instance is reentrant—that is, multiple Perl interpreters can be used concurrently within the same process without locking, as each instance has its own copy of any mutable data (symbol tables, stacks, etc.). This of course requires that each Perl interpreter instance is accessed by only one thread at any given time.
The first mod_perl generation has only a single
PerlInterpreter
, which is constructed by the
parent process, then inherited across the forks to child processes.
mod_perl 2.0 has a configurable number of
PerlInterpreter
s and two classes of interpreters,
parent and clone. A parent is like in mod_perl
1.0, where the main interpreter created at startup time compiles any
preloaded Perl code. A clone is created from the
parent using the Perl API perl_clone( )
function.
At request time, parent interpreters are used only for making more
clones, as the clones are the interpreters that actually handle
requests. Care is taken by Perl to copy only mutable data, which
means that no runtime locking is required and read-only data such as
the syntax tree is shared from the parent, which should reduce the
overall mod_perl memory footprint.
Rather than creating a PerlInterperter
for each
thread, by
default mod_perl creates a pool of interpreters. The pool mechanism
helps cut down memory usage a great deal. As already mentioned, the
syntax tree is shared between all cloned interpreters. If your server
is serving more than just mod_perl requests, having a smaller number
of PerlInterpreter
s than the number of threads
will clearly cut down on memory usage. Finally, perhaps the biggest
win is memory reuse: as calls are made into Perl subroutines, memory
allocations are made for variables when they are used for the first
time. Subsequent use of variables may allocate more memory; e.g., if
a scalar variable needs to hold a longer string than it did before,
or an array has new elements added. As an optimization, Perl hangs
onto these allocations, even though their values go out of scope.
mod_perl 2.0 has much better control over which
PerlInterpreter
s are used for incoming requests.
The interpreters are stored in two linked lists, one for available
interpreters and another for busy ones. When needed to handle a
request, one interpreter is taken from the head of the available
list, and it’s put back at the head of the same list
when it’s done. This means that if, for example, you
have ten interpreters configured to be cloned at startup time, but no
more than five are ever used concurrently, those five continue to
reuse Perl’s allocations, while the other five
remain much smaller, but ready to go if the need arises.
The interpreters pool mechanism has been abstracted into an API known
as tipool (thread item pool). This pool,
currently used to manage a pool of PerlInterpreter
objects, can be used to manage any data structure in which you wish
to have a smaller number of items than the number of configured
threads.
It’s important to notice that the Perl ithreads implementation ensures that Perl code is thread-safe, at least with respect to the Apache threads in which it is running. However, it does not ensure that functions and extensions that call into third-party C/C++ libraries are thread-safe. In the case of non-thread-safe extensions, if it is not possible to fix those routines, care needs to be taken to serialize calls into such functions (either at the XS or Perl level). See Perl 5.8.0’s perlthrtut manpage.
Note that while Perl data is thread-private unless explicitly shared
and threads themselves are separate execution threads, the threads
can affect process-scope state, affecting all the threads. For
example, if one thread does chdir("/tmp")
, the
current working directory of all threads is now
/tmp. While each thread can correct its current
working directory by storing the original value, there are functions
whose process-scope changes cannot be undone. For example,
chroot( )
changes the root directory of all
threads, and this change is not reversible. Refer to the
perlthrtut manpage for more information.
As we mentioned earlier, Apache 2.0 uses two APIs:
The Apache Portable Runtime (APR) API, which implements a portable and efficient API to generically work with files, threads, processes, shared memory, etc.
The Apache API, which handles issues specific to the web server
mod_perl 2.0 provides its own very flexible special-purpose XS code generator, which is capable of doing things none of the existing generators can handle. It’s possible that in the future this generator will be generalized and used for other projects of a high complexity.
This generator creates the Perl glue code for the public APR and Apache APIs, almost without a need for any extra code (just a few thin wrappers to make the API more Perlish).
Since APR can be used outside of Apache, the Perl
APR:
: modules can be used outside of Apache as
well.
In addition to the already mentioned new features in mod_perl 2.0, the following are of major importance:
Apache 2.0 protocol modules are supported. Later we will see an example of a protocol module running on top of mod_perl 2.0.
mod_perl 2.0 provides a very simple-to-use interface to the Apache filtering API; this is of great interest because in mod_perl 1.0 the
Apache::Filter
andApache::OutputChain
modules, used for filtering, had to go to great lengths to implement filtering and couldn’t be used for filtering output generated by non-Perl modules. Moreover, incoming-stream filtering has now become possible. We will discuss filtering and see a few examples later on.A feature-full and flexible
Apache::Test
framework was developed especially for mod_perl testing. While intended to test the core mod_perl features, it is also used by third-party module writers to easily test their modules. Moreover,Apache::Test
was adopted by Apache and is currently used to test the Apache 1.3, 2.0, and other ASF projects. Anything that runs on top of Apache can be tested withApache::Test
, whether the target is written in Perl, C, PHP, etc.The support of the new MPMs makes mod_perl 2.0 able to scale better on a wider range of platforms. For example, if you’ve happened to try mod_perl 1.0 on Win32 you probably know that parallel requests had to be serialized—i.e., only a single request could be processed at a time, rendering the Win32 platform unusable with mod_perl as a heavy production service. Thanks to the new Apache MPM design, mod_perl 2.0 can now efficiently process parallel requests on Win32 platforms (using its native win32 MPM).
mod_perl 2.0 provides
new
configuration directives for the newly added features and improves
upon existing ones. For example, the PerlOptions
directive provides fine-grained configuration for what were
compile-time only options in the first mod_perl generation. The
Perl*FilterHandler
directives provide a much
simpler Apache filtering API, hiding most of the details underneath.
We will talk in detail about these and other options in the section
Section 24.5.
The new Apache::Directive
module provides a Perl
interface to the Apache configuration tree, which is another new
feature in Apache 2.0.
The rewrite of mod_perl gives us a chance to build a smarter, stronger, and faster implementation based on lessons learned over the years since mod_perl was introduced. There are some optimizations that can be made in the mod_perl source code, some that can be made in the Perl space by optimizing its syntax tree, and some that are a combination of both.
Since as of this writing mod_perl 2.0 hasn’t yet been released, the installation instructions may change a bit, but the basics should be the same. Always refer to the mod_perl documentation for the correct information.
First download the latest stable sources of Apache 2.0, mod_perl 2.0, and Perl 5.8.0.[2] Remember that mod_perl 1.0 works only with Apache 1.3, and mod_perl 2.0 requires Apache 2.0. You can get the sources from:
mod_perl 2.0—http://perl.apache.org/dist/
Apache 2.0—http://httpd.apache.org/dist/
Perl 5.8.0—http://cpan.org/src/
You can always find the most up-to-date download information at http://perl.apache.org/download/.
Next, build Apache 2.0:
Extract the source (as usual, replace x with the correct version number):
panic% tar -xzvf httpd-2.0.xx
If you don’t have GNU tar(1), use the appropriate tools and flags to extract the source.
Configure:
panic% cd httpd-2.0.xx panic% ./configure --prefix=/home/httpd/httpd-2.0 --with-mpm=prefork
Adjust the —prefix option to the directory where you want Apache 2.0 to be installed. If you want to use a different MPM, adjust the —with-mpm option. The easiest way to find all of the configuration options for Apache 2.0 is to run:
panic% ./configure --help
Finally, build and install:
panic% make && make install
If you don’t have Perl 5.6.0 or higher installed, or you need to rebuild it because you want to enable certain compile-time features or you want to run one of the threaded MPMs, which require Perl 5.8.0, build Perl (we will assume that you build Perl 5.8.0):
Extract the source:
panic% tar -xzvf perl-5.8.0.tar.gz
Configure:
panic% cd perl-5.8.0 panic% ./Configure -des -Dprefix=$HOME/perl/perl-5.8.0 -Dusethreads
This configuration accepts all the defaults suggested by the Configure script and produces a terse output. The -Dusethreads option enables Perl ithreads. The -Dprefix option specifies a custom installation directory, which you may want to adjust. For example, you may decide to install it in the default location provided by Perl, which is /usr/local under most systems.
For a complete list of configuration options and for information on installation on non-Unix systems, refer to the INSTALL document.
Now build, test, and install Perl:
panic% make && make test && make install
Before proceeding with the installation of mod_perl 2.0,
it’s advisable to install at least the LWP package
into your newly installed Perl distribution so that you can fully
test mod_perl 2.0 later. You can use CPAN.pm
to
accomplish that:
panic% $HOME/perl/perl-5.8.0/bin/perl -MCPAN -e 'install("LWP")'
Now that you have Perl 5.8.0 and Apache 2.0 installed, you can proceed with the mod_perl 2.0 installation:
Extract the source:
panic% tar -xzvf mod_perl-2.0.x.tar.gz
Remember the nightmare number of options for mod_perl 1.0? You need only two options to build mod_perl 2.0. If you need more control, read install.pod in the source mod_perl distribution or online at http://perl.apache.org/docs/2.0/user/. Configure:
panic% cd mod_perl-2.0.x panic% perl Makefile.PL MP_AP_PREFIX=/home/stas/httpd/prefork \ MP_INST_APACHE2=1
The MP_AP_PREFIX option specifies the base directory of the installed Apache 2.0, under which the include/ directory with Apache C header files can be found. For example, if you have installed Apache 2.0 in the directory \Apache2 on Win32, you should use:
MP_AP_PREFIX=\Apache2
The MP_INST_APACHE2 option is needed only if you have mod_perl 1.0 installed under the same Perl tree. You can remove this option if you don’t have or don’t plan to install mod_perl 1.0.
Now build, test, and install mod_perl 2.0:
panic% make && make test && make install
On Win32 you have to use nmake instead of make, and the && chaining doesn’t work on all Win32 platforms, so instead you should do:
C:\modperl-2.0\> nmake C:\modperl-2.0\> nmake test C:\modperl-2.0\> nmake install
Apache 2.0 binaries can be obtained from http://httpd.apache.org/dist/binaries/.
Perl 5.6.1 or 5.8.0 binaries can be obtained from http://cpan.org/ports/index.html.
For mod_perl 2.0, as of this writing only the binaries for the Win32 platform are available, kindly prepared and maintained by Randy Kobes. Once mod_perl 2.0 is released, various OS distributions will provide binary versions for their platforms.
If you are not on a Win32 platform you can safely skip to the next section.
There are two ways of obtaining a binary mod_perl 2.0 package for Win32:
-
PPM
The first, for ActivePerl users, is through PPM, which assumes you already have ActivePerl (build 6xx or later), available from http://www.activestate.com/, and a Win32 Apache 2.0 binary, available from http://www.apache.org/dist/httpd/binaries/win32/. In installing this, you may find it convenient when transcribing any Unix-oriented documentation to choose installation directories that do not have spaces in their names (e.g., C:\Apache2).
After installing Perl and Apache 2.0, you can then install mod_perl 2.0 via the
PPM
utility. ActiveState does not maintain mod_perl in its PPM repository, so you must get it from somewhere else. One way is simply to do:C:\> ppm install http://theoryx5.uwinnipeg.ca/ppmpackages/mod_perl-2.ppd
Another way, which will be useful if you plan on installing additional Apache modules, is to set the repository within the
PPM
shell utility as follows (the lines are broken here for readability):PPM> set repository theoryx5 http://theoryx5.uwinnipeg.ca/cgi-bin/ppmserver?urn:/PPMServer
or, for
PPM3
:PPM> rep add theoryx5 http://theoryx5.uwinnipeg.ca/cgi-bin/ppmserver?urn:/PPMServer
mod_perl 2.0 can then be installed as:
PPM> install mod_perl-2
This will install the necessary modules under an Apache2/ subdirectory in your Perl tree, so as not to disturb an existing Apache/ directory from mod_perl 1.0. See the next section for instructions on how to add this directory to the
@INC
path for searching for modules.The mod_perl
PPM
package also includes the necessary Apache DLL mod_perl.so; a post-installation script that will offer to copy this file to your Apache2 modules directory (e.g., C:\Apache2\modules) should be run. If this is not done, you can get the file mod_perl-2.tar.gz from http://theoryx5.uwinnipeg.ca/ppmpackages/x86/. This file, when unpacked, contains mod_perl.so in the top-level directory.Note that the mod_perl package available from this site will always use the latest mod_perl sources compiled against the latest official Apache release; depending on changes made in Apache, you may or may not be able to use an earlier Apache binary. However, in the Apache Win32 world it is a particularly good idea to use the latest version, for bug and security fixes.
- Apache/mod_perl binary
At ftp://theoryx5.uwinnipeg.ca/pub/other/ you can find an archive called Apache2.tar.gz containing a binary version of Apache 2.0 with mod_perl 2.0. This archive unpacks into an Apache2 directory, underneath which is a blib subdirectory containing the necessary mod_perl files (enabled with a
PerlSwitches
directive in httpd.conf). Some editing of httpd.conf will be necessary to reflect the location of the installed directory. See the Apache2.readme file for further information.This package, which is updated periodically, is compiled against recent CVS sources of Apache 2.0 and mod_perl 2.0. As such, it may contain features, and bugs, not present in the current official releases. Also for this reason, these may not be binary-compatible with other versions of Apache 2.0/mod_perl 2.0.
- Apache/mod_perl/Perl 5.8 binary distribution
Because mod_perl 2.0 works best with Perl 5.8 in threaded environments such as Apache 2.0 with the win32 MPM, there is a package including Perl 5.8, Apache 2.0, and mod_perl 2.0. To get this, look for the perl-5.8-win32-bin.tar.gz package at ftp://theoryx5.uwinnipeg.ca/pub/other/, and extract it to C:\, which will give you an Apache2 directory containing the Apache 2.0 installation along with mod_perl 2.0, and a Perl directory containing the Perl installation (you should add this Perl directory to your path).
Similar to mod_perl 1.0, in order to use mod_perl 2.0 a few configuration settings should be added to httpd.conf. They are quite similar to the 1.0 settings, but some directives were renamed and new directives were added.
To enable mod_perl as a DSO, add this to httpd.conf:
LoadModule perl_module modules/mod_perl.so
This setting specifies the location of the mod_perl module relative
to the ServerRoot
setting, so you should put it
somewhere after ServerRoot
is specified.
Win32 users need to make sure that the path to the Perl binary (e.g.,
C:\Perl\bin) is in the PATH
environment variable. You could also add the directive:
LoadFile "/Path/to/your/Perl/bin/perl5x.dll"
to httpd.conf to load your Perl DLL, before loading mod_perl.so.
To prevent you from
inadvertently loading mod_perl 1.0
modules, mod_perl 2.0 Perl modules are installed into dedicated
directories under Apache2/. The
Apache2
module prepends the locations of the
mod_perl 2.0 libraries to @INC
:
@INC
is the same as the core
@INC
, but with Apache2/
prepended. This module has to be loaded just after mod_perl has been
enabled. This can be accomplished with:
use Apache2 ( );
in the startup file. If you don’t use a startup file, you can add:
PerlModule Apache2
to httpd.conf, due to the order in which the
PerlRequire
and PerlModule
directives are processed.
Next, a startup file with Perl code usually is loaded:
PerlRequire "/home/httpd/httpd-2.0/perl/startup.pl"
It’s used to adjust Perl module search paths in
@INC
, preload commonly used modules, precompile
constants, etc. A typical startup.pl file for
mod_perl 2.0 is shown in Example 24-1.
Example 24-1. startup.pl
use Apache2 ( ); use lib qw(/home/httpd/perl); # enable if the mod_perl 1.0 compatibility is needed # use Apache::compat ( ); # preload all mp2 modules # use ModPerl::MethodLookup; # ModPerl::MethodLookup::preload_all_modules( ); use ModPerl::Util ( ); #for CORE::GLOBAL::exit use Apache::RequestRec ( ); use Apache::RequestIO ( ); use Apache::RequestUtil ( ); use Apache::Server ( ); use Apache::ServerUtil ( ); use Apache::Connection ( ); use Apache::Log ( ); use APR::Table ( ); use ModPerl::Registry ( ); use Apache::Const -compile => ':common'; use APR::Const -compile => ':common'; 1;
In this file the Apache2
module is loaded, so the
2.0 modules will be found. Afterwards, @INC
is
adjusted to include nonstandard directories with Perl modules:
use lib qw(/home/httpd/perl);
If you need to use the backward-compatibility layer, to get 1.0
modules that haven’t yet been ported to work with
mod_perl 2.0, load Apache::compat
:
use Apache::compat ( );
Next, preload the commonly used mod_perl 2.0 modules and precompile the common constants. You can preload all mod_perl 2.0 modules by uncommenting the following two lines:
use ModPerl::MethodLookup; ModPerl::MethodLookup::preload_all_modules( );
Finally, the startup.pl file must be terminated
with 1;
.
Now you can pass Perl’s
command-line switches in
httpd.conf by using the
PerlSwitches
directive, instead of using
complicated workarounds.
For example, to enable warnings and taint checking, add:
PerlSwitches -wT
The -I command-line switch can be used to adjust
@INC
values:
PerlSwitches -I/home/stas/modperl
For example, you can use that technique to set different
@INC
values for different virtual hosts, as we
will see later.
mod_perl 2.0 provides two types of core handlers:
modperl
and perl-script
.
modperl is configured as:
SetHandler modperl
This is the bare mod_perl
handler type, which just calls the
Perl*Handler
’s callback function.
If you don’t need the features provided by the
perl-script
handler, with the
modperl
handler, you can gain even more
performance. (This handler isn’t available in
mod_perl 1.0.)
Unless the Perl*Handler
callback running under the
modperl
handler is configured with:
PerlOptions +SetupEnv
or calls:
$r->subprocess_env;
in a void context (which has the same effect as PerlOptions
+SetupEnv
for the handler that called it), only the
following environment variables are accessible via
%ENV
:
MOD_PERL
andGATEWAY_INTERFACE
(always)PATH
andTZ
(if you had them defined in the shell or httpd.conf)
Therefore, if you don’t want to add the overhead of
populating %ENV
when you simply want to pass some
configuration variables from httpd.conf,
consider using PerlSetVar
and
PerlAddVar
instead of
PerlSetEnv
and PerlPassEnv
.
SetHandler perl-script
Most mod_perl handlers use the perl-script
handler. Here are a few things to note:
PerlOptions +GlobalRequest
is in effect unless:PerlOptions -GlobalRequest
is specified.
PerlOptions +SetupEnv
is in effect unless:PerlOptions -SetupEnv
is specified.
STDOUT
andSTDOUT
get tied to the request object$r
, which makes it possible to read fromSTDIN
and print directly toSTDOUT
viaprint( )
, instead of having to use implicit calls like$r->print( )
.Several special global Perl variables are saved before the handler is called and restored afterward (as in mod_perl 1.0). These include
%ENV
,@INC
,$/
, andSTDOUT
’s$|
andEND
blocks.
Let’s demonstrate
the differences between the
modperl
and perl-script
core
handlers. Example 24-2 represents a simple mod_perl
response handler that prints out the environment variables as seen by
it.
Example 24-2. Apache/PrintEnv1.pm
package Apache::PrintEnv1; use strict; use warnings; use Apache::RequestRec ( ); # for $r->content_type use Apache::Const -compile => 'OK'; sub handler { my $r = shift; $r->content_type('text/plain'); for (sort keys %ENV){ print "$_ => $ENV{$_}\n"; } return Apache::OK; } 1;
This is the required configuration for the
perl-script
handler:
PerlModule Apache::PrintEnv1 <Location /print_env1> SetHandler perl-script PerlResponseHandler Apache::PrintEnv1 </Location>
Now issue a request to http://localhost/print_env1, and you should see all the environment variables printed out.
The same response handler, adjusted to work with the
modperl
core handler, is shown in Example 24-3.
Example 24-3. Apache/PrintEnv2.pm
package Apache::PrintEnv2; use strict; use warnings; use Apache::RequestRec ( ); # for $r->content_type use Apache::RequestIO ( ); # for $r->print use Apache::Const -compile => 'OK'; sub handler { my $r = shift; $r->content_type('text/plain'); $r->subprocess_env; for (sort keys %ENV){ $r->print("$_ => $ENV{$_}\n"); } return Apache::OK; } 1;
The configuration now will look like this:
PerlModule Apache::PrintEnv2 <Location /print_env2> SetHandler modperl PerlResponseHandler Apache::PrintEnv2 </Location>
Apache::PrintEnv2
cannot use print(
)
, so it uses $r->print( )
to
generate a response. Under the modperl
core
handler, %ENV
is not populated by default;
therefore, subprocess_env( )
is called in a void
context. Alternatively, we could configure this section to do:
PerlOptions +SetupEnv
If you issue a request to http://localhost/print_env2, you should see all the environment variables printed out as with http://localhost/print_env1.
The PerlOptions
directive
provides fine-grained
configuration for what were compile-time-only options in the first
mod_perl generation. It also provides control over what class of
PerlInterpreter
is used for a
<VirtualHost>
or location configured with
<Location>
,
<Directory>
, etc.
Options are enabled by prepending +
and disabled
with -
. The options are discussed in the following
sections.
On by default; can be used to disable mod_perl for a
given <VirtualHost>
. For example:
<VirtualHost ...> PerlOptions -Enable </VirtualHost>
Share the parent Perl interpreter, but give the
<VirtualHost>
its own interpreter pool. For
example, should you wish to fine-tune interpreter pools for a given
virtual host:
<VirtualHost ...> PerlOptions +Clone PerlInterpStart 2 PerlInterpMax 2 </VirtualHost>
This might be worthwhile in the case where certain hosts have their own sets of large modules, used only in each host. Tuning each host to have its own pool means that the hosts will continue to reuse the Perl allocations in their specific modules.
When cloning a Perl interpreter, to inherit the parent Perl
interpreter’s PerlSwitches
, use:
<VirtualHost ...> ... PerlSwitches +inherit </VirtualHost>
Create a new parent Perl interpreter for the given
<VirtualHost>
and give it its own
interpreter pool (implies the Clone
option).
A common problem with mod_perl 1.0 was that the namespace was shared
by all code within the process. Consider two developers using the
same server, each of whom wants to run a different version of a
module with the same name. This example will create two parent Perl
interpreters, one for each <VirtualHost>
,
each with its own namespace and pointing to a different path in
@INC
:
<VirtualHost ...> ServerName dev1 PerlOptions +Parent PerlSwitches -Mblib=/home/dev1/lib/perl </VirtualHost> <VirtualHost ...> ServerName dev2 PerlOptions +Parent PerlSwitches -Mblib=/home/dev2/lib/perl </VirtualHost>
Disable specific Perl*Handler
s (all
compiled-in handlers are enabled by
default). The option name is derived from the
Perl*Handler
name, by stripping the
Perl
and Handler
parts of the
word. So PerlLogHandler
becomes
Log
, which can be used to disable
PerlLogHandler
:
PerlOptions -Log
Suppose one of the hosts does not want to allow users to configure
PerlAuthenHandler
,
PerlAuthzHandler
,
PerlAccessHandler
, and
<Perl>
sections:
<VirtualHost ...> PerlOptions -Authen -Authz -Access -Sections </VirtualHost>
Or maybe it doesn’t want users to configure anything but the response handler:
<VirtualHost ...> PerlOptions None +Response </VirtualHost>
Resolve Perl*Handler
s at startup time; loads
the modules from disk if
they’re not already loaded.
In mod_perl 1.0, configured Perl*Handler
s that are
not fully qualified subroutine names are resolved at request time,
loading the handler module from disk if needed. In mod_perl 2.0,
configured Perl*Handler
s are resolved at startup
time. By default, modules are not auto-loaded during startup-time
resolution. It is possible to enable this feature with:
PerlOptions +Autoload
Consider this configuration:
PerlResponseHandler Apache::Magick
In this case, Apache::Magick
is the package name,
and the subroutine name will default to handler
.
If the Apache::Magick
module is not already
loaded, PerlOptions
+Autoload
will attempt to pull it in at startup time. With this option enabled
you don’t have to explicitly load the handler
modules. For example, you don’t need to add:
PerlModule Apache::Magick
Set up the global Apache::RequestRec
object
for use with
Apache->request
. This setting is needed, for
example, if you use CGI.pm
to process the incoming
request.
This setting is enabled by default for sections configured as:
<Location ...> SetHandler perl-script ... </Location>
And can be disabled with:
<Location ...> SetHandler perl-script PerlOptions -GlobalRequest ... </Location>
Scan output for HTTP headers. This option
provides the same functionality as
mod_perl 1.0’s PerlSendHeaders
option, but it’s more robust. It usually must be
enabled for registry scripts that send the HTTP header with:
print "Content-type: text/html\n\n";
Turn on merging of Perl*Handler
arrays. For
example, with this setting:
PerlFixupHandler Apache::FixupA <Location /inside> PerlFixupHandler Apache::FixupB </Location>
a request for /inside runs only
Apache::FixupB
(mod_perl 1.0 behavior). But with
this configuration:
PerlFixupHandler Apache::FixupA <Location /inside> PerlOptions +MergeHandlers PerlFixupHandler Apache::FixupB </Location>
a request for /inside will run both the
Apache::FixupA
and
Apache::FixupB
handlers.
Set up environment variables for each request, à la mod_cgi.
When this option is enabled, mod_perl fiddles with the environment to
make it appear as if the code is called under the mod_cgi handler.
For example, the $ENV{QUERY_STRING}
environment
variable is initialized with the contents of Apache::args(
)
, and the value returned by
Apache::server_hostname( )
is put into
$ENV{SERVER_NAME}
.
Those who have moved to the mod_perl API no longer need this extra
%ENV
population and can gain by disabling it,
since %ENV
population is expensive. Code using the
CGI.pm
module requires PerlOptions
+SetupEnv
because that module relies on a properly
populated CGI environment table.
This option is enabled by default for sections configured as:
<Location ...> SetHandler perl-script ... </Location>
Since this option adds an overhead to each request, if you don’t need this functionality you can turn it off for a certain section:
<Location ...> SetHandler perl-script PerlOptions -SetupEnv ... </Location>
or globally affect the whole server:
PerlOptions -SetupEnv <Location ...> ... </Location>
It can still be enabled for sections that need this functionality.
When this option is disabled you can still read environment variables set by you. For example, when you use the following configuration:
PerlOptions -SetupEnv <Location /perl> PerlSetEnv TEST hi SetHandler perl-script PerlHandler ModPerl::Registry Options +ExecCGI </Location>
and you issue a request for setupenvoff.pl from Example 24-4.
Example 24-4. setupenvoff.pl
use Data::Dumper; my $r = Apache->request( ); $r->send_http_header('text/plain'); print Dumper(\%ENV);
you should see something like this:
$VAR1 = { 'GATEWAY_INTERFACE' => 'CGI-Perl/1.1', 'MOD_PERL' => 'mod_perl/2.0.1', 'PATH' => '/bin:/usr/bin', 'TEST' => 'hi' };
Notice
that we
got
the
value of the environment variable TEST
.
The following directives are enabled only in a threaded MPM mod_perl:
If all running interpreters are in use, mod_perl will clone new
interpreters to handle the request, up until this number of
interpreters is reached. When PerlInterpMax
is
reached, mod_perl will block until an interpreter becomes available.
The minimum number of available interpreters this parameter will clone before a request comes in.
mod_perl will throttle down the number of interpreters to this number as those in use become available.
The maximum number of requests an interpreter should serve. The interpreter is destroyed and replaced with a fresh clone when this number is reached.
As mentioned, when a request in a threaded MPM is handled by mod_perl, an interpreter must be pulled from the interpreter pool. The interpreter is then available only to the thread that selected it, until it is released back into the interpreter pool. By default, an interpreter will be held for the lifetime of the request, equivalent to this configuration:
PerlInterpScope request
For example, if a PerlAccessHandler
is configured,
an interpreter will be selected before it is run and not released
until after the logging phase.
Interpreters will be shared across subrequests by default; however, it is possible to configure the interpreter scope to be per subrequest on a per-directory basis:
PerlInterpScope subrequest
With this configuration, an autoindex-generated page, for example,
would select an interpreter for each item in the listing that is
configured with a Perl*Handler
.
It is also possible to configure the scope to be per handler:
PerlInterpScope handler
With this configuration, an interpreter will be selected before
PerlAccessHandler
s are run and put back
immediately afterwards, before Apache moves on to the authentication
phase. If a PerlFixupHandler
is configured further
down the chain, another interpreter will be selected and again put
back afterwards, before PerlResponseHandler
is
run.
For protocol handlers, the interpreter is held for the lifetime of
the connection. However, a C protocol module (e.g., mod_ftp) might
hook into mod_perl and provide a request_rec
record. In this case, the default scope is that of the request (the
download of one file). Should a mod_perl handler want to maintain
state for the lifetime of an FTP connection, it is possible to do so
on a
per-<VirtualHost>
basis:
PerlInterpScope connection
The httpd server startup options can be
retrieved using Apache::exists_config_define( )
.
For example, to check if the server was started in single-process
mode:
panic% httpd -DONE_PROCESS
use the following code:
if (Apache::exists_config_define("ONE_PROCESS")) { print "Running in a single process mode"; }
[1] Here and in the rest of this and the next chapter we refer to the mod_perl 1.x series as mod_perl 1.0 and to 2.0.x as mod_perl 2.0 to keep things simple. Similarly, we call the Apache 1.3.x series Apache 1.3 and the 2.0.x series Apache 2.0.
[2] Perl 5.6.1 can be used with prefork, but if you build from source why not go for the best?
Get Practical mod_perl now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.