Chapter 12. Custom Mason Subclasses
Something that we have tried very hard to do beginning with the 1.10
release of Mason is to make it easier to customize
Mason’s behavior. Jon Swartz was already on this
track even back with the release of 0.80, which saw the first
appearance of the HTML::Mason::Resolver
classes,
but 1.10 tries to bring this to new levels.
Starting with 1.10 it has become possible to subclass almost every core class that comes with Mason. Some obvious candidates for subclassing include the Lexer, Compiler, and Resolver. This chapter will demonstrate how you might go about implementing subclasses of various Mason objects.
Class::Container as a Superclass
A number of modules in Mason are subclasses of
Class::Container
. This is a class that was created
to encapsulate some common behaviors for Mason objects. Originally,
it was called HTML::Mason::Container
, but Ken
Williams decided to package this class separately and release it to
CPAN, as it solves some fundamental problems of a large
object-oriented system. Any Mason object that takes parameters to its
constructor must inherit from this module. Of
course, since all of the classes that you might consider subclassing
inherit from Class::Container
already, you
shouldn’t need to inherit from it directly. However,
you may need to use some of its methods. We will briefly cover a few
of them here, but see the Class::Container
documentation for more details.
The modules in the Mason core distribution that are
Class::Container
subclasses are
HTML::Mason::ApacheHandler
,
HTML::Mason::CGIHandler
,
HTML::Mason::Interp
,
HTML::Mason::Compiler
,
HTML::Mason::Lexer
,
HTML::Mason::Resolver
, and
HTML::Mason::Request
.
The most important methods that Class::Container
provides are valid_params( )
and
contained_objects( )
, both of which are class
methods.
The first,
valid_params( )
, is
called in order to register the valid parameters for a
class’s new( )
constructor. The
second method, contained_objects( )
, is used to
register the objects, if any, that a given class contains.
The contained_objects( )
method is not something you will have to
use for all of your subclasses, since most of the time you
won’t be altering the structure of
Mason’s framework, you’ll just be
plugging your own classes into it. This method is called with a hash
that contains as its keys parameter names that the
class’s constructor accepts and as its values the
default name of the contained class.
For example, HTML::Mason::Compiler
contains the
following code:
_ _PACKAGE_ _->contained_objects( lexer => 'HTML::Mason::Lexer' );
This says that the HTML::Mason::Compiler->new( )
method will accept a lexer
parameter and that, if
no such parameter is given, then an object of the
HTML::Mason::Lexer
class will be constructed.
Class::Container
also implements a bit of magic
here, so that if HTML::Mason::Compiler->new( )
is called with a lexer_class
parameter, it will
load the class, instantiate a new object of that class, and use that
for the lexer. In fact, it’s even smart enough to
notice if parameters given to HTML::Mason::Compiler->new( )
are really intended for this subclass, and it will make
sure that they get passed along.
The valid_params( )
method is a bit more complex.
It also takes a list of key/value pairs as arguments. The keys are
the names of parameters accepted by the new( )
method, while the values are hash references defining a validation
specification for the parameter. This specification is largely the
same as that used by the
Params::Validate
module, with a few additions (but no subtractions).
One addition is that each parameter, excluding those that represent
contained objects, may also define a value for
parse
. This tells Mason how to parse this
parameter if it is defined as part of an Apache configuration file.
If no parse
parameter is provided, a sensible
default will be guessed from the value of the
Params::Validate
type
argument.
The upshot of this is that your subclasses can define their own constructor parameters and Mason will then check for these parameters in an Apache configuration file.
As an example,
HTML::Mason::Compiler
contains the following:
_ _PACKAGE_ _->valid_params ( allow_globals => { parse => 'list', type => ARRAYREF, default => [ ], descr => "An array of names of Perl variables that are" . " allowed globally within components" }, default_escape_flags => { parse => 'string', type => SCALAR, default => '', descr => "Escape flags that will apply by default to" . " all Mason tag output" }, lexer => { isa => 'HTML::Mason::Lexer', descr => "A Lexer object that will scan component" . " text during compilation" }, preprocess => { parse => 'code', type => CODEREF, optional => 1, descr => "A subroutine through which all component text" . " will be sent during compilation" }, postprocess_perl => { parse => 'code', type => CODEREF, optional => 1, descr => "A subroutine through which all Perl code" . " will be sent during compilation" }, postprocess_text => { parse => 'code', type => CODEREF, optional => 1, descr => "A subroutine through which all plain text will" . " be sent during compilation" }, ); _ _PACKAGE_ _->contained_objects( lexer => 'HTML::Mason::Lexer' );
The type
,
default
, and optional
parameters are part of the validation specification used by
Params::Validate
.
The various constants used,
ARRAYREF
, SCALAR
, and so on,
are all exported by Params::Validate
. The
parameters passed to valid_params( )
correspond to the
MasonAllowGlobals
,
MasonDefaultEscapeFlags
,
MasonLexerClass
,
MasonPreprocess
,
MasonPostprocessPerl
, and
MasonPostprocessText
httpd.conf
configuration
variables. Yes, Class
is added
automatically to the lexer
param
because lexer
was also given to the
contained_objects( )
method.
The descr
parameter is used when we generate the
HTML::Mason::Params
documentation and is probably
not something you’d need to use.
For more details, see both the Class::Container
and Params::Validate
documentation.
Syntax: Your Very Own Lexer
A request heard every so often on the Mason users list is for some way to create an XML-based markup language that can be used with Mason and that can be compiled to a Mason component object.
Despite the panic the thought of such a thing inspires in us, in the interests of good documentation, we will show the beginnings of such a lexer.
This lexer object will make use of several modules from CPAN,
including
XML::SAX::ParserFactory
and
XML::SAX::Base
. The former is what it sounds
like, a factory for
SAX parsers (SAX2
parsers, actually). The latter is what any SAX2 handler should use as
a base class. It implements a default no-op method for all the
possible SAX2 methods, allowing you to simply implement those that
you need. Our lexer will be a SAX2 handler, so we will inherit from
XML::SAX::Base
.
A quick side note
on
SAX (Simple API for XML): SAX is an event-based API for parsing XML.
As the parser finds XML constructs, such as tags or
character data, it calls appropriate methods in a SAX handler, such
as start_element()
or
characters()
. The parser is an event producer and
the handler, like our Lexer, is an event consumer. In our case, the
Lexer will also be generating events for the Compiler, though these
will not be SAX events.
For more information on Perl’s implementation of SAX2, see the perl-xml project on Sourceforge at http://perl-xml.sourceforge.net/.
For the purposes of our example, let’s assume that
any element that is not in the mason
XML namespace will be output verbatim, as will any text. For tags,
we’ll just implement
<mason:args>
,
<mason:init>
,
<mason:perl>
, and
<mason:output>
in this example.[25] The
<mason:init>
tag will contain XML-escaped
Perl code, while the <mason:args>
tag will
contain zero or more <mason:arg>
tags. Each
<mason:arg>
tag will have the attributes
name
and
default
, with
name
being required.
We will also implement a <mason:component>
tag in order to provide a single top-level containing tag for the
component, which is an XML requirement.
This is only a subset of the Mason syntax set, but it’s enough to show you how to customize a fairly important part of the system.
Using these tags, we might have some XML like this:
<?xml version="1.0"?> <mason:component xmlns:mason="http://www.masonbook.com/"> This is plain text. <b>This is text in an HTML tag</b> <mason:perl> my $x; if ($y > 10) { $x = 10; } else { $x = 100; } </mason:perl> $x is <mason:output>$x</mason:output> $y is <mason:output>$y</mason:output> <mason:args> <mason:arg name="$y" /> <mason:arg name="@z" default="(2,3)" /> </mason:args> <mason:init> $y *= $_ foreach @z; </mason:init> </mason:component>
OK, that looks just beautiful!
Let’s start with the preliminaries.
package HTML::Mason::Lexer::XML; $VERSION = '0.01'; use strict; use HTML::Mason::Exceptions( abbr => [ qw( param_error syntax_error error ) ] ); use HTML::Mason::Lexer; use Params::Validate qw(:all); use XML::SAX::Base; use XML::SAX::ParserFactory; use base qw(HTML::Mason::Lexer XML::SAX::Base); # Lexer comes first
As mentioned before,
XML::SAX::Base
provides default no-op methods
for all of the possible SAX2 events, of which there are many. Since
we’re not interested in most of them,
it’s nice to have them safely ignored. We inherit
from
HTML::Mason::Lexer
because it provides a few methods that
the compiler class needs, such as object_id( )
.
Because we’re staunch generalists, we
won’t insist that the XML namespace of our tags
needs to be ' mason'
.
We’ll let the user override this with a parameter if
desired:
_ _PACKAGE_ _->valid_params ( xml_namespace => { parse => 'string', type => SCALAR, default => 'mason', descr => "Prefix of XML tags indicating Mason sections" }, );
We don’t need to make a separate new( )
method in our module, since we can just inherit the one
provided by our base Lexer class. The main action will happen in the
lex( )
method:
sub lex { my ($self, %p) = @_; local $self->{name} = $p{name}; local $self->{compiler} = $p{compiler};
We need a convenient place to keep these, so we stick them into
$self
for the duration of lexing.
Perl’s local( )
function makes
sure these entries expire at the end of the lex( )
method:
$self->{state} = [ ];
We’ll need to keep a stack of what tags
we’ve seen so we can check that tags
aren’t improperly nested and in order to handle
characters( )
events correctly:
my $parser = XML::SAX::ParserFactory->parser( Handler => $self );
We could have created the parser object in our
new()
method, but to store it we would have had to
save it in the lexer object’s structure, which would
have created a circular reference. Doing it this way guarantees that
the reference to the parser will go out of scope when
we’re finished using it.
$parser->parse_string( $p{comp_source} ); }
The last bit tells the parser to parse the component text we were given. That will cause the parser to in turn call methods for each SAX event that occurs while parsing the string.
Now we’ll take a look at our event-handling methods.
The first is start_element( )
, which will be called whenever an XML
tag is first encountered:
sub start_element { my $self = shift; my $elt = shift; if ( ! defined $elt->{Prefix} || $elt->{Prefix} ne $self->{xml_namespace} ) { $self->_verbatim_start_element($elt); return; }
If we got something that isn’t in our designated namespace we’ll just pass it through to the compiler as text to be output:
if ( $elt->{LocalName} eq 'component' ) { $self->{compiler}->start_component; }
When the component starts, we notify the compiler so it can do any initialization that it needs to do:
foreach my $block ( qw( init perl args ) ) { if ( $elt->{LocalName} eq $block ) { $self->_start_block($block); last; } } if ( $elt->{LocalName} eq 'output' ) { $self->_start_output; } if ( $elt->{LocalName} eq 'arg' ) { $self->_handle_argument($elt); } }
The rest of this method is basically a switch statement. Depending on what type of element we receive, we call the appropriate internal method to handle that element.
Let’s look at some of the individual methods that are called:
sub _verbatim_start_element { my $self = shift; my $elt = shift; my $xml = '<' . $elt->{Name}; my @att; foreach my $att ( values %{ $elt->{Attributes} } ) { push @att, qq|$att->{Name}="$att->{Value}"|; } if (@att) { $xml .= ' '; $xml .= join ' ', @att; } $xml .= '>'; $self->{compiler}->text( text => $xml ); }
Basically, this method goes through some contortions to regenerate
the original XML element and then passes it on to the compiler as
plain text. It should be noted that this implementation will end up
converting tags like
<foo/>
into tag pairs like
<foo></foo>
. This is certainly valid
XML but it may be a bit confusing to users. Unfortunately, there is
no easy way to retrieve the exact text of the source document to
determine how a tag was originally written, and with XML
you’re not supposed to care anyway.
Back to our subclass. The next method to implement is our
_start_block( )
method. This will handle the
beginning of a number of blocks in a simple generic fashion:
sub _start_block { my $self = shift; my $block = shift; if ( $self->{state}[-1] && $self->{state}[-1] ne 'def' && $self->{state}[-1] ne 'method' ) { syntax_error "Cannot nest a $block tag inside a $self->{state}[-1] tag"; }
What we are doing here is making it impossible to do something like
nest a <mason:init>
tag inside a
<mason:perl>
block. In fact, the only tags
that can contain other tags are method and subcomponent definition
tags, which are unimplemented in this example.
We notify the compiler that a new block has started and then push the block name onto our internal stack so we have access to it later:
$self->{compiler}->start_block( block_type => $block ); push @{ $self->{state} }, $block; }
Again, we check for basic logical errors:
sub _start_output { my $self = shift; if ( $self->{state}[-1] && $self->{state}[-1] ne 'def' && $self->{state}[-1] ne 'method' ) { syntax_error "Cannot nest an output tag inside a $self->{state}[-1] tag"; }
Again, we push this onto the stack so we know that this was the last tag we saw:
push @{ $self->{state} }, 'output'; }
The variable name and default are expressed as attributes of the
element. The weird '{}name
' syntax is intentional.
Read the Perl SAX2 spec mentioned earlier for more details on what
this means.
sub _handle_argument { my $self = shift; my $elt = shift; my $var = $elt->{Attributes}{'{}name'}{Value}; my $default = $elt->{Attributes}{'{}default'}{Value};
We want to check that the variable name is a valid Perl variable name:
unless ( $var =~ /^[\$\@%][^\W\d]\w*/ ) { syntax_error "Invalid variable name: $var"; }
Then we tell the compiler that we just found a variable declaration.
$self->{compiler}->variable_declaration( block_type => 'args', type => substr( $var, 0, 1 ), name => substr( $var, 1 ), default => $default ); }
That wraps up all the methods that start_element( )
calls. Now let’s move
on to handling a characters( )
SAX event. This
happens whenever the SAX parser encounters data outside of an XML
tag.
sub characters { my $self = shift; my $chars = shift; if ( ! $self->{state}[-1] || $self->{state}[-1] eq 'def' || $self->{state}[-1] eq 'method' ) { $self->{compiler}->text( text => $chars->{Data} ); return; }
If we’re in the main body of a component, subcomponent, or method, we simply pass the character data on as text:
if ( $self->{state}[-1] eq 'init' || $self->{state}[-1] eq 'perl' ) { $self->{compiler}->raw_block( block_type => $self->{state}[-1], block => $chars->{Data} ); return; }
Character data in a <mason:init>
or
<mason:perl>
section is passed to the
compiler as the contents of that block. The compiler knows what type
of tag is currently being processed and handles it appropriately.
if ( $self->{state}[-1] eq 'output' ) { $self->{compiler}->substitution( substitution => $chars->{Data} ); } }
If we are in a substitution tag, we call a different compiler method instead. Otherwise, we’ll simply end up discarding the contents.
Since we may be dealing with text where whitespace is significant (as opposed to HTML), we’ll want to pass on whitespace as if it were character data:
sub ignorable_whitespace { $_[0]->characters($_[1]->{Data}) }
This method may be called if the XML parser finds a chunk of “ignorable whitespace.” Frankly, we can never ignore whitespace, because it is just so cool, and without it our code would be unreadable. But apparently XML parsers can ignore it.[26]
The last thing we need to handle is
an
end_element( )
event:
sub end_element { my $self = shift; my $elt = shift; if ( ! defined $elt->{Prefix} || $elt->{Prefix} ne $self->{xml_namespace} ) { $self->_verbatim_end_element($elt); return; }
Again, XML elements not in our designated namespace are passed on verbatim to the compiler:
if ( $elt->{LocalName} eq 'component' ) { $self->{compiler}->end_component; return; }
If we have reached the end tag of the component, we inform the compiler that the component is complete and we return:
return if $elt->{LocalName} eq 'arg';
We don’t need to do anything to end argument
declarations. The work needed to handle this element happened when we
called _handle_argument( )
from our
start_element( )
method.
if ( $self->{state}[-1] ne $elt->{LocalName} ) { syntax_error "Something very weird happened. " . "We encountered an ending tag for a $elt->{LocalName} tag " . "before ending our current tag ($self->{state}[-1])."; }
Actually, this should just never happen: XML does not allow tag overlap and, if the parser finds overlapping tags, it should die rather than passing them to us. But we believe in being paranoid. If there is an error in the logic of this lexer code, this might help us in catching it.
if ( $elt->{LocalName} eq 'output' ) { pop @{ $self->{state} }; return; }
Any output that needed to be sent has already been dealt with via the
characters( )
method so we simply need to change our
state if the end tag was </mason:output>
:
$self->{compiler}->end_block( block_type => $elt->{LocalName} ); pop @{ $self->{state} }; }
The only remaining possibilities at this point are either
<mason:perl>
,
<mason:init>
, or
<mason:args>
. For these we simply tell the
compiler that the block is over, change our state, and finish.
The last method we need to write is
_verbatim_end_element()
to pass through tag
endings for non-Mason tags:
sub _verbatim_end_element { my $self = shift; my $elt = shift; $self->{compiler}->text( text => "</$elt->{Name}>" ); }
This concludes our sample lexer subclass. Note that there are a couple of things missing here. First of all, there is no handling of subcomponents or methods. This wouldn’t be too terribly hard as it’s mostly an issue of calling the right methods on the compiler.
We also would want to handle line numbers. The default Mason lexer
keeps track of line numbers in the source file so that the compiler
can output appropriate #line
directives in the
object file, meaning that errors are reported relative to the source
file. This feature isn’t required but can be very
nice to have.
Some of the unhandled potential tags like
<mason:text>
would be extremely trivial to
implement. The <mason:flags>
and
<mason:attr>
tags could be modeled on the
code for handling <mason:args>
. And of
course, we need to handle component calls too. This is the point in
this example where we say, “finishing this is left
as an exercise to the reader.”
To use this new lexer class, we would either place the following in the httpd.conf file:
PerlSetVar MasonLexerClass HTML::Mason::Lexer::XML
or, when creating the ApacheHandler object, we would simply pass in
'HTML::Mason::Lexer::XML
' as the value of
the lexer_class
parameter.
Output: Compiling to a Different Output
So you’ve decided that you really hate Mason and you want to use Embperl instead. But you have a number of Mason components you’ve already written that you’d like to save. Well, you can create your own compiler to generate Embperl code from Mason. In this case, we’ll use the lexer as is and rewrite the compiler from scratch. There isn’t really a one-to-one match between Mason and Embperl’s features so this example will, like the lexer example, be limited in scope. Finding an intelligent way to convert Mason’s methods and subcomponents to Embperl is beyond the scope of this book.
In case you are unfamiliar with Embperl, it uses the following
syntax: [+ +]
tags contain code whose results
should be sent to the browser, like Mason’s
substitution tag (<% %>
). The [* *]
tags contain Perl code that is not
intended to generate output. This is equivalent to
Mason’s %
-lines
and <%perl>
blocks. Finally, Embperl also
has a
[! !]
tag similar to Mason’s
<%once>
block.
There are other Embperl tags but, once again, this is a simplified example.
Embperl does have a feature similar to Mason’s inheritance system called EmbperlObject, but translating between the two is nontrivial.
So let’s make our new compiler:
package HTML::Mason::Compiler::ToEmbperl; $VERSION = '0.01'; use strict; use HTML::Mason::Lexer; use HTML::Mason::Exceptions ( abbr => [qw(syntax_error)] ); use HTML::Mason::Compiler; use base qw(HTML::Mason::Compiler);
This pulls in the basic packages we’ll need. Even
though we really aren’t inheriting much from
HTML::Mason::Compiler
, we still subclass it as anything
expecting a compiler will check that what it is given is a subclass
of HTML::Mason::Compiler
.
Of course, in our case, we won’t be using this
compiler with the HTML::Mason::Interp
class, so
the point is moot but important to mention.
sub compile { my ($self, %p) = @_; $self->lexer->lex( comp_source => $p{comp_source}, name => 'Embperl', compiler => $self ); return $self->component_as_embperl; }
The only parameter we expect is
comp_source
. We tell the lexer the name of the
component is 'Embperl'
since we
don’t really care what the name is in this context.
Presumably we are being called by some sort of script that is simply
going to take the Embperl-ized component and write it to disk
somewhere. The name is used for reporting syntax errors when a
component is run, but that won’t be an issue in this
case.
sub start_component { my $self = shift; $self->{once_header} = ''; $self->{header} = ''; $self->{body} = ''; $self->{footer} = ''; $self->{current_block} = ''; }
This method is called to give the compiler a chance to reset its state, so that’s what we do.
We will be storing blocks of code in each of the first four
attributes. When we encounter a
<%once>
block, it will go in
the once_header
attribute. For
<%init>
blocks, we can put
then in the
header
attribute.
%
-lines,
<%perl>
blocks,
<%text>
blocks, substitution tags, and text will
be placed immediately into the body
attribute.
Finally, any <%cleanup>
blocks will go into the
footer
attribute.
The
current_block()
attribute will be used to keep track
of what type of block we are in after a call to our
start_block()
method.
This example will ignore other Mason syntax such as component calls,
subcomponents, methods, and <%shared>.
Again, this will be left as an exercise for the reader.
sub start_block { my ($self, %p) = @_; syntax_error "Cannot nest a $p{block_type} inside a $self->{in_block} block" if $self->{in_block};
This is to make sure that the component is following the syntax rules we expect.
$self->{in_block} = $p{block_type}; }
Then we record what kind of block we are starting, which will be
something like init
or
perl
.
The next method,
raw_block()
, is called for all of the blocks that we
handle except the <%text>
block:
sub raw_block { my ($self, %p) = @_; for ($self->{in_block}) { /^once$/ and $self->{once_header} .= $p{block}; /^init$/ and $self->{header} .= $p{block}; /^perl$/ and $self->{body} .= "[* $p{block} *]"; /^cleanup$/ and $self->{footer} .= $p{block}; } }
This switchlike statement stores the code given to us in the
appropriate attribute. If it is a <%perl%>
block,
we wrap it in the relevant Embperl tag; otherwise, we simply store it
as is in the appropriate slot.
sub text_block { my ($self, %p) = @_; $self->{body} .= $p{block}; } sub text { my ($self, %p) = @_; $self->{body} .= $p{text}; }
The first method is called when the lexer finds a
<%text>
block. The second is called for
regular text. Both of these get placed into the
body
attribute for later use.
sub substitution { my ($self, %p) = @_; $self->{body} .= "[+ $p{substitution} +]"; }
This method handles
substitution tags
(<% %>
) though it ignores the fact that this
method can also be given an
escape
parameter. This could be handled via Embperl’s
$escmode
variable
(again, left as an exercise for the reader).
sub perl_line { my ($self, %p) = @_; $self->{body} .= "[* $p{line} *]"; }
This method is called for
%
-lines.
Then we need to implement the end_block()
method:
sub end_block { my ($self, %p) = @_; syntax_error "end of $p{block_type} encountered while in $self->{in_block} block" unless $self->{in_block} eq $p{block_type};
Another sanity check is in the start_block()
method. It’s always a good thing to make sure that
the lexer is giving us the kind of input that we would expect.
$self->{in_block} = undef; }
And we reset our in_block
attribute so that the
next call to start_block()
succeeds.
The last method to implement is
the
component_as_embperl()
method, which simply will
return a big block of text, our new Embperl page:
sub component_as_embperl { my $self = shift; my $page = ''; if ( length $self->{once_header} ) { $page .= "[! $self->{once_header} !]\n"; } if ( length $self->{header} ) { $page .= "[* $self->{header} *]\n"; } if ( length $self->{body} ) { $page .= "$self->{body}\n"; } if ( length $self->{footer} ) { $page .= "[* $self->{footer} *]\n"; } return $page; }
And there you have it — a perfectly good Mason component brutally butchered and turned into an Embperl page. I hope you’re happy with yourself!
Storage: Replacing the Resolver
Occasionally, people on the Mason users list wonder if they can store
their component source in an RDBMS. The way to achieve this is to
create your own
HTML::Mason::Resolver
subclass.
The resolver’s job is take a component path and figure out where the corresponding component is.
We will show an example that connects to a MySQL server containing the following table:
MasonComponent ---------------------------------------- path VARCHAR(255) PRIMARY KEY component TEXT NOT NULL last_modified DATETIME NOT NULL
Our code starts as follows:
package HTML::Mason::Resolver::MySQL; $VERSION = '0.01'; use strict; use DBI; use Params::Validate qw(:all); use HTML::Mason::ComponentSource; use HTML::Mason::Resolver; use base qw(HTML::Mason::Resolver); _ _PACKAGE_ _->valid_params ( db_name => { parse => 'string', type => SCALAR }, user => { parse => 'string', type => SCALAR, optional => 1 }, password => { parse => 'string', type => SCALAR, optional => 1 }, );
These parameters will be used to connect to the MySQL server containing our components. Readers familiar with the Perl DBI will realize that there are a number of other parameters that we could take.
Our constructor method, new( )
, needs to do a bit
of initialization to set up the database connection, so we override
our base class’s method:
sub new { my $class = shift; my $self = $class->SUPER::new(@_);
We invoke the new( )
method provided by our
superclass, which validates the parameters in @_
and makes sure they get sent to the right contained objects. The
latter concern doesn’t seem so important in this
case since we don’t have any contained objects, but
the point is that if somebody subclasses our
HTML::Mason::Resolver::MySQL
class and adds
contained objects, our new( )
method will still do
the right thing with its parameters.
Now we connect to the database in preparation for retrieving components later:
$self->{dbh} = DBI->connect ( "dbi:mysql:$self->{db_name}", $self->{user}, $self->{password}, { RaiseError => 1 } ); return $self; }
A resolver needs to implement two methods left unimplemented in the
parent HTML::Mason::Resolver
class. These are
get_info()
and glob_path()
. The
first is used to retrieve information about the component matching a
particular component path. The second takes a glob pattern like
/path/*
or
/path/*/foo/*
and returns the
component paths of all the components that match that wildcard path.
Additionally, if we want this resolver to be usable with the
ApacheHandler module, we need to implement a method called
apache_request_to_comp_path( )
, which takes an Apache object and
translates it into a component path.
Given a path, we want to get the time when this component was last modified, in the form of a Unix timestamp, which is what Mason expects:
sub get_info { my ($self, $path) = @_; my ($last_mod) = $self->{dbh}->selectrow_array ( 'SELECT UNIX_TIMESTAMP(last_modified) FROM MasonComponent WHERE path = ?', {}, $path ); return unless $last_mod;
If there was no entry in the database for the given path, we simply return, which lets Mason know that no matching component was found:
return HTML::Mason::ComponentSource->new ( comp_path => $path, friendly_name => $path, last_modified => $last_mod, comp_id => $path, source_callback => sub { $self->_get_source($path) }, ); }
The get_info()
method returns its information in
the form of a
HTML::Mason::ComponentSource
object. This is a very simple class that
holds information about a component.
Its constructor accepts the following parameters:
- comp_path
- friendly_name
The string given for this parameter will be used to identify the component in error messages. For our resolver, the component path works for this parameter as well because it is the primary key for the MasonComponent table in the database, allowing us to uniquely identify a component.
For other resolvers, this might differ from the component path. For example, the filesystem resolver that comes with Mason uses the component’s absolute path on the filesystem.
- last_modified
This is the last modification time for the component, as seconds since the epoch.
- comp_id
This should be a completely unique identifier for the component. Again, since the component path is our primary key in the database, it works well here.
- source_callback
This is a subroutine reference that, when called, returns the source text of the component.
Mason could have had you simply create an
HTML::Mason::ComponentSource
subclass that implemented asource()
method for your resolver, but we thought that rather than requiring you to write such a do-nothing subclass, it would be easier to simply use a callback instead.Our
_get_source()
method is trivially simple:sub _get_source { my $self = shift; my $path = shift; return $self->{dbh}->selectrow_array ( 'SELECT component FROM MasonComponent WHERE path = ?', {}, $path ); }
- comp_class
This is the component class into which this particular component should be blessed when it is created. This must be a subclass of
HTML::Mason::Component
. The default isHTML::Mason::Component
.- extra
This optional parameter should be a hash reference. It is used to pass information from the resolver to the component class.
This is needed since an
HTML::Mason::Resolver
subclass and anHTML::Mason::Component
subclass can be rather tightly coupled, but they must communicate with each other through the interpreter (this may change in the future). Next is ourglob_path()
method:sub glob_path { my $self = shift; my $pattern = shift; $pattern =~~ s/*/%/g;
The pattern given will be something that could be passed to
Perl’s
glob()
function. We simply replace this with
the SQL equivalent for a LIKE
search:
return $self->{dbh}->selectcol_array ( 'SELECT path FROM MasonComponent WHERE path LIKE ?', {}, $pattern ); }
Then we return all the matching paths in the database.
Since we may want to use this resolver with ApacheHandler, we will
also implement the
apache_request_to_comp_path( )
method:
sub apache_request_to_comp_path { my $self = shift; my $r = shift; my $path = $r->uri; return $path if $self->{dbh}->selectrow_array ( 'SELECT 1 FROM MasonComponent WHERE path = ?', {}, $path ); return undef unless $r->path_info; $path .= $r->path_info; return $path if $self->{dbh}->selectrow_array ( 'SELECT 1 FROM MasonComponent WHERE path = ?', {}, $path ); return undef; }
We generate a component path by taking the requested URI and looking
for that in the database. If it doesn’t exist, we
will try appending the path info if possible or just give up.
Finally, we try the altered path and, if that
doesn’t exist either, we just give up and return
undef
, which will cause the ApacheHandler module
to return a NOT FOUND
status for this request.
That’s it, all done. And nothing left as an exercise for the reader this time.
As with the lexer, this can be used either via a httpd.conf directive:
PerlSetVar MasonResolverClass HTML::Mason::Resolver::MySQL
or by passing the resolver_class
parameter to the
new()
method for
HTML::Mason::Interp
.
Request: A Request Object with a Built-in Session
Wouldn’t it be cool to have a request object with a built-in session? “Yes, it would,” you answer. “Child’s play,” we say.
When a request is made using this object, it should either find an old
session or create a new one. Then in our components we will simply
call $m->session()
to get back a hash reference
that will persist between requests.
For simplicity’s sake, we won’t make this class configurable as to what type of session to use, though it could be done.[27]
package HTML::Mason::Request::WithSession; $VERSION = '0.01'; use strict; # Import a subroutine error( ) which throws an HTML::Mason::Exception # object use HTML::Mason::Exceptions ( abbr => [ 'error' ] ); use HTML::Mason::ApacheHandler; use base qw(HTML::Mason::Request);
One problem unique to subclassing to the Request object is that Mason
already comes with two of its own Request subclasses. These are
HTML::Mason::Request::ApacheHandler
and
HTML::Mason::Request::CGIHandler
, which are used
by the ApacheHandler and CGIHandler, respectively.
In order to cooperate with the ApacheHandler and CGIHandler modules,
we want to subclass the appropriate class. However, we
can’t know which one to subclass when we are loaded,
because it is possible that we will be loaded
before the ApacheHandler or CGIHandler module.
We’ll take care of this in our
new()
method, which will be discussed momentarily.
Our session will be implemented using cookies
and
Cache::FileCache
for
storage, just as we saw in Chapter 11:
use Apache::Cookie; use Cache::FileCache; use Digest::SHA1;
We solve our subclassing problem with the following code. There is
nothing wrong with changing a class’s inheritance
dynamically in Perl, so that’s what we do. The
alter_superclass()
method is provided by the
HTML::Mason::Request
base class, and does the
right thing even given multiple inheritance. It also cooperates with
Class:Container
to make sure that it sees any
changes made to the inheritance hierarchy:
sub new { my $class = shift; $class->alter_superclass( $HTML::Mason::ApacheHandler::VERSION ? 'HTML::Mason::Request::ApacheHandler' : $HTML::Mason::CGIHandler::VERSION ? 'HTML::Mason::Request::CGI' : 'HTML::Mason::Request' ); return $class->SUPER::new(@_); }
We make a session, call exec()
in our parent
class, taking care to preserve the caller’s
scalar/list context, and then save the session. If an exception is
thrown, we simply rethrow it:
sub exec { my $self = shift; $self->_make_session; my @result; if (wantarray) { @result = eval { $self->SUPER::exec(@_) }; } elsif (defined wantarray) { $result[0] = eval { $self->SUPER::exec(@_) }; } else { eval { $self->SUPER::exec(@_) }; } # copy this in case _save_session overwrites $@ my $e = $@; $self->_save_session; die $e if $e; return wantarray ? @result : defined wantarray ? $result[0] : undef; }
Making a new session for subrequests is probably incorrect behavior,
so we simply reuse our parent’s session object if a
subrequest is exec( )
‘d:
sub _make_session { my $self = shift; if ( $self->is_subrequest ) { $self->{session} = $self->parent_request->session; return; }
This code is pulled almost verbatim from Chapter 11:
my %c = Apache::Cookie->fetch; my $session_id = exists $c{masonbook_session} ? $c{masonbook_session}->value : undef; $self->{session_cache} = Cache::FileCache->new( { cache_root => '/tmp', namespace => 'Mason-Book-Session', default_expires_in => 60 * 60 * 24, # 1 day auto_purge_interval => 60 * 60 * 24, # 1 day auto_purge_on_set => 1 } ); my $session; if ($session_id) { $session = $self->{session_cache}->get($session_id); } unless ($session) { $session = { _session_id => Digest::SHA1::sha1_hex( time, rand, $$ ) }; } Apache::Cookie->new( $self->apache_req, name => 'masonbook_session', value => $session->{_session_id}, path => '/', expires => '+1d', )->bake; $self->{session} = $session; }
Also just like Chapter 11:
sub _save_session { my $self = shift; $self->{session_cache}->set ( $self->{session}{_session_id} => $self->{session} ); }
And to finish it off, a simple accessor method:
sub session { $_[0]->{session} }
Wow, nice and simple. Of course, this would need to be customized for
your environment, or you can use the previously mentioned
HTML::Mason::Request::WithApacheSession
module
available from CPAN.
Once again, you have two options to use this new subclass. If you are configuring Mason via your httpd.conf file, do this:
PerlSetVar MasonRequestClass HTML::Mason::Request::WithSession
or in your handler.pl you can load the module
and then pass a request_class
parameter to the
HTML::Mason::Interp
class’s
constructor.
Argument Munging: ApacheHandler
One of the main reasons that you might consider creating your own ApacheHandler class is to change the way arguments are processed. For example, we might want to create objects based on certain objects.
Our subclass starts like many others:
package HTML::Mason::ApacheHandler::AddObjects; $VERSION = '0.01'; use strict; use HTML::Mason::ApacheHandler; use base qw(HTML::Mason::ApacheHandler);
This should look pretty familiar. Now we’ll load a few more classes, which we’ll be using to create objects:
use Date::ICal; # date object use MyApp::User; # user object
And now we override the argument-processing subroutine,
request_args()
:
sub request_args { my $self = shift; my ($args, $r, $cgi_object) = $self->SUPER::request_args(@_);
ApacheHandler’s request_args()
method returns three items. The first is a hash reference containing
the arguments that will be passed to the component. The second is the
Apache
or Apache::Request
object for the current request, and the third is a
CGI.pm
object. The CGI.pm
object is created only when the ApacheHandler’s
args_method
attribute is set to
CGI
.
if ( exists $args->{epoch} ) { $args->{date} = Date::ICal->new( epoch => $args->{epoch} ); } if ( exists $args->{user_id} ) { $args->{user} = MyApp::User->new( user_id => $args->{user_id} ); }
This bit of code simply creates some useful objects if certain
incoming arguments exist. Finally we return the munged
$args
hash reference, along with the other return
values from our superclass’s
request_args()
method:
return ($args, $r, $cgi_object); }
Now, whenever the client submits an argument called
epoch
, there will be an additional argument,
date
, a Date::ICal
object,
available in components. Similarly, a request with a
user_id
parameter means that there will be an
argument user
containing a
MyApp::User
object passed to components.
Using this class is a little different from what we’ve seen previously. In our httpd.conf file, we’d have something like this:
<Location /mason> SetHandler perl-script PerlHandler HTML::Mason::ApacheHandler::AddObjects </Location>
Similarly, in our handler.pl file,
we’d simply create this object
instead of an
HTML::Mason::ApacheHandler
object.
More Reader Exercises
Consider the following possibilities for Mason subclasses:
A resolver subclass that allows you to have two files per component. One file could be primarily HTML and the other would be code. The resolver subclass would simply concatenate the two together.
A lexer subclass enabling ASP-style syntax.
An ApacheHandler subclass that munges incoming strings into Unicode, using the
Encode
module from CPAN.A CGIHandler subclass that performs the same argument-to-object transformation seen in our example ApacheHandler subclass.
[25] The equivalent of <% %>
in the sane
world where people don’t use XML for
everything!
[26] See Section 2.10 of the W3C XML 1.0 Recommendation for the definition of “ignorable whitespace.”
[27] This is left as an
exercise... Actually, this was left to the one of the authors. Dave
Rolsky recently created
HTML::Mason::Request::WithApacheSession
, which is
a highly configurable module that expands on the example shown in
this section. This module is available from a CPAN mirror near
you.
Get Embedding Perl in HTML with Mason now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.