Search the Catalog
Perl for System Administration

Perl for System Administration

Managing multi-platform environments with Perl

By David N. Blank-Edelman
1st Edition July 2000
1-56592-609-9, Order Number: 6099
400 pages, $34.95

Chapter 9
Log Files

In this chapter:
Text Logs
Binary Log Files
Stateful and Stateless Data
Disk Space Problems
Log Analysis
Module Information for This Chapter
References for More Information

If this weren't a book on system administration, an entire chapter on log files would seem peculiar. But system administrators have a very special relationship with log files. Like Doctor Doolittle, who could talk to the animals, system administrators are expected to be able to communicate with a large menagerie of software and hardware. Much of this communication takes place through log files, so we become log file linguists. Perl can be a big help in this process.

It is impossible to touch on all of the different kinds of processing and analysis you can do with logs. Entire books have been devoted to just statistical analysis of this sort of data. However, this chapter should give you some general approaches to the topic and Perl tools to whet your appetite for more.

Text Logs

Logs come in different flavors, so we need several approaches for dealing with them. The most common type of log file is one composed entirely of lines of text. Popular server packages like Apache (web), INN (Usenet news), and Sendmail (email) spew log text in voluminous quantities. Most logs on Unix machines look similar because they are created by a centralized logging facility known as syslog. For our purposes, we can treat files created by syslog like any other text file.

Here's a simple Perl program to scan for the word "error" in a text-based log file:

open(LOG,"logfile") or die "Unable to open logfile:$!\n";
	print if /\berror\b/i;

Perl-savvy readers are probably itching to turn it into a one-liner. For those folks:

perl -ne 'print if /\berror\b/i' logfile

Binary Log Files

Sometimes it's not that easy writing programs to deal with log files. Instead of nice, easily parseable text lines, some logging mechanisms produce nasty, gnarly binary files with proprietary formats that can't be parsed with a single line of Perl. Luckily, Perl isn't afraid of these miscreants. Let's look at a few approaches we can take when dealing with these files. We're going to look at two different examples of binary logs: Unix's wtmp file and NT/2000's event logs.

Back in Chapter 3, User Accounts, we touched briefly on the notion of logging in and logging out of a Unix host. Login and logout activity is tracked in a file called wtmp on most Unix variants. It is common to check this file whenever there is a question about a user's connection habits (e.g., from what hosts does this person usually log in?).

On NT/2000, the event logs play a more generalized role. They are used as a central clearinghouse for logging practically all activity that takes place on these machines including login and logout activity, OS messages, security events, etc. Their role is analogous to the Unix syslog service we mentioned earlier.

Using unpack( )

Perl has a function called unpack( ) especially designed to parse binary and structured data. Let's take a look at how we might use it to deal with the wtmp files. The format of wtmp differs from Unix variant to Unix variant. For this specific example we'll look at the wtmp files found on SunOS 4.1.4 and Digital Unix 4.0 because they are pretty simple. Here's a plain text translation of the first three records in a SunOS 4.1.4 wtmp file:

0000000    ~  \0  \0  \0  \0  \0  \0  \0   r   e   b   o   o   t  \0  \0
0000020   \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000040    ,   /   ;   4   c   o   n   s   o   l   e  \0   r   o   o   t
0000060   \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000100   \0  \0  \0  \0   ,   /   ; 203   c   o   n   s   o   l   e  \0
0000120   \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000140   \0  \0  \0  \0  \0  \0  \0  \0   ,   /   < 230

Unless you are already familiar with the structure of this file, that "ASCII dump" (as it is called) of the data looks like line noise or some other kind of semi-random garbage. So how do we become acquainted with this file's structure?

The easiest way to understand the format of this file is to look at the source code for programs that read and write to it. If you are not literate in the C language, this may seem like a daunting task. Luckily, we don't actually have to understand or even look at most of the source code; we can just examine the portion that defines the file format.

All of the operating system programs that read and write to the wtmp file get their file definition from a single, short C include file, which is likely to be found at /usr/include/utmp.h. The part of the file we need to look at begins with a definition of the C data structure that will be used to hold the information. If you search for struct utmp { you'll find the portion we need. The next lines after struct utmp { define each of the fields in this structure. These lines should be each be commented using the /* text */ C comment convention.

Just to give you an idea of how different two versions of wtmp can be, compare the relevant excerpts from utmp.h on these two operating systems.

SunOS 4.1.4:

struct utmp {
        char    ut_line[8];   /* tty name */
        char    ut_name[8];   /* user id */
        char    ut_host[16];  /* host name, if remote */
        long    ut_time;      /* time on */

Digital Unix 4.0:

struct utmp {
        char    ut_user[32];  /* User login name */
        char    ut_id[14];    /* /etc/inittab id- IDENT_LEN in init */
        char    ut_line[32];  /* device name (console, lnxx) */
        short   ut_type;      /* type of entry */
        pid_t   ut_pid;       /* process id */
        struct exit_status {
            short e_termination; /* Process termination status */
            short e_exit;        /* Process exit status */
        } ut_exit;               /* The exit status of a process
                               * marked as DEAD_PROCESS.
        time_t  ut_time;         /* time entry was made */
        char    ut_host[64];     /* host name same as MAXHOSTNAMELEN */

These files provide all of the clues we need to compose the necessary unpack( ) statement. unpack( ) takes a data format template as its first argument. It uses this template to determine how to disassemble the (usually) binary data it receives in its second argument. unpack( ) will take apart the data as instructed, returning a list where each element of the list corresponds to an element of your template.

Let's construct our template piece by piece based on the C structure from the SunOS utmp.h include file above. There are many possible template letters we can use. I've translated the ones we'll use below, but you should check the pack( ) section of the perlfunc manual page for more information. Constructing these templates is not always straightforward; C compilers occasionally pad values out to satisfy alignment constraints. The command pstruct that ships with Perl can often help with quirks likes these.

We do not have any of these complications with our data format. Table 9-1 gives the utmp.h breakdown.

Table 9-1: Translating the utmp.h C Code to an unpack( ) Template

C Code

unpack( ) Template

Template Letter/Repeat # Translation

char ut_line[8];


ASCII string (space padded), 8 bytes long

char ut_name[8];


ASCII string (space padded), 8 bytes long

char ut_host[16];


ASCII string (space padded), 16 bytes long

long ut_time;


A signed "long" value (may not be the same as the size of a "long" value on a particular machine)

Having constructed our template, let's use it in a real piece of code:

# this is the template we're going to feed to unpack(  )
$template = "A8 A8 A16 l";
# this uses pack(  ) to help us determine the size (in bytes)
# of a single record
$recordsize = length(pack($template,(  )));
# open the file
open(WTMP,"/var/adm/wtmp") or die "Unable to open wtmp:$!\n";
# read it in one record at a time
while (read(WTMP,$record,$recordsize)) {
     # unpack it, using our template
    # handle the records with a null character specially 
    # (see below)
    if ($name and substr($name,0,1) ne "\0"){
        print "$tty:$name:$host:" ,
               scalar localtime($time),"\n"; 
    else {
        print "$tty:(logout):(logout):",
               scalar localtime($time),"\n"; 
# close the file

Here's the output of this little program:

~:reboot::Mon Nov 17 15:24:30 1997
:0:dnb::0:Mon Nov 17 15:35:08 1997
ttyp8:user:host.mcs.anl.go:Mon Nov 17 18:09:49 1997 Nov 17 19:03:44 1997
ttyp6:(logout):(logout):Mon Nov 17 19:26:26 1997
ttyp1:dnb:traal-22.ccs.neu:Mon Nov 17 23:47:18 1997
ttyp1:(logout):(logout):Tue Nov 18 00:39:51 1997

Here are two small comments on the code:

Calling an OS (or Someone Else's) Binary

Groveling through wtmp files on systems is such a common task that Unix systems ship a command for printing a human readable dump of the binary file called last. Here's some sample output showing approximately the same data as the output of our previous example:

dnb       ttyp6    traal-22.ccs.neu Mon Nov 17 23:47 - 00:39  (00:52)
dnb       ttyp1    traal-22.ccs.neu Mon Nov 17 23:47 - 00:39  (00:52)
dnb       ttyp6 Mon Nov 17 19:03 - 19:26  (00:22)
user      ttyp8    host.mcs.anl.go Mon Nov 17 18:09 - crash (27+11:50)
dnb       :0       :0               Mon Nov 17 15:35 - 17:35 (4+02:00)
reboot    ~                         Mon Nov 17 15:24 

We can easily call binaries like last from Perl. This code will show all of the unique user names found in our current wtmp file:

# location of the last command binary
$lastexec = "/usr/ucb/last";
open(LAST,"$lastexec|") or die "Unable to run $lastexec:$!\n";
    $user = (split)[0];
    print "$user","\n" unless exists $seen{$user};
close(LAST) or die "Unable to properly close pipe:$!\n";

So why use this method when unpack( ) looks like it can serve all of your needs? Portability. As we've shown, the format of the wtmp file differs from Unix variant to Unix variant. On top of this, a vendor can change the format of wtmp between OS releases, rendering your perfectly good unpack( ) template invalid.

However, one thing you can reasonably depend on is the continued presence of a last command that will read this format, independent of any underlying format changes. If you use the unpack( ) method, you have to create and maintain separate template strings for each different wtmp format you plan to parse.[1]

The biggest disadvantage of using this method over unpack( ) is the increased sophistication of the field parsing you need to do in the program. With unpack( ), all of the fields are automatically extracted from the data for you. Using our last example, you may find yourself with split( ) or regular-expression-resistant output like this, all in the same output:

user   console                       Wed Oct 14 20:35 - 20:37  (00:01)
user   pts/12   Wed Oct 14 09:19 - 18:12  (08:53)
user   pts/17   Tue Oct 13 13:36 - 17:09  (03:33)
reboot system boot                   Tue Oct  6 14:13 

Your eye has little trouble picking out the columns, but any program that parses this output will have to deal with the missing information in lines one and four. unpack( ) can still be used to tease apart this output because it has fixed field widths, but that's not always possible.

Using the OS's Logging API

For this approach, let's switch our focus to Windows NT/2000's Event Log Service. As we mentioned before, it unfortunately does not log to plain text files. The best, and only, supported way to get to the data is through a set of special API calls. Most users rely on the Event Viewer program, shown in Figure 9-1, to retrieve this data for them.

Figure 9-1. The NT4 Event Viewer


Luckily, there is a Perl module written by Jesse Dougherty (updated by Martin Pauley and Bret Giddings) that allows easy access to the Event Log API calls.[2] Here's a simple program that just dumps a listing of events in the System event log in a syslog-like format. We'll walk through a more complex version of this program later in this chapter.

use Win32::EventLog;
# each event has a type, this is a translation of the common types
%type = (1  => "ERROR",
         2  => "WARNING",
         4  => "INFORMATION",
         8  => "AUDIT_SUCCESS",
         16 => "AUDIT_FAILURE");
# if this is set, we also retrieve the full text of every 
# message on each Read(  )
$Win32::EventLog::GetMessageText = 1; 
# open the System event log
$log = new Win32::EventLog("System") 
  or die "Unable to open system log:$^E\n";
# read through it one record at a time, starting with the first entry
    print scalar localtime($entry->{TimeGenerated})." ";
    print $entry->{Computer}."[".($entry->{EventID} &
          0xffff)."] ";
    print $entry->{Source}.":".$type{$entry->{EventType}};
    print $entry->{Message};

Command-line utilities like last that dump event logs into plain ASCII format also exist for NT/2000. We'll see one of these utilities in action later in this chapter.

Stateful and Stateless Data

In addition to the format in which the log file presents its data, it is important to think about the contents of these files because what the data represents and how it is represented both contribute to our plan of attack when programming. With log file contents, often a distinction can be made between data that is stateful and data that is stateless. Let's see a couple of examples that will make this distinction clear.

Here's a three-line snippet from an Apache web server log. Each line represents a request answered by the web server: - - [13/Dec/1998:00:04:20 -0500] "GET home/u1/tux/tuxedo05.gif
HTTP/1.0" 200 18666 - - [13/Dec/1998:00:04:21 -0500] "GET home/u2/news.htm 
HTTP/1.0" 200 6748 - - [13/Dec/1998:00:04:22 -0500] "GET home/u1/bgc.jpg HTTP/1.1" 304 -

Here are a few lines from a printer daemon log file:

Aug 14 12:58:46 warhol  printer: cover/door open 
Aug 14 12:58:58 warhol  printer: error cleared 
Aug 14 17:16:26 warhol  printer: offline or intervention needed 
Aug 14 17:16:43 warhol  printer: error cleared 
Aug 15 20:40:45 warhol  printer: paper out 
Aug 15 20:40:48 warhol  printer: error cleared

In both cases, each line of the log file is independent of every other line in the file. We can find patterns or aggregate lines together gathering statistics, but there's nothing inherent in the data that connects the log file entries to each other.

Now consider some slightly doctored entries from a sendmail mail log:

Dec 13 05:28:27 mailhub sendmail[26690]: FAA26690: from=<>, size=643, class=0, pri=30643,
nrcpts=1, msgid=<>, proto=ESMTP,
  Dec 13 05:29:13 mailhub sendmail[26695]: FAA26695: from=<>, size=9600, class=0, pri=39600,
nrcpts=1, msgid=<>, proto=ESMTP, [129.10.116.
  Dec 13 05:29:15 mailhub sendmail[26691]: FAA26690: to=<>, delay=00:00:02, xdelay=00:00:01,
mailer=local, stat=Sent
  Dec 13 05:29:19 mailhub sendmail[26696]: FAA26695: to="|IFS=' '&&exec /usr/bin/procmail -f-||exit 75 #user", ctladdr=user
(6603/104), delay=00:00:06, xdelay=00:00:06, mailer=prog, stat=Sent

Unlike the previous examples, there is a definite connection between the lines in the file. Figure 9-2 makes that connection explicit.

Figure 9-2. Related entries in the sendmail log


Each line has at least one partner entry that shows the source and destinations of each message. When a message enters the system it is assigned a unique "Message-ID," highlighted above, which identifies that message while it is in play. This Message-ID allows us to associate related lines in an interleaved log file, essentially giving a message an existence or "state" in between entries of a log file.

Sometimes we care about the "distance" between state transitions. Take, for instance, the wtmp file we saw earlier in this chapter. Not only are we interested in when someone logs in and when they log out (the two state transitions in the log), but in the time between these two events, i.e., how long they were logged in.

Finally, the most sophisticated log files can add another twist. Here are some excerpts from a POP (Post Office Protocol) server's log file while the server is in debug mode. The names and IP addresses have been changed to protect the innocent:

Jan 14 15:53:45 mailhub popper[20243]: Debugging turned on
Jan 14 15:53:45 mailhub popper[20243]: (v2.53) Servicing request from "client" at 129.X.X.X
Jan 14 15:53:45 mailhub popper[20243]: +OK QPOP (version 2.53) at mailhub starting.  
Jan 14 15:53:45 mailhub popper[20243]: Received: "USER username"
Jan 14 15:53:45 mailhub popper[20243]: +OK Password required for username.
Jan 14 15:53:45 mailhub popper[20243]: Received: "pass xxxxxxxxx"
Jan 14 15:53:45 mailhub popper[20243]: +OK username has 1 message (26627 octets).
Jan 14 15:53:46 mailhub popper[20243]: Received: "LIST"
Jan 14 15:53:46 mailhub popper[20243]: +OK 1 messages (26627 octets)
Jan 14 15:53:46 mailhub popper[20243]: Received: "RETR 1"
Jan 14 15:53:46 mailhub popper[20243]: +OK 26627 octets
<message text appears here>
Jan 14 15:53:56 mailhub popper[20243]: Received: "DELE 1"
Jan 14 15:53:56 mailhub popper[20243]: Deleting message 1 at offset 0 of length 26627
Jan 14 15:53:56 mailhub popper[20243]: +OK Message 1 has been deleted.
Jan 14 15:53:56 mailhub popper[20243]: Received: "QUIT"
Jan 14 15:53:56 mailhub popper[20243]: +OK Pop server at mailhub signing off.
Jan 14 15:53:56 mailhub popper[20243]: (v2.53) Ending request from "user" at (client) 129.X.X.X

Not only do we encounter connections ("Servicing request from...") and disconnections ("Ending request from..."), but we have information detailing what took place in between these state transitions.

Each of these middle events also provides potentially useful "distance" information. If there was a problem with our POP server, we might look to see how long each step in the above output took.

In the case of an FTP server, you may be able to draw some conclusions from this data about how people interact with your site. On average, how long do people stay connected before they transfer files? Do they pause between commands for a long time? Do they always travel from one part of your site to another before downloading the same file? The interstitial data can be a rich source of information.

Disk Space Problems

The downside to having programs that can provide useful or verbose logging output is the amount of disk space this output can consume. This is a concern for all three operating systems covered in this book: Unix, MacOS, and Windows NT/2000. Of the three, NT/2000 is probably the least troublesome of the lot because its central logging facility has built-in autotrimming support. MacOS does not have a central logging facility, but it too can run servers that will happily produce enough logging output to fill your disks if given the chance.

Usually, the task of keeping the log files down to a reasonable size is handed off to the system administrator. Most Unix vendors provide some sort of automated log size management mechanism with the OS, but it often only handles the select set of log files shipped with the machine. As soon as you add another service to a machine that creates a separate log file, it becomes necessary to tweak (or even toss) the vendor-supplied solution.

Log Rotation

The usual solution to the space problem is to rotate your log files. (We'll see an unusual solution later in this section.) After a specific duration has passed or a file size has been reached, the current log file is moved to another name, e.g., logfile.0. The logging process is then continued into an empty file. At the next interval or limit, we repeat the process, first moving the backup file (logfile.0 ) to another name (like logfile.1). This process is repeated until a set number backup files have been created. The oldest backup file at that point is deleted. Figure 9-3 shows a picture of this process.

Figure 9-3. A pictorial representation of log rotation


This method allows us to keep a reasonable, finite amount of log data around. Table 9-2 provides one recipe for log rotation and the Perl functions needed to perform each step.

Table 9-2: A Recipe for Log Rotation in Perl



Move the older backup logs out of the way (i.e., move each one to a new name in the sequence)

rename( ) or &File::Copy::move( ) if moving files cross-filesystem.

If necessary, signal the process creating this particular log file to close the current file and cease logging to disk until told otherwise

kill( ) for programs that take signals, system( ) or `` (backticks) if another administrative program has to be called for this purpose

Copy or move the log file that was just in use to another file

&File::Copy to copy, rename( ) to rename (or &File::Copy::move( ) if moving files cross-filesystem)

If necessary, truncate the current log file

truncate( ) or open(FILE,">filename")

If necessary, signal the logging process to resume logging

See step 2 of this table

If desired, compress or post-process the copied file

system( ) or backticks to run a compression program or other code for post-processing

Delete other, older log file copies

stat( ) to examine file sizes and dates, unlink( ) to delete files

There are many variations on this theme. Everyone and their aunt's vendor have written their own script to do log rotation. It should come as no surprise that there is a Perl module to handle log rotation. Let's take a quick look at Logfile::Rotate by Paul Gampe.

Logfile::Rotate uses the object-oriented programming convention of first creating a new log file object instance and then running a method of that instance. First, we create a new instance with the parameters found in Table 9-3.

Table 9-3: Logfile::Rotate Parameters




Name of log file to rotate

Count (optional, default: 7)

Number of backup files to keep around

Gzip (optional, default: Perl's default as found during the Perl build)

Full path to gzip compression program executable


Code to be executed after the rotation has been completed, as in step 5 of Table 9-2

Here's some example code that uses these parameters:

use Logfile::Rotate;
$logfile = new Logfile::Rotate(
                File   => "/var/adm/log/syslog",
                Count  => 5,
                Gzip   => "/usr/local/bin/gzip",
                Signal => 
                   sub {
			          open PID, "/etc/" or
                           die "Unable to open pid file:$!\n";
                        chomp($pid = <PID>);
                        close PID;
                        # should check validity first
                        kill -HUP $pid; 

This locks the log file you've specified and prepares the module for rotating it. Once you've created this object, actually rotating the log is trivial:

$logfile->rotate(  );
undef $logfile;

The undef line is there to be sure that the log file is unlocked after rotation (it stays locked while the logfile object is still in existence).

As mentioned in the documentation, if this module is run by a privileged user (like root) there are a few concerns. First, Logfile::Rotate makes a system call to run the gzip program, potentially a security hole. Second, the Signal subroutine must be coded in a defensive manner. In the previous example, we don't check to see that the process ID retrieved from /etc/ is actually the correct PID for syslog. It would be better to use one of the process table listing strategies we discussed in Chapter 4, User Activity, before sending the signal via kill( ). See Chapter 1, Introduction, for more tips on coding defensively.

Circular Buffering

We've just discussed the traditional log rotation method for dealing with storage of ever-growing logs. Let me show you a more unusual approach that you can add to your toolkit.

Here's a common scenario: you're trying to debug a server daemon that provides a torrent of log output. You're only interested in a small part of the total output, perhaps just the lines the server produces after you run some sort of test with a special client. Saving all of the log output to disk as per usual would fill your disk quickly. Rotating the logs as often as needed with this volume of output would slow down the server. What do you do?

I wrote a program called bigbuffy to deal with this conundrum. The approach is pretty straightforward. bigbuffy reads from its usual "standard" or "console" input one line at a time. These lines are stored in a circular buffer of a set size. When the buffer is full, it starts filling from the top again. This read-store process continues forever until bigbuffy receives a signal from the user. Upon receiving this signal, it dumps the current contents of the buffer to a file and returns to its normal cycle. What's left behind on disk is essentially a window into the log stream, showing just the data you need.

bigbuffy can be paired with a service-monitoring program like those found in Chapter 5, TCP/IP Name Services. As soon as the monitor detects a problem, it can signal bigbuffy to dump its log buffer. Now you've got a snapshot of the log localized to the failure instance (assuming your buffer is large enough and your monitor noticed the problem in time).

Here's a simplified version of bigbuffy. The code is longer than the examples we've seen so far in this chapter, but it is not very complex. We'll use it in a moment as a springboard for addressing some important issues like input blocking and security:

$buffsize = 200; # default circular buffer size (in lines)
use Getopt::Long;
# parse the options
GetOptions("buffsize=i" => \$buffsize,
           "dumpfile=s" => \$dumpfile);
# set up the signal handler and initialize a counter
# and away we go! (with just a simple 
# read line--store line loop)
while (<>){
    # insert line into data structure 
    # note, we do this first, even if we've caught a signal. 
    # Better to dump an extra line than lose a line of data if 
    # something goes wrong in the dumping process
    $buffer[$whatline] = $_;
    # where should the next line go? 
	($what_line %= $buff_size)++;
    # if we receive a signal, dump the current buffer
    if ($dumpnow) {
	   &dodump(  );
sub setup {
    die "USAGE: $0 [--buffsize=<lines>] --dumpfile=<filename>"
      unless (length($dumpfile));
    $SIG{'USR1'} = \&dumpnow; # set a signal handler for dump
    $whatline = 1; # start line in circular buffer
# simple signal handler that just sets an exception flag, 
# see perlipc(1)
sub dumpnow {
    $dumpnow = 1;
# dump the circular buffer out to a file, appending to file if
# it exists
sub dodump{
    my($line);     # counter for line dump
    my($exists);   # flag, does the output file exist already?
    my(@firststat,@secondstat); # to hold output of lstats
    $dumpnow = 0;  # reset the flag and signal handler
    $SIG{'USR1'} = \&dumpnow;
    if (-e $dumpfile and (! -f $dumpfile or -l $dumpfile)) {
       warn "ALERT: dumpfile exists and is not a plain file, 
             skipping dump.\n";
       return undef;
    # we have to take special precautions when we're doing an 
    # append. The next set of "if" statements perform a set of 
    # security checks while opening the file for append
    if (-e $dumpfile) {
	    $exists = 1;
	    unless(@firststat = lstat $dumpfile){
	        warn "Unable to lstat $dumpfile, 
                  skipping dump.\n";
	        return undef;
	    if ($firststat[3] != 1) {
	        warn "$dumpfile is a hard link, skipping dump.\n";
	        return undef;
    unless (open(DUMPFILE,">>$dumpfile")){
	    warn "Unable to open $dumpfile for append, 
             skipping dump.\n";
	    return undef;
    if ($exists) {
	    unless (@secondstat = lstat DUMPFILE){
	        warn "Unable to lstat opened $dumpfile, 
                  skipping dump.\n";
	        return undef;
	    if ($firststat[0] != $secondstat[0] or # check dev num
	        $firststat[1] != $secondstat[1] or # check inode
	        $firststat[7] != $secondstat[7])   # check sizes
	       warn "SECURITY PROBLEM: lstats don't match, 
                skipping dump.\n";
	       return undef;
    $line = $whatline;
    print DUMPFILE "-".scalar(localtime).("-"x50)."\n";
    do {
        # in case buffer was not full
	    last unless (defined $buffer[$line]);
	    print DUMPFILE $buffer[$line];
	    $line = ($line == $buffsize) ? 1 : $line+1;
    } while ($line != $whatline);
    # zorch the active buffer to avoid leftovers 
    # in future dumps
    $whatline = 1;
    $buffer = (  );
    return 1;

A program like this can stir up a few interesting implementation issues.

Input blocking in log processing programs

I mentioned earlier that this is a simplified version of bigbuffy. For ease of implementation, especially cross-platform, this version has an unsavory characteristic: while dumping data to disk, it can't continue reading input. During a buffer dump, the program sending output to bigbuffy may be told by the OS to pause operation pending the drain of its output buffer. Luckily, the dump is fast, so the window where this could happen is very small, but this is still less passive than we'd like.

Two possible solutions to this problem include:

Both approaches are hard to pull off portably in a cross-platform environment, hence the simplified version shown in this book.

Security in log processing programs

You may have noticed that bigbuffy troubles itself more than usual with the opening and writing of its output file. This is an example of the defensive coding style mentioned earlier in "Log Rotation." If this program is to be used to debug server daemons, it is likely to be run by privileged users on a system. It is important to think about unpleasant situations that might allow this program to be abused.

For example, take the case where the output file we've specified is maliciously swapped with a link to another file. If we naively opened and wrote to the file, we might find ourselves inadvertently stomping on an important file like /etc/passwd instead. Even if we checked the output file before opening it, a nasty person may have switched it on us before we began to write to it. To avoid this scenario:

If we didn't have to append, we could have opened a temporary file with a randomized name (so it couldn't be guessed ahead of time) and renamed the temporary file into place.

These sorts of gyrations are necessary on most Unix systems because Unix was not originally designed with security as a high priority. Symbolic-link security breaches are not a problem under NT4 since they are a little-used part of the POSIX subsystem, and MacOS, which doesn't have the notion of "privileged user."[3]

Log Analysis

Some system administrators never get past the rotation phase in their relationship with their log files. As long as the necessary information exists on disk when it is needed for debugging, they never put any thought into using their log file information for any other purpose. I'd like to suggest that this is a shortsighted view, and that a little log file analysis can go a long way. We're going to look at a few approaches you can use for performing log file analysis in Perl, starting with the most simple and getting more complex as we go along.

Most of the examples in this section use Unix log files for demonstration purposes, since the average Unix system has more log files than sample systems from either of the other two operating systems put together, but the approaches offered here are not OS-specific.

Stream Read-Count

The easiest approach is the simple "read-and-count." We read through a stream of log data, looking for interesting data, and increment a counter when we find it. Here's a simple example, which counts the number of times a machine has rebooted based on the contents of a Solaris 2.6 wtmpx file:[4]

# template for Solaris 2.6 wtmpx, see the pack(  ) doc 
# for more information
$template = "A32 A4 A32 l s s2 x2 l2 l x20 s A257 x"; 
# determine the size of a record
$recordsize = length(pack($template,(  )));
# open the file
open(WTMP,"/var/adm/wtmpx") or die "Unable to open wtmpx:$!\n";
# read through it one record at a time
while (read(WTMP,$record,$recordsize)) {
    if ($ut_line eq "system boot"){
        print "rebooted ".scalar localtime($tv_sec)."\n";
print "Total reboots: $reboots\n";

Let's extend this methodology and see an example of statistics gathering using the Windows NT Event Log facility. As mentioned before, NT has a well-developed and fairly sophisticated system-logging mechanism. This sophistication makes it a bit trickier for the beginning Perl programmer. We'll have to use some Win32-specific Perl module routines to get at the basic log information.

NT programs and operating system components log their activities by posting "events" to one of several different event logs. These events are recorded by the OS with basic information like when the event was posted, which program or OS function is posting the event, what kind of event (informational or something more serious) is being posted, etc.

Unlike Unix, the actual description of the event, or log message, is not actually stored with the event entry. Instead, an EventID is posted to the log. This EventID contains a reference to a specific message compiled into a program library (.dll ). Retrieving a log message given an EventID is tricky. The process involves looking up the proper library in the Registry and loading the library by hand. Luckily, the current version of Win32::EventLog performs this process for us automatically (see $Win32::EventLog::GetMessageText in our first Win32::Eventlog example earlier, "Using the OS's Logging API."

For our next example, we're going to generate some simple statistics on the number of entries currently in the System log, where they have come from, and their level of severity. We'll write this program in a slightly different manner than the first NT logging example in this chapter.

Our first step is to load the Win32::EventLog module that contains the glue between Perl and the Win32 event log routines. We then initialize a hash table that will be used to contain the results of our calls to the log-reading routines. Perl would normally take care of this for us, but sometimes it is good to add code like this for the benefit of others who will be reading the program. Finally, we set up a small list of event types that we will use later for printing statistics:

use Win32::EventLog;
my %event=('Length',NULL,
# partial list of event types, i.e., Type 1 is "Error",
# 2 is "Warning", etc.
@types = ("","Error","Warning","","Information");

Our next step is to open up the System event log. The Open( ) places an EventLog handle into $EventLog that we can use as our connection to this particular log:

  or die "Could not open System log:$^E\n";

Once we have this handle, we can use it to retrieve the number of events in the log and the number of the oldest record:


We use this information as part of our first Read( ) statement, which positions us to the place in the log right before the first record. This is the equivalent of seek( )ing to the beginning of a file:

$EventLog->Win32::EventLog::Read((EVENTLOG_SEEK_READ |
                                  $numevents + $oldestevent, $event);

And then from here on in, it is a simple loop to read each log entry in turn. The EVENTLOG_SEQUENTIAL_READ flag says "continue reading from the position of the last record read." The EVENTLOG_FORWARDS_READ flag moves us forward in chronological order.[5] The third argument to Read( ) is the record offset, in this case 0, because we want to pick up right where we left off. As we read each record, we record its Source and EventType in a hash table of counters.

# loop through all of the events, recording the number of 
# Source and EventTypes
for ($i=0;$i<$numevents;$i++) {
                     0, $event);
# now print out the totals
print "-->Event Log Source Totals:\n";
for (sort keys %source) {
    print "$_: $source{$_}\n";
print "-"x30,"\n";
print "-->Event Log Type Totals:\n";
for (sort keys %types) {
    print "$types[$_]: $types{$_}\n";
print "-"x30,"\n";
print "Total number of events: $numevents\n";

My results look like this:

--> Event Log Source Totals:
Application Popup: 4
DCOM: 12
Dhcp: 12
EventLog: 351
Mouclass: 6
NWCWorkstation: 2
Print: 27
Rdr: 12
RemoteAccess: 108
SNMP: 350
Serial: 175
Service Control Manager: 248
Sparrow: 5
Srv: 201
msbusmou: 162
msi8042: 3
msinport: 162
mssermou: 151
qic117: 2
--> Event Log Type Totals:
Error: 493
Warning: 714
Information: 1014
Total number of events: 2220

As promised, here's some sample code that relies on a last-like program to dump the contents of the event log. It uses a program called ElDump by Jesper Lauritsen, downloaded from ElDump is similar to DumpEl found in the NT Resource Kit:

$eldump = 'c:\bin\eldump';      # path to ElDump
# output data field separated by ~ and without full message
# text (faster)
$dumpflags = '-l system -c ~ -M'; 
open(ELDUMP,"$eldump $dumpflags|") or die "Unable to run $eldump:$!\n";
print STDERR "Reading system log.";
    ($date,$time,$source,$type,$category,$event,$user,$computer) = 
    print STDERR ".";
print STDERR "done.\n";
# for each type of event, print out the sources and number of 
# events per source
foreach $type (qw(Error Warning Information 
                  AuditSuccess AuditFailure)){
    print "-" x 65,"\n";
    print uc($type)."s by source:\n";
    for (sort keys %$type){
        print "$_ ($$type{$_})\n";
print "-" x 65,"\n";

Here's a snippet from the output:

ERRORs by source:
Cdrom (2)
DCOM (15)
Dhcp (2524)
Disk (1)
EventLog (5)
RemoteAccess (30)
Serial (24)
Service Control Manager (100)
Sparrow (2)
atapi (2)
i8042prt (4)
WARNINGs by source:
Cdrom (22)
Dhcp (76)
Print (8)
Srv (82)

A simple stream read-count variation

A simple variation of the stream read-count approach involves taking multiple passes through the data. This is sometimes necessary for large data sets and cases where it takes an initial scan through the data before you can determine the difference between interesting and non-interesting data. Programmatically, this means after the first pass through the input, either:

Here's an example where a multiple-pass read-count approach might be useful. Imagine you have to deal with a security breach where an account on your system has been compromised. One of the first questions you might want to ask is "Has any other account been compromised from the same source machine?" Finding a comprehensive answer to this seemingly simple question turns out to be trickier than you might expect. Let's take a first shot at the problem. This SunOS-specific code (see the initial template) takes the name of a user as its first argument and an optional regular expression as a second argument for filtering out hosts we wish to ignore:

$template       = "A8 A8 A16 l"; # for SunOS 4.1.x
$recordsize     = length(pack($template,(  )));
($user,$ignore) = @ARGV;
print "-- scanning for first host contacts from $user --\n";
open(WTMP,"/var/adm/wtmp") or die "Unable to open wtmp:$!\n";
while (read(WTMP,$record,$recordsize)) {
    if ($user eq $name){
       next if (defined $ignore and $host =~ /$ignore/o);
	    if (length($host) > 2 and !exists $contacts{$host}){
	        $connect = localtime($time);
print "-- scanning for other contacts from those hosts --\n";
die "Unable to seek to beginning of wtmp:$!\n"
  unless (seek(WTMP,0,0));
while (read(WTMP,$record,$recordsize)) {
    # if it is not a logout, and we're looking for this host,
    # and this is a connection from a user *other* than the 
    # compromised account, then record
    if (substr($name,1,1) ne "\0" and	
	     exists $contacts{$host} and 
	     $name ne $user){
	         $connect = localtime($time);
# here's the output format, may need to be adjusted based on template
format STDOUT = 
@<<<<<<<<    @<<<<<<<<<<<<<<  @<<<<<<<<<<<<<<<<<<

First the program scans through a wtmp file looking for all logins from the compromised user. As it finds them, it compiles a hash of all of the hosts from which these logins took place. It then rolls back to the beginning of the file and scans for connections from that host list, printing matches as it finds them. It would be easy to modify this program to scan all of the files in a directory of rotated wtmp log files.

One problem with this program is that it is too specific. It will only match exact hostnames. If an intruder is coming in from a dialup modem bank of an ISP (which they often are), chances are the hostnames will change with each dial-up connection. Still, partial solutions like this often help a great deal.

Besides its simplicity, the stream read-count approach we've been discussing has the advantage of being faster and less memory-intensive than other method. It works best with the stateless type of log files we discussed early on in the chapter. But sometimes, especially when dealing with stateful data, we need to use a different plan of attack.


The opposite extreme of our previous approach, where we passed by the data as fast as possible, is to read it into memory and deal with it after reading. Let's look at a few versions of this strategy.

First, an easy example: let's say you have an FTP transfer log and you want to know which files have been transferred the most often. Here are some sample lines from a wu-ftpd FTP server transfer log:

Sun Dec 27 05:18:57 1998 1 11868 /net/ a _ o a ftp 0 *
Sun Dec 27 05:52:28 1998 25 269273 /CPAN/doc/FAQs/FAQ/PerlFAQ.html a _ o a mozilla@ ftp 0 *
Sun Dec 27 06:15:04 1998 1 11868 /CPAN/MIRRORING.FROM b _ o a
edu ftp 0 *
Sun Dec 27 06:15:05 1998 1 35993 /CPAN/RECENT.html b _ o a ftp 0 *

Here's the list of fields for each line of the previous output (please see the wu-ftpd server man page xferlog(5) for details on each field).

Field #

Field Name




transfer-time (in seconds)























Here's some code to show which files have been transferred most often:

$xferlog = "/var/adm/log/xferlog";
open(XFERLOG,$xferlog) or die "Unable to open $xferlog:$!\n";
while (<XFERLOG>){
for (sort {$files{$b} <=> $files{$a}||$a cmp $b} keys %files){
    print "$_:$files{$_}\n";

We read each line of the file, using the name of the file as a hash key and incrementing the value for that key. The name of the file is extracted from each log line using an array index that references a specific element of the list returned by the split( ) function:


You may notice that the specific element we reference (8) is different from the 8th field in the xferlog field listing above. This is an unfortunate result of the lack of field delimiters in the original file. We are splitting on whitespace (the default for split( )), so the date field becomes five separate list items.

One subtle trick in this code sample is in the anonymous sort function we use to sort the values:

for (sort {$files{$b} <=> $files{$a}||$a cmp $b} keys %files){

Note that the places of $a and $b have been switched from their alphabetical order in the first portion. This causes sort to return the items in descending order, thus showing us the more frequently transferred files first. The second portion of the anonymous sort function (||$a cmp $b) assures that we list files with the same number of transfers in a sorted order.

If we want to limit this script to counting only certain files or directories, we could let the user specify a regular expression as the first argument to this script. For example, adding:

next unless /$ARGV[0]/o;

to the while( ) loop allows you to specify a regular expression to limit which files will be counted.

Regular Expressions

Regular expression crafting is often one of the most important parts of log parsing. Regexps are used like programmatic sieves to extract the interesting data from the non-interesting data in your logs. The regular expressions used in this chapter are very basic, but you'll probably be creating more sophisticated regexps for your own use. You may wish to use the subroutine or compiled regular expression techniques introduced in the previous chapter to use them even more efficiently.

Any time you spend learning how to wield regexp power will benefit you in many ways. One of the best for learning about regular expressions is Jeffrey Friedl's book, Mastering Regular Expressions (O'Reilly).

Let's take a look at another example of the read-remember-process approach using our "breach-finder" program from the previous section. Our earlier code only showed us successful logins from the intruder sites. We have no way of knowing about unsuccessful attempts. For that information, we're going to have to bring in another log file.


This problem exposes one of Unix's flaws: Unix systems tend to store log information in a number of different places and formats. Few tools are provided for dealing with these disparities (luckily we have Perl). It is not uncommon to need more than one data source to solve problems like these.

The log file that will be the most help to us in this endeavor is the one generated through syslog by Wietse Venema's Unix security tool tcpwrappers. tcpwrappers provides gatekeeper programs and libraries that can be used to control access to network services. An individual network service like telnet can be configured so that a tcpwrappers program first handles all network connections to this service. After a connection is made, the tcpwrappers program will syslog the connection attempt and then either pass the connection off to the real service or take some action (like dropping the connection). The choice of whether to let the connection through is based on some simple user-provided rules (e.g., allow only certain originating hosts). tcpwrappers can also take preliminary precautions to make sure the connection is coming from the place it purports to come from using a DNS reverse-lookup. It can also be configured to log the name of the user who made the connection (via the RFC931 ident protocol) if possible. For a more detailed description of tcpwrappers, see Simson Garfinkel and Gene Spafford's book Practical Unix & Internet Security (O'Reilly).

For our purposes, we can just add some code to our previous breach-finder program that scans the tcpwrappers log (tcpdlog in this case) for connections from the suspect hosts we found in our scan of wtmp. If we add the following code to the end of our previous code sample:

# tcpd log file location
$tcpdlog        = "/var/log/tcpd/tcpdlog"; 
$hostlen        = 16;  # max length of hostname in wtmp file
print "-- scanning tcpdlog --\n";
open(TCPDLOG,$tcpdlog) or die "Unable to read $tcpdlog:$!\n";
    next if !/connect from /; # we only care about connections
    ($connecto,$connectfrom) = /(.+):\s+connect from\s+(.+)/;
    $connectfrom =~ s/^.+@//;
    # tcpwrappers can log the entire hostname, not just the first N 
    # characters like some wtmp logs. As a result, we need to truncate 
    # the hostname at the same place as the wtmp file if we want to 
    # perform a hash lookup below
    $connectfrom = substr($connectfrom,0,$hostlen);
    print if (exists $contacts{$connectfrom} and 
              $connectfrom !~ /$ignore/o);

we get output that looks like this:

-- scanning for first host contacts from user --
user       host.ccs.neu  Fri Apr  3 13:41:47
-- scanning for other contacts from those hosts --
user2      host.ccs.neu  Thu Oct  9 17:06:49
user2      host.ccs.neu  Thu Oct  9 17:44:31
user2      host.ccs.neu  Fri Oct 10 22:00:41
user2      host.ccs.neu  Wed Oct 15 07:32:50
user2      host.ccs.neu  Wed Oct 22 16:24:12
-- scanning tcpdlog --
Jan 12 13:16:29 host2 in.rshd[866]: connect from
Jan 13 14:38:54 host3 in.rlogind[4761]: connect from
Jan 15 14:30:17 host4 in.ftpd[18799]: connect from
Jan 16 19:48:19 host5 in.ftpd[5131]: connect from

You may have noticed that the output above found connections from two different time ranges. We found connections in wtmp from April 3 to October 22, while the tcpwrappers data appeared to show only January connections. The difference in dates is an indication that our wtmp files and our tcpwrappers files are rotated at different speeds. You need to be aware of these details when writing code that tacitly assumes the two log files being correlated refer to the same time period.

For a final and more sophisticated example of the read-remember-process approach, let's look at a task that requires combining stateful and stateless data. If you wanted a more comprehensive picture of the activity on a wu-ftpd server, you might want to use code to correlate the login and logout activity logged in a machine's wtmp file with the file transfer information recorded by wu-ftpd in its xferlog file. It might be nice if you could see output that showed when an FTP session started and finished, and what transfers took place during that session.

Here's a snippet of sample output from the code we're about to assemble. It shows four FTP sessions in March. The first session shows one file being transferred to the machine. The next two show files being transferred from that machine, and the last shows a connection without any transfers:

Thu Mar 12 18:14:30 1998-Thu Mar 12 18:14:38 1998 pitpc.ccs.neu.ed
        -> /home/dnb/makemod
Sat Mar 14 23:28:08 1998-Sat Mar 14 23:28:56 1998 traal-22.ccs.neu
        <- /home/dnb/.emacs19
Sat Mar 14 23:14:05 1998-Sat Mar 14 23:34:28 1998 traal-22.ccs.neu
        <- /home/dnb/lib/emacs19/cperl-mode.el
        <- /home/dnb/lib/emacs19/filladapt.el
Wed Mar 25 21:21:15 1998-Wed Mar 25 21:36:15 1998 traal-22.ccs.neu
        (no transfers in xferlog)

Producing this output turns out to be non-trivial, since we need to pigeonhole stateless data into a stateful log. The xferlog transfer log shows only the time and the host that initiated the transfer. The wtmp log shows the connection and disconnections from other hosts to the server. Let's walk through how to combine the two types of data using a read-remember-process approach. We'll define some variables for the program, and then call the subroutines to perform each task:

# for date->Unix time (secs from Epoch) conversion
use Time::Local; 
$xferlog = "/var/log/xferlog"; # location of transfer log
$wtmp = "/var/adm/wtmp";       # location of wtmp
$template = "A8 A8 A16 l";     # SunOS 4.1.4 template for wtmp 
$recordsize = length(pack($template,(  ))); # size of each wtmp entry
$hostlen = 16;   # max length of the hostname in wtmp
# month name to number mapping
%month = qw{Jan 0 Feb 1 Mar 2 Apr 3 May 4 Jun 5 Jul 6 
            Aug 7 Sep 8 Oct 9 Nov 10 Dec 11};  
&ScanXferlog;    # scan the transfer log
&ScanWtmp;       # scan the wtmp log
&ShowTransfers;  # correlate and print transfers

Now let's look at the procedure that reads the wu-ftpd xferlog log file:

# scans a wu-ftpd transfer log and populates the %transfers 
# data structure
sub ScanXferlog {
    print STDERR "Scanning $xferlog...";
    open(XFERLOG,$xferlog) or 
       die "Unable to open $xferlog:$!\n";
    while (<XFERLOG>){
       # use an array slice to select the fields we want
	   ($mon,$mday,$time,$year,$rhost,$fname,$direction) = 
       # add the direction of transfer to the filename, 
       # i is "transferred in"
	   $fname = ($direction eq 'i' ? "-> " : "<- ") . $fname;
       # convert the transfer time to Unix epoch format
	   ($hours,$min,$sec) = split(':',$time);
	   $unixdate =  
	   # put the data into a hash of lists of lists:
    print STDERR "done.\n";

The push( ) line of Perl in the previous code probably deserves a little explanation. This line creates a hash of lists of lists that looks something like this:

$transfers{hostname} = 
  ([time1, filename1], [time2, filename2],[time3, filename3]...)

The %transfers hash is keyed on the name of the host that initiated the transfer. We truncate that name to the largest string size our wtmp can hold as we create each hash entry.

For each host, we store a list of transfer pairs, each pair recording when a file was transferred and the name of that file. We're choosing to store the time in "seconds since the epoch" for ease of comparison later.[6] The subroutine timelocal( ) from the module Time::Local helps us convert to that standard. Because we're scanning a file transfer log written in chronological order, these lists of pairs are built in chronological order as well, a property that will come in handy later.

Let's move on to scanning wtmp :

# scans the wtmp file and populates the @sessions structure 
# with ftp sessions
sub ScanWtmp {
    print STDERR "Scanning $wtmp...\n";
    open(WTMP,$wtmp) or die "Unable to open $wtmp:$!\n";
    while (read(WTMP,$record,$recordsize)) {
        # don't even bother to unpack if record does not begin
        # with ftp. NOTE: this creates a wtmp format dependency 
        # as a trade-off for speed
	    next if (substr($record,0,3) ne "ftp");
        # if we find an open connection record, then 
        # create a hash of list of lists. The LoL will be used
        # as a stack below.
	    if ($name and substr($name,0,1) ne "\0"){
       # if we find a close connection record, we try to pair
       # it with a previous open connection record we recorded 
       # before
	    else {
               unless (exists $connections{$tty}){
               warn "found lone logout on $tty:" . 
                     scalar localtime($time)."\n";
           # we'll use the previous open connect and this
           # close connect to record this as a single session. 
           # To do that we create a list of lists where each 
           # list is (hostname, login, logout) 
                [@{shift @{$connections{$tty}}},$time]);
	        # if no more connections on the stack for that
            # tty, remove from hash
	        delete $connections{$tty} 
              unless (@{$connections{$tty}});
    print STDERR "done.\n";

Let's look at what's going in this code. We read through wtmp one record at a time. If that record begins with ftp, we know that this is an FTP session. As the comment says, the line of code that makes that decision has an implicit assumption about the format of a wtmp record. If tty was not the first field in the record, this test would not work. However, being able to test whether a line is of interest without having to unpack( ) is worth it.

Once we've found a line that begins with ftp, we take it apart to determine if it describes opening or closing of an FTP session. If it is an opening of a connection, we record that in %connections, a data structure that keeps tabs on all the open sessions. Like %transfers in our previous subroutine, it is a hash of list of lists, this time keyed on the tty (i.e., terminal) of each connection. Each of the values in this hash is a set of pairs detailing the connection hostname and time.

Why use such a complicated data structure to keep track of the open connections? Unfortunately, there isn't a simple "open-close open-close open-close" pairing of lines in wtmp. For instance, take a look at these lines from wtmp (as printed by our first wtmp program earlier in this chapter):

ftpd1833:dnb:ganges.ccs.neu.e:Fri Mar 27 14:04:47 1998
ttyp7:(logout):(logout):Fri Mar 27 14:05:11 1998
ftpd1833:dnb:hotdiggitydog-he:Fri Mar 27 14:05:20 1998
ftpd1833:(logout):(logout):Fri Mar 27 14:06:20 1998
ftpd1833:(logout):(logout):Fri Mar 27 14:06:43 1998

Notice the two open FTP connection records on the same tty (lines 1 and 3). If we just stored a single connection per tty in a plain hash, we'd lose the first connection record when we found the second one.

Instead, we use the list of lists keyed off every tty in %connections as a stack. When we see a connection opening, we add a (host, login-time) pair for the connection to the stack kept for that tty. Each time we see a close connection line for this tty, we "pop" one of the open connection records off the stack and store our complete information about the session as a whole in another data structure. That's the purpose of this line of code:

push(@sessions,[@{shift @{$connections{$tty}}},$time]);

Let's untangle this line from the inside out to make sure everything is clear. The part in bold type returns a reference to the stack/list of open connection pairs for a specific tty:

push(@sessions,[@{shift @{$connections{$tty}}},$time]);

This pops the reference to the first connection pair off that stack:

push(@sessions,[@{shift @{$connections{$tty}}},$time]);

We dereference it to get at the actual (host, login-time) connection pair list. If we place this pair at the beginning of another list that ends with the connection time, Perl will interpolate the connection pair and we'll have a single, three-element list. This gives us a triad of (host, login-time, logout-time):

push(@sessions,[@{shift @{$connections{$tty}}},$time]);

Now that we have all of the parts (initiating host, connection start, and end) of an FTP session in a single list, we can push a reference to that list on to the @sessions lists of lists for future use:

push(@sessions,[@{shift @{$connections{$tty}}},$time]);

We have a list of sessions thanks to this one, very busy line of code.

To finish the job in our &ScanWtmp subroutine we check if the stack is empty for a tty, i.e., there are no more open connection requests pending. If this is the case we can delete that tty's entry from the hash, since we know the connection has ended:

delete $connections{$tty} unless (@{$connections{$tty}});

Time to do the actual correlation between our two different data sets. This task falls to our &ShowTransfers subroutine. For each session, it prints out the connection triad, and then the files transferred during this session.

# iterate over the session log, pairing sessions
# with transfers
sub ShowTransfers {
    foreach $session (@sessions){
	    # print session times
	    print scalar localtime($$session[1]) . "-" .
	          scalar localtime($$session[2]) . 
              " $$session[0]\n";
       # find all files transferred in this connection triad 
       # and print them
	    print &FindFiles(@{$session}),"\n";

Here's the hard part, deciding whether a particular login session had any transfers:

# returns all of the files transferred for a given connect 
# session triad
sub FindFiles{
    my($rhost,$login,$logout) = @_;
    # easy case, no transfers in this login
    unless (exists $transfers{$rhost}){
	    return "\t(no transfers in xferlog)\n";
    # easy case, first transfer we have on record is 
    # after this login
    if ($transfers{$rhost}->[0]->[0] > $logout){
	    return "\t(no transfers in xferlog)\n";
    # find any files transferred in this session
    foreach $transfer (@{$transfers{$rhost}}){
        # if transfer happened before login 
 	    next if ($$transfer[0] < $login);  
        # if transfer happened after logout
	    last if ($$transfer[0] > $logout); 
        # if we've already used this entry
	    next unless (defined $$transfer[1]);  
	    undef $$transfer[1];
    ($#found > -1 ? @found : "\t(no transfers in xferlog)\n")

We can eliminate the easy cases first. If we've never seen transfers initiated by this host, or the first transfer associated with this host occurs after the session triad we are checking has ended, we know no files have been transferred during that session.

If we can't eliminate the easy cases, we need to look through our lists of transfers. We check if each transfer made from the host in question occurred after the session started, but before the session ended. We skip to the next transfer if either of these conditions isn't true. We also avoid testing the other transfers for the host as soon as we've found a transfer that takes place after the session has ended. Remember we mentioned that all of the transfers are added to the data structure in chronological order? Here's where it pays off.

The last test we make before considering a transfer entry to be valid may look a little peculiar:

# if we've already used this entry
next unless (defined $$transfer[1]);  

If two anonymous FTP sessions from the same host overlap in time, we have no way of knowing which session is responsible for initiating a transfer of that file. There is simply no information from either of our logs that can help us make that determination. The best we can do in this case is make up a standard and keep to it. The standard here is "attribute the transfer to the first session possible." This test line above, and the subsequent undefing of the filename value as a flag, enforces that standard.

If this final test passes, we declare victory and add the filename to the list of files transferred in that session. The session and its accompanying file transfers are printed.

Read-remember-process programs that have to do this sort of correlation can get fairly sophisticated, especially when they are bringing together data sources where the correlation is a bit fuzzy. So in good Perl spirit, let's see if we can take an easier approach.

Black Boxes

In the Perl world, if you are trying to write something generally useful, another person may have beat you to it and published their code for the task. This gives you an opportunity to simply feed your data into their module in a prescribed way and receive results without having to know how the task was performed. This is often known as a "black box approach."

One example is the SyslogScan package by Rolf Harold Nelson. Earlier in this chapter we noted that parsing a mail log file from sendmail can be tricky because the lines are stateful. Each line often has one or more sibling lines interspersed with other lines later in the log. The SyslogScan package offers an easy way to refer to each mail delivery without having to manually scan the file and pair up all of the relevant lines. It can filter for certain addresses and keep track of some rudimentary statistics for the messages it has seen.

SyslogScan is object-oriented, so the first step is to load the module and create a new object instance:

use SyslogScan::DeliveryIterator;
# a list of mail syslog files
$maillogs = ["/var/log/mail/maillog"]; 
$iterator = new SyslogScan::DeliveryIterator(syslogList => $maillogs);

The new method of SyslogScan::DeliveryIterator returns an iterator, essentially a pointer into the file that shuttles forward message delivery by message delivery. By using an iterator, we are spared from the actual work of scanning ahead in the file looking for all of the lines related to a particular message. If we call the next( ) method of that iterator, it will hand us back a delivery object. This object encapsulates the information about that delivery previously spread over several lines in the log. For example, this code:

while ($delivery = $iterator -> next(  )){
    print $delivery->{Sender}." -> ".

prints out information like: -> -> ->

It gets even snazzier. If we feed a SyslogScan iterator object to the new method of the SyslogScan::Summary module, new will take all of the output from that iterator's next method and return a summary object. This summary object contains usage summaries for all of the delivery objects that iterator could possible return.

But the SyslogScan package takes this functionality to still another level. If we now feed a summary object to the new method of SyslogScan::ByGroup, we get a bygroup object that has grouped all of the summaries into domains and compiled stats for those groups. Here's the magic we just described in action:

use SyslogScan::DeliveryIterator;
use SyslogScan::Summary;
use SyslogScan::ByGroup;
use SyslogScan::Usage;
# the location of our maillog
$maillogs = ["/var/log/mail/maillog"]; 
# get an iterator for this file
$iterator = new SyslogScan::DeliveryIterator(syslogList => $maillogs);
# feed this iterator to ::Summary, receive a summary object
$summary = new SyslogScan::Summary($iterator);
# feed this summary object to ::ByGroup and receive a 
# stats-by-group object
$bygroup = new SyslogScan::ByGroup($summary);
# print the contents of this object
foreach $group (sort keys %$bygroup){
                      {groupUsage}->getBroadcastVolume(  )};
                      {groupUsage}->getSendVolume(  )};
                      {groupUsage}->getReceiveVolume(  )};
                      {groupUsage}->getReceiveVolume(  )};
format STDOUT_TOP =
Name                         Bmesg  BByytes  Smesg  SBytes   Rmesg  Rbytes
---------------------------  -----  -------- ------ -------- ------ -------
format STDOUT =
@<<<<<<<<<<<<<<<<<  @>>>>> @>>>>>>> @>>>>> @>>>>>>> @>>>>> @>>>>>>>

The result is a report detailing the number and bytes of broadcast, sent, and received messages. Here's an excerpt from some sample output:

Name                 Bmesg  BByytes  Smesg  SBytes   Rmesg  Rbytes
-------------------  -----  -------- ------ -------- ------ -------          1     3420      1     3420      0        0             1      984      1      984      4     7812                  3    10830      3    10830      1      787          1     1245      1     1245      0        0                0        0      0        0      1     2040

The plus side of the black box approach is that you can often get a great deal done, thanks to the hard work of the module or script author, with very little code of your own. The minus side to using the black box approach is the trust you have to place in another author's code. It may have subtle bugs or use an approach that does not scale for your needs. It is best to look over the code before you drop it into production in your site.

Using Databases

The last approach we'll discuss requires the most knowledge outside of the Perl domain to implement. As a result, we'll only take a very simple look at a technique that over time will probably become more prevalent.

The previous examples we've seen work fine on reasonably-sized data sets when run on machines with a reasonable amount of memory, but they don't scale. For situations where you have lots of data, especially if the data comes from different sources, databases are the natural tool.

There are at least two ways to make use of databases from Perl. The first is one I'll call a "Perl-only" method. With this method, all of the database activity takes place in Perl, or libraries tightly coupled to Perl. The second way uses Perl modules like the DBI family to make Perl a client of another database like MySQL, Oracle, or MS-SQL. Let's use both of these approaches for log processing and analysis.

Using Perl-only databases

As long as the data set is not too large, we can probably stick to a Perl-only solution. We'll extend our ubiquitous breach-finder for an example. So far our code just dealt with connections on a single machine. If we wanted to find out about logins from intruders on any of our machines, how would we do it?

Our first step is to drop all of the wtmp data for our machines into a database of some sort. For the purpose of this example, assume that all of the machines in question have direct access to some shared directory via some network file system like NFS. Before we proceed, we need to choose a database format.

My "Perl database format" of choice is the Berkeley DB format. I use quotes around "Perl database format" because, while the support for DB is shipped with the Perl sources, the actually DB libraries must be procured from another source ( and installed before the Perl support can be built. Table 9-4 provides a comparison between the different supported database formats.

Table 9-4: Comparison of the Supported Perl Database Formats


Unix Support

NT/2000 Support

Mac Support

Key or Value Size Limits

Byte-Order Independent

"old" dbm






"new" dbm










1K (default)



Yes [7]

Yes [8]





Yes a

Yes a




I like the Berkeley-DB format because it can handle larger data sets and is byte-order independent. The byte-order independence is particularly important for the Perl code we're about to see, since we'll want to read and write to the same file from different machines which may have different architectures.

We'll start off by populating the database. For the sake of simplicity and portability, we're calling the last program to avoid having to unpack( ) several different wtmp files ourselves. Here's the code, with an explanation to follow:

use DB_File;
use FreezeThaw qw(freeze thaw);
use Sys::Hostname; # to get the current hostname
use Fcntl;         # for the definition of O_CREAT and O_RDWR
# find the executable for the last program
(-x "/bin/last" and $lastex = "/bin/last") or 
  (-x "/usr/ucb/last" and $lastex = "/usr/ucb/last");
$userdb    = "userdata";     # user database file
$connectdb = "connectdata";  # connection database file
$thishost  = &hostname;
open(LAST,"$lastex|") or
    die "Can't run the program $lastex:$!\n";
# read each line of the output from "last"
while (<LAST>){
    next if /^reboot\s/ or /^shutdown\s/ or 
           /^ftp\s/     or /^wtmp\s/; 
    ($user,$tty,$host,$day,$mon,$date,$time) = split;  
    next if $tty =~ /^:0/ or $tty =~ /^console$/;
    next if (length($host) < 4);
    $when = $mon." ".$date." ".$time;
	# save each record in a hash of list of lists
# tie to a database file, creating it (for Read & Write); if 
# it does not exist see the footnote in the text re: $DB_BTREE
tie %userdb, "DB_File",$userdb,O_CREAT|O_RDWR, 0600, $DB_BTREE
  or die "Unable to open $userdb database for r/w:$!\n";
# iterate through the users and store the info in our 
# database using freeze
foreach $user (keys %users){
    if (exists $userdb{$user}){
   	    ($userinfo) = thaw($userdb{$user});
	    $userdb{$user}=freeze $userinfo;
    else {
	    $userdb{$user}=freeze $users{$user};
untie %userdb;
# do the same for the connections
tie %connectdb, "DB_File",$connectdb,O_CREAT|O_RDWR, 
                0600, $DB_BTREE
  or die "Unable to open $connectdb database for r/w:$!\n";
foreach $connect (keys %connects){
    if (exists $connectdb{$connect}){
	    ($connectinfo) = thaw($connectdb{$connect});
    else {
untie %connectdb;

Our code takes the output from the last program and does the following:

  1. Filters out the lines that are not useful.
  2. Squirrels away the output in two hash of list of lists data structures that look like this:
  3. $users{username} = 
        [[current host, connecting host, connect time],
         [current host, connecting host, connect time]
    $connects{host} = 
        [[current host, username1, connect time],
         [current host, username2, connect time],

  4. Takes this data structure in memory and attempts to merge it into a database.

This last step is the most interesting, so let's explore it more carefully. We tie the hashes %userdb and %connectdb to database files.[9] This magic allows us to access those hashes transparently, while Perl handles storing and retrieving data to the database files behind the scenes. But hashes only store simple strings. How do we get our "hashes of list of lists" into a single hash value for storage?

Ilya Zakharevich's FreezeThaw module is used to store our complex data structure in a single scalar that can be used as a hash value. FreezeThaw can take an arbitrary Perl data structure and encode it as a string. There are other modules like this, Data::Dumper by Gurusamy Sarathy (shipped with Perl) and Storable by Raphael Manfredi being the most prevalent. FreezeThaw offers the most compact representation of a complex data structure, hence its use here. Each of these modules has its strong points, so be sure to investigate all three if you have a task like ours.

In our program we check if an entry for this user or host exists. If it doesn't, we simply "freeze" the data structure into a string and store that string in the database using our tied hash. If it does exist, we "thaw" the existing data structure found in the database back into memory, add our data, then re-freeze and re-store it.

If we run this code on several machines, we'll have a database with some potentially useful information to feed to the next version of our breach-finder program.


An excellent time to populate a database like this is just after a log rotation of a wtmp file has taken place.

The database population code presented here is too bare-sbones for production use. One glaring deficiency is the lack of a mechanism to prevent multiple instances of the program from updating the database at the same time. Given that file locking over NFS is known to be dicey at best, it might be easier to call code like this from a larger program that serializes the process of collecting information from each machine in turn.

Now that we have a database full of data, let's walk through our new improved breach-finder program that uses this information:

use DB_File;
use FreezeThaw qw(freeze thaw);
use Fcntl;
# accept the username and hosts to ignore as command-line arguments
($user,$ignore) = @ARGV;
# database files we'll be using
$userdb    ="userdata";
$connectdb ="connectdata";
tie %userdb, "DB_File",$userdb,O_RDONLY,666,$DB_BTREE
  or die "Unable to open $userdb database for reading:$!\n";
tie %connectdb, "DB_File",$connectdb,O_RDONLY,666,$DB_BTREE
  or die "Unable to open $connectdb database 
          for reading:$!\n";

We've loaded the modules we need, taken our input, set a few variables, and tied them to our database files. Now it's time to do some work:

# we can exit if we've never seen a connect from this user
unless (exists $userdb{$user}){
    print "No logins from that user.\n";
    untie %userdb;
    untie %connectdb;
($userinfo) = thaw($userdb{$user});
print "-- first host contacts from $user --\n";
foreach $contact (@{$userinfo}){
    next if (defined $ignore and $contact->[1] =~ /$ignore/o);
    print $contact->[1] . " -> " . $contact->[0] . 
          " on ".$contact->[2]."\n";

This code says: if we've seen this user at all, we reconstitute that user's contact records in memory using thaw( ). For each contact, we test to see if we've been asked to ignore the host it came from. If not, we print a line for that contact and record the originating host in the %otherhosts hash.

We use a hash here as a simple way of collecting the unique list of hosts from all of the contact records. Now that we have the list of hosts the intruder may have connected from, we need to find out all of the other users who have connected from these potentially compromising hosts.

Finding this information will be easy because when we recorded which users logged in to which machines, we also recorded the inverse (i.e., which machines were logged into by which users) in another database file. We now look at all of the records from the hosts we identified in the previous step. If we are not told to ignore a host, and we have connection records for it, we capture a unique list of users who have logged into that host using the %userseen hash:

print "-- other connects from source machines  --\n";
foreach $host (keys %otherhosts){
    next if (defined $ignore and $host =~ /$ignore/o);
    next unless (exists $connectdb{$host});
	($connectinfo) = thaw($connectdb{$host});
	foreach $connect (@{$connectinfo}){
	    next if (defined $ignore and $connect->[0] =~ /$ignore/o);

The final act of this three-step drama has a nice circular flair. We return to our original user database to find all of the connections made by suspect users from suspect machines:

foreach $user (sort keys %userseen){
    next unless (exists $userdb{$user});
    ($userinfo) = thaw($userdb{$user});
    foreach $contact (@{$userinfo}){
        next if (defined $ignore and $contact->[1] =~ /$ignore/o);
        write if (exists $otherhosts{$contact->[1]});

All that's left is to sweep up the theater and go home:

untie %userdb;
untie %connectdb;
format STDOUT =
@<<<<<<<< @<<<<<<<<<<<<<<< -> @<<<<<<<<<<<<<<< on @<<<<<<<<<<<

Here's some example output from the program (again, with the user and hostnames changed to protect the innocent):

-- first host contacts from baduser --
badhost1.exampl -> on Jan 18 09:55
badhost2.exampl -> on Jan 19 11:53
-- other connects from source machines  --
baduser2:  badhost1.exampl -> machine2.ccs.neu.e on Dec 15 13:26
baduser2:  badhost2.exampl -> machine2.ccs.neu.e on Dec 11 12:45
baduser3:  badhost1.exampl -> machine1.ccs.neu.ed on Jul 13 16:20
baduser4:  badhost1.exampl -> machine1.ccs.neu.ed on Jun 9 11:53
baduser:   badhost1.exampl -> machine1.ccs.neu.ed on Jan 18 09:55
baduser:   badhost2.exampl -> machine2.ccs.neu.e on Jan 19 11:53

This is a lovely example program, but it doesn't really scale past a small cluster of machines. For every subsequent run of the program, it may have to read a record from the database, thaw( ) it back into memory, add some new data to the record, freeze( ) it again, and store it back in the database. This can be CPU-time-and memory-intensive. The whole process potentially happens once per user and machine connection, so things slow down very quickly.

Using Perl-cliented SQL databases

Let's look at one way you might deal with very large datasets. You may need to load your data into a more sophisticated SQL database (commercial or otherwise) and query the information you need from it using SQL. If you're not familiar with SQL, I recommend you take a quick peek at Appendix D, The Fifteen-Minute SQL Tutorial, before looking at this section.

Populating the database could look like this:

use DBI;
use Sys::Hostname;
$db = "dnb"; # the name of the database we're using
# locate the last executable
(-x "/bin/last" and $lastex = "/bin/last") or 
  (-x "/usr/ucb/last" and $lastex = "/usr/ucb/last");
# connect to a Sybase database using user "dnb" and a password 
# provided on the command line
$dbh = DBI->connect('dbi:Sybase:','dnb',$ARGV[0]);
die "Unable to connect: $DBI::errstr\n" 
  unless (defined $dbh);
# change to the database we'll be using
$dbh->do("use $db") or
  die "Unable to change to $db: ".$dbh->errstr."\n";
# create the lastinfo table if it doesn't exist
unless ($dbh->selectrow_array(
          q{SELECT name from sysobjects WHERE name="lastinfo"})){
          $dbh->do(q{create table lastinfo (username char(8),
                                            localhost char(40),
                                            otherhost varchar(75),
                                            when char(18))}) or
      die "Unable to create lastinfo: ".$dbh->errstr."\n";
$thishost  = &hostname;
$sth = $dbh->prepare(
  qq{INSERT INTO lastinfo(username,localhost,otherhost,when) 
   VALUES (?,'$thishost', ?, ?)}) or
  die "Unable to prepare insert: ".$dbh->errstr."\n";
open(LAST,"$lastex|") or die "Can't run the program $lastex:$!\n";
while (<LAST>){
    next if /^reboot\s/ or /^shutdown\s/ or 
            /^ftp\s/    or /^wtmp\s/; 
    ($user,$tty,$host,$day,$mon,$date,$time) = split;  
    next if $tty =~ /^:0/ or $tty =~ /^console$/;
    next if (length($host) < 4);
    $when = $mon." ".$date." ".$time;

This creates a table called lastinfo with username, localhost, otherhost, and when columns. We iterate over the output of last, inserting non-bogus entries into this table.

Now we can use databases to do what they do so well. Here is a set of sample SQL queries that could easily be wrapped in Perl using the DBI or ODBC interfaces we explored in Chapter 7, SQL Database Administration:

-- how many entries in the database?
select count (*) from lastinfo;
-- how many users have logged in?
select count (distinct username) from lastinfo;
-- how many separate hosts have connected to our machines?
select count (distinct otherhost) from lastinfo;
-- which local hosts has the user "dnb" logged into?
select distinct localhost from lastinfo where username = "dnb";

These examples should give you a taste of the sort of "data mining" you can do once all of the data is in a real database. Each of those queries took only a second or so to run. Databases can be fast, powerful tools for system administration.

The subject of log analysis is a vast one. Hopefully this chapter has given you a few tools and a little inspiration.

Module Information for This Chapter




Win32::EventLog (ships with ActivePerl)






Getopt::Long (ships with Perl)



Time::Local (ships with Perl)






DB_File (ships with Perl)






Sys::Hostname (ships with Perl)



Fcntl (ships with Perl)






References for More Information

1. There's a bit of handwaving here, since you still have to track where the last executable is found in each Unix environment and compensate for any differences in the format of each program's output.

2. Log information in Windows 2000 can also be retrieved using the Window Management Instrumentation (WMI) framework we touched on in Chapter 4, User Activity. Win32::EventLog is easier to use and understand.

3. In fairness, both NT and MacOS have their own unique security weaknesses. Also, there is considerable time and effort these days being spent to "harden" various Unix distributions (e.g., OpenBSD).

4. wtmpx is a file that uses an extended form of the u/wtmp format. It was designed to record events without some of the field length constraints of the classic format (e.g., 16-character remote hostnames). Under Solaris, each login/logout is logged to both wtmp and wtpmx.

5. Here's another place where the Win32 event log routines are more flexible than usual. Our code could have moved to the to the end of the log and read backwards in time if we wanted to do that for some reason.

6. This is seconds since some arbitrary starting point. For example, the epoch on most Unix machines is 00:00:00 GMT on January 1, 1970.

7. Actual database libraries may have to be downloaded separately.

8. Database library and Perl module must be downloaded from the net (

9. You don't usually have to use the BTREE form of storage when using DB_File, but this program can store some very long values. Those values caused the Version 1.85 DB_HASH storage method to croak in testing (causing corrupted data), while the BTREE storage method seemed to handle the pounding. Later version of the DB libraries may not have this bug.

Back to: Perl for System Administration

O'Reilly Home | O'Reilly Bookstores | How to Order | O'Reilly Contacts
International | About O'Reilly | Affiliated Companies

© 2001, O'Reilly & Associates, Inc.