CGI Programming on the World Wide WebBy Shishir Gundavaram1st Edition March 1996 This book is out of print, but it has been made available online through the O'Reilly Open Books Project. |
4.4 Decoding Forms in Other Languages
Since Perl contains powerful pattern-matching operators and string manipulation functions, it is very simple to decode form information. Unfortunately, this process is not as easy when dealing with other high-level languages, as most of them lack these kinds of operators. However, there are various libraries of functions on the Internet that make the decoding process easier, as well as the uncgi program (http://www.hyperion.com/~koreth/uncgi.html).
C Shell (csh)
It is difficult to decode form information using native C shell commands. csh was not designed to perform this type of string manipulation. As a result, you have to use external programs to achieve the task. The easiest and most versatile package available for handling form queries is uncgi, which decodes the form information and stores them in environment variables that can be accessed not only by csh, but also by any other language, such as Perl, Tcl, and C/C++. For example, if the form contains two text fields, named "user" and "age," uncgi will place the form data in the variables WWW_user and WWW_age, respectively. Here is a simple form and a csh CGI script to handle the information:
<HTML> <HEAD><TITLE>Simple C Shell and uncgi Example</TITLE></HEAD> <BODY> <H1>Simple C Shell and uncgi Example</H1> <HR> <FORM ACTION="/cgi-bin/uncgi/simple.csh" METHOD="POST"> Enter name: <INPUT TYPE="text" NAME="name" SIZE=40><BR> Age: <INPUT TYPE="text" NAME="age" SIZE=3 MAXLENGTH=3><BR> What do you like:<BR> <SELECT NAME="drink" MULTIPLE> <OPTION>Coffee <OPTION>Tea <OPTION>Soft Drink <OPTION>Alcohol <OPTION>Milk <OPTION>Water </SELECT> <P> <INPUT TYPE="submit" VALUE="Submit the form"> <INPUT TYPE="reset" VALUE="Clear all fields"> </FORM> <HR> </BODY> </HTML>Notice the URL associated with the ACTION attribute! It points to the uncgi executable, with extra path information (your program name). The server executes uncgi, which then invokes your program based on the path information. Remember, your program does not necessarily have to be a csh script; it can be a program written in any language. Now, let's look at the program.
#!/usr/local/bin/csh echo "Content-type: text/plain" echo ""The usual header information is printed out.
if ($?WWW_name) then echo "Hi $WWW_name -- Nice to meet you." else echo "Don't want to tell me your name, huh?" echo "I know you are calling in from $REMOTE_HOST." echo "" endifuncgi takes the information in the "name" text entry field and places it in the environment variable WWW_name.
In csh, environment variables are accessed by prefixing a "$" to the name (e.g., $REMOTE_HOST). When checking for the existence of variables, however, you must use the C shell's $? construct. I use $? in the conditional to check for the existence of WWW_Name. You cannot check for the existence of data directly:
if ($WWW_name) then .... else .... endifIf the user did not enter any data into the "name" text entry field, uncgi will not set a corresponding environment variable. If you then try to check for data using the method shown above, the C shell will give you an error indicating the variable does not exist.
The same procedure is applied to the "age" text entry field.
if ($?WWW_age) then echo "You are $WWW_age years old." else echo "Are you shy about your age?" endif echo "" if ($?WWW_drink) then echo "You like: $WWW_drink" | tr '#' '' else echo "I guess you don't like any fluids." endif exit(0)Here is another important point to remember. Since the form contains a scrolled list with the multiple selection property, uncgi will place all the selected values in the variable, separated by the " #" symbol. The UNIX command tr converts the "#" character to the space character within the variable for viewing purposes.
C/C++
There are a few form decoding function libraries for C and C++. These include the previously mentioned uncgi library, and Enterprise Integration Technologies Corporation's (EIT) libcgi. Both of them are simple to use.
C/C++ decoding using uncgi
Let's look at an example using uncgi (assuming the HTML form is the same as the one used in the previous example):
#include <stdio.h> #include <stdlib.h>These two libraries--standard I/O and standard library--are used in the following program. The getenv function, used to access environment variables, is declared in stdlib.h.
void main (void) { char *name, *age, *drink, *remote_host; printf ("Content-type: text/plain\n\n"); uncgi();Four variables are declared to store environment variable data. The uncgi function retrieves the form information and stores it in environment variables. For example, a form variable called name, would be stored in the environment variable WWW_name.
name = getenv ("WWW_name"); age = getenv ("WWW_age"); drink = getenv ("WWW_drink"); remote_host = getenv ("REMOTE_HOST");The getenv standard library function reads the environment variables, and returns a string containing the appropriate information.
if (name == NULL) { printf ("Don't want to tell me your name, huh?\n"); printf ("I know you are calling in from %s.\n\n", remote_host); } else { printf ("Hi %s -- Nice to meet you.\n", name); } if (age == NULL) { printf ("Are you shy about your age?\n"); } else { printf ("You are %s years old.\n", age); } printf ("\n");Depending on the user information in the form, various informational messages are output.
if (drink == NULL) { printf ("I guess you don't like any fluids.\n"); } else { printf ("You like: "); while (*drink != '\0') { if (*drink == '#') { printf (" "); } else { printf ("%c", *drink); } ++drink; } printf ("\n"); } exit(0); }The program checks each character in order to convert the "#" symbols to spaces. If the character is a "#" symbol, a space is output. Otherwise, the character itself is displayed. This process takes up eight lines of code, and is difficult to implement when compared to Perl. In Perl, it can be done simply like this:
$drink =~ s/#/ /g;This example points out one of the major deficiencies of C for CGI program design: pattern matching.
C/C++ decoding using libcgi
Now, let's look at another example in C. But this time, we will use EIT's libcgi library, which you can get from http://wsk.eit.com/wsk/dist/doc/libcgi/libcgi.html.
#include <stdio.h> #include "cgi.h"The header file cgi.h contains the prototypes for the functions in the library. Simply put, the file--like all the other header files--contains a list of all the functions and their arguments.
cgi_main (cgi_info *cgi) { char *name, *age, *drink, *remote_host;Notice that there is no main function in this program. The libcgi library actually contains the main function, which fills a struct called cgi_info with environment variables and data retrieved from the form. It passes this struct to your cgi_main function. In the function I've written here, the variable cgi refers to that struct:
form_entry *form_data;The variable type form_entry is a linked list that is meant to hold key/value pairs, and is defined in the library. In this program, form_data is declared to be of type form_entry.
print_mimeheader ("text/plain");The print_mimeheader function is used to output a specific MIME header. Technically, this function is not any different from doing the following:
print "Content-type: text/plain\n\n";However, the function does simplify things a bit, in that the programmer does not have to worry about accidentally forgetting to output the two newline characters after the MIME header.
form_data = get_form_entries (cgi); name = parmval (form_data, "name"); age = parmval (form_data, "age"); drink = parmval (form_data, "drink");The get_form_entries function parses the cgi struct for form information, and places it in the variable form_data. The function takes care of decoding the hexadecimal characters in the input. The parmval function retrieves the value corresponding to each form variable (key).
if (name == NULL) { printf ("Don't want to tell me your name, huh?\n"); printf ("I know you are calling in from %s.\n\n", cgi->remote_host); } else { printf ("Hi %s -- Nice to meet you.\n", name); }Notice how the REMOTE_HOST environment variable is accessed. The libcgi library places all the environment variable information into the cgi struct.
Of course, you can still use the getenv function to retrieve environment information.
if (age == NULL) { printf ("Are you shy about your age?\n"); } else { printf ("You are %s years old.\n", age); } printf ("\n"); if (drink == NULL) { printf ("I guess you don't like any fluids.\n"); } else { printf ("You like: %s", drink); printf ("\n"); } free_form_entries (form_data); exit(0); }Unfortunately, this library does not handle multiple keys properly. For example, if the form has multiple checkboxes with the same variable name, libcgi will return just one value for a specific key.
Once the form processing is complete, you should call the free_form_entries function to remove the linked list from memory.
In addition to the functions discussed, libcgi offers numerous other ones to aid in form processing. One of the functions that you might find useful is the mcode function. Here is an example illustrating this function:
switch (mcode (cgi)) { case MCODE_GET: printf("Request Method: GET\n"); break; case MCODE_POST: printf("Request Method: POST\n"); break; default: printf("Unrecognized method: %s\n", cgi->request_method); }The mcode function reads the REQUEST_METHOD information from the cgi struct and returns a code identifying the type of request.
Tcl
Unlike C/C++, Tcl does contain semi-efficient pattern matching functions. These functions can be used to decode form information. However, according to benchmark test results posted in comp.lang.perl, the regular expression functions as implemented in Tcl are quite inefficient, especially when compared to Perl. But you are not limited to writing form decoding routines in Tcl, beca