Section 9.12. Forms Programming

If you create forms, sooner or later you'll need to create the server-side application that processes them. Don't panic. There is nothing magic about server-side programming, nor is it overly difficult. With a little practice and some perseverance, you'll be cranking out forms applications.

The most important advice we can give about forms programming is easy to remember: copy others' work. Writing a forms application from scratch is fairly hard; copying a functioning forms application and modifying it to support your form is far easier.

Fortunately, server vendors know this, and they usually supply sample forms applications with their server. Rummage about for a directory named cgi-src, and you should discover a number of useful examples you can easily copy and reuse.

We can't hope to replicate all the useful stuff that came with your server or provide a complete treatise on forms programming. What we can do is offer a simple example of both GET and POST applications, giving you a feel for the work involved and hopefully getting you moving you in the right direction.

Before we begin, keep in mind that not all servers invoke these applications in the same manner. Our examples cover the broad class of servers derived from the original NCSA HTTP server. They also should work with the Netscape Communications family of server products and the public-domain Apache server. In all cases, consult your server documentation for complete details. You will find more detailed information in CGI Programming with Perl, by Scott Guelich, Gunther Birznieks, and Shishir Gundavaram, and Webmaster in a Nutshell, by Stephen Spainhour and Robert Eckstein, both published by O'Reilly.

One alternative to CGI programming is the Java servlet model, covered in Java Servlet Programming, by Jason Hunter with William Crawford (O'Reilly). Servlets can be used to process GET and POST form submissions, although they are actually more general objects. There are no examples of servlets in this book.

9.12.1 Returning Results

Before we begin, we need to discuss how server-side applications end. All server-side applications pass their results back to the server (and on to the user) by writing those results to the application's standard output as a MIME-encoded file. Hence, the first line of the application's output must be a MIME Content-Type descriptor. If your application returns an HTML document, the first line is:

Content-type: text/html

The second line must be completely empty. Your application can return other content types, too ? just include the correct MIME type. A GIF image, for example, is preceded with:

Content-type: image/gif

Generic text that is not to be interpreted as HTML can be returned with:

Content-type: text/plain

This is often useful for returning the output of other commands that generate plain text instead of HTML.

9.12.2 Handling GET Forms

One of two methods for passing form parameters from client to server is the GET method. In that way, parameters are passed as part of the URL that invokes the server-side forms application. A typical invocation of a GET-style application might use a URL like this:

http://www.kumquat.com/cgi-bin/dump_get?name=bob&phone=555-1212

When the server processes this URL, it invokes the application named dump_get stored in the directory named cgi-bin. Everything after the question mark is passed to the application as parameters.

Things diverge a bit at this point, due to the nature of the GET-style URL. While forms place name/value pairs in the URL, it is possible to invoke a GET-style application with only values in the URL. Thus, the following is a valid invocation as well, with parameters separated by plus signs (+):

http://www.kumquat.com/cgi-bin/dump_get?bob+555-1212

This is a common invocation when the application is referenced by a searchable document with the <isindex> tag. The parameters typed by the user into the document's text-entry field are passed to the server-side application as unnamed parameters separated by plus signs.

If you invoke your GET application with named parameters, your server passes those parameters to the application in one way; unnamed parameters are passed differently.

9.12.2.1 Using named parameters with GET applications

Named parameters are passed to GET applications by creating an environment variable named QUERY_STRING and setting its value to the entire portion of the URL following the question mark. Using our previous example, the value of QUERY_STRING would be set to:

name=bob&phone=555-1212

Your application must retrieve this variable and extract from it the parameter name/value pairs. Fortunately, most servers come with a set of utility routines that performs this task for you, so a simple C program that just dumps the parameters might look like:

#include <stdio.h>

#include <stdlib.h>



#define MAX_ENTRIES 10000



typedef struct {char *name;

                char *val;

               } entry;



char *makeword(char *line, char stop);

char x2c(char *what);

void unescape_url(char *url);

void plustospace(char *str);



main(int argc, char *argv[])



{  entry entries[MAX_ENTRIES];

    int num_entries, i;

    char *query_string;



/* Get the value of the QUERY_STRING environment variable */

    query_string = getenv("QUERY_STRING");



/* Extract the parameters, building a table of entries */

    for (num_entries = 0; query_string[0]; num_entries++) {

       entries[num_entries].val = makeword(query_string, '&');

       

       plustospace(entries[num_entries].val);

       unescape_url(entries[num_entries].val);

       entries[num_entries].name = 

          makeword(entries[num_entries].val, '=');

       }



/* Spit out the HTML boilerplate */

    printf("Content-type: text/html\n");

    printf("\n");

    

    printf("<html>");

    printf("<head>");

    printf("<title>Named Parameter Echo</title>\n");

    printf("</head>");

    printf("<body>");

    printf("You entered the following parameters:\n");

    printf("<ul>\n");



/* Echo the parameters back to the user */

    for(i = 0; i < num_entries; i++)

        printf("<li> %s = %s\n", entries[i].name, 

                  entries[i].val);



/* And close out with more boilerplate */

    printf("</ul>\n");

    printf("</body>\n");

    printf("</html>\n");

}

The example program begins with a few declarations that define the utility routines that scan through a character string and extract the parameter names and values.[7] The body of the program obtains the value of the QUERY_STRING environment variable using the getenv( ) system call, uses the utility routines to extract the parameters from that value, and then generates a simple HTML document that echoes those values back to the user.

[7] These routines are usually supplied by the server vendor. They are not part of the standard C or Unix libraries.

For real applications, you should insert your actual processing code after the parameter extraction and before the HTML generation. Of course, you'll also need to change the HTML generation to match your application's functionality.

9.12.2.2 Using unnamed parameters with GET applications

Unnamed parameters get passed to the application as command-line parameters. This makes writing the server-side application almost trivial. Here is a simple shell script that dumps the parameter values back to the user:

#!/bin/csh -f

#

# Dump unnamed GET parameters back to the user



echo "Content-type: text/html"

echo

echo '<html>'

echo '<head>'

echo '<title>Unnamed Parameter Echo</title>'

echo '</head>'

echo '<body>'

echo 'You entered the following parameters:'

echo '<ul>'



foreach i ($*)

   echo '<li>' $i

end



echo '</ul>'

echo '</body>'



exit 0

Again, we follow the same general style: output a generic document header, including the MIME Content-Type, followed by the parameters and some closing boilerplate. To convert this to a real application, replace the foreach loop with commands that actually do something.

9.12.3 Handling POST Forms

Applications that use POST-style parameters expect to read encoded parameters from their standard input. Like GET-style applications with named parameters, they can take advantage of the server's utility routines to parse these parameters.

Here is a program that echoes the POST-style parameters back to the user:

#include <stdio.h>

#include <stdlib.h>



#define MAX_ENTRIES 10000



typedef struct {char *name;

                char *val;

               } entry;



char *makeword(char *line, char stop);

char *fmakeword(FILE *f, char stop, int *len);

char x2c(char *what);

void unescape_url(char *url);

void plustospace(char *str);



main(int argc, char *argv[])



{  entry entries[MAX_ENTRIES];

    int num_entries, i;



/* Parse parameters from stdin, building a table of entries */

    for (num_entries = 0; !feof(stdin); num_entries++) {

       entries[num_entries].val = fmakeword(stdin, '&', &cl);

       plustospace(entries[num_entries].val);

       unescape_url(entries[num_entries].val);

       entries[num_entries].name = 

          makeword(entries[num_entries].val, '=');

       }



/* Spit out the HTML boilerplate */

    printf("Content-type: text/html\n");

    printf("\n");

    printf("<html>");

    printf("<head>");

    printf("<title>Named Parameter Echo</title>\n");

    printf("</head>");

    printf("<body>");

    printf("You entered the following parameters:\n");

    printf("<ul>\n");



/* Echo the parameters back to the user */

    for(i = 0; i < num_entries; i++)

        printf("<li> %s = %s\n", entries[i].name, 

                  entries[i].val);



/* And close out with more boilerplate */

    printf("</ul>\n");

    printf("</body>\n");

    printf("</html>\n");

}

Again, we follow the same general form. The program starts by declaring the various utility routines needed to parse the parameters, along with a data structure to hold the parameter list. The actual code begins by reading the parameter list from the standard input and building a list of parameter names and values in the array named entries. Once this is complete, a boilerplate document header is written to the standard output, followed by the parameters and some closing boilerplate.

Like the other examples, this program is handy for checking the parameters being passed to the server application early in the forms- and application-debugging process. You can also use it as a skeleton for other applications by inserting appropriate processing code after the parameter list is built up and altering the output section to send back the appropriate results.