Recipe 19.1 Writing a CGI Script

19.1.1 Problem

You want to write a CGI script to process the contents of an HTML form. In particular, you want to access the form contents and produce valid output in return.

19.1.2 Solution

A CGI script is a server-side program launched by a web server to generate dynamic content. It receives encoded information from the remote client (user's browser) via STDIN and environment variables, and it must produce a valid HTTP header and body on STDOUT. The standard CGI module, shown in Example 19-1, painlessly manages input and output encoding.

Example 19-1. hiweb
  #!/usr/bin/perl -w
  # hiweb - load CGI module to decode information given by web server
  use strict;
  use CGI qw(:standard escapeHTML);
  # get a parameter from a form
  my $value = param('PARAM_NAME');
  # output a document
  print header( ), start_html("Howdy there!"),
        p("You typed: ", tt(escapeHTML($value))),
        end_html( );

19.1.3 Discussion

CGI is just a protocol, a formal agreement between a web server and a separate program. The server encodes the client's form input data, and the CGI program decodes the form and generates output. The protocol says nothing regarding which language the program must be written in; programs and scripts that obey the CGI protocol have been written in C, shell, Rexx, C++, VMS DCL, Smalltalk, Tcl, Python, and of course Perl.

The full CGI specification lays out which environment variables hold which data (such as form input parameters) and how it's all encoded. In theory, it should be easy to follow the protocol to decode the input, but in practice, it is surprisingly tricky to get right. That's why we strongly recommend using the CGI module. The hard work of handling the CGI requirements correctly and conveniently has already been done, freeing you to write the core of your program without getting bogged down in network protocols.

CGI scripts are called in two main ways, referred to as methodsbut don't confuse HTTP methods with Perl object methods! The HTTP GET method is used in document retrievals where an identical request will produce an identical result, such as a dictionary lookup. A GET stores form data in the URL. This means it can be conveniently bookmarked for canned requests, but has limitations on the total request size. The HTTP POST method sends form data separate from the request. It has no size limitations, but cannot be bookmarked. Forms that update information on the server, such as mailing in feedback or modifying a database entry, should use POST. Client browsers and intervening proxies are free to cache and refresh the results of GET requests behind your back, but they may not cache POST requests. GET is suitable only for short read-only requests, whereas POST works for forms of any size, as well as for updates and feedback responses. By default, therefore, the CGI module uses POST for all forms it generates.

With few exceptions, mainly related to file permissions and highly interactive work, CGI scripts can do nearly anything other programs can do. They can send results back in many formats: plain text, HTML documents, XML files, sound files, pictures, or anything else specified in the HTTP header. Besides producing plain text or HTML text, they can also redirect the client browser to another location, set server cookies, request authentication, and give errors.

The CGI module provides two different interfaces: a procedural one for casual use, and an object-oriented one for power users with complicated needs. Virtually all CGI scripts should use the simple procedural interface, but unfortunately, most of's documentation uses examples with the original object-oriented approach. Due to backward compatibility, if you want the simple procedural interface, you need to specifically ask for it using the :standard import tag. See Chapter 12 for more on import tags.

To read the user's form input, pass the param function a field name. If you have a form field named "favorite", then param("favorite") returns its value. With some types of form fields, such as scrolling lists, the user can choose more than one option. For these, param returns a list of values, which you could assign to an array.

For example, here's a script that pulls in values of three form fields, the last one having many return values:

use CGI qw(:standard);
$who   = param("Name");
$phone = param("Number");
@picks = param("Choices");

Called without arguments, param returns a list of valid form parameters in list context or how many form parameters there were in scalar context.

That's all there is to accessing the user's input. Do with it whatever you please, then generate properly formatted output. This is nearly as easy. Remember that unlike regular programs, a CGI script's output must be formatted in a particular way: it must first emit a set of headers followed by a blank line before any normal output.

As shown in the Solution, the CGI module helps with output as well as input. The module provides functions for generating HTTP headers and HTML code. The header function builds the header for you. By default, it produces headers for a text/html document, but you can change the Content-Type and supply other optional header parameters as well:

print header( -TYPE    => 'text/plain',
              -EXPIRES => '+3d' ); can also be used to generate HTML. It may seem trivial, but this is where the CGI module shines: the creation of dynamic forms, especially stateful ones such as shopping carts. The CGI module even has functions for generating forms and tables.

When printing form widgets, the characters &, <, >, and " in HTML output are automatically replaced with their entity equivalents. This is not the case with arbitrary user output. That's why the Solution imports and makes use of the escapeHTML functionif the user types any of those special characters, they won't cause formatting errors in the HTML.

For a full list of functions and their calling conventions, see's documentation.

19.1.4 See Also

The documentation for the standard CGI module;; Recipe 19.6