2.2 HTTP

HTTP requests are just that?an incoming request from a client (typically a web browser) to a server for a specific document. It's worth looking at this a bit closer. Let's start by looking at the two main methods for retrieving a document via HTTP, GET and POST.

2.2.1 GET

Generally speaking, a GET is a simple request for a page with some parameters sent in a specific format. Examine the request shown Example 2-1, as you might type into the address field of your browser.

Example 2-1. HTTP GET request URI
http://localhost:1234/example.jsp?page=12&format=simple

Looking at the HTTP GET request, notice that the request contains two parameters: page and format. When the web application gets this request, it may examine these parameters and return different results based on the values of these parameters. Example 2-2 shows the actual text sent to the web server.

Example 2-2. Bytes sent for a HTTP GET request
GET /example.jsp?page=12&format=simple HTTP/1.1

Host: cascadetg.com

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5)

            Gecko/20031007 Firebird/0.7

Accept: text/xml,application/xml,application/xhtml+xml,text/

        html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/

        jpeg,image/gif;q=0.2,*/*;q=0.1

Accept-Language: en-us,en;q=0.5

Accept-Encoding: gzip,deflate

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

Keep-Alive: 300

Connection: keep-alive

Cookie: CP=null*

In many ways, it can help to think of a page request as a very simple execution of a command-line program or even a just a function being called. Conceptually, this is how a servlet works at the most primitive level: a stream of bytes from the client is passed in, and a stream of bytes is then sent back.

2.2.2 POST

A POST is conceptually similar to a GET but with slightly different formatting. A POST is commonly used to handle a form submission, with more complex data sent to the server than a GET. Example 2-3 shows a simple HTML form.

Example 2-3. HTML form
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

"http://www.w3.org/TR/html4/loose.dtd">

<html>

<head>

<title>Simple Form</title>

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

</head>

<body>

<form action="http://localhost:1234/" method="post"

    enctype="multipart/form-data" name="simpleForm" >

  <p>Text  Field <input type="text" name="textfield"></p>

  <p>File to upload <input type="file" name="file"></p>

  <p><input type="submit" name="Submit" value="Submit">

  </p>

</form>

</body>

</html>

You'll notice the emphasis on the form tags in the example. These input elements generate user interface elements in the web browser. When the user clicks on the Submit button, as shown in Figure 2-3, the web browser examines these user interface elements and returns the appropriate data to the web server.

Figure 2-3. A simple HTML form
figs/rww_0203.gif


An example of the data sent to the server is shown in Example 2-4. You'll notice that the values of the text field and the Submit button are sent as part of this text.

Example 2-4. Bytes sent for an HTTP POST request
POST / HTTP/1.1

Host: cascadetg.com

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) 

            Gecko/20031007 Firebird/0.7

Accept: text/xml,application/xml,application/xhtml+xml,text/

        html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/

        jpeg,image/gif;q=0.2,*/*;q=0.1

Accept-Language: en-us,en;q=0.5

Accept-Encoding: gzip,deflate

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

Keep-Alive: 300

Connection: keep-alive

Cookie: CP=null*

Content-Type: multipart/form-data; 

              boundary=---------------------------41184676334

Content-Length: 438



-----------------------------41184676334

Content-Disposition: form-data; name="textfield"



Test

-----------------------------41184676334

Content-Disposition: form-data; name="file"; filename="tiny.txt"

Content-Type: text/plain



A small text file.



Nothing to see here.  Move along.



-----------------------------41184676334

Content-Disposition: form-data; name="Submit"



Submit

-----------------------------41184676334--

Note that the file is encoded and sent along with the rest of the content in this request (using MIME-based encoding). This demonstrates how a simple stream of bytes can contain complex information, including form data and even files. Most of the standard Internet protocols, such as HTTP, NNTP (used in Usenet newsgroups), SMTP, POP, and FTP are based on similar (more or less) human-readable formats. In many ways, these textual formats aren't particularly efficient, but they are easy to understand and debug. The main difference between "classic" Internet protocols, such as HTTP and NNTP, and web service protocols (such as SOAP) is the reliance on XML for presenting and parsing the bytes.

2.2.3 Potential of Bytes

Once you start thinking of the Internet and networking as readable (and possibly mutable) streams of bytes, many interesting ideas become apparent. For example, certain products read and then pass along streams of bytes. A stream might alter a web page to block pornography or generate reports on the information being sent. Another obvious example is a server application that goes out on the Internet, automatically downloads web pages, searches for links and other information, and then builds a searchable index (such as Google.com and other Internet search engines).

As discussed in Chapter 1, it turns out one popular idea is to read the contents of other web pages, parse the HTML, and then generate your own "new" page featuring that content. Aside from being illegal and extremely rude, it's incredibly inefficient. You may only be interested in one or two thousand bytes in that response, and yet Amazon is sending you a hundred times that much data (HTML formatting, information about other products, etc.). It's much more efficient to directly call another server with a request, and just receive the data you're looking for?in other words, a remote procedure call.