A Crash Course on Databases

A tutorial-based book should keep you actively working, so I try to refrain from pausing the action for long-winded passages explaining esoterica. But you will not get very far developing dynamic Web sites if you do not have a solid familiarity (though not necessarily expertise) with databases. In this section, I'll introduce you to basic database concepts and vocabulary, using Microsoft Access. I strongly encourage you to spend additional time learning to work with databases, as you continue to master dynamic Web site development. For now, this section should be enough to get you started.

Introducing Database Objects

In the simplest terms, a database is a system of storage for data. But in contemporary use, the term database generally means a lot morecertainly in the case of Microsoft Access or an enterprise-level database system, such as Microsoft SQL Server or Oracle. Each of these is a relational database management system (RDBMS). The RDBMS model was developed in the 1970s and 1980s to enable database managers to store data in a way that reflected relationships between different types of data. We'll return to the idea of relationships momentarily, but first you should understand the objects that make up databases.

Data in a database is stored in tables. At first glance, tables look like Excel spread sheets, in that they are made of rows and columns. The columns, called fields, contain a single category of information. The rows, called records, contain a single set of information comprising one element of data for each field. For example, in a table called tbl_customers, you might expect to find fields for first name, address, city, state, postal code, phone number, and so on. Each individual customer would have her or his own record.

The accompanying figure shows a table from the Newland database, which contains basic information about countries. You can easily see each country listed in a row in the countryName field, and you'll see that each country has the same type of information listed. You can edit tables directly by clicking in a cell and typing away, but there is a better way.


Tables aren't the only type of object you can expect to find in databases. Also of note are forms, reports, and queries. Forms are used to insert new data and modify existing data. The form used to build the country table shown in the preceding screenshot can be seen in the following screenshot. Forms make it easy to insert and edit information, and you can also use them to control the type of information entered, which helps ensure the integrity of the data entered. At the bottom of the form is a group of record navigation buttons, which you can use to access the record you want to edit, or to create a new record from scratch.


Whereas forms are a means of inputting information into tables, reports are a means of outputting that data. You probably noticed that it was impossible to read all of the information in the table directly, because the fields weren't wide enough to accommodate all the text. You can use reports to make data presentable. Better yet, you can selectively show only some data in reports, rather than showing all of it, which makes reports much more useful.


You use queries to show data selectively in reports. Queries are like searches: You provide certain criteria, and the database returns a report. For example, you could obtain a report of all the countries in the Newland database that begin with T, which would return Taiwan and Thailand. Queries are written in a language called SQL, or Structured Query Language. You will use SQL heavily when working with databases, beginning with this chapter, because you must use SQL to retrieve data from a database to make it available for Web pages. The reason for this is that SQL is the primary means that developers have to communicate with database systems. Access has a visual SQL editor, but you can't use it in Dreamweaver. In Dreamweaver, you can hand-code SQL or use Dreamweaver's visual SQL editor.


Familiarity with SQL is a prerequisite for dynamic Web application development. You'll learn a fair amount in this book, but for an easy-to-read and comprehensive introduction, check out SQL: Visual QuickStart Guide (Peachpit).

The SQL snippet in the accompanying screenshot retrieves all of the records in the countryName, population, country_currency, and description fields of the tbl_country table.


Understanding Relationships

Relationships are a crucial concept when it comes to working with modern databases. They enable developers to specify how different database tables are connected with one another through shared data. By creating relationships, database designers are able to model data into tables that reflect reality and enable efficient maintenance of data over time. I have created the database file you will use for the book, and it already has many relationships in place. While you won't be creating any more relationships, you will often need to retrieve and use data from more than one table together, and you can't do this unless you understand relationships.

It is perhaps easiest to understand relationships by following an example. Imagine you use an Excel spreadsheet to store information about financial transactions at your company. You want to store each transaction as a separate row, so you create a spreadsheet with the following columns:











Over time, hundreds of records are added to this spreadsheet. Many of these records are for repeat customers. The problem is, each time a customer returns her or his address information is stored again. As time passes, some of these repeat customers move. Their new addresses are duly entered in the spreadsheet, but all of the former records have the old address. Chances are, sooner or later, someone will inadvertently use the wrong address. Updating these addresses is hard, because there are so many; and, unfortunately, in Excel there's not much you can do about this problem.

A more logical way to represent the transaction is to separate the customer from the transaction. One table would track individual sales, but the customer information would be stored in a separate table for customers. The customer table would have one and only one record for each customer. If a customer moved, you would need to update only the single record that applied to the customer, not all of the records of his or her transactions. Then, back in the transaction table, instead of listing all of the customer information, you would list a unique identifier that referenced the customer in the customer table. Databases enable you to create this type of relationship between tables.


If you've ever wondered why catalog companies have product IDs or customer IDs that you have to refer to, this is why: Those IDs are unique numbers in their database that refer to you and only you, or to a given product and only that product.

Thus, the customer table would look as follows:









And the transaction table would look as follows:






Notice that both tables have a field for cust_ID. The cust_ID in the customer table is a unique identifier in that table, also called the primary key. No two columns in this table will ever have the same cust_ID. It's possible that there will be two John Smiths, and it's possible that two people will reside at postal code 90210. But each row is guaranteed to be unique, because each row has its own unique primary key, cust_ID.

In contrast, the cust_ID in the transaction table could be repeated multiple timesthis would mean that the same customer had ordered more than one time. When you use the primary key of one table as a field in a different table, it is referred to as a foreign key. By placing foreign keys in tables, you create relationships between tables. Again, the benefit of doing this is that you remove redundant information and better maintain the integrity of your data.


To facilitate the discussion, I've simplified these tables. For example, you would normally expect to see a third table to handle products (that is, an inventory table), with product_ID used as the foreign key in the transaction table. Also, this example assumes that a customer can have only a single credit card. Obviously, you can add new tables, fields, and relationships to handle these realities.

The following figure shows the relationships between the two tables described in this example. The line between the two tables indicates the relationship. The number 1 on the left side indicates that in the tbl_customers table, the cust_ID is unique, while the infinity character on the right indicates that cust_ID can appear many times. This is known as a one-to-many relationship.


The power of relationships extends beyond preventing redundancy. For example, you can write a SQL query that pulls data out of both tables, using certain criteria. For instance, you could write a query that lists all of the first and last names of customers who spent over $100. You can also create forms that write to more than one table.


A copy of this simple database can be found on the CD-ROM, in Lesson08/Complete/transaction.mdb.

Databases on the Web

Now that you have a sense for what databases look like and what they can do, you should be primed to understand how they work on the Web.

Database content is used on the Web in many different ways. The simplest waywhich you'll do yourself at the end of this lessonis to display the contents of a field on the Web. But you can display more than a simple string of text. You can also display multiple fields, populate menus, and handle user authentication. In addition to reading and displaying information, you can also collect it and store it in database tables. Using this technique, you can create user registration, surveys, quizzes, and content management systems.


Content management systems let users add or modify site content without having to know any HTML, buy any special software, or worry about uploading files, using Web forms and databases. The user types the site content into a Web form. This content is then stored in a database and output to a different Web page.

If you compare database-driven Web pages to database objects, you'll realize that you can effectively use Web pages and Web forms as a surrogate for database reports and forms. Pages that output data are like reports, while pages with Web forms perform the same function as database forms. The advantage to such a system is that users don't need Access (or whichever RDBMS your organization uses) to maintain its content, and users can be distributed all over the world and still have access to the data storage, thanks to the Internet.

So much for the cool features; now how does it all work? Just as you have access to files sitting on your hard drive, so a user has access to files sitting on a server's hard drive. An Internet user does not have permission to start modifying files on any serverthat would be a security nightmare. But you can give applications on your system access to certain files; using a technology called ODBC, you can create a data source name (DSN) that you can use to allow ASP or ColdFusion to read and write to your Access database. The relationship is depicted in the following figure.


There's a lot going on in this figure. Before I deconstruct it, though, notice how far the user is from the actual data; this complex sequence enables the user to access data, but protects the data from convenient access, providing some measure of security.


Security is a major issue with dynamic applications that link to data sources. You don't want malicious users to gain access to data, whether to read it or to write it. It is the job of the server administrator to secure the server, but it is your job as application developer to secure the application. I don't cover security issues in this book, but as you start developing real-world database-driven applications, you need to take this topic seriously. A great place to start is Macromedia's Security Development Center, a free resource containing dozens of articles, white papers, tips, tutorials, and more. It can be found at http://www.macromedia.com/desdev/security/.

Let's look at the figure in detail. You'll notice the figure is divided into four different regions: the client layer, the middleware layer, the ODBC layer, and the database layer.

The client layer includes static HTML interpreted by the browser. This static HTML may have been coded as static HTML, or as dynamic HTML, but as you know, by the time it is returned to the browser, it's all static HTML.

ODBC has all of the information needed to access the database, including the appropriate database driver and the path to the database. These two pieces of information are stored in ODBC and are referred to using the DSN.

Finally, the SQL reaches the database itself, and it looks in the proper table(s) and retrieves the appropriate information.

So much for the way in. On the way out, the data is sent back the same way it came in, until it gets to the middleware layer. At this point, the data retrieved is stored in the server's memory (RAM). It is not actually on the page yet. A set of data stored in memory as the result of a query is called a recordset. You use ASP or ColdFusion to output the recordset data (or any subset of it) inside the HTML, much as you did with form variables, so that the user can see it.