NULL Values

No discussion of data types would be complete without talking about NULL values. NULL is not really a data type, but rather a value that can be held by any data type. A column (or other expression) of any given data type can hold all permissible values for that type, or it can hold no value. When a column has no value, it is said to be NULL. For example, a column of type SMALLINT can hold values between ?32768 and +32767: it can also be NULL. A TIME column can hold values from midnight to noon, but a TIME value can also be NULL.

NULL values represent missing, unknown, or not-applicable values. For example, let's say that you want to add a membership_expiration_date to the customers table. Some customers might be permanent members?their memberships will never expire. For those customers, the membership_expiration_date is not applicable and should be set to NULL. You may also find some customers who don't want to provide you with their birth dates. The birth_date column for these customers should be NULL.

In one case, NULL means not applicable. In the other case, NULL means don't know. A NULL membership_expiration_date does not mean that you don't know the expiration date, it means that the expiration date does not apply. A NULL birth_date does not mean that the customer was never born(!); it means that the date of birth is unknown.

Of course, when you create a table, you can specify that a given column cannot hold NULL values (NOT NULL). When you do so, you aren't affecting the data type of the column; you're just saying that NULL is not a legal value for that particular column. A column that prohibits NULL values is mandatory; a column that allows NULL values is optional.

You may be wondering how a data type could hold all values legal for that type, plus one more value. The answer is that PostgreSQL knows whether a given column is NULL not by looking at the column itself, but by first examining a NULL indicator (a single bit) stored separately from the column. If the NULL indicator for a given row/column is set to TRUE, the data stored in the row/column is meaningless. This means that a data row is composed of values for each column plus an array of indicator bits?one bit for each optional column.



    Part II: Programming with PostgreSQL