Chapter 3: SQL Data Types

Overview

Previously we defined database as an organized collection of information. Not only does that mean that data have to be organized according to a company's business rules, but also the database organization should reflect the nature of the information. Databases can store dollar amounts, quantities, names, date and time values, binary files, and more. These can be further classified by type, which reflects the "nature" of the data: numbers, characters, dates, etc.

Note

Data type is a characteristic of a database table column that determines what kind of data it can hold.

One can ask: why do we need data types at all? Wouldn't it be easier simply have one uniform data type and store everything, let's say, in the form of character strings?

There are many reasons why we don't do that. Some of them are historical. For example, when relational databases were born in the late twentieth century, hard disk space and memory storage were at premium, so the idea was to store everything as efficiently as possible. Already existing programming languages had some built-in rules for how to store different types of data. For example, any English character (plus special characters and digits) could be represented using its ASCII equivalent and the necessary storage for it was one byte (more about ASCII later in this chapter). Numbers are traditionally stored in the form of binary strings (native to computer architecture). To represent a number from negative 32,768 to positive 32,767, two bytes (or sixteen bits) of storage are sufficient (2¹⁶). But if we used ASCII characters to represent numbers, we would need six bytes to store any integer greater than 9,999 (five bytes for digits, and one for the plus or minus sign), five bytes for whole numbers greater than 999 (four bytes for digits, one byte for the sign), and so on. Now imagine — if we have a million of records, we could save four million bytes (about 4M). Sounds like almost nothing today, but back in the 1970s that was incredibly large storage space. The principle of effectiveness is still in place, of course, but now we are talking different scales.

Note

One byte consists of eight bits. Each bit is a binary number that can either be 0 or 1. All information is stored in memory or on the hard disk in form of ones and zeroes, representing the only two states computers understand: zero means no signal, and one indicates the presence of the signal.

Another reason is logical consistency. Every data type has its own rules, sort order, relations with other data types, implicit conversion rules, and so on. It is definitely easier to work with sets of like values, say dates, rather than with a mixture of dates, numbers, and character strings. Try comparing library shelves where all materials are sorted and classified (fiction is in one room, kids' literature in another, audio books in their special area, videotapes somewhere else) with a pile of chaotically mixed books, tapes, white papers, and CDs and think what would you prefer for finding information.

And the last thing to mention — some modern data types (particularly movie files) are too large and too complicated to store them in a traditional way. Now we are going to discuss existing SQL data types in more details.