MySQL Column Types

Each table in a database is made up of one or more columns. When you create a table using a CREATE TABLE statement, you specify a type for each column. A column type is more specific than a data type, which is just a general category, such as "number" or "string." A column type precisely characterizes the kind of values a given table column can contain, such as SMALLINT or VARCHAR(32).

MySQL's column types are the means by which you describe what kinds of values a table's columns contain, which in turn determines how MySQL treats those values. For example, if you have numeric values, you can store them using a numeric or a string column type, but MySQL will treat the values somewhat differently depending on how you store them. Each column type has several characteristics:

  • What kind of values you can store in it

  • How much space values take up, and whether the values are fixed-length (all values of the type taking the same amount of space) or variable-length (the amount of space depending on the particular value being stored)

  • How values of the type are compared and sorted

  • Whether the type allows NULL values

  • Whether the type can be indexed

The following discussion surveys MySQL's column types briefly in a broad overview and then describes in more detail the properties that characterize each type.

Overview of Column Types

MySQL provides column types for values from all the general data type categories except the NULL value. NULL spans all types in the sense that the property of whether a column can contain NULL values is treated as a type attribute.

MySQL has numeric column types for both integer and floating-point values, as shown in Table 2.2. Integer columns can be signed or unsigned. A special attribute allows sequential integer column values to be generated automatically, which is useful in applications that require a series of unique identification numbers.

Table 2.2. Numeric Column Types
Type Name Meaning
TINYINT A very small integer
SMALLINT A small integer
MEDIUMINT A medium-sized integer
INT A standard integer
BIGINT A large integer
FLOAT A single-precision floating-point number
DOUBLE A double-precision floating-point number
DECIMAL A floating-point number, represented as a string

MySQL string column types are shown in Table 2.3. Strings can hold anything, even arbitrary binary data such as images or sounds. Strings can be compared according to whether or not they are case sensitive. In addition, you can perform pattern matching on strings. (Actually, in MySQL you can perform pattern matching on any column type, but it's most often done with string types.)

Table 2.3. String Column Types
Type Name Meaning
CHAR A fixed-length character string
VARCHAR A variable-length character string
TINYBLOB A very small BLOB (binary large object)
BLOB A small BLOB
MEDIUMBLOB A medium-sized BLOB
LONGBLOB A large BLOB
TINYTEXT A very small text string
TEXT A small text string
MEDIUMTEXT A medium-sized text string
LONGTEXT A large text string
ENUM An enumeration; column values may be assigned one enumeration member
SET A set; column values may be assigned multiple set members

MySQL date and time types are shown in Table 2.4, where CC, YY, MM, DD hh, mm, and ss represent century, year, month, day, hour, minute, and second. For temporal values, MySQL provides types for dates and times (either combined or separate) and timestamps (a special type that allows you to track when changes were last made to a record). There is also a type for efficiently representing year values when you don't need an entire date.

Table 2.4. Date and Time Column Types
Type Name Meaning
DATE A date value, in 'CCYY-MM-DD' format
TIME A time value, in 'hh:mm:ss' format
DATETIME A date and time value, in 'CCYY-MM-DD hh:mm:ss' format
TIMESTAMP A timestamp value, in CCYYMMDDhhmmss format
YEAR A year value, in CCYY format

Creating Tables

To create a table, issue a CREATE TABLE statement and specify a list of the columns that make up the table. Each column has a name and a type, and various attributes can be associated with each type. The following example creates a table named mytbl containing three columns named f, c, and i:

CREATE TABLE mytbl 
(
    f FLOAT(10,4),
    c CHAR(15) NOT NULL DEFAULT 'none',
    i TINYINT UNSIGNED NULL
);

The syntax for declaring a column is as follows:

col_name col_type [col_attributes] [general_attributes] 

The name of the column, col_name, is always first in the definition. The precise rules for naming columns are given in the "MySQL Naming Rules" section of Chapter 3. Briefly summarized, column names can be up to 64 characters long and can consist of alphanumeric characters from the server's default character set, as well as the underscore and dollar sign characters ('_' and '$'). Function names (words such as POS and MIN) are not reserved and can be used as column names; but keywords, such as SELECT, DELETE, and CREATE, normally are reserved and cannot be used, but as of MySQL 3.23.6, you can include other characters within a name or use reserved words by enclosing the name within backtick ('`') characters. A column name can begin with any character that is legal in a name, including a digit. However, unless quoted within backticks, a name cannot consist entirely of digits because then it would appear to be a number.

The column type col_type indicates the specific kind of values the column can hold. The type specifier can also indicate the maximum length of the values you store in the column. For some types, you specify the length explicitly as a number. For others, the length is implied by the type name. For example, CHAR(10) specifies an explicit length of 10 characters, whereas TINYBLOB values have an implicit maximum length of 255 characters. Some of the type specifiers allow you to indicate a maximum display width (how many characters to use for displaying values). Floating-point types allow the number of decimal places to be specified, so you can control how precise values are.

Following the column type, you can specify optional type-specific attributes as well as more general attributes. These attributes function as type modifiers. They cause MySQL to change the way it treats column values in some way:

  • The type-specific attributes that are allowable depend on the column type you choose. For example, UNSIGNED is allowable only for numeric types, and BINARY is allowable only for CHAR and VARCHAR.

  • The general attributes can be given for any column type, with a few exceptions. You can specify NULL or NOT NULL to indicate whether a column can hold NULL values. For all but BLOB and TEXT types, you can specify DEFAULT def_value to indicate that a column should be assigned the value def_value when a new row is created that does not explicitly specify the column's value. The value of def_value must be a constant; it cannot be an expression or refer to other columns.

If multiple column attributes are given, there are some constraints on the order in which they may appear. In general, you should be safe if you specify column type-specific attributes such as UNSIGNED or ZEROFILL before general attributes such as NULL or NOT NULL.

The rest of this section discusses the syntax for declaring each of MySQL's column types and the properties that characterize them, such as their range and storage requirements. The type specifications are shown as you use them in CREATE TABLE statements. Optional information is indicated by square brackets ([]). For example, the syntax MEDIUMINT[(M)] indicates that the maximum display width, specified as (M), is optional. On the other hand, for VARCHAR(M), the lack of brackets indicates that (M) is required.

Numeric Column Types

MySQL's numeric column types fall into two general classifications:

  • Integer types are used for numbers that have no fractional part, such as 1, 43, -3, 0, or -798432. You can use integer columns for data represented by whole numbers, such as weight to the nearest pound, height to the nearest inch, number of stars in a galaxy, number of people in a household, or number of bacteria in a petri dish.

  • Floating-point types are used for numbers that may have a fractional part, such as 3.14159, -.00273, -4.78, or 39.3E+4. You can use floating-point column types for values that may have a fractional part or that are extremely large or small. Some types of data you might represent as floating-point values are average crop yield, distances, money values, unemployment rates, or stock prices.

Integer types are the simplest. Floating-point types are more complex, particularly because their behavior has changed at certain points in MySQL's development.

Floating-point values can be assigned to integer columns but will be rounded to the nearest integer. Conversely, integer values can be assigned to floating-point columns. They are treated as floating-point values with a fractional part of zero.

Table 2.5 shows the name and range of each numeric type, and Table 2.6 shows the amount of storage required for values of each type.

Table 2.5. Numeric Column Type Ranges
Type Specification Range
TINYINT[(M)] Signed values: ?128 to 127 (?27 to 27 ? 1)
  Unsigned values: 0 to 255 (0 to 28 ? 1)
SMALLINT[(M)] Signed values: ?32768 to 32767 (?215 to 215 ? 1)
  Unsigned values: 0 to 65535 (0 to 216 ? 1)
MEDIUMINT[(M)] Signed values: ?8388608 to 8388607 (?223 to 223 ? 1)
  Unsigned values: 0 to 16777215 (0 to 224 ? 1)
INT[(M)] Signed values: ?2147683648 to 2147483647 (?231 to 231 ? 1)
  Unsigned values: 0 to 4294967295 (0 to 232 ? 1)
BIGINT[(M)] Signed values: ?9223372036854775808 to 9223372036854775807 (?263 to 263 ? 1)
  Unsigned values: 0 to 18446744073709551615 (0 to 264 ? 1)
FLOAT[(M,D)] Minimum non-zero values: ±1.175494351E?38
  Maximum non-zero values: ±3.402823466E+38
DOUBLE[(M,D)] Minimum non-zero values: ±2.2250738585072014E?308
  Maximum non-zero values: ±1.7976931348623157E+308
DECIMAL([M[,D]]) Varies; range depends on M and D

Table 2.6. Numeric Column Type Storage Requirements
Type Specification Storage Required
TINYINT[(M)] 1 byte
SMALLINT[(M)] 2 bytes
MEDIUMINT[(M)] 3 bytes
INT[(M)] 4 bytes
BIGINT[(M)] 8 bytes
FLOAT[(M,D)] 4 bytes
DOUBLE[(M,D)] 8 bytes
DECIMAL([M[,D]]) M bytes (MySQL < 3.23), M+2 bytes (MySQL 3.23)

Integer Column Types

MySQL provides five integer types: TINYINT, SMALLINT, MEDIUMINT, INT, and BIGINT. INTEGER is a synonym for INT. These types vary in the range of values they can represent and in the amount of storage space they require. (Types with a larger range require more storage.) Integer columns can be declared as UNSIGNED to disallow negative values; this shifts the range for the column upward to begin at 0.

When you declare an integer column, you can specify an optional display size M. If given, M should be an integer from 1 to 255. It represents the number of characters used to display values for the column. For example, MEDIUMINT(4) specifies a MEDIUMINT column with a display width of 4. If you declare an integer column without an explicit width, a default width is assigned. The defaults are the lengths of the "longest" values for each type. Note that displayed values are not chopped to fit within M characters. If the printable representation of a particular value requires more than M characters, MySQL displays the full value.

The display size M for an integer column is related only to the number of characters used to display column values. It has nothing to do with the number of bytes of storage space required. For example, BIGINT values require 8 bytes of storage regardless of the display width. It is not possible to magically cut the required storage space for a BIGINT column in half by declaring it as BIGINT(4). Nor does M have anything to do with the range of values allowed. If you declare a column as INT(3), that will not restrict it to a maximum value of 999.

The following statement creates a table to illustrate the default values of M and D for integer column types:

CREATE TABLE mytbl 
(
    itiny     TINYINT,
    itiny_u   TINYINT UNSIGNED,
    ismall    SMALLINT,
    ismall_u  SMALLINT UNSIGNED,
    imedium   MEDIUMINT,
    imedium_u MEDIUMINT UNSIGNED,
    ireg      INT,
    ireg_u    INT UNSIGNED,
    ibig      BIGINT,
    ibig_u    BIGINT UNSIGNED
);

If you issue a DESCRIBE mytbl statement after creating the table, the number following each type name shows the value that MySQL uses by default in the absence of an explicit display width specifier: [1]

[1] Due to a minor glitch, the display width for BIGINT will be 21 (not 20) if you run this query using a version of MySQL older than 3.23.

+-----------+-----------------------+ 
| Field     | Type                  |
+-----------+-----------------------+
| itiny     | tinyint(4)            |
| itiny_u   | tinyint(3) unsigned   |
| ismall    | smallint(6)           |
| ismall_u  | smallint(5) unsigned  |
| imedium   | mediumint(9)          |
| imedium_u | mediumint(8) unsigned |
| ireg      | int(11)               |
| ireg_u    | int(10) unsigned      |
| ibig      | bigint(20)            |
| ibig_u    | bigint(20) unsigned   |
+-----------+-----------------------+
Floating-Point Column Types

MySQL provides three floating-point types: FLOAT, DOUBLE, and DECIMAL. Synonymous types are DOUBLE PRECISION and REAL for DOUBLE, and NUMERIC for DECIMAL. Ranges for these types differ from ranges for integer types in the sense that there is not only a maximum value a floating-point type can represent, but also a minimum non-zero value. The minimum values provide a measure of how precise the type is, which is often important for recording scientific data. (There are, of course, corresponding negative maximum and minimum values.)

Floating-point types can be declared as UNSIGNED, although not until MySQL 4.0.2 for FLOAT and DOUBLE. Unlike the integer types, declaring a floating-point type UNSIGNED doesn't shift the type's range upward, it merely eliminates the negative end.

For each floating-point type, you can specify a maximum display size M and the number of decimal places D. The value of M should be from 1 to 255. The value of D can be from 0 to 30, but should be no more than M?2. (If you're more familiar with ODBC terms, M and D correspond to the ODBC concepts of "precision" and "scale.")

For FLOAT and DOUBLE, M and D are optional. If they are omitted, these types are treated as follows:

  • Prior to MySQL 3.23.6, FLOAT and DOUBLE are treated as FLOAT(10,2) and DOUBLE(16,4) with stored values rounded to 2 and 4 decimals, respectively.

  • For MySQL 3.23.6 and later, FLOAT and DOUBLE are stored to the full precision allowed by your hardware.

For DECIMAL, M and D may or may not be optional, depending on your version of MySQL:

  • Prior to MySQL 3.23.6, M and D are required for DECIMAL columns.

  • For MySQL 3.23.6 and later, if D is omitted, it defaults to 0. If M is omitted as well, it defaults to 10. In other words, the following equivalences hold:

    DECIMAL = DECIMAL(10) = DECIMAL(10,0) 
    DECIMAL(n) = DECIMAL(n,0)
    

FLOAT(p) syntax is also allowed for ODBC compatibility. However, the precise behavior of columns specified using this syntax is somewhat complicated:

  • Prior to MySQL 3.23, the allowable values of p are 4 and 8, indicating the number of bytes of storage per value. FLOAT(4) and FLOAT(8) are treated as FLOAT(10,2) and DOUBLE(16,4) with stored values rounded to 2 and 4 decimals, respectively.

  • For MySQL 3.23.0 to 3.23.5, the allowable values of p are still 4 and 8 and indicate the number of bytes of storage, but FLOAT(4) and FLOAT(8) are treated as single-precision and double-precision columns with values stored to full hardware precision.

  • For MySQL 3.23.6 and later, p can range from 0 to 53 and indicates the minimum number of bits of precision required for stored values. For p values from 0 to 24, the column is treated as single-precision. For values from 25 to 53, the column is treated as double-precision.

More confusing still is that MySQL allows FLOAT4 and FLOAT8 as synonyms, but what they are synonyms for depends on your version of MySQL:

  • Prior to MySQL 3.23.6, FLOAT4 and FLOAT8 are equivalent to FLOAT(10,2) and DOUBLE(16,4).

  • For MySQL 3.23.6 and later, FLOAT4 and FLOAT8 are equivalent to FLOAT and DOUBLE.

If you carefully compare these equivalences to those for FLOAT(4) and FLOAT(8), you'll see that FLOAT4 and FLOAT8 are not quite the same as FLOAT(4) and FLOAT(8), although you might have expected them to be.

Checking How MySQL Treats a Type Specification

If you're not sure how your version of MySQL will treat a given floating-point column specification, try the following. Create a table that contains a column defined the way you're wondering about and then use DESCRIBE to see how MySQL reports the type. For example, in MySQL 3.23.0, if you create a column using FLOAT4, you'd see the following:

mysql> CREATE TABLE t (f FLOAT4); 
mysql> DESCRIBE t;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| f     | float(10,2) | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+

In MySQL 3.23.6, you'd see the following instead:

mysql> CREATE TABLE t (f FLOAT4); 
mysql> DESCRIBE t;
+-------+-------+------+-----+---------+-------+
| Field | Type  | Null | Key | Default | Extra |
+-------+-------+------+-----+---------+-------+
| f     | float | YES  |     | NULL    |       |
+-------+-------+------+-----+---------+-------+

The lack of a (M,D) indicator in the latter case indicates that values are stored to the full precision allowed by the hardware.

This technique actually works to see how MySQL treats any column definition, but I have found it most useful for floating-point types.

Choosing Numeric Column Types

When you choose a numeric type, consider the range of values you need to represent and choose the smallest type that will cover the range. Choosing a larger type wastes space, leading to tables that are unnecessarily large and that cannot be processed as efficiently as if you had chosen a smaller type. For integer values, TINYINT is the best if the range of values in your data is small, such as a person's age or number of siblings. MEDIUMINT can represent millions of values and can be used for many more types of values, at some additional cost in storage space. BIGINT has the largest range of all but requires twice as much storage as the next smallest integer type (INT) and should be used only when really necessary. For floating-point values, DOUBLE takes twice as much space as FLOAT. Unless you need exceptionally high precision or an extremely large range of values, you can probably represent your data at half the storage cost by using FLOAT.

Every numeric column's range of values is determined by its type. If you attempt to insert a value that lies outside the column's range, truncation occurs; MySQL clips the value to the appropriate endpoint of the range and uses the result. No truncation occurs when values are retrieved.

Value truncation occurs according to the range of the column type, not the display width. For example, a SMALLINT(3) column has a display width of 3 and a range from -32768 to 32767. The value 12345 is wider than the display width but within the range of the column, so it is inserted without clipping and retrieved as 12345. The value 99999 is outside the range, so it is clipped to 32767 when inserted. Subsequent retrievals retrieve the value 32767.

In general, values assigned to a floating-point column are rounded to the number of decimals indicated by the column specification. If you store 1.23456 in a FLOAT(8,1) column, the result is 1.2. If you store the same value in a FLOAT(8,4) column, the result is 1.2346. This means you should declare floating-point columns with a sufficient number of decimals to give you values as precise as you require. If you need accuracy to thousandths, don't declare a type with only two decimal places.

The DECIMAL type is a floating-point type, but it differs from FLOAT and DOUBLE in that DECIMAL values actually are stored as strings and have a fixed number of decimals. The significance of this fact is that DECIMAL values are not subject to roundoff error the way that FLOAT and DOUBLE columns are?a property that makes DECIMAL especially applicable to currency calculations. The corresponding tradeoff is that DECIMAL values are not as efficient as floating-point values stored in native format that the processor can operate on directly.

The maximum possible range for DECIMAL is the same as for DOUBLE, but the effective range is determined by the values of M and D. If you vary M and hold D fixed, the range becomes larger as M becomes larger. This is illustrated by Table 2.7. If you hold M fixed and vary D, the range becomes smaller as D becomes larger, although the precision increases. This is shown by Table 2.8.

Table 2.7. How M Affects the Range of DECIMAL(M,D)
Type Specification Range (for MySQL < 3.23) Range (for MySQL 3.23)
DECIMAL(4,1) ?9.9 to 99.9 ?999.9 to 9999.9
DECIMAL(5,1) ?99.9 to 999.9 ?9999.9 to 99999.9
DECIMAL(6,1) ?999.9 to 9999.9 ?99999.9 to 999999.9

Table 2.8. How D Affects the Range of DECIMAL(M,D)
Type Specification Range (for MySQL < 3.23) Range (for MySQL 3.23)
DECIMAL(4,0) ?999 to 9999 ?9999 to 99999
DECIMAL(4,1) ?9.9 to 99.9 ?999.9 to 9999.9
DECIMAL(4,2) ?.99 to 9.99 ?99.99 to 999.99

The range for a given DECIMAL type depends on your version of MySQL. As of MySQL 3.23, DECIMAL values are handled according to the ANSI specification, which states that a type of DECIMAL(M,D) must be able to represent any value with M digits and D decimal places. For example, DECIMAL(4,2) must be able to represent values from ?99.99 to 99.99. Because the sign character and decimal point must still be stored, this requires an extra two bytes, so DECIMAL(M,D) values for MySQL 3.23 and later use M+2 bytes. For DECIMAL(4,2), six bytes are needed for the "widest" value (?99.99). At the positive end of the range, the sign byte is not needed to hold a sign character, so MySQL uses it to extend the range beyond that required by the ANSI specification. In other words, for DECIMAL(4,2), the maximum value that can be stored in the six bytes available is 999.99.

There are two special conditions that reduce the DECIMAL storage requirement of M+2 bytes to a lesser value:

  • If D is 0, DECIMAL values have no fractional part and no byte need be allocated to store the decimal point. This reduces the required storage by one byte.

  • If a DECIMAL column is UNSIGNED, no sign character need be stored, also reducing the required storage by one byte.

For versions of MySQL prior to 3.23, DECIMAL values are represented in a slightly different fashion. A DECIMAL(M,D) column is stored using M bytes per value, and the sign character and decimal point (if needed) are included in the M bytes. Thus, for a type DECIMAL(4,2), the range is ?.99 to 9.99 because those cover all the possible 4-character values. If D is 0, no decimal point need be stored, and the byte usually used for that purpose can be used to store another digit. The effect is to extend the range of the column by an extra order of magnitude. (This explains why the pre-3.23 range in Table 2.8 shifts by a factor of 10 for DECIMAL(4,2) compared to DECIMAL(4,1), but by a factor of 100 for DECIMAL(4,1) compared to DECIMAL(4,0). I bet you didn't even notice that!)

Numeric Column Type Attributes

The ZEROFILL attribute can be specified for all numeric types. It causes displayed values for the column to be padded with leading zeros to the display width. You can use ZEROFILL when you want to make sure column values always display using a given number of digits. Actually, it's more accurate to say "a given minimum number of digits" because values wider than the display width are displayed in full without being chopped. You can see this by issuing the following statements:

mysql> DROP TABLE IF EXISTS mytbl; 
mysql> CREATE TABLE mytbl (my_zerofill INT(5) ZEROFILL);
mysql> INSERT INTO mytbl VALUES(1),(100),(10000),(1000000);
mysql> SELECT my_zerofill FROM mytbl;
+-------------+
| my_zerofill |
+-------------+
|       00001 |
|       00100 |
|       10000 |
|     1000000 |
+-------------+

Note that the final value, which is wider than the column's display width, is displayed in full.

The UNSIGNED attribute disallows negative values. It is most often used with integer types. Making an integer column UNSIGNED doesn't change the "size" of the underlying data type's range; it just shifts the range upward. Consider this table specification:

CREATE TABLE mytbl 
(
    itiny   TINYINT,
    itiny_u TINYINT UNSIGNED
);

itiny and itiny_u are both TINYINT columns with a range of 256 values but differ in the particular allowable values. The range of itiny is ?128 to 127, whereas the range of itiny_u is shifted up, resulting in a range of 0 to 255.

UNSIGNED is useful for integer columns into which you plan to store information that doesn't take on negative values, such as population counts or attendance figures. If you use a signed column for such values, you use only half of the column type's range. By making the column UNSIGNED, you effectively double your range. If you use the column for sequence numbers, it will take twice as long to run out of values if you make it UNSIGNED.

You can also specify UNSIGNED for floating-point columns, although the effect is slightly different than for integer columns. The range does not shift upward; instead, the upper end remains unchanged and the lower end becomes zero. A precaution to observe is that you should not use UNSIGNED with FLOAT or DOUBLE columns prior to MySQL 4.0.2. In earlier versions, MySQL allows these types to be declared as UNSIGNED, but doing so may result in unpredictable column behavior. (This prohibition does not apply to DECIMAL.)

One other attribute, AUTO_INCREMENT, can be specified for integer column types only. Use the AUTO_INCREMENT attribute when you want to generate unique identifiers or values in a series. When you insert NULL into an AUTO_INCREMENT column, MySQL generates the next sequence value and stores it in the column. Normally, unless you take steps to cause otherwise, AUTO_INCREMENT values begin at 1 and increase by 1 for each new row. The sequence can be affected if you delete rows from the table. This depends on the table type, which determines whether or not sequence values are reused.

You can have at most one AUTO_INCREMENT column in a table. The column should be NOT NULL, and it should be declared as a PRIMARY KEY or as a UNIQUE key. Also, because sequence values are always positive, you normally declare the column UNSIGNED as well. For example, you can declare an AUTO_INCREMENT column in any of the following ways:

CREATE TABLE ai (i INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY); 
CREATE TABLE ai (i INT UNSIGNED AUTO_INCREMENT NOT NULL, PRIMARY KEY (i));
CREATE TABLE ai (i INT UNSIGNED AUTO_INCREMENT NOT NULL, UNIQUE (i));

It is always allowable to declare an AUTO_INCREMENT column explicitly NOT NULL, as shown. However, for versions 3.23 and later, MySQL treats AUTO_INCREMENT columns as NOT NULL automatically.

The behavior of AUTO_INCREMENT columns is discussed further in the "Working with Sequences" section later in this chapter.

Following the attributes just described, which are specific to numeric columns, you can also specify the general attributes NULL or NOT NULL. If you do not specify NULL or NOT NULL for a numeric column, the default is NULL. You can also specify a default value using the DEFAULT attribute. If you do not specify a default value, one is chosen automatically. For all numeric column types, the default is NULL for columns that may contain NULL, and 0 otherwise.

The following table contains three INT columns, having default values of -1, 1, and NULL:

CREATE TABLE t 
(
    i1 INT DEFAULT -1,
    i2 INT DEFAULT 1,
    i3 INT DEFAULT NULL
); 

String Column Types

MySQL provides several string types to hold character data. Strings are often used for values like the following:

'N. Bertram, et al.' 
'Pencils (no. 2 lead)'
'123 Elm St.'
'Monograph Series IX'

But strings are actually "generic" types in a sense because you can use them to represent any value. For example, you can use string types to hold binary data, such as images or sounds, or output from gzip, should you want to store compressed data.

Table 2.9 shows the types provided by MySQL for declaring string-valued columns and the maximum size and storage requirements of each type. For variable-length column types, the amount of storage taken by a value varies from row to row and depends on the length of the values actually stored in the column. This length is represented by L in the table.

The extra bytes required in addition to L are the number of bytes needed to store the length of the value. MySQL handles variable-length values by storing both the content of the value and its length. These extra bytes are treated as an unsigned integer. Notice the correspondence between a variable-length type's maximum length, the number of extra bytes required for that type, and the range of the unsigned integer type that uses the same number of bytes. For example, MEDIUMBLOB values can be up to 224?1 bytes long and require 3 bytes to record the result. The 3-byte integer type MEDIUMINT has a maximum unsigned value of 224?1. That's not a coincidence.

Table 2.9. String Column Types
Type Specification Maximum Size Storage Required
CHAR[(M)] M bytes M bytes
VARCHAR(M) M bytes L+1 bytes
TINYBLOB, TINYTEXT 28 ? 1 bytes L+1 bytes
BLOB, TEXT 216 ? 1 bytes L+2 bytes
MEDIUMBLOB, MEDIUMTEXT 224 ? 1 bytes L+3 bytes
LONGBLOB, LONGTEXT 232 ? 1 bytes L+4 bytes
ENUM('value1','value2',...) 65535 members 1 or 2 bytes
SET('value1','value2',...) 64 members 1, 2, 3, 4, or 8 bytes

For ENUM and SET, the column definition includes a list of legal values. Attempting to store a value other than those causes the value to be converted to '' (the empty string). For the other string types, values that are too long are chopped to fit. But string types range from very small to very large, with the largest type able to hold nearly 4GB of data, so you should be able to find something long enough to avoid truncation of your information.[2]

[2] The effective maximum column size is actually imposed by the maximum packet size of the client/server communication protocol. This value is 16MB prior to MySQL 4, and 1GB for MySQL 4 and later.

ENUM and SET values are stored internally as numbers, as detailed later in the "ENUM and SET Column Types" section. Values for the other string types are stored as a sequence of bytes and treated either as bytes or characters, depending on whether the type holds binary or non-binary strings:

  • A binary string is treated as a generic sequence of bytes, without respect to any character set. BLOB columns hold binary values, as do CHAR and VARCHAR columns if they are declared with the BINARY attribute.

  • A non-binary string is treated as a sequence of characters and interpreted with respect to the properties of a particular character set. TEXT columns hold non-binary strings, as do CHAR and VARCHAR columns if they are declared without the BINARY attribute. For a single-byte character set, each character takes one byte. For multi-byte character sets, characters can take more than one byte. In MySQL 4.1 and later, columns can be assigned character sets individually. Prior to MySQL 4.1, the server's default character set is used to interpret character strings.

Use of a character set causes non-binary strings to be compared and sorted using the character set's collating sequence. By contrast, a binary string has no character set and thus no collating sequence. This results in some differences in the way binary and non-binary strings are interpreted:

  • Binary strings are processed byte-by-byte in comparisons based only on the underlying numeric value of each byte. One implication of this property is that binary values are case sensitive, because the lowercase and uppercase versions of a given letter have different numeric codes.

  • Non-binary strings are processed character-by-character in comparisons using the character set collating sequence. For most character sets, uppercase and lowercase versions of a given letter have the same collating value, which means that non-binary string comparisons are not case sensitive. Similar characters with different accents also may have the same collating value. For example, 'E' and 'É' compare as the same character in the latin1 character set.

There are a few character sets that do treat uppercase and lowercase as having different collating values and that distinguish between accent marks: cp1521csas, cp1527ltlvcsas, latin1csas, maccecsas, and macromancsas. Note that these character set names each end with csas, which means "case sensitive, accent sensitive." They're something of a special case, so although elsewhere in this book I discuss non-binary strings as not case sensitive, keep in mind that these character sets exist as exceptions to the rule.

The distinction between characters and bytes can be seen easily by considering the length of a string containing multi-byte characters. For example, in MySQL 4.1 and later, you can use the CONVERT() function to generate a string in any available character set. The following statement creates @s as a string using ucs2, a character set that uses two bytes to encode each character:

mysql> SET @s = CONVERT('ABC' USING ucs2); 

What is the "length" of the string @s? It depends. If you measure with CHAR_LENGTH(), which is multi-byte aware, you get the length in characters. If you measure with LENGTH(), which is not multi-byte aware, you get the length in bytes:

mysql> SELECT CHAR_LENGTH(@s), LENGTH(@s); 
+-----------------+------------+
| CHAR_LENGTH(@s) | LENGTH(@s) |
+-----------------+------------+
|               3 |          6 |
+-----------------+------------+

A binary string has no character set and is treated simply as a sequence of individual bytes. Consequently, the length of the string is the same whether measured in characters or bytes:

mysql> SET @s = BINARY CONVERT('ABC' USING ucs2); 
mysql> SELECT CHAR_LENGTH(BINARY @s), LENGTH(BINARY @s);
+------------------------+-------------------+
| CHAR_LENGTH(BINARY @s) | LENGTH(BINARY @s) |
+------------------------+-------------------+
|                      6 |                 6 |
+------------------------+-------------------+

The difference between lengths in characters and in bytes is significant for interpreting the meaning of string column types. For example, a column declaration of VARCHAR(20) doesn't really mean "20 characters maximum," it means "as many characters as will fit in 20 bytes." For single-byte character sets, the two are the same because the number of characters is the same as the number of bytes. But, for a multi-byte character set, the number of characters can be many less than 20.

The CHAR and VARCHAR Column Types

CHAR and VARCHAR are the most commonly used string types. The difference between them is that CHAR is a fixed-length type and VARCHAR is a variable-length type. Values in a CHAR(M) column each take M bytes; shorter values are right-padded with spaces when they are stored. (Trailing spaces are stripped off on retrieval, however.) Values in a VARCHAR(M) column are stored using only as many bytes as necessary, plus one byte to record the length. Trailing spaces are stripped from VARCHAR values when they are stored; this differs from the ANSI SQL standard for VARCHAR values. (A VARCHAR type for which trailing spaces are not stripped may be introduced in a future version of MySQL.)

CHAR and VARCHAR columns can be declared with a maximum length M from 1 to 255. M is optional for CHAR and defaults to 1 if missing. Beginning with MySQL 3.23, CHAR(0) is also legal. CHAR(0) is useful as a placeholder when you want to declare a column but don't want to allocate space for it if you're not sure yet how wide to make it. You can use ALTER TABLE to widen the column later. A CHAR(0) column can also be used to represent on/off values if you allow it to be NULL. Values in such a column can have two values?NULL or the empty string. A CHAR(0) column takes very little storage space in the table?only a single bit. As of MySQL 4.0.2, VARCHAR(0) is allowable as well, but it's treated as CHAR(0).

Keep two general principles in mind when choosing between CHAR and VARCHAR column types:

  • If your values are all the same length, VARCHAR actually will use more space due to the extra byte required to record the length of values. On the other hand, if your values vary in length, VARCHAR columns have the advantage of taking less space. A CHAR(n) column always takes n bytes, even if it is empty or NULL.

  • If your values don't vary much in length, CHAR is a better choice than VARCHAR if you're using MyISAM or ISAM tables. For such table types, tables with fixed-length rows can be processed more efficiently than tables with variable-length rows.

With a few limited exceptions, you cannot mix CHAR and VARCHAR within the same table. MySQL will even change columns from one type to another, depending on the circumstances. (This is something that other databases do not do.) The principles that apply are as follows:

  • Table rows are fixed-length only if all the columns in the table are fixed-length types.

  • If even a single column has a variable length, table rows become variable-length as well.

  • If table rows are variable-length, any fixed-length columns in the column may as well be converted to variable-length equivalents when that will save space.

What this means is that if you have VARCHAR, BLOB, or TEXT columns in a table, you cannot also have CHAR columns; MySQL silently converts them to VARCHAR. Suppose you create a table as follows:

CREATE TABLE mytbl 
(
    c1 CHAR(10),
    c2 VARCHAR(10)
);

If you issue a DESCRIBE query, the output is as follows:

mysql> DESCRIBE mytbl; 
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| c1    | varchar(10) | YES  |     | NULL    |       |
| c2    | varchar(10) | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+

Notice that the presence of the VARCHAR column causes MySQL to convert c1 to VARCHAR as well. If you try using ALTER TABLE to convert c1 to CHAR, it won't work. The only way to convert a VARCHAR column to CHAR is to convert all VARCHAR columns in the table at the same time:

mysql> ALTER TABLE mytbl MODIFY c1 CHAR(10), MODIFY c2 CHAR(10); 
mysql> DESCRIBE mytbl;
+-------+----------+------+-----+---------+-------+
| Field | Type     | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| c1    | char(10) | YES  |     | NULL    |       |
| c2    | char(10) | YES  |     | NULL    |       |
+-------+----------+------+-----+---------+-------+

The BLOB and TEXT column types are variable-length like VARCHAR, but they have no fixed-length equivalent, so you cannot use CHAR columns in the same table as BLOB or TEXT columns. Any CHAR column will be converted to VARCHAR.

The exception to non-mixing of fixed- and variable-length columns is that CHAR columns shorter than four characters are not converted to VARCHAR. For example, MySQL will not change the CHAR column in the following table to VARCHAR:

CREATE TABLE mytbl 
(
    c1 CHAR(2),
    c2 VARCHAR(10)
);

Y