2.2 Tables

Tables are the fundamental core of a relational database. Relational theory describes tables as abstract objects, ascribing no significance to the order of the rows or the columns that make up a table. However, tables also exist in physical form on disk in your database server, with physical ordering that affects performance. When an application queries for those physical bytes stored on disk or cached in memory, the server processes must have some way to reach them.

The physical layout of table rows affects the performance of reads of those rows, so it is important to understand the types of tables and how they affect the layout. Figure 2-2 shows four different physical tables, illustrating four basic patterns of table growth and aging, and shows how these patterns affect data layouts.

Figure 2-2. Physical table growth and aging

Tables occupy one or more contiguous areas of disk space (called extents on Oracle^[1]) that the server can read with minimal read-head movement and maximal efficiency. The database organizes table rows in blocks, which are too small to show here, usually 2KB-16KB in size. These blocks are constant-sized across most or all of the database (depending on the vendor). The blocks are the smallest units that the database reads from disk and caches, as discussed earlier. As formerly empty blocks within an extent become occupied, the high-water markthe highest point of the table that the database needs to scanrises toward the top of the extent until, reaching the top, it triggers allocation of a new extent.^[2] Above the high-water mark is space reserved for future writes, but not space that the database will ever touch for a read. The high-water mark does not move downward unless you rebuild or truncate the table. Figure 2-2 illustrates growth patterns described in the following sections.

^[1] Technically, an extent is contiguous according to virtual disk addressing, which is all the database software knows about. At a lower level, a RAID or other disk-striping/mirroring system can translate these contiguous virtual addresses to blocks on different disks, but you can safely ignore this subtlety when tuning SQL.

^[2] This most closely describes Oracle, but the differences between vendors, in this area, are not important to SQL tuning.

2.2.1 Continuous Growth

The continuous growth pattern, shown for T1 in Figure 2-2, is the most common pattern among transaction tables, which continually acquire new rows but almost never lose old rows. It is often regrettable that old rows stay around long after they have outlived their usefulness, but deciding what is truly safe to purge is hard (and scary) work, even ignoring the effort of writing the routines to do the work. Somehow, this work always ends up at the end of the priority list for a product's features (and who needs it in the initial release of a product?), much to the delight of disk vendors.

In continuous growth tables, the level of interest in rows tends to drop with their age, making the newest rows, conveniently stored together at the top of the table and most likely to be queried, best to cache. When the newest rows are the hottest, the natural clustering of new rows makes optimum use of the cache, and even a very large table can see a good cache-hit ratio (the fraction of logical I/Os that avoid physical I/Os) if you use indexed access that avoids the older rows.

A query that touches all of a continuous growth table (up to the high-water mark, that is), then discards all but some of the newest rows, will look good when the table is new and tiny. However, whole-table-access runtime grows linearly, assuming a constant table-growth rate, and will likely become intolerable. An access path that touches only new rows will maintain roughly constant efficiency, given steady table growth, since you will have a roughly constant number of rows created in the last week or so.

2.2.2 Purge Eldest

The purge eldest pattern, shown for T2 in Figure 2-2, I call the Ouroboros pattern, after the mythical snake that continually eats its own tail. In this table, the oldest rows are periodically purged (all of the oldest rows, not just some subset), completely emptying their blocks and making them available for inserts of the newest rows. The high-water mark need not move once this table reaches mature size, assuming you delete rows (once you begin deletes) at the same rate you insert them. The head of this snake (holding the newest rows) is always chasing the tail (holding the oldest rows), which retreats with every fresh delete. From the point of view of keeping the newest rows physically close together, this pattern is as good as the continuous growth pattern, with the added advantage that, since table growth halts once purging begins, the whole table has a better chance to be well cached. Note that this is an idealized case that is rarely seen, since retention of a few of the oldest rows, or a growth rate that exceeds the purge rate, will tend to gradually mix old and new rows ever more thoroughly.

2.2.3 Purge, Not by Age

The purge, not by age pattern, shown for T3 in Figure 2-2, reflects deletes that are not age-driven. Blocks become available for fresh inserts as soon as they have empty space that exceeds some threshold (typically 60% empty on Oracle), staying eligible until empty space falls below another threshold (typically 10% empty on Oracle). This happens to blocks that are scattered fairly randomly throughout the table, so new rows scatter accordingly, and caching becomes harder with time both as a result of average blocks being emptier and as a result of interesting rows being more spread out. However, such a pattern of purging tends to imply that the level of interest in rows is not age-related, tending to make such a table hard to cache regardless of whether purging scatters new rows.

2.2.4 Complete Purge and Regrowth

The complete purge and regrowth pattern, shown for T4 in Figure 2-2, reflects a wholesale purge with the beginnings of regrowth. In Figure 2-2, the entire table contents were recently deleted and the oldest rows shown are not that old, since the table has a long way to grow before it reaches its old high-water mark. This pattern is similar to the continuous growth pattern shown in T1, except that, since the table was formerly large and has not been rebuilt, the high-water mark has not fallen back, forcing full table scans to do just as much physical I/O as before the purge.

Oracle's TRUNCATE command, as an alternative to DELETE, can drop the high-water mark. On any database, you can also lower the high-water mark by dropping and rebuilding the table.