SQL Server Memory Management

By default, SQL Server 2000 dynamically allocates and deallocates system memory as needed to balance the needs of other applications running on the machine and the needs of its own internal components. As in versions prior to SQL Server 7.0, you do have the option of configuring SQL Server to use a fixed amount of memory; however, unless you have identified a specific problem that requires allocating a fixed amount of memory to SQL Server, it is best to let SQL Server dynamically manage memory.

Regardless of whether the memory is allocated dynamically or is fixed, the total memory space for each SQL Server instance on a system is considered a single unified cache and is managed as a collection of various memory pools. Each memory pool has its own policies and purposes, and memory can be requested by and granted to any of several internal components.

The Buffer Manager and Memory Pools

The main memory component in SQL Server is the buffer pool. Any memory not being used by another memory-related component remains in the buffer pool. The buffer manager handles bringing data and index pages into the buffer pool so that data can be accessed and shared among users. When other components require memory, they request a buffer from the buffer pool. The memory buffers are pages in memory that are the same size (8KB) as a data or index page.

Unlike pre-7.0 versions of SQL Server, there is no fixed, separate procedure cache area. The procedure cache is simply another memory pool in which query trees and plans from stored procedures, triggers, user-defined functions, or ad hoc queries can reside while executing. The size of the procedure cache grows and shrinks as needed (for more information on procedure cache and how it's used, see Chapter 28, "Creating and Managing Stored Procedures in SQL Server," and Chapter 35, "Understanding Query Optimization").

In addition, other pools are used by memory-intensive queries that use sorting or hashing, and by special memory objects that need less than one 8KB page.

The Buffer Manager

Processes in SQL Server can only access data and index pages in memory. Direct access to pages on disk is not allowed. The buffer manager is the process that manages and controls access to the "in memory" versions of the physical disk pages. When a process needs to access a data or index page, the page must first exist in memory in the buffer pool. If the requested page isn't in memory, a physical I/O has to be performed to retrieve the record and then the process can access the in-memory copy of the page. Because physical I/Os are more expensive and time consuming to perform, you want to have a system that performs the fewest physical I/Os as possible. One solution is to provide more memory to SQL Server. The more pages that can reside in memory, and stay there, the more likely a page will be found in memory when requested.

Accessing Memory Buffers

A database appears to a SQL Server process as a simple sequence of numbered pages. The pages in the database are uniquely identified by the database ID, file number, and page number. When a process needs to access a page, it submits a request to the buffer manager, specifying the unique database ID, file number, and page number identifier.

To provide fast access to data in memory and avoid memory scans for data pages, pages in the buffer pool are hashed for fast access. Hashing is a technique that uniformly maps a key via a hash function across a set of hash buckets. A hash bucket is a structure in memory that contains an array of pointers (implemented as a linked list) to the buffer pages. A linked list chain of hash pages is built when the pointers to buffer pages do not fit on a single hash page. The number of hash buckets used is determined dynamically by SQL Server depending on the total size of the buffer pool.

In SQL Server, the hash values are generated from a combination of the database ID, file number, and page number. The hash function generates the hash key and stores it in a hash bucket. Essentially, the hash bucket serves as an index to the specific page needed. Hashing allows SQL Server to find a specific data page in memory quickly and with only a few memory reads, even in large memory environments. Use of a hashing algorithm also allows SQL Server to quickly determine whether a requested page is not in memory and has to be read in from disk, without having to perform an exhaustive scan of the data cache.

If the requested page is found in memory, the buffer manager returns a pointer to the memory buffer holding the page to the process that requested it. If the page is not in memory, the buffer manager must first initiate a disk I/O request to bring the page into memory. After the page has been read into a memory buffer, the pointer to that buffer is returned to the requesting process.

If the process that requested the page ends up modifying any information on the page, the page is marked as "dirty" and the buffer manager is notified when the process is finished making updates to the page. The buffer manager is then responsible for making sure that the changes to the page are written out to disk in a way that coordinates with logging and transaction management. This is handled by the checkpoint and lazywriter processes.

The Checkpoint Process

Checkpoints help minimize the amount of work that SQL Server must do when databases are recovered during system startup. Checkpoints flush modified data pages from memory out to disk so that those changes don't have to be redone during database recovery (for more details on the checkpoint process and how it affects recovery, see Chapter 31, "Transaction Management and the Transaction Log").

The checkpoint process takes care of performing automated checkpoints that occur based on the setting of the recovery interval. When recovery time is estimated to exceed the configured recovery interval, the checkpoint process issues a checkpoint for the database. The method SQL Server uses to determine whether a checkpoint is necessary is to examine how large the log has grown since the last checkpoint. SQL Server assumes that 10MB of log equates to 1 minute of recovery time. Therefore, if the recovery interval is set to 5 minutes, a checkpoint is triggered when the log grows by 50MB since the last checkpoint. The default recovery interval in SQL Server 2000 is 0, which means that SQL Server dynamically chooses the appropriate recovery interval depending on the amount of database activity.

SQL Server limits checkpoint operations to a maximum of 100 concurrent write operations in an effort to limit the amount of resources checkpoints consume. Also, to optimize checkpoints and to make sure that checkpoints don't perform more work than necessary, the checkpoint process keeps track of which pages it has written out to disk already by setting the generation flag for each buffer in the cache. Without this flag to help keep track of the pages already written, checkpoint operations could potentially write the same page to disk multiple times. When the checkpoint is initiated, all generation flags are turned off. As the checkpoint process checks a page, it toggles the generation bit to the opposite value. If it happens to revisit a page whose generation bit has already been toggled, SQL Server skips writing that page as the page might have already been written due to its proximity to other pages that were already written. Again, to minimize overhead, SQL Server attempts to gather physically contiguous pages together that need to be written and write them out once to avoid excessive I/O.

If you want to find out how often checkpoints are occurring, and possibly adjust the recovery interval setting, start up SQL Server with trace flag 3502. This trace flag writes information to SQL Server's error log every time a checkpoint occurs.

The Lazywriter Process

Before a requested page can be brought into memory, a free buffer must be available in the buffer pool. If no free buffers are available in the free buffer list, an existing buffer needs to be reused. When an existing buffer has to be reused, many buffer pages might have to be searched simply to locate a buffer to free up for use. If the buffer found is marked as dirty?in other words, it contains changes that haven't been written out to disk first?the buffer manager must first write the changes out before the page can be reused and assigned to the requesting process. The requesting process would have to wait for this process to complete. To provide optimal performance, it is crucial that SQL Server keep a supply of buffers available for immediate use.

The buffer pool is managed by the lazywriter process, which uses a clock algorithm to sweep through the buffer pool looking for pages to be placed in the free buffer pool. The lazywriter thread maintains a pointer into the buffer pool that "sweeps" sequentially through it (like the hand on a clock). As the lazywriter visits each buffer, it determines whether that buffer has been referenced since the last sweep by examining a reference count value stored in the buffer header. If the reference count is greater than 0, the buffer remains in the pool, but its reference count is decremented. If the reference count reaches 0, the buffer is made available for reuse?if the page is still dirty, it is first written out to disk before being removed from the hash lists and put onto the free list.


The lazywriter is so named because it is not an aggressive process. It runs as a low-priority process, working only during periods of low disk I/O. This is done to minimize the impact of the lazywriter operations on other SQL Server processes.

To keep frequently referenced pages in memory, SQL Server increments the reference count of a buffer each time the buffer is accessed by any process. For data or index pages, the reference count is incrmented by 1 each time. For objects that are expensive to create, such as stored procedure plans, a higher reference count is set initially that better reflects their "replacement cost."

The lazywriter applies a bit of a weighting factor to pages in memory to keep the more expensive pages in the buffer. When sweeping through checking the reference counts, instead of simply decrementing the reference count by 1, it divides the reference count by 4. This provides a means for the more frequently referenced pages (those with a high reference count) and those with a high replacement cost to remain in cache longer because it takes longer for the reference count to reach 0. This keeps the pages that are likely to be reused often in the buffer pool longer.

SQL Server 2000 dynamically estimates the number of free pages that should be available based on the load on the system and the number of stalls occurring. A stall is what occurs when a process needs a free page, and none are available. The process has to be put into a sleep status until the lazywriter can free up some pages. The minimum size of the buffer free list is computed dynamically as a percentage of the overall buffer pool size but is always between 128KB and 4MB.

The lazywriter process monitors the free buffer list and when the number of pages on the free list falls below its minimum size, it kicks off the process to begin sweeping the buffer cache. If a lot of stalls occur, SQL Server increases the minimum size of the free list. If the load on the system is light and few stalls occur, the minimum free list size can be reduced, and the excess pages can be used for hashing additional data and index pages or query plans. The Performance Monitor has counters that let you examine not only the number of free pages but also the number of stalls occurring (for more information on monitoring SQL Server with Performance Monitor, see Chapter 37, "Monitoring SQL Server Performance").

All user threads in SQL Server also perform a function similar to that of the lazywriter process. This happens when a user process needs to read a page from disk into a buffer. After the read has been initiated, the client process worker thread checks to see whether the free list is too small, and if it is, the user thread itself performs the same function as the lazywriter, searching for buffers that can be freed. Having user worker threads share in the work of the lazywriter helps better distribute the cost across all the CPUs in an SMP environment. In a multiple CPU environment, SQL Server 2000 maintains a separate free list for each CPU. When a user thread needs a free page, it first checks the free list for the CPU it is currently running on. Only when no pages are available in its own free list will the user thread check the free lists of other CPUs.

Keeping Pages in the Cache Permanently

If you have critical tables you want to keep in memory, such as code lookup tables, you can mark these tables so that their pages are never put on the free list and are therefore kept in memory indefinitely. This process is referred to as pinning a table. When a table is pinned, none of the pages associated with that table are marked as free, and thus are never reused until the table is unpinned.

You use the sp_tableoption stored procedure to pin or unpin a table by specyifying TRUE or FALSE for the pintable option. Be careful not to pin too many tables to the cache as you can end up starving the buffer cache of available pages for other processes. In general, only pin tables after you've carefully tuned your system, when you have plenty of memory available, and you have a good feel for which tables are the heavy hitters.

Remember, if pages are accessed frequently, there might be no need to pin the table. Because only pages that have a reference count of 0 are placed in the free list, pages accessed repeatedly should rarely, if ever, end up on the free list, which is as it should be.

Large Memory Support

One of the first solutions to database performance problems is to throw more memory at the system (primarily because adding memory is much cheaper than replacing the CPU or disk subsystem, or even cheaper than hiring a consultant to come in and tune the system correctly). The reasoning behind adding more memory is, of course, to reduce the need for physical I/O by increasing your cache-hit ratio.

Keep in mind that some systems will not benefit from more memory. For example, if your database is 500MB in size and you already have 1GB of RAM installed, you could cache the entire database and still have memory left over. Additionally, typically only a relatively small portion of most databases is accessed frequently and repeatedly. A memory configuration that's only a small percentage of the entire database size can often still yield a high cache-hit ratio. More memory will not help out for the rarely run report that performs a large, single pass, table scan. Because the data is accessed infrequently, it's not likely to stay in cache, and the first read of a page will still have to come in from disk. Only subsequent queries against the table could experience a benefit.

When running the Enterprise Edition of SQL Server 2000 on either the Windows 2000 Advanced Server or Windows 2000 Datacenter Server platforms, you can allocate more than the default maximum of 4GB of memory by enabling the Windows 2000 Address Windowing Extensions (AWE) API. When this option is enabled, a SQL Server instance can then access up to 8GB of physical memory on Advanced Server and up to 64GB on Datacenter Servers. Although standard 32-bit addressing supports up to only 4GB of physical memory, the AWE API allows the additional memory to be acquired as nonpaged memory. The memory manager can then dynamically map views of the nonpaged memory into the 32-bit address space. You must be careful when using this extension because nonpaged memory cannot be swapped out. SQL Server allocates the entire chunk requested and does not release it back to the operating system until SQL Server is shut down. Other applications or other instances of SQL Server running on the same machine might not be able to get the memory they need.

Enabling AWE for the Enterprise Edition of SQL Server involves three steps. First, you must ensure that the account that starts your SQL Server instance has the Windows policy right to lock pages in memory. The user account that you specify to start the SQL Server service during installation will have the page-locking permission granted to it automatically. However, if you have changed the SQL Server startup account since installation, you need to verify that the page-locking permission has been assigned to that account.

The second step is to set the AWE-enabled option to 1 with sp_configure:

EXEC sp_configure 'awe enabled', '1' 

Finally, you also have to configure Windows 2000 to address physical memory above 4GB by adding the /pae switch to the Windows 2000 startup command in the boot.ini file.


When using AWE with SQL Server 2000, SQL Server can no longer dynamically allocate RAM. By default, it will grab all available memory, leaving only 128MB available for Windows and other applications. You also need to configure the max server memory option to limit the amount of memory that SQL Server allocates. Be sure to leave enough memory for Windows and any other applications running on the server, usually at least 500MB.

The Log Manager

As described in Chapter 31, the buffer manager uses a "write-ahead" strategy when writing changes to the transaction log?log records for a transaction are always written to disk before the modified data pages are written. Write-ahead logging ensures that all databases can be recovered to a consistent state. Because processes must wait for acknowledgment that a transaction has been physicially written to disk in the transaction log before it can consider it committed, all writes to the transaction log are performed synchronously. Writes to data pages can be made asynchronously, without waiting for acknowledgment, because if a failure occurs writing the data, the transactions can still be rolled back or committed from the information in the transaction log.

The log manager in SQL Server 2000 maintains contiguous regions of memory, called log caches, that are separate from the buffer pools used to store data and index pages. Within these regions of memory, the log manager formats and manages the log records before they are written out to disk. Two or more of these memory regions are used to help achieve maximum throughput?while one log cache is being written out to disk, the other log cache can still be receiving new records. The log manager also maintains two log queues: a flushQueue, which contains full log caches waiting to be written to disk, and a freeQueue, which contains log caches that have been flushed and can be reused.

All transaction records are first written to an available log cache. When the transaction commits, the log cache is placed into the flushQueue (if it isn't already there), to be written out to disk. The user process then goes into a wait queue and does not perform further processing until its log cache records have been flushed to disk.

The flushQueue is monitored by the log writer, a dedicated thread that reads the flushQueue in sequence and flushes the log caches out to disk one at a time. When the flush for a particular log cache is completed, any processes waiting on that log cache are woken up and can resume work.

    Part III: SQL Server Administration
    Part IV: Transact-SQL
    Part V: SQL Server Internals and Performance Tuning
    Part VI: Additional SQL Server Features