The Buffer Pools

The database buffer pool area is a piece of memory used to cache a table's index and data pages as they are being read from disk to be scanned or modified. The buffer pool area helps to improve database system performance by allowing data to be accessed from memory instead of from disk. Because memory access is much faster than disk access, the less often that DB2 needs to read from or write to a disk, the better the system will perform.

When a database is created, there will be one default buffer pool created for the database. This buffer pool is named IBMDEFAULTBP; it has a page size of 4 KB; and it will be sized depending on the operating system. For Windows, the default buffer pool will be 250 pages or 1 MB, whereas for UNIX, the default buffer pool will be 1,000 pages or 4 MB. The default buffer pool cannot be dropped; however, the size of the default buffer pool can be changed, using the ALTER BUFFERPOOL statement.

A database's buffer pool(s) are allocated in memory when the database is activated or when the first application connects to the database. When all applications disconnect from the database and it is deactivated, the buffer pool memory is deallocated and freed back to the operating system. With DB2 UDB Version 8, buffer pools can now be created, dropped, and resized while the database is active.

Buffer Pool Usage

When a DB2 agent acting on behalf of an application retrieves a row in a table for the first time, DB2 UDB must first read the page containing that row from disk and place it in the buffer pool. The next time any application requests data, DB2 UDB will look for it in the buffer pool. If the requested data is in the buffer pool, it can be retrieved without disk access, resulting in faster performance.

Once a page is read into the buffer pool, it will remain in the buffer pool until the database is stopped or until the space used by the page in the buffer pool is needed to read another page into the buffer pool. When reading a page into the buffer pool, it will be placed into an empty buffer pool page if one exists. If the buffer pool is full (i.e., there are no empty pages available), DB2 will choose a victim page, based on the following:

  • How recently the page was last accessed

  • The probability that the page will be read again by the last agent that accessed it

  • The type of data on the page (i.e., index or data)

  • Whether the page was changed in the buffer pool but has not yet been written to disk

To guarantee the integrity of the data and transactions, changed pages must first be written to disk before they can be replaced in the buffer pool. Because the transaction must wait for the page to be written to disk, this can slow down the performance of the system. To overcome this condition and improve performance, DB2 UDB uses buffer pool cleaners periodically to scan the buffer pool for dirty pages (i.e., pages that have changed since being read into the buffer pool) and asynchronously write them to disk. To optimize the access to the data on these pages, they will not be removed from the buffer pool after they have been written to disk; however, they will no longer be dirty, so they can be replaced in the buffer pool without the need to write them to disk.

Reading pages from disk is an expensive, time-consuming operation. Reading several consecutive pages into the buffer pool using a single I/O operation can greatly reduce this overhead and allow applications to process their work faster. In addition, multiple parallel I/O operations to read several ranges of pages into the buffer pool can help reduce I/O wait time. In DB2 UDB this is accomplished using I/O prefetchers to read pages from disk into the buffer pool in the expectation that they will be required by an application. Prefetching of index and data pages into the buffer pool can help improve performance by reducing the I/O wait time and further taking advantage of parallel I/O operations.

NOTE

Prefetching and buffer pool cleaning will be discussed in more detail later in this chapter.


Creating Buffer Pools

The CREATE BUFFERPOOL statement will create a buffer pool for the database that the user is connected to. In previous versions of DB2 UDB, the buffer pool would not be allocated until the database was stopped and restarted (i.e., all current applications were disconnected from the database and another connection was established). With DB2 UDB Version 8, the buffer pool can be allocated immediately, if there is enough memory available to handle the request.

The CREATE BUFFERPOOL statement has options to specify the following:

  • bufferpool-name: Specifies the name of the buffer pool. The name cannot be used for any other buffer pools and cannot begin with the characters SYS or IBM.

  • IMMEDIATE: Specifies that the buffer pool will be created immediately if there is enough memory available on the system. If there is not enough reserved space in the database shared memory to allocate the new buffer pool, a warning is returned, and buffer pool creation will be DEFERRED, as described below.

    • This is the default.

  • DEFERRED: Specifies that the buffer pool will be created the next time the database is stopped and restarted.

  • ALL DBPARTITIONNUMS: Specifies that the buffer pool will be created on all partitions in the database. This is the default if no database partition group is specified.

  • DATABASE PARTITION GROUP: Specifies the database partition group(s) on which the buffer pool will be created. The buffer pool will be created on all database partitions that are part of the specified database partition groups.

  • SIZE: Specifies the size of the buffer pool and is defined in number of pages. In a partitioned database, this will be the default size for all database partitions where the buffer pool exists.

    • The EXCEPT ON DBPARTITIONNUMS clause described below allows the buffer pool to have different sizes on the different database partitions.

  • EXCEPT ON DBPARTITIONNUMS: Specifies the database partition or partitions where the buffer pool will be sized differently than specified by the SIZES clause. This can be specified as a single partition, a list of partitions, or a range of partitions. To specify a list of partitions, separate the database partitions in the list by commas. To specify a range of partitions, use the TO clause between the partition numbers. When specifying a range of partitions, the second database partition number must be higher than the first database partition number, and the buffer pool must exist on all of the database partitions. After the partitions or partitions have been specified, their size must also be specified.

  • NUMBLOCKPAGES: Specifies the number of pages to be created in the block-based area of the buffer pool. The actual value of NUMBLOCKPAGES may differ from what was specified because the size must be a multiple of the BLOCKSIZE. The block-based area of the buffer pool cannot be more than 98% of the total buffer pool size. Specifying a value of 0 will disable block I/O for the buffer pool.

  • BLOCKSIZE: Specifies the number of pages within a given block in the block-based area of the buffer pool. The block size must be between two and 256 pages, and the default value is 32 pages.

  • PAGESIZE: Specifies the page size used for the buffer pool. The default page size is 4 KB, or 4096 bytes. The page size can be specified in both bytes and KB.

  • EXTENDED STORAGE/NOT EXTENDED STORAGE: Specifies whether buffer pool victim pages will be copied to a secondary cache called extended storage. Extended storage is more efficient than retrieving data from disk but less efficient than retrieving data from the buffer pool, so it is not applicable to 64-bit environments.

NOTE

Once defined for a buffer pool, the page size and buffer pool name cannot be altered.


Enabling block-based I/O by setting NUMBLOCKPAGES to a value greater than 0 can help performance for applications that do sequential prefetching. If the applications do not perform sequential prefetching of pages into the buffer pool, this area of the buffer pool will be wasted.

NOTE

NUMBLOCKPAGES does not need to be set to allow prefetching to occur.


Block-based buffer pools have the following limitations:

  • A buffer pool cannot be made block-based and also use extended storage.

  • A buffer pool cannot be made block-based and also support Address Windowing Extensions (AWE) on Windows.

  • If a buffer pool is enabled for AWE support and block-based I/O is enabled, the block-based I/O support will be disabled.

The following statement:

CREATE BUFFERPOOL BP1
SIZE 25000

will create a buffer pool named BP1 with a size of 100 MB (25,000 4-KB pages = 100 MB). Because the page size is not specified, the buffer pool will use the default page size of 4 KB. Because the IMMEDIATE option is the default, the buffer pool will be allocated immediately and available for use, as long as there is enough memory available to fulfill the request.

The following statement:

CREATE BUFFERPOOL BP2
SIZE 25000
PAGESIZE 8 K

will create a buffer pool named BP2 with a size of 200 MB (25,000 8-KB pages = 200 MB). The buffer pool will use an 8-KB page size. Because the IMMEDIATE option is the default, the buffer pool will be allocated immediately and available for use, as long as there is enough memory available to fulfill the request.

The following statement:

CREATE BUFFERPOOL BP3
DEFERRED
SIZE 1000000

will create a buffer pool named BP3 with a size of 4 GB (1,000,000 4-KB pages = 4 GB). Because the page size is not specified, the buffer pool will use the default page size of 4 KB. Because the DEFERRED option is specified, the buffer pool will not be allocated until the database is stopped and restarted.

The following statement:

CREATE BUFFERPOOL BP4
IMMEDIATE
SIZE 100000
NUMBLOCKPAGES 32000
BLOCKSIZE 256

will create a buffer pool named BP4 with a size of 400 MB (100,000 4-KB pages = 400 MB). Because the page size is not specified, the buffer pool will use the default page size of 4 KB. The buffer pool will have 128 MB set aside for block-based I/O to help optimize sequential prefetch activity. Because the IMMEDIATE option is specified, the buffer pool will be allocated immediately and available for use, as long as there is enough memory available to fulfill the request. Because the block size is set to 256 pages, it is recommended that the extent size for the table spaces assigned to this buffer pool also be set to 256 pages.

NOTE

This may not always be possible, because several table spaces with different extent sizes may be bound to the same block-based buffer pool.


The proportions of the page area and block area are not maintained. The block-based area of the buffer pool will be altered only if:

  • NUMBLOCKPAGES is specified in the ALTER BUFFERPOOL statement

  • The buffer pool SIZE is altered to something that would cause NUMBLOCKPAGES to be greater than 98% of SIZE.

NOTE

Altering the buffer pool size to a value less than the NUMBLOCKPAGES is not allowed, unless the BUFFPAGE parameter is used.


For example. Consider the buffer pool BP1, defined as:

create bufferpool BP1 size 9600 numblockpages 4800 blocksize 32

This gives a page-based area of 4,800 pages and a block-based area of 4,800 pages.

The statement:

alter bufferpool BP1 size 6400

would change the buffer pool to have a page-based area of 1,600 pages and a block-based area of 4,800 pages.

In the following statement, the specified SIZE is less than the current NUMBLOCKPAGES value of 4,800:

alter bufferpool BP1 size 3200

This would cause the following error to be returned.

[View full width]
DB21034E The command was processed as an SQL statement because it was not a valid Command graphics/ccc.gif Line Processor command. During SQL processing it returned: SQL20150N The number of block graphics/ccc.gif pages for a buffer pool is too large for the size of the buffer pool. SQLSTATE=54052

In the following statement, the specified SIZE is less than the current NUMBLOCKPAGES value of 4,800. However, because the BUFFPAGE parameter is used, the buffer pool will be reduced in size, but the block-based area will have to be reduced accordingly. This is done to allow for a certain number of pages to exist in the page-based area for workloads that do not consist of sequential prefetching. Use this statement:

update db cfg for TEST using BUFFPAGE 3200
alter bufferpool BP1 size -1

for a partitioned database with 12 partitions and partition group DPG1 defined to include database partitions 0, 1, 2, 3, 4, 5, 6, and 7. To create a buffer pool named BP5 with a size of 25,000 pages and a page size of 32 KB that will be allocated only on the partitions where DPG1 is defined, use the following statement:

CREATE BUFFERPOOL BP5
DATABASEPARTITIONGROUP DPG1
SIZE 25000
PAGESIZE 32K

For the same database and partition groups as described above, to create a buffer pool named BP6 with a page size of 32 KB and a size of 25,000 pages on partitions 0, 1, 2, 3, 4, and 5 and a size of 35,000 pages on partitions 6 and 7, use either of the following statements:

CREATE BUFFERPOOL BP6
DATABASEPARTITIONGROUP DPG1
SIZE 25000
PAGESIZE 32K
EXCEPT ON DBPARTITIONNUMS 6 TO 7 SIZE 35000

CREATE BUFFERPOOL BP6
DATABASEPARTITIONGROUP DPG1
SIZE 25000
PAGESIZE 32K
EXCEPT ON DBPARTITIONNUM 6 SIZE 35000
EXCEPT ON DBPARTITIONNUM 7 SIZE 35000

How Many Buffer Pools Should a Database Have?

There are two main schools of thought regarding the number of buffer pools that should be configured for a database:

  • Use one big buffer pool and let DB2's aging algorithm take care of which pages are in the buffer pool.

  • Use multiple buffer pools and assign them to specific table spaces to ensure that highly active pages are kept in the buffer pool.

If DB2 is installed in 32-bit mode, there are limitations on the maximum amount of shared memory that can be addressed by a process. Because the buffer pool is allocated in shared memory, there are limits on the maximum size of the buffer pool(s) that can be created for a database, regardless of the amount of physical memory on the system. If DB2 is installed in 64-bit mode, the shared memory limit is much higher, and most systems will run out of real memory before hitting the operating system limit.

If they are properly sized and assigned to the right table spaces, multiple buffer pools will outperform a single buffer pool. However, multiple buffer pools will require constant monitoring and tuning in order to keep them performing optimally.

Other than its size, a single buffer pool needs no tuning. DB2 has a highly optimized algorithm for aging pages in the buffer pool that uses several techniques designed to optimize the buffer pool hit ratio by:

  • Favoring important pages, such as index pages

  • Placing pages that are unlikely to be accessed again on "Hate stacks," which are used to identify victim pages quickly

When to Consider Multiple Buffer Pools

Multiple buffer pools should be considered under the following conditions:

  • There are multiple applications accessing the database, and the user wants to favor a particular application.

  • Certain tables do not need large buffer pools, for example:

    • Tables that are always appended to, such as journal or history tables.

    • Huge tables (i.e., bigger than the buffer pool) that are only ever fully scanned.

      - These tables will likely always require disk I/O, regardless of buffer pool size.

      - Assign these tables to a dedicated buffer pool; otherwise, when they are scanned, they will likely cause other important pages to be flushed from the buffer pool.

      - Be careful to make these tables' dedicated buffer pool large enough for effective prefetching.

  • If tables are usually scanned concurrently, separate them into their own dedicated buffer pools.

    • UDB will try to keep these pages in memory; but in aggregate, this could swamp the buffer pool and interfere with the access to other objects.

  • To isolate high-priority or high-usage tables or indexes from other database activity.

  • If people occasionally run large reports on OLTP systems and the reports do large joins, sorts, etc. that cause large temporary tables to be created.

    • These temporary tables can sometimes overwhelm the buffer pool, so it can be beneficial to create a buffer pool for the temporary table space.

    • Size this buffer pool according to the typical overflow sorts, if there are any, under normal operations.

    • This buffer pool typically does not need to be large because it is normally accessed sequentially.

    • This way, the reports have less impact on the overall performance of the online system.

Buffer Pool Overhead

For every page of buffer pool space and/or extended storage, DB2 creates a 100-byte descriptor in the database heap to store a descriptor of the page. Before creating large buffer pools, it may be necessary to increase the size of the database heap (dbheap) configuration parameter to hold the descriptors for the buffer pool pages.

For a 1-GB buffer pool with a page size of 4 KB, there must be 262,144 pointers allocated in the database heap. This will use up approximately 25 MB of database heap.

For a 32-GB buffer pool with a page size of 4 KB, there must be 8,388,608 pointers allocated in the database heap. This will use up approximately 800 MB of database heap.

32- and 64-bit Considerations

DB2 UDB Version 8 can be installed as a 32-bit or a 64-bit application. When DB2 is installed on the 64-bit version of UNIX, setting the DATABASE_MEMORY configuration parameter to AUTOMATIC allows DB2 to grow its memory usage as needed, when the buffer pools grow, or when additional memory is needed for database control blocks.

When DB2 is installed in 32-bit mode, the amount of shared memory available and the memory management options are more limited than in 64-bit mode. To guarantee that memory is available to create a new buffer pool or enlarge an existing buffer pool with the IMMEDIATE option, memory must be reserved when the database is started. If the database configuration parameter DATABASE_MEMORY is set to a numeric value, when the database is activated, DB2 will allocate the minimum of the following:

  • The total shared memory used by the buffer pools, lock list, database heap, shared sort heap threshold, package cache, and utility heap.

  • The amount of memory specified by DATABASE_MEMORY.

NOTE

The db2level command now displays whether DB2 is installed in 32- or 64-bit mode, in addition to the version and fixpack level.


Shared Memory/Buffer Pool Limits

When DB2 is installed in 32-bit mode, there are operating system limitations to the maximum size of a shared memory that a process can allocate. Because each database will have all of its shared memory allocated in the same segment, there is a maximum amount of shared memory that can be allocated per database.

It is important to note that the limitation is on the total amount of shared memory that can be allocated/addressed, and the buffer pool is just one of the items that gets allocated in the database's shared memory. The following items are allocated in shared memory:

  • Buffer pools (buffpage or BP size)

  • Lock list (locklist)

  • Package cache (pckcachesz)

  • Shared sorts

    • If intra-partition parallelism is enabled, reserve the shared sort heap threshold (SHEAPTHRES_SHR).

  • Database heap (dbheap)

    • Log buffer (logbufsz)

    • Catalog cache (catalogcache_sz)

  • Utility heap (util_heap_sz)

In 32-bit AIX, there are a maximum of 16, 256 MB memory segments addressable by a process. Of these, only seven can be used by DB2 for shared memory. Of these seven segments (or 1.75 GB), one is used for memory mapped I/O, and one may be used by the fast communications manager (FCM) for inter- or intra-partition communication (if there are multiple database partitions or intra-partition parallelism is enabled). To maximize the amount of shared memory that can be used for the buffer pool, memory-mapped I/O can be disabled, and the FCM can be forced to use the network instead of shared memory.

To disable memory mapped I/O, set the following DB2 registry variables to no.

  • DB2_MMAP_READ

  • DB2_MMAP_WRITE

Using memory-mapped I/O is most beneficial for systems that contain table spaces with few file containers, i.e., SMS or DMS files. Memory-mapped I/O helps to avoid i-node latching by spreading the I/Os across more files (and more underlying i-nodes). If memory-mapped I/O is turned off, make sure that any SMS table space (or DMS table space using file containers) has enough containers (i.e., 3+ per file system) to avoid i-node contention. Otherwise, any performance gain from the extra memory will be lost due to contention on disk.

To force the FCM to use the network instead of shared memory, set the following DB2 registry variable to no.

  • DB2_FORCE_FCM_BP

The maximum addressable amount of shared memory for the 32-bit version of DB2 varies, depending on the operating system. Table 2.1 shows shared memory limits.

Table 2.1. Operating System Shared Memory Limits

Operating System

Shared Memory Limit

AIX

1.75 GB

Solaris

3.35 GB

Windows NT/2000/XP

2 GB

3 GB if using Advanced Server and /3GB set in boot.ini

64 GB with AWE support; requires that DB2_AWE registry variable be set

Linux

Kernel 2.3 or earlier:

768 KB if less than 2 GB of real memory

1.1GB if 2 GB or more of real memory

Kernel 2.4 or higher

1.75 GB

HP/UX

800 KB

Maximizing Buffer Pool Size on Windows

When working with Windows 2000, the total addressable memory can be up to 64 GB; therefore, the maximum buffer pool sizes that can be created on Windows equals 64 GB minus the memory used by the operating system and other DB2 memory allocations, assuming that the server is dedicated to DB2. The support for large memory addressing on Windows is provided by the Microsoft Address Windowing Extensions (AWE). Through AWE, Windows 2000 Advanced Server provides support for up to 8 GB of memory addressing, whereas Windows 2000 Data Center Server provides support for up to 64 GB of memory.

To take advantage of memory addresses above 2 GB, both DB2 and Windows 2000 must be configured correctly to support AWE. To be able to address up to 3 GB of memory, the /3GB Windows 2000 boot option must be set. To enable access to more than 4 GB of memory via the AWE memory interface, the /PAE Windows 2000 boot option must be set. To verify that you have the correct boot option selected, under Control Panel, select System, then select "Startup and Recovery." From the drop-down list, you can see the available boot options. If the boot option (/3GB or /PAE) you want is selected, you are ready to proceed to the next task in setting up AWE support. If the option you want is not available for selection, you must add the option to the boot.ini file on the system drive. The boot.ini file contains a list of actions to be done when the operating system is started. Add /3GB or /PAE, or both (separated by blanks), at the end of the list of existing parameters. Once you have saved this changed file, you can verify and select the correct boot option, as mentioned above.

Windows 2000 also has to be modified to associate the right to "lock pages in memory" with the userid that was used to install DB2. To set the "lock pages in memory" correctly, once you have logged on to Windows 2000 as the user who installed DB2, under the Start menu on Windows 2000, select the "Administrative Tools" folder, then the "Local Security Policy" program. Under the local policies, you can select the user rights assignment for "lock pages in memory."

DB2 also requires the setting of the DB2_AWE registry variable to be able to take advantage of the larger memory addressing. This registry variable must be set to the buffer pool ID of the buffer pool that will be larger than 3 GB and have a need for AWE support, as well as the number of physical pages and the address window pages to be allocated for the buffer pool.

The buffer pool ID is found in the BUFFERPOOLID column in the catalog view SYSCAT.BUFFERPOOLS. The number of physical pages to allocate should be less than the total number of available pages of memory, and the actual number chosen will depend on the working environment. For example, in an environment where only DB2 UDB and database applications are used on the server, normally select a value between one-half of the available memory up to 1 GB less than the available memory. In an environment where other nondatabase applications are also running on the server, these values will need to be reduced to leave memory available for the other applications. The number used in the DB2_AWE registry variable is the number of physical pages to be used in support of AWE and for use by DB2. The upper limit on the address window pages is 1.5 GB, or 2.5 GB when the /3GB Windows 2000 boot option is in effect.

Hidden Buffer Pools

When a database is activated or started (i.e., during the first connection to the database), DB2 automatically creates four hidden buffer pools for the database, in addition to the IBMDEFAULTBP and any user created buffer pools. These buffer pools are hidden and do not have entries in the system catalog tables. In addition, these buffer pools cannot be used directly by assigning table spaces to them and cannot be altered.

There will be one hidden buffer pool per page size (i.e., 4 KB, 8 KB, 16 KB, and 32 KB) to ensure that there is a buffer pool available under all circumstances. DB2 UDB will use these buffer pools under the following conditions:

  • When the CREATE BUFFERPOOL statement is executed and the IMMEDIATE option is specified, but there is not enough memory available to allocate the buffer pool.

    • If this occurs, a message is written to the administration notification log.

    • Any table spaces that are using the buffer pool will be remapped to the hidden buffer pool with the same page size.

  • When the IBMDEFAULTBP and/or any of the user-created buffer pools cannot be allocated when the database is activated or started.

    • If this occurs, a message is written to the administration notification log.

    • Any table space that is using a buffer pool that was not allocated will be remapped to the hidden buffer pool with the same page size.

    • DB2 will be fully functional because of the hidden buffer pools, but performance will be drastically reduced.

  • When a table space is created and its page size does not correspond to the page size of any of the user created buffer pools.

  • During a roll forward operation if a buffer pool is created and the DEFERRED option is specified.

    • Any user-created table spaces that are assigned to this buffer pool will be remapped to the hidden buffer pool with the same page size for the duration of the roll forward operation.

By default, the hidden buffer pools will be created with a size of 16 pages. This can be changed, using the DB2_OVERRIDE_BPF registry variable. To change the size of the hidden buffer pools to use 64 pages each, set the registry variable as follows:

DB2SET DB2_OVERRIDE_BPF=64

Altering Buffer Pools

The ALTER BUFFERPOOL statement will change the defined attributes for the specified buffer pool in the database that the user is connected to. In previous versions of DB2 UDB, the buffer pool size could not be changed until the database was stopped and restarted (i.e., all current applications were disconnected from the database and another connection was established).

With DB2 UDB Version 8, the buffer pool size can be changed immediately, provided that there is enough memory available to handle the request. However, changes to any other buffer pool configuration parameters will be deferred until the database is stopped and restarted.

The ALTER BUFFERPOOL statement has options to specify the following:

  • bufferpool-name: Specifies the name of the buffer pool. The name cannot be used for any other buffer pools and cannot begin with the characters SYS or IBM.

  • IMMEDIATE: Specifies that the buffer pool will be created immediately if there is enough memory available on the system. If there is not enough reserved space in the database shared memory to allocate the new buffer pool, a warning is returned, and buffer pool creation will be DEFERRED. This is the default.

  • DEFERRED: Specifies that the buffer pool will be created the next time that the database is stopped and restarted

  • DBPARTITIONNUM: Specifies the database partition group on which the ALTER BUFFERPOOL statement will take effect.

  • SIZE: Specifies the size of the buffer pool and is defined in number of pages. In a partitioned database, this will be the default size for all database partitions where the buffer pool exists. The EXCEPT ON DBPARTITIONNUMS clause allows the buffer pool to have different sizes on the different database partitions.

  • EXTENDED STORAGE/NOT EXTENDED STORAGE: Specifies whether buffer pool victim pages will be copied to a secondary cache called extended storage. Extended storage is more efficient than retrieving data from disk but less efficient than retrieving data from the buffer pool so is not applicable to 64-bit environments.

  • ADD DATABASE PARTITION GROUP: Specifies the database partition group(s) to which the buffer pool will be added. The buffer pool will be created on all database partitions that are part of the specified database partition group(s).

  • NUMBLOCKPAGES: Specifies the number of pages to be created in the block-based portion of the buffer pool. The actual value of NUMBLOCKPAGES may differ from what was specified because the size must be a multiple of the BLOCKSIZE. The block-based portion of the buffer pool cannot be more than 98% of the total buffer pool size. Specifying a value of 0 will disable block I/O for the table space.

  • BLOCKSIZE: Specifies the number of pages used for the block-based I/O. The block size must be between 2 and 256 pages, and the default value is 32 pages.

A buffer pool cannot be altered to use both block-based I/O and extended storage at the same time.

NOTE

Before altering the number of pages in a buffer pool, it is important to understand and assess the impact on applications accessing the database. A change in buffer pool size may result in a different access plan.


For the buffer pool created with the following statement:

CREATE BUFFERPOOL BP1
SIZE 25000

In order to change the size of the buffer pool to 200 MB use the following statement:

ALTER BUFFERPOOL BP1
SIZE 50000

For the buffer pool created with the following statement:

CREATE BUFFERPOOL BP2
IMMEDIATE
SIZE 100000
NUMBLOCKPAGES 32000
BLOCKSIZE 256

In order to change the size of the block-based area of the buffer pool to 200 MB use the following statement:

ALTER BUFFERPOOL BP2
NUMBLOCKPAGES 50000

NOTE

Changes to the size of the block-based area of the buffer pool will not take effect immediately. They will take effect when the database is stopped and restarted.


For the buffer pool created with the following statement:

CREATE BUFFERPOOL BP3
DATABASEPARTITIONGROUP DPG1
SIZE 25000
PAGESIZE 32K

In order to allocate this buffer pool on the database partitions in the partition group DPG2, use the following statement:

ALTER BUFFERPOOL BP3
ADD DATABASEPARTITIONGROUP DPG2

For the buffer pool created with the following statement:

CREATE BUFFERPOOL BP4
IMMEDIATE
SIZE 100000
NUMBLOCKPAGES 32000
BLOCKSIZE 256

the following statement:

ALTER BUFFERPOOL BP4
EXTENDED STORAGE

would produce an error because block-based I/O and extended storage cannot both be enabled at the same time.

Block-Based Buffer Pools Can Improve Sequential Prefetching

The DB2 UDB buffer pools are page-based; therefore, when contiguous pages on disk are prefetched into the buffer pool, they likely will be placed into noncontiguous pages within the buffer pool. Sequential prefetching can be enhanced if contiguous pages can be read from disk into contiguous pages within the buffer pool.

This can be accomplished by enabling block-based I/O for the buffer pool, using the NUMBLOCKPAGES and BLOCKSIZE parameters for the buffer pool. A block-based buffer pool will contain a page-based area, as well as a block-based area, with sizes based on the NUMBLOCKPAGES parameter. The page-based area will be used for any I/O operation that is not performed using sequential prefetching. The block-based area of the buffer pool will be made up of a number of "blocks," where each block will contain a set number of contiguous pages, defined by the BLOCKSIZE.

To make the block-based I/O as efficient as possible, it is important to try to match the extent size for the table spaces, using the buffer pool with the block size for the buffer pool. Because the prefetch operation will attempt to read an extent from the table space into a block in the buffer pool, having these use the same size allows for the most optimal use of the memory. DB2 will still use the block-based area of the buffer pool for prefetching if the extent size is larger than the block size of the buffer pool. If the extent size is smaller than the block size, DB2 may still use the block-based area for prefetching, but if there is too much difference between the extent and block sizes, DB2 may chose to use the page-based area of the buffer pool, depending on the size difference.

If some of the pages that have been requested in the prefetch request have already been read into the page-based area of the buffer pool, using page-based I/O, the prefetcher may or may not read the data into the block-based area, depending on how much of the block would be wasted. The I/O server allows some wasted pages in each buffer pool block, but if too much of a block would be wasted, the I/O server will prefetch the data into the page-based area of the buffer pool. Space is considered as wasted under the following conditions:

  • The page would be empty because the extent size is smaller than the block size

  • The page already exists in the page-based area of the buffer pool

System Catalog Views Relating to Buffer Pools

There are two system catalog views that can be used to retrieve information about the buffer pools defined within a database:

  • SYSCAT.BUFFERPOOLS

  • SYSCAT.BUFFERPOOLDBPARTITIONS

SYSCAT.BUFFERPOOLS View

This view contains a row for each buffer pool defined in each database partition group in the database. It contains the following columns:

  • BPNAME: Name of the buffer pool. The name is defined when you create the buffer pool with the CREATE BUFFERPOOL statement.

  • BUFFERPOOLID: The unique identifier assigned to the buffer pool when it is created

  • DBPGNAME: The name of the database partition group where the buffer pool is defined. Will be blank if the buffer pool is on all partitions.

  • NPAGES: The size of the buffer pool, in number of pages.

  • PAGESIZE: The page size for the buffer pool.

  • ESTORE: Indicates whether the buffer pool is using extended storage as a secondary cache.

    • Y = Yes

    • N = No

  • NUMBLOCKPAGES: The number of pages in the buffer pool set aside for block-based I/O, in number of pages.

    • Zero if block-based I/O is not enabled

  • BLOCKSIZE: The block size of a given block in the block-based area of the buffer pool.

    • Zero if block-based I/O is not enabled

The following is an example of the contents of this view:

[View full width]
select * from SYSCAT.BUFFERPOOLS BPNAME BUFFERPOOLID DBPGNAME NPAGES PAGESIZE ESTORE NUMBLOCKAPGES graphics/ccc.gif BLOCKSIZE -------------------------------------------------------------------------------------------- IBMDEFAULTBP 1 - 250 4096 N 0 0 BP3 2 PG2 25000 32768 N 0 0 BP4 3 - 10000 4096 N 3200 256 3 record(s) selected.

To calculate the sizes of the buffer pools in MB, use the following statement:

select BPNAME, NPAGES*PAGESIZE/1024/1024 AS SIZE_in_MB from SYSCAT.BUFFERPOOLS

The following is an example of the output of this statement:

BPNAME                               SIZE_in_MB
----------------------------------------------
IBMDEFAULTBP                         9
BP3                                  781
BP4                                  39

3 record(s) selected.
SYSCAT.BUFFERPOOLDBPARTITIONS View

This view contains a row for each database partition in the database partition group, where the size of the buffer pool is different from the default size specified in the column NPAGES in the SYSCAT.BUFFERPOOLS view. It contains the following columns:

  • BUFFERPOOLID: The unique identifier assigned to the buffer pool when it is created

  • DBPARTITIONNUM: The database partition number where the buffer pool has a different size.

  • NPAGES: The size of the buffer pool, in number of pages on the specified database partition.

The following is an example of the contents of this view:

select * from SYSCAT.BUFFERPOOLDBPARTITIONS

BUFFERPOOLID             DBPARTITIONNUM          NPAGES
-----------------------------------------------------
2                        3                       30000

1 record(s) selected.

What Is Prefetching?

When an agent acting on behalf of an application needs to access table or index pages, it will first look for the pages in the database buffer pool area. If the page cannot be found in the buffer pool area, it will be read from disk into the buffer pool. I/O is very expensive, and in this case, the agent cannot do anything but wait for the read request to finish before it can access the page. These page reads are typically done one page at a time, and if the application also needs to access subsequent pages within the table or index, this is not an efficient method for reading the pages into the buffer pool.

In many situations, DB2 UDB can anticipate the pages that will be requested by an application and read them into the buffer pool before the agent actually attempts to access them. This is referred to as prefetching. Prefetching can improve the database performance because the pages will be found in the buffer pool when the agent accesses them, reducing or eliminating the time the application must wait for the page to be read from disk into the buffer pool. This is more relevant to DSS-type workloads that scan large indexes and tables than for OLTP-type workloads that involve less scanning and more random insert/update/delete activity.

Prefetching can be enabled by the DB2 optimizer when it is building the access plan for a statement and determines that it will be scanning a large portion of a table or index. It can also be enabled or triggered when DB2 is executing an access plan and detects that it has read a number of pages in sequence and will likely continue to do so. This is known as sequential detection and can be enabled or disabled, using the database configuration parameter SEQDETECT.

Restart recovery will automatically use the prefetchers to improve restart time; however, prefetching can be disabled during restart recovery by setting the DB2 registry variable DB2_AVOID_PREFETCH to ON.

How Does Prefetching Work?

Prefetching will be requested by a DB2 agent if the access plan has specified that prefetching should be done or if sequential detection is enabled and a number of pages are read in sequence. The DB2 agent will make prefetch requests that will be examined by the prefetch manager and assigned to the DB2 I/O servers (also known as prefetchers).

The prefetch requests are sent to a queue where they are examined by the prefetch manager. The prefetch manager examines the requests to determine whether any of the prefetch requests from different applications can be combined and handled more efficiently in one prefetch operation. If no prefetch requests can be combined, the queue acts in a first in-first out (FIFO) manner.

To ensure that the prefetchers do not work too aggressively, reading data into the buffer pool and overwriting pages before they can be accessed, the amount of prefetching is controlled by the size of the buffer pool, as well as the prefetch size and the access plan generated by the optimizer. For smaller buffer pools, prefetching may be scaled back to ensure that prefetched pages do not flood the buffer pool and kick out pages that are currently being used or have been prefetched into the buffer pool and have not been accessed.

Within DB2 UDB, there are two main types of data and index prefetch requests handled by the prefetchers: range and list prefetching.

Range prefetching is used to prefetch a sequential range of pages from a database object and is used during table scans and when triggered by sequential detection. If a block-based area is set aside in the buffer pool, DB2 will perform a "big block" read to read a contiguous series of pages into a single private memory buffer, then will copy the block of pages to the block-based area of the buffer pool with one memory copy operation.

Internally, the mechanism used is big block read or "vectored" read (on platforms that support an efficient vectored read API). A big block read involves reading a contiguous block of pages into a single, private memory buffer. Each individual page is then copied into its own buffer pool slot (and all of these slots are likely to be scattered throughout the buffer pool). A vectored read API will read a contiguous block of pages from disk into different memory buffers (a different memory buffer for each page of the block). In this case, those memory buffers can actually be the individual buffer pool slots, and the overhead of the copy from one memory buffer to another is avoided (hence, an efficient vectored read API is preferred over big block read). The block-based area of the buffer pool overcomes this by setting aside contiguous blocks of the buffer pool equal in size to the extent size to make prefetching more efficient, especially on operating systems that do not support vectored reads.

List prefetching is used to prefetch a list of pages into the buffer pool area when an index scan produces a list of record IDs (RIDs) that must be prefetched into the buffer pool. List prefetching is normally decided on and specified in the access plan when the SQL statement is optimized. A list prefetch request may get converted to a range prefetch request if the pages are found to be sequential.

There is one prefetch queue that is monitored by the prefetch manager and shared between all of the configured prefetchers. The prefetch queue can handle up to 100 prefetch requests.

When a prefetch request is made, the request will be broken up into a number of smaller I/O requests. By default, the number of I/O requests will be determined by the number of containers that are in the table space. However, if the DB2_PARALLEL_IO registry variable is set, DB2 will start up a number of prefetchers equal to the prefetch size divided by the extent size. This allows multiple prefetchers to be working on the requests in parallel.

Choosing the optimal prefetch size

The prefetch size for a table space can be changed using the ALTER TABLESPACE statement so it can be adjusted if the prefetchers are either over-prefetching or under-prefetching. Over-prefetching results in wasted space in the buffer pool and/or overwriting pages in the buffer pool that are likely to be reused by other applications. Under-prefetching results in high prefetch wait times and usually causes the DB2 agents to perform the I/O themselves.

As a general rule of thumb, set the prefetch size as follows:

For a table space with multiple containers:

prefetch size = extent size * number of containers

For a table space with a single container on a Redundant Array of Independent Disks (RAID) or striped disk with the DB2_PARALLEL_IO variable set:

prefetch size = extent size * number of disks in the stripe set

If the containers are on separate physical disks, this will allow parallel I/O that will enable the prefetchers to read all extents simultaneously. The prefetching can be made more aggressive by increasing the prefetch size to be an integer multiple of the above formula. However, for this to be effective, there must be sufficient disk bandwidth to support it. In addition, the number of prefetchers may need to be increased (but only if the registry variable DB2_PARALLEL_IO is enabled, because the number of requests will be (prefetch size/extent size). With the registry variable DB2_PARALLEL_IO disabled (which is the default), the number of requests will be based on the number of containers in the table space, so increasing the prefetch size will not result in more prefetch requests and will increase the amount of I/O requested in each request.

Consider the following example:

A table space has two containers with a prefetch size set to twice the extent size. With the registry variable DB2_PARALLEL_IO disabled, there will be two prefetch requests since there are two containers in the table space, and each prefetch request will be for one extent. With the registry variable DB2_PARALLEL_IO enabled, there will be two prefetch requests because the prefetch size divided by the extent size equals 2, and each prefetch request will be for one extent.

If the prefetch size is increased to be four times the extent size, with the registry variable DB2_PARALLEL_IO disabled, there will still only be two prefetch requests because there are two containers in the table space. However, each prefetch request will now be for two extents. With the registry variable DB2_PARALLEL_IO enabled, there will be four prefetch requests because the prefetch size divided by the extent size equals 4, and each prefetch request will still be for one extent.

Increasing the prefetch size when the registry variable DB2_PARALLEL_IO is disabled will normally result in the same number of prefetch requests but with more I/O assigned to each request.

Choosing the number of prefetchers

If prefetch size is set as suggested above (i.e., prefetch size = extent size x number of containers) for all of the table spaces in the database, and all of the table spaces are being scanned at the same time, then the number of prefetchers should be equal to the number of disks belonging to the database. However, if one or more of the table spaces has been set up using more aggressive prefetching (i.e., the prefetch size is a multiple of this value) and/or some of the table spaces are not being scanned at the same time as the others, then the calculation becomes more complicated. To determine the number of prefetchers required in this case:

  • Determine the number of table spaces that will potentially be scanned at the same time.

  • For each of these table spaces, determine the number of prefetchers required to service a scan of it (based on the formulas above).

  • Sum these values to determine the total number of prefetchers required.

Prefetching example

When the optimizer determines that a table scan can benefit from prefetching, DB2 will not prefetch the entire table in at once because this could potentially flood the buffer pool and result in important pages being removed; it does the prefetching in stages.

In the example shown in Figure 2.1, the prefetch size is equal to four times the extent size. Each block represents an extent.

Figure 2.1. DB2 prefetching.

graphics/02fig01.gif

New prefetch requests are made each time a trigger-point page is read by the agent. Trigger-point pages are located at every multiple of the prefetch size within the table.

When the table scan starts, the first trigger point will be hit when the first page in the table is read. The first trigger point will request that DB2 prefetch the first two prefetch-sized blocks in the table. The first block will actually be one page less than the regular prefetch size and will start on the second page in the table to avoid prefetcher/agent contention on that first page. All subsequent trigger points will make a prefetch request to bring in the next prefetch-sized block of pages. If the system is configured and tuned properly, the agent should never have to wait on I/O because the prefetcher will have already read the page into the buffer pool.

If there is more than one container in the table space and/or DB2_PARALLEL_IO is enabled, it will be much more efficient to have multiple prefetchers performing the I/O, rather than one prefetcher. In this case, the prefetch request is broken up into smaller requests, and each of the requests is placed onto the prefetch queue. The prefetchers can then service the requests in parallel, potentially performing much faster than a single prefetcher doing all of the work.

In the example above, if DB2_PARALLEL_IO is enabled, the larger request would be broken into four (prefetch size / extent size = 4) smaller prefetch requests. If DB2_PARALLEL_IO is disabled (which is the default), the number of prefetch requests would be based on the number of containers in the table space.

In the example, the last trigger point will not result in prefetching because there are not enough pages remaining to fill an entire prefetch request.

Other roles of the prefetchers

In addition to the true prefetching work, prefetchers have some other key functions. These functions include:

  1. DROPPING TEMPORARY OBJECTS: During query processing, one or more system temporary tables may be created to hold intermediate or final result sets. When a system temporary table is no longer required, it is dropped. This involves purging (removing) its pages from the buffer pool, and this can take a relatively long time. So, rather than having the agent drop the table and wait for it to complete, it instead makes a prefetcher request that will handle the drop table work. This reduces the agent's response time because it does not have to do the work itself.

  2. ADDING DMS CONTAINERS: When more than one container is added to a DMS table space in a single ALTER TABLESPACE statement, the prefetchers assist the agent in allocating the disk space for those containers. The agent handles the allocation for one container, and for each of the remaining containers it creates a prefetch request to add the container. This allows the container allocation to be done in parallel, speeding up the process.

  3. RESIZING DMS CONTAINERS: This request type is similar to adding DMS containers, but it handles allocation of new space when the RESIZE or EXTEND options of the ALTER TABLESPACE statement are used.

  4. TRAVERSING INDEX LEAF PAGES: This request type is mainly used by the REORG utility when scanning an index object. Given the root page of an index, the leaf level is traversed using the prefetchers.

What Is Page Cleaning?

When an agent acting on behalf of an application needs to access table or index pages, it will first look for the pages in the database buffer pool area. If the page cannot be found in the buffer pool area, it will be read from disk into the buffer pool. If the buffer pool is full, DB2 must select a "victim" page to be overwritten in the buffer pool. If the victim page is dirty (i.e., it was changed since it was read into the buffer pool and has not been written to disk), it must first be written to disk before it can be overwritten. During the write operation, the application must wait. To reduce the likelihood that the victim page will be dirty, DB2 uses page cleaners to write dirty pages asynchronously to disk before they are chosen as victims.

An additional benefit of the page cleaners w