Example 1: Create Index Taking Too Long

Problem Description

CustomerA was creating indexes for a large table in its database, but the time required to create the index appeared to be excessive. The following is the db2batch output, which shows the time required to create the index:

---------------------------------------------
Statement number: 1

create index x1 on staff (id)

Elapsed Time is:           1068.922   seconds
---------------------------------------------

Problem Analysis and Resolution

Step 1: Examine the Database and Database Manager Configuration Parameters

Database manager configuration

The three parameters in the database manager configuration that typically have a big impact on the create index operation are the SHEAPTHRES, MAX_QUERYDEGREE, and INTRA_PARALLEL, and these parameters are highlighted in bold below. In this case, intra-partition parallelism was disabled; therefore, the index creation could not take advantage of the multiple CPUs in the server. If intra-partition parallelism is enabled, there will be sort heap used during the index creation in all subagent processes; therefore, the sort heap threshold (SHEAPTHRES) may also need to be increased.

Node type = Database Server with local clients

Database manager configuration release level            = 0x0a00

CPU speed (millisec/instruction)             (CPUSPEED) = 4.000000e-05

Max number of concurrently active databases     (NUMDB) = 8
Data Links support                          (DATALINKS) = NO
Federated Database System Support           (FEDERATED) = NO
Transaction processor monitor name        (TP_MON_NAME) =

Default charge-back account           (DFT_ACCOUNT_STR) =

Java Development Kit installation path       (JDK_PATH) = /wsdb/v81/
  bldsupp/AIX/jdk1.3.0

Diagnostic error capture level              (DIAGLEVEL) = 3
Notify Level                              (NOTIFYLEVEL) = 3
Diagnostic data directory path               (DIAGPATH) =

Default database monitor switches
  Buffer pool                         (DFT_MON_BUFPOOL) = OFF
  Lock                                   (DFT_MON_LOCK) = OFF
  Sort                                   (DFT_MON_SORT) = OFF
  Statement                              (DFT_MON_STMT) = OFF
  Table                                 (DFT_MON_TABLE) = OFF
  Timestamp                         (DFT_MON_TIMESTAMP) = ON
  Unit of work                            (DFT_MON_UOW) = OFF
Monitor health of instance and databases   (HEALTH_MON) = OFF

SYSADM group name                        (SYSADM_GROUP) = BUILD
SYSCTRL group name                      (SYSCTRL_GROUP) =
SYSMAINT group name                    (SYSMAINT_GROUP) =

Database manager authentication        (AUTHENTICATION) = SERVER
Cataloging allowed without authority   (CATALOG_NOAUTH) = YES
Trust all clients                      (TRUST_ALLCLNTS) = YES
Trusted client authentication          (TRUST_CLNTAUTH) = CLIENT
Use SNA authentication                   (USE_SNA_AUTH) = NO
Bypass federated authentication            (FED_NOAUTH) = NO

Default database path                       (DFTDBPATH) = /home/dsnow

Database monitor heap size (4KB)          (MON_HEAP_SZ) = 90
Java Virtual Machine heap size (4KB)     (JAVA_HEAP_SZ) = 1024
Audit buffer size (4KB)                  (AUDIT_BUF_SZ) = 0
Size of instance shared memory (4KB)  (INSTANCE_MEMORY) = AUTOMATIC
Backup buffer default size (4KB)            (BACKBUFSZ) = 1024
Restore buffer default size (4KB)           (RESTBUFSZ) = 1024

Sort heap threshold (4KB)                  (SHEAPTHRES) = 20000

Directory cache support                     (DIR_CACHE) = YES

Application support layer heap size (4KB)   (ASLHEAPSZ) = 15
Max requester I/O block size (bytes)         (RQRIOBLK) = 32767
Query heap size (4KB)                   (QUERY_HEAP_SZ) = 1000
DRDA services heap size (4KB)            (DRDA_HEAP_SZ) = 128

Priority of agents                           (AGENTPRI) = SYSTEM
Max number of existing agents               (MAXAGENTS) = 200
Agent pool size                        (NUM_POOLAGENTS) = 100(calculated)
Initial number of agents in pool       (NUM_INITAGENTS) = 0
Max number of coordinating agents     (MAX_COORDAGENTS) = MAXAGENTS
Max no. of concurrent coordinating agents  (MAXCAGENTS) = MAX_COORDAGENTS
Max number of client connections      (MAX_CONNECTIONS) = MAX_COORDAGENTS

Keep fenced process                        (KEEPFENCED) = YES
Number of pooled fenced processes         (FENCED_POOL) = MAX_COORDAGENTS
Initialize fenced process with JVM     (INITFENCED_JVM) = NO
Initial number of fenced processes     (NUM_INITFENCED) = 0

Index re-creation time                       (INDEXREC) = RESTART

Transaction manager database name         (TM_DATABASE) = 1ST_CONN
Transaction resync interval (sec)     (RESYNC_INTERVAL) = 180
SPM name                                     (SPM_NAME) =
SPM log size                          (SPM_LOG_FILE_SZ) = 256
SPM resync agent limit                 (SPM_MAX_RESYNC) = 20
SPM log path                             (SPM_LOG_PATH) =

TCP/IP Service name                          (SVCENAME) =
Discovery mode                               (DISCOVER) = SEARCH
Discovery communication protocols       (DISCOVER_COMM) =
Discover server instance                (DISCOVER_INST) = ENABLE

Maximum query degree of parallelism   (MAX_QUERYDEGREE) = ANY
Enable intra-partition parallelism     (INTRA_PARALLEL) = NO

No. of int. communication buffers(4KB)(FCM_NUM_BUFFERS) = 512

Database configuration for database sample

The three parameters in the database manager configuration that typically have a big impact on the create index operation are the LOGBUFSZ, SORTHEAP, and BUFFPAGE, and these parameters are highlighted in bold below.

The log buffer size (LOGBUFSZ) is important because the create index operation will cause log records to be created. In this case, the log buffer was very small and had to be increased.
The sort list heap (SORTHEAP) is used to sort the index keys while creating an index. Because the table was large, the sort heap was also increased so that the sorting need not spill to a temp table.
- On AIX, also verify the ulimit for the instance owner to ensure it is not too small, causing sort overflows. In this case, the operating system was not AIX, so this was not the cause.
In this case, the default buffer pool size (BUFFPAGE) is 1,000 4-KB pages. To determine the real size of the buffer pool, use the following statement:
```
select * from syscat.bufferpools
```
- In this case, there was only one buffer pool (IBMDEFAULTBP), and it had a size of 300,000 4-KB pages.
- This was done on a 32-bit instance of DB2, so the size of the buffer pool could not have been increased much higher without hitting the operating system limitations.

[View full width]
Database configuration release level                    = 0x0a00
Database release level                                  = 0x0a00

Database territory                                      = US
Database code page                                      = 819
Database code set                                       = ISO8859-1
Database country/region code                            = 1
Dynamic SQL Query management           (DYN_QUERY_MGMT) = DISABLE

Discovery support for this database       (DISCOVER_DB) = ENABLE

Default query optimization class         (DFT_QUERYOPT) = 5
Degree of parallelism                      (DFT_DEGREE) = 1
Continue upon arithmetic exceptions   (DFT_SQLMATHWARN) = NO
Default refresh age                   (DFT_REFRESH_AGE) = 0
Number of frequent values retained     (NUM_FREQVALUES) = 10
Number of quantiles retained            (NUM_QUANTILES) = 20

Backup pending                                          = NO

Database is consistent                                  = NO
Rollforward pending                                     = NO
Restore pending                                         = NO

Multi-page file allocation enabled                      = NO

Log retain for recovery status                          = NO
User exit for logging status                            = NO

Data Links Token Expiry Interval (sec)      (DL_EXPINT) = 60
Data Links Write Token Init Expiry Intvl(DL_WT_IEXPINT) = 60
Data Links Number of Copies             (DL_NUM_COPIES) = 1
Data Links Time after Drop (days)        (DL_TIME_DROP) = 1
Data Links Token in Uppercase                (DL_UPPER) = NO
Data Links Token Algorithm                   (DL_TOKEN) = MAC0

Database heap (4KB)                            (DBHEAP) = 1200
Size of database shared memory (4KB)  (DATABASE_MEMORY) = AUTOMATIC
Catalog cache size (4KB)              (CATALOGCACHE_SZ) = (MAXAPPLS*4)
Log buffer size (4KB)                        (LOGBUFSZ) = 8
Utilities heap size (4KB)                (UTIL_HEAP_SZ) = 5000
Buffer pool size (pages)                     (BUFFPAGE) = 1000
Extended storage segments size (4KB)    (ESTORE_SEG_SZ) = 16000
Number of extended storage segments   (NUM_ESTORE_SEGS) = 0
Max storage for lock list (4KB)              (LOCKLIST) = 100

Max size of appl. group mem set (4KB) (APPGROUP_MEM_SZ) = 20000
Percent of mem for appl. group heap   (GROUPHEAP_RATIO) = 70
Max appl. control heap size (4KB)     (APP_CTL_HEAP_SZ) = 128
Sort heap thres for shared sorts (4KB) (SHEAPTHRES_SHR) = (SHEAPTHRES)
Sort list heap (4KB)                         (SORTHEAP) = 256
SQL statement heap (4KB)                     (STMTHEAP) = 4096
Default application heap (4KB)             (APPLHEAPSZ) = 256
Package cache size (4KB)                   (PCKCACHESZ) = (MAXAPPLS*8)
Statistics heap size (4KB)               (STAT_HEAP_SZ) = 4384

Interval for checking deadlock (ms)         (DLCHKTIME) = 10000
Percent. of lock lists per application       (MAXLOCKS) = 10
Lock timeout (sec)                        (LOCKTIMEOUT) = -1

Changed pages threshold                (CHNGPGS_THRESH) = 60
Number of asynchronous page cleaners   (NUM_IOCLEANERS) = 1
Number of I/O servers                   (NUM_IOSERVERS) = 1
Index sort flag                             (INDEXSORT) = YES
Sequential detect flag                      (SEQDETECT) = YES
Default prefetch size (pages)         (DFT_PREFETCH_SZ) = 32

Track modified pages                         (TRACKMOD) = OFF

Default number of containers                            = 1
Default tablespace extentsize (pages)   (DFT_EXTENT_SZ) = 32

Max number of active applications            (MAXAPPLS) = AUTOMATIC
Average number of active applications       (AVG_APPLS) = 1
Max DB files open per application            (MAXFILOP) = 64

Log file size (4KB)                         (LOGFILSIZ) = 1000
Number of primary log files                (LOGPRIMARY) = 3
Number of secondary log files               (LOGSECOND) = 2
Changed path to log files                  (NEWLOGPATH) =
Path to log files                                       = /home/dsnow/dsnow/NODE0000
/SQL00001/SQLOGDIR/
Overflow log path                     (OVERFLOWLOGPATH) =
Mirror log path                         (MIRRORLOGPATH) =
First active log file                                   =
Block log on disk full                (BLK_LOG_DSK_FUL) = NO
Percent of max active log space by transaction(MAX_LOG) = 0
Num. of active log files for 1 active UOW(NUM_LOG_SPAN) = 0

Group commit count                          (MINCOMMIT) = 1
Percent log file reclaimed before soft chckpt (SOFTMAX) = 100
Log retain for recovery enabled             (LOGRETAIN) = OFF
User exit for logging enabled                (USEREXIT) = OFF
Auto restart enabled                      (AUTORESTART) = ON
Index re-creation time                       (INDEXREC) = SYSTEM
                                                          (RESTART)
Default number of loadrec sessions    (DFT_LOADREC_SES) = 1
Number of database backups to retain   (NUM_DB_BACKUPS) = 12
Recovery history retention (days)     (REC_HIS_RETENTN) = 366

TSM management class                    (TSM_MGMTCLASS) =
TSM node name                            (TSM_NODENAME) =
TSM owner                                   (TSM_OWNER) =
TSM password                             (TSM_PASSWORD) =

Step 2: Adjust Relevant Parameters and Determine Values for the Task Being Optimized

The relevant database manager and database configuration parameters were adjusted, based on the analysis of their currently set values, and the appropriate values for the task being organized were determined. The configurations were updated as follows:

update dbm cfg using intra_parallel on
update dbm cfg using sheapthres 60000
update db cfg for sample using logbufsz 256
update db cfg for sample using sortheap 1000

The instance was stopped and restarted.

Step 3: Rerun the CREATE INDEX Statement

The db2batch output showing the timing of the create index is below. Although the create index time was improved, it was still higher than expected.

---------------------------------------------
Statement number: 1

create index x1 on staff (id)

Elapsed Time is:            897.214   seconds
---------------------------------------------

Step 4: Check the System for I/O Bottlenecks, Excess Paging, or Other Processes/Applications Using Excessive System Resources

This required operating system tools such as vmstat, iostat, and/or top to capture the memory, I/O, and process level information.

NOTE

It is important to capture the snapshots from these tools over a period of time, not taking just one snapshot, especially because the first line of the output of the vmstat and iostat tools contains average information since the server was started and is not really useful in analyzing a problem.

While the create index statement is running, capture vmstat output, as follows:

vmstat 5 > vmstat.out

This will capture a vmstat snapshot every 5 seconds and write the output to the file vmstat.out. This command will run until a Ctrl-C is issued on the terminal where the command was run. After this, examine the vmstat.out file.

Below is the vmstat.out file that was captured while the create index statement was running:

kthr     memory             page              faults        cpu
----- ----------- ------------------------ ------------ -----------
 r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
 1  1 16243 29168   0   0   0   0    0   0 305 2890 870 14  6 80  0
 2  1 16243 29165   0   0   0   0    0   0 212 2010 550 25  7 68  0
 0  1 16243 29165   0   0   0   0    0   0 217 1175 584 24  2 74  0
 0  1 16243 29165   0   0   0   0    0   0 224 1970 563 25  6 63  2
 2  1 16243 29165   0   0   0   0    0   0 217 1171 583 23  2 75  0
 2  1 16243 29165   0   0   0   0    0   0 211 1968 550 25  7 68  0
 0  1 16243 29165   0   0   0   0    0   0 217 1175 584 24  1 75  0
 2  1 16328 29072   0   0   0   0    0   0 217 1202 584 23  3 74  0

The analysis of the vmstat output above did not indicate any problems. The run queue was small and did not grow over time. There was no paging occurring at any time during the run (as indicated by pi = 0). The CPUs were not waiting for I/O (as indicated by the wa column).

While the create index statement was running, the iostat information was captured, as follows:

iostat 5 > iostat.out

The iostat.out file was created as follows:

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
Hdisk0          12.6      13.2       7.3         46         0
hdisk1           8.8       9.6       6.4         28         0
cd0              0.0       0.0       0.0          0         0

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
Hdisk0          10.6      11.3       6.9         37         0
hdisk1           9.8      10.9       6.8         33         0
cd0              0.0       0.0       0.0          0         0
Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
Hdisk0          12.9      13.7       7.8         49         0
hdisk1           8.4       9.3       6.2         30         0
cd0              0.0       0.0       0.0          0         0

The iostat showed that the disks were relatively inactive. The most active disk was busy only 12.9% of the time. In addition, the amount of data read (Kb_read) for the disks was quite low.

In this case each of the Hdisks was actually a disk array that could support much higher throughput. Disk arrays make the disk volume look like one disk to the OS tools; therefore, it was important to understand the physical disk implementation when examining the output of these tools.

In reexamining the database and database manager configurations, the NUM_IOSERVERS configuration parameter warranted further investigation. The parameter was set as follows:

Number of I/O servers                   (NUM_IOSERVERS) = 1

DB2 was not taking advantage of I/O parallelism and was not reading from the disks in parallel. In addition, because the containers were on a RAID-5 disk, we checked the DB2_PARALLEL_IO registry variable to determine whether it was set for the table space(s) where the table was created. In this, case it was not enabled.

The number of prefetchers (I/O servers) was increased as follows:

update db cfg for sample using num_ioservers 6

In this case, the number of prefetchers was set to six because the disk volume is using RAID-5 technology with six data disks and one parity disk per array. Therefore, there were six disks servicing the table space for the data.

Parallel I/O was enabled as follows:

db2set DB2_PARALLEL_IO=*

The DB2 instance was stopped and restarted to enable the registry variable.

Step 5: Rerun the CREATE INDEX Statement

The db2batch output showing the timing of the create index is below. Although the create index time was improved, it was still higher than expected.

---------------------------------------------
Statement number: 1

create index x1 on staff (id)

Elapsed Time is:            437.391   seconds
---------------------------------------------

Step 6: Recheck the System Resource Usage

In this case, two sessions were opened, and the vmstat and iostat were captured for the entire time the index was being created.

Some of the entries from the beginning and end of the vmstat.out and iostat.out files are below:

kthr     memory             page              faults        cpu
----- ----------- ------------------------ ------------ -----------
 r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
 1  1 16243 29168   0   0   0   0    0   0 305 2890 870 14  6 80  0
 2  1 16243 29165   0   0   0   0    0   0 212 2010 550 35  7 58  0
 0  1 16243 29165   0   0   0   0    0   0 217 1175 584 34  2 64  0
 0  1 16243 29165   0   0   0   0    0   0 224 1970 563 35  6 53  2
  .
  .
  .
 2  1 16243 29165   0   0   0   0    0   0 211 1968 550 25  7 61  7
 0  1 16243 29165   0   0   0   0    0   0 217 1175 584 24  1 66  9
 2  1 16328 29072   0   0   0   0    0   0 217 1202 584 23  3 67  7



Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
Hdisk0          12.6      13.2       7.3         46         0
hdisk1           8.8       9.6       6.4         28         0
cd0              0.0       0.0       0.0          0         0

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
Hdisk0          10.6      11.3       6.9         37         0
hdisk1           9.8      10.9       6.8         33         0
cd0              0.0       0.0       0.0          0         0

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
Hdisk0          12.9      13.7       7.8         49         0
hdisk1           8.4       9.3       6.2         30         0
cd0              0.0       0.0       0.0          0         0
  .
  .
  .
Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
Hdisk0          24.7      36.1      16.3          1       317
Hdisk1          24.5      36.0      16.1          2       311
cd0              0.0       0.0       0.0          0         0

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
Hdisk0          24.6      36.1      16.3          1       315
Hdisk1          24.9      36.9      16.4          2       321
cd0              0.0       0.0       0.0          0         0

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
Hdisk0          24.5      36.0      16.2          1       315
Hdisk1          24.6      36.4      16.5          2       323
cd0              0.0       0.0       0.0          0         0

An analysis of this information shows differences in the processing between the beginning and end of the create index processing. At the beginning of the processing, all of the I/O activity was shown to be reading the data file. At the end of the create index processing, all of the activity is write activity. This was due to the manner in which DB2 creates indexes.

Because DB2 logs only the creation of index pages and not the creation of every index key value, it must flush all of the new log pages to disk at the end of the index creation. With a CHNGPGS_THRESH setting of 60% and a 1.2-GB buffer pool, at the end of the index creation, there can be up to 720 MB of pages in the buffer pool that needs to be flushed to disk. By decreasing the value of the CHNGPGS_THRESH configuration parameter to 20%, there will be a maximum of only 240 MB of pages in the buffer pool to be flushed to disk. This can result in greatly improved index creation times.

Step 7: Adjust Relevant Parameters and Determine Values for the Task Being Optimized

update db cfg for sample using chngpgs_thresh 20

Step 8: Rerun the CREATE INDEX Statement

The db2batch output showing the timing of the create index improved dramatically.

---------------------------------------------
Statement number: 1

create index x1 on staff (id)

Elapsed Time is:            38.451   seconds
---------------------------------------------