CustomerA was creating indexes for a large table in its database, but the time required to create the index appeared to be excessive. The following is the db2batch output, which shows the time required to create the index:
--------------------------------------------- Statement number: 1 create index x1 on staff (id) Elapsed Time is: 1068.922 seconds ---------------------------------------------
The three parameters in the database manager configuration that typically have a big impact on the create index operation are the SHEAPTHRES, MAX_QUERYDEGREE, and INTRA_PARALLEL, and these parameters are highlighted in bold below. In this case, intra-partition parallelism was disabled; therefore, the index creation could not take advantage of the multiple CPUs in the server. If intra-partition parallelism is enabled, there will be sort heap used during the index creation in all subagent processes; therefore, the sort heap threshold (SHEAPTHRES) may also need to be increased.
Node type = Database Server with local clients Database manager configuration release level = 0x0a00 CPU speed (millisec/instruction) (CPUSPEED) = 4.000000e-05 Max number of concurrently active databases (NUMDB) = 8 Data Links support (DATALINKS) = NO Federated Database System Support (FEDERATED) = NO Transaction processor monitor name (TP_MON_NAME) = Default charge-back account (DFT_ACCOUNT_STR) = Java Development Kit installation path (JDK_PATH) = /wsdb/v81/ bldsupp/AIX/jdk1.3.0 Diagnostic error capture level (DIAGLEVEL) = 3 Notify Level (NOTIFYLEVEL) = 3 Diagnostic data directory path (DIAGPATH) = Default database monitor switches Buffer pool (DFT_MON_BUFPOOL) = OFF Lock (DFT_MON_LOCK) = OFF Sort (DFT_MON_SORT) = OFF Statement (DFT_MON_STMT) = OFF Table (DFT_MON_TABLE) = OFF Timestamp (DFT_MON_TIMESTAMP) = ON Unit of work (DFT_MON_UOW) = OFF Monitor health of instance and databases (HEALTH_MON) = OFF SYSADM group name (SYSADM_GROUP) = BUILD SYSCTRL group name (SYSCTRL_GROUP) = SYSMAINT group name (SYSMAINT_GROUP) = Database manager authentication (AUTHENTICATION) = SERVER Cataloging allowed without authority (CATALOG_NOAUTH) = YES Trust all clients (TRUST_ALLCLNTS) = YES Trusted client authentication (TRUST_CLNTAUTH) = CLIENT Use SNA authentication (USE_SNA_AUTH) = NO Bypass federated authentication (FED_NOAUTH) = NO Default database path (DFTDBPATH) = /home/dsnow Database monitor heap size (4KB) (MON_HEAP_SZ) = 90 Java Virtual Machine heap size (4KB) (JAVA_HEAP_SZ) = 1024 Audit buffer size (4KB) (AUDIT_BUF_SZ) = 0 Size of instance shared memory (4KB) (INSTANCE_MEMORY) = AUTOMATIC Backup buffer default size (4KB) (BACKBUFSZ) = 1024 Restore buffer default size (4KB) (RESTBUFSZ) = 1024 Sort heap threshold (4KB) (SHEAPTHRES) = 20000 Directory cache support (DIR_CACHE) = YES Application support layer heap size (4KB) (ASLHEAPSZ) = 15 Max requester I/O block size (bytes) (RQRIOBLK) = 32767 Query heap size (4KB) (QUERY_HEAP_SZ) = 1000 DRDA services heap size (4KB) (DRDA_HEAP_SZ) = 128 Priority of agents (AGENTPRI) = SYSTEM Max number of existing agents (MAXAGENTS) = 200 Agent pool size (NUM_POOLAGENTS) = 100(calculated) Initial number of agents in pool (NUM_INITAGENTS) = 0 Max number of coordinating agents (MAX_COORDAGENTS) = MAXAGENTS Max no. of concurrent coordinating agents (MAXCAGENTS) = MAX_COORDAGENTS Max number of client connections (MAX_CONNECTIONS) = MAX_COORDAGENTS Keep fenced process (KEEPFENCED) = YES Number of pooled fenced processes (FENCED_POOL) = MAX_COORDAGENTS Initialize fenced process with JVM (INITFENCED_JVM) = NO Initial number of fenced processes (NUM_INITFENCED) = 0 Index re-creation time (INDEXREC) = RESTART Transaction manager database name (TM_DATABASE) = 1ST_CONN Transaction resync interval (sec) (RESYNC_INTERVAL) = 180 SPM name (SPM_NAME) = SPM log size (SPM_LOG_FILE_SZ) = 256 SPM resync agent limit (SPM_MAX_RESYNC) = 20 SPM log path (SPM_LOG_PATH) = TCP/IP Service name (SVCENAME) = Discovery mode (DISCOVER) = SEARCH Discovery communication protocols (DISCOVER_COMM) = Discover server instance (DISCOVER_INST) = ENABLE Maximum query degree of parallelism (MAX_QUERYDEGREE) = ANY Enable intra-partition parallelism (INTRA_PARALLEL) = NO No. of int. communication buffers(4KB)(FCM_NUM_BUFFERS) = 512
The three parameters in the database manager configuration that typically have a big impact on the create index operation are the LOGBUFSZ, SORTHEAP, and BUFFPAGE, and these parameters are highlighted in bold below.
The log buffer size (LOGBUFSZ) is important because the create index operation will cause log records to be created. In this case, the log buffer was very small and had to be increased.
The sort list heap (SORTHEAP) is used to sort the index keys while creating an index. Because the table was large, the sort heap was also increased so that the sorting need not spill to a temp table.
On AIX, also verify the ulimit for the instance owner to ensure it is not too small, causing sort overflows. In this case, the operating system was not AIX, so this was not the cause.
In this case, the default buffer pool size (BUFFPAGE) is 1,000 4-KB pages. To determine the real size of the buffer pool, use the following statement:
select * from syscat.bufferpools
In this case, there was only one buffer pool (IBMDEFAULTBP), and it had a size of 300,000 4-KB pages.
This was done on a 32-bit instance of DB2, so the size of the buffer pool could not have been increased much higher without hitting the operating system limitations.
[View full width]Database configuration release level = 0x0a00 Database release level = 0x0a00 Database territory = US Database code page = 819 Database code set = ISO8859-1 Database country/region code = 1 Dynamic SQL Query management (DYN_QUERY_MGMT) = DISABLE Discovery support for this database (DISCOVER_DB) = ENABLE Default query optimization class (DFT_QUERYOPT) = 5 Degree of parallelism (DFT_DEGREE) = 1 Continue upon arithmetic exceptions (DFT_SQLMATHWARN) = NO Default refresh age (DFT_REFRESH_AGE) = 0 Number of frequent values retained (NUM_FREQVALUES) = 10 Number of quantiles retained (NUM_QUANTILES) = 20 Backup pending = NO Database is consistent = NO Rollforward pending = NO Restore pending = NO Multi-page file allocation enabled = NO Log retain for recovery status = NO User exit for logging status = NO Data Links Token Expiry Interval (sec) (DL_EXPINT) = 60 Data Links Write Token Init Expiry Intvl(DL_WT_IEXPINT) = 60 Data Links Number of Copies (DL_NUM_COPIES) = 1 Data Links Time after Drop (days) (DL_TIME_DROP) = 1 Data Links Token in Uppercase (DL_UPPER) = NO Data Links Token Algorithm (DL_TOKEN) = MAC0 Database heap (4KB) (DBHEAP) = 1200 Size of database shared memory (4KB) (DATABASE_MEMORY) = AUTOMATIC Catalog cache size (4KB) (CATALOGCACHE_SZ) = (MAXAPPLS*4) Log buffer size (4KB) (LOGBUFSZ) = 8 Utilities heap size (4KB) (UTIL_HEAP_SZ) = 5000 Buffer pool size (pages) (BUFFPAGE) = 1000 Extended storage segments size (4KB) (ESTORE_SEG_SZ) = 16000 Number of extended storage segments (NUM_ESTORE_SEGS) = 0 Max storage for lock list (4KB) (LOCKLIST) = 100 Max size of appl. group mem set (4KB) (APPGROUP_MEM_SZ) = 20000 Percent of mem for appl. group heap (GROUPHEAP_RATIO) = 70 Max appl. control heap size (4KB) (APP_CTL_HEAP_SZ) = 128 Sort heap thres for shared sorts (4KB) (SHEAPTHRES_SHR) = (SHEAPTHRES) Sort list heap (4KB) (SORTHEAP) = 256 SQL statement heap (4KB) (STMTHEAP) = 4096 Default application heap (4KB) (APPLHEAPSZ) = 256 Package cache size (4KB) (PCKCACHESZ) = (MAXAPPLS*8) Statistics heap size (4KB) (STAT_HEAP_SZ) = 4384 Interval for checking deadlock (ms) (DLCHKTIME) = 10000 Percent. of lock lists per application (MAXLOCKS) = 10 Lock timeout (sec) (LOCKTIMEOUT) = -1 Changed pages threshold (CHNGPGS_THRESH) = 60 Number of asynchronous page cleaners (NUM_IOCLEANERS) = 1 Number of I/O servers (NUM_IOSERVERS) = 1 Index sort flag (INDEXSORT) = YES Sequential detect flag (SEQDETECT) = YES Default prefetch size (pages) (DFT_PREFETCH_SZ) = 32 Track modified pages (TRACKMOD) = OFF Default number of containers = 1 Default tablespace extentsize (pages) (DFT_EXTENT_SZ) = 32 Max number of active applications (MAXAPPLS) = AUTOMATIC Average number of active applications (AVG_APPLS) = 1 Max DB files open per application (MAXFILOP) = 64 Log file size (4KB) (LOGFILSIZ) = 1000 Number of primary log files (LOGPRIMARY) = 3 Number of secondary log files (LOGSECOND) = 2 Changed path to log files (NEWLOGPATH) = Path to log files = /home/dsnow/dsnow/NODE0000 /SQL00001/SQLOGDIR/ Overflow log path (OVERFLOWLOGPATH) = Mirror log path (MIRRORLOGPATH) = First active log file = Block log on disk full (BLK_LOG_DSK_FUL) = NO Percent of max active log space by transaction(MAX_LOG) = 0 Num. of active log files for 1 active UOW(NUM_LOG_SPAN) = 0 Group commit count (MINCOMMIT) = 1 Percent log file reclaimed before soft chckpt (SOFTMAX) = 100 Log retain for recovery enabled (LOGRETAIN) = OFF User exit for logging enabled (USEREXIT) = OFF Auto restart enabled (AUTORESTART) = ON Index re-creation time (INDEXREC) = SYSTEM (RESTART) Default number of loadrec sessions (DFT_LOADREC_SES) = 1 Number of database backups to retain (NUM_DB_BACKUPS) = 12 Recovery history retention (days) (REC_HIS_RETENTN) = 366 TSM management class (TSM_MGMTCLASS) = TSM node name (TSM_NODENAME) = TSM owner (TSM_OWNER) = TSM password (TSM_PASSWORD) =
The relevant database manager and database configuration parameters were adjusted, based on the analysis of their currently set values, and the appropriate values for the task being organized were determined. The configurations were updated as follows:
update dbm cfg using intra_parallel on update dbm cfg using sheapthres 60000 update db cfg for sample using logbufsz 256 update db cfg for sample using sortheap 1000
The instance was stopped and restarted.
The db2batch output showing the timing of the create index is below. Although the create index time was improved, it was still higher than expected.
--------------------------------------------- Statement number: 1 create index x1 on staff (id) Elapsed Time is: 897.214 seconds ---------------------------------------------
This required operating system tools such as vmstat, iostat, and/or top to capture the memory, I/O, and process level information.
NOTE
It is important to capture the snapshots from these tools over a period of time, not taking just one snapshot, especially because the first line of the output of the vmstat and iostat tools contains average information since the server was started and is not really useful in analyzing a problem.
While the create index statement is running, capture vmstat output, as follows:
vmstat 5 > vmstat.out
This will capture a vmstat snapshot every 5 seconds and write the output to the file vmstat.out. This command will run until a Ctrl-C is issued on the terminal where the command was run. After this, examine the vmstat.out file.
Below is the vmstat.out file that was captured while the create index statement was running:
kthr memory page faults cpu ----- ----------- ------------------------ ------------ ----------- r b avm fre re pi po fr sr cy in sy cs us sy id wa 1 1 16243 29168 0 0 0 0 0 0 305 2890 870 14 6 80 0 2 1 16243 29165 0 0 0 0 0 0 212 2010 550 25 7 68 0 0 1 16243 29165 0 0 0 0 0 0 217 1175 584 24 2 74 0 0 1 16243 29165 0 0 0 0 0 0 224 1970 563 25 6 63 2 2 1 16243 29165 0 0 0 0 0 0 217 1171 583 23 2 75 0 2 1 16243 29165 0 0 0 0 0 0 211 1968 550 25 7 68 0 0 1 16243 29165 0 0 0 0 0 0 217 1175 584 24 1 75 0 2 1 16328 29072 0 0 0 0 0 0 217 1202 584 23 3 74 0
The analysis of the vmstat output above did not indicate any problems. The run queue was small and did not grow over time. There was no paging occurring at any time during the run (as indicated by pi = 0). The CPUs were not waiting for I/O (as indicated by the wa column).
While the create index statement was running, the iostat information was captured, as follows:
iostat 5 > iostat.out
The iostat.out file was created as follows:
Disks: % tm_act Kbps tps Kb_read Kb_wrtn Hdisk0 12.6 13.2 7.3 46 0 hdisk1 8.8 9.6 6.4 28 0 cd0 0.0 0.0 0.0 0 0 Disks: % tm_act Kbps tps Kb_read Kb_wrtn Hdisk0 10.6 11.3 6.9 37 0 hdisk1 9.8 10.9 6.8 33 0 cd0 0.0 0.0 0.0 0 0 Disks: % tm_act Kbps tps Kb_read Kb_wrtn Hdisk0 12.9 13.7 7.8 49 0 hdisk1 8.4 9.3 6.2 30 0 cd0 0.0 0.0 0.0 0 0
The iostat showed that the disks were relatively inactive. The most active disk was busy only 12.9% of the time. In addition, the amount of data read (Kb_read) for the disks was quite low.
In this case each of the Hdisks was actually a disk array that could support much higher throughput. Disk arrays make the disk volume look like one disk to the OS tools; therefore, it was important to understand the physical disk implementation when examining the output of these tools.
In reexamining the database and database manager configurations, the NUM_IOSERVERS configuration parameter warranted further investigation. The parameter was set as follows:
Number of I/O servers (NUM_IOSERVERS) = 1
DB2 was not taking advantage of I/O parallelism and was not reading from the disks in parallel. In addition, because the containers were on a RAID-5 disk, we checked the DB2_PARALLEL_IO registry variable to determine whether it was set for the table space(s) where the table was created. In this, case it was not enabled.
The number of prefetchers (I/O servers) was increased as follows:
update db cfg for sample using num_ioservers 6
In this case, the number of prefetchers was set to six because the disk volume is using RAID-5 technology with six data disks and one parity disk per array. Therefore, there were six disks servicing the table space for the data.
Parallel I/O was enabled as follows:
db2set DB2_PARALLEL_IO=*
The DB2 instance was stopped and restarted to enable the registry variable.
The db2batch output showing the timing of the create index is below. Although the create index time was improved, it was still higher than expected.
--------------------------------------------- Statement number: 1 create index x1 on staff (id) Elapsed Time is: 437.391 seconds ---------------------------------------------
In this case, two sessions were opened, and the vmstat and iostat were captured for the entire time the index was being created.
Some of the entries from the beginning and end of the vmstat.out and iostat.out files are below:
kthr memory page faults cpu ----- ----------- ------------------------ ------------ ----------- r b avm fre re pi po fr sr cy in sy cs us sy id wa 1 1 16243 29168 0 0 0 0 0 0 305 2890 870 14 6 80 0 2 1 16243 29165 0 0 0 0 0 0 212 2010 550 35 7 58 0 0 1 16243 29165 0 0 0 0 0 0 217 1175 584 34 2 64 0 0 1 16243 29165 0 0 0 0 0 0 224 1970 563 35 6 53 2 . . . 2 1 16243 29165 0 0 0 0 0 0 211 1968 550 25 7 61 7 0 1 16243 29165 0 0 0 0 0 0 217 1175 584 24 1 66 9 2 1 16328 29072 0 0 0 0 0 0 217 1202 584 23 3 67 7 Disks: % tm_act Kbps tps Kb_read Kb_wrtn Hdisk0 12.6 13.2 7.3 46 0 hdisk1 8.8 9.6 6.4 28 0 cd0 0.0 0.0 0.0 0 0 Disks: % tm_act Kbps tps Kb_read Kb_wrtn Hdisk0 10.6 11.3 6.9 37 0 hdisk1 9.8 10.9 6.8 33 0 cd0 0.0 0.0 0.0 0 0 Disks: % tm_act Kbps tps Kb_read Kb_wrtn Hdisk0 12.9 13.7 7.8 49 0 hdisk1 8.4 9.3 6.2 30 0 cd0 0.0 0.0 0.0 0 0 . . . Disks: % tm_act Kbps tps Kb_read Kb_wrtn Hdisk0 24.7 36.1 16.3 1 317 Hdisk1 24.5 36.0 16.1 2 311 cd0 0.0 0.0 0.0 0 0 Disks: % tm_act Kbps tps Kb_read Kb_wrtn Hdisk0 24.6 36.1 16.3 1 315 Hdisk1 24.9 36.9 16.4 2 321 cd0 0.0 0.0 0.0 0 0 Disks: % tm_act Kbps tps Kb_read Kb_wrtn Hdisk0 24.5 36.0 16.2 1 315 Hdisk1 24.6 36.4 16.5 2 323 cd0 0.0 0.0 0.0 0 0
An analysis of this information shows differences in the processing between the beginning and end of the create index processing. At the beginning of the processing, all of the I/O activity was shown to be reading the data file. At the end of the create index processing, all of the activity is write activity. This was due to the manner in which DB2 creates indexes.
Because DB2 logs only the creation of index pages and not the creation of every index key value, it must flush all of the new log pages to disk at the end of the index creation. With a CHNGPGS_THRESH setting of 60% and a 1.2-GB buffer pool, at the end of the index creation, there can be up to 720 MB of pages in the buffer pool that needs to be flushed to disk. By decreasing the value of the CHNGPGS_THRESH configuration parameter to 20%, there will be a maximum of only 240 MB of pages in the buffer pool to be flushed to disk. This can result in greatly improved index creation times.
The relevant database manager and database configuration parameters were adjusted, based on the analysis of their currently set values, and the appropriate values for the task being organized were determined. The changed pages threshold was updated as follows:
update db cfg for sample using chngpgs_thresh 20
The db2batch output showing the timing of the create index improved dramatically.
--------------------------------------------- Statement number: 1 create index x1 on staff (id) Elapsed Time is: 38.451 seconds ---------------------------------------------