The previous sections used the explain tool to examine DB2 access plans. The next sections will introduce the db2exfmt tool and discuss how to interpret its output. The db2exfmt tool reads the explain tables and builds an ASCII report with the explain information, as well as a graphical representation of the access plan.
The output of the db2exfmt tool includes the following information:
DB2 version and release level
Basic database configuration parameters
Original SQL statement text
The SQL statement as it was presented to the DB2 engine
"Optimized" SQL statement text
SQL-like representation of the query after it has been rewritten, views merged, constraints and triggers added
Triggers are not shown
An overview graph of the query access plan
Details of the LOw LEvel Plan OPerators (LOLEPOPs)
As shown below, the main information in the overview area is the version and release level of DB2, as well as the date and time when the tool was run.
DB2 Universal Database Version 8.1, 5622-044 (c) Copyright IBM Corp. 1991, 2002 Licensed Material - Program Property of IBM IBM DATABASE 2 Explain Table Format Tool ************************ EXPLAIN INSTANCE *************************** DB2_VERSION: 08.01.0 SOURCE_NAME: SQLC2E03 SOURCE_SCHEMA: NULLID SOURCE_VERSION: EXPLAIN_TIME: 2002-11-03-188.8.131.52000 EXPLAIN_REQUESTER: DWAINE
The database context area lists the configuration parameters that have the biggest impact on the performance of the database and its applications, including:
Buffer pool size
Sort heap size
Average number of applications
Database Context: ---------------- Parallelism: None CPU Speed: 1.094264e-006 Comm Speed: 1 Buffer Pool size: 80000 Sort Heap size: 4096 Database Heap size: 1200 Lock List size: 40 Maximum Lock List: 22 Average Applications: 1 Locks Available: 903
The package context indicates whether the SQL was dynamic or static, as well as the optimization level, isolation level, and degree of intra-partition parallelism used for the statement.
Package Context: ---------------- SQL Type: Dynamic Optimization Level: 5 Blocking: Block All Cursors Isolation Level: Cursor Stability --------------------- STATEMENT 1 SECTION 203 ---------------------- QUERYNO: 1 QUERYTAG: Statement Type: Select Updatable: No Deletable: No Query Degree: 1
The original statement contains the SQL statement as it was originally run by the application.
[View full width]Original Statement: ------------------- select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, sum (l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from tpcd.lineitem where l_shipdate <= date ('2002-12-01') - 90 day group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus
The optimized statement contains the SQL statement as it was rewritten by the query rewrite facility of the DB2 optimizer. The internal names (Q1, Q2, Q3) represent the table's position in the select list.
The query rewrite facility can also perform the following functions if applicable:
Automatically redirect the query to a materialized query table if one can be used to satisfy the query
Precompute constant expressions
[View full width]Optimized Statement: ------------------- SELECT Q3.$C7 AS "L_RETURNFLAG", Q3.$C6 AS "L_LINESTATUS", Q3.$C5 AS "SUM_QTY", Q3.$C4 AS "SUM_BASE_PRICE", Q3.$C3 AS "SUM_DISC_PRICE", Q3.$C2 AS "SUM_CHARGE",(Q3.$C5 / Q3.$C0) AS "AVG_QTY", (Q3.$C4 /Q3.$C0) AS "AVG_PRICE", (Q3.$C1 / Q3.$C0) AS "AVG_DISC", INTEGER(Q3.$C0) AS "COUNT_ORDER" FROM (SELECT SUM(Q2.$C2), SUM(Q2.$C3), SUM(Q2.$C4), SUM(Q2.$C5),SUM(Q2.$C6),SUM(Q2.$C7), Q2 .$C0, Q2.$C1 FROM (SELECT Q1.L_LINESTATUS, Q1.L_RETURNFLAG, Q1.COUNT, Q1.S5, Q1.S4, Q1.S3, Q1.S2, Q1.S1 FROM TPCD.L_SUMMARY AS Q1 WHERE (Q1.L_SHIPDATE <= '09/02/2002')) AS Q2 GROUP BY Q2.$C1, Q2.$C0) AS Q3 ORDER BY Q3.$C7, Q3.$C6
Access Plan: ----------- Total Cost: 23296.9 Query Degree: 1 Rows RETURN ( 1) Cost I/O | 6 GRPBY ( 2) 23296.9 12728 | 24 MDTQ ( 3) 23296.9 12728 | 6 GRPBY ( 4) 23296.7 12728 | 6 TBSCAN ( 5) 23296.7 12728 | 6 SORT ( 6) 23296.7 12728 | 478775 TBSCAN ( 7) 22453.8 12728 | 496100 TABLE: TPCD.L_SUMMARY
The db2exfmt tool builds an ASCII/text graph of the access plan as above. The elements of the access plan are read from the bottom up.
Starting at the bottom of the access plan, we see that the base table accessed for this query is the L_SUMMARY table, and it has a cardinality of 496100 rows. The table is accessed via a table scan (relation scan) and the data then sorted. The output of the sort is then scanned, the data grouped on the specified column, and then directed to the coordinator partition, using a merge directed table queue. On the coordinator partition, the data is again grouped before being returned to the application.
The operators that can be encountered in the db2exfmt output can be broken into the following categories:
TBSCAN? Table Scan
IXSCAN? Index Scan
FETCH? Fetch from table
MSJOIN? Merge Scan Join
NLJOIN? Nested Loop Join
HSJOIN? Hash Join
GRPBY? Group By
TEMP? Insert into temp table
IXAND? Index ANDing
RIDSCA? Index ORing or List Prefetch
IXA? Star Schema Bitmap Indexing
BTQ? Broadcast Table Queue
DTQ? Directed Table Queue
MDTQ? Merge Directed Table Queue
MBTQ? Merge Broadcast Table Queue
LTQ? Local Table Queue, for intra-partition parallelism
In the following access plan example, the path of execution is read from the bottom up, and from left to right.
Each row that is found by the index scan (IXSCAN) in step 14 is passed to the nested loop join (NLJOIN) in step 13. The nested loop join (NLJOIN) then accesses the inner table, based on the join predicates and local predicates (if any) returned by the fetch (FETCH) in step 15, based on the index scan (IXSCAN) in step 16. Each joined row is returned from the nested loop join (NLJOIN) to the next operator in the access plan. Execution continues until the entire outer stream is exhausted.
| 3.87404 NLJOIN ( 13) 125.206 5 /-------+------\ 0.968511 4 IXSCAN FETCH ( 14) ( 15) 75.0966 100.118 3 4 | /----+---\ 4.99966e+06 4 1.99987e+07 INDEX: TPCD IXSCAN TABLE: TPCD UXP_NMPK ( 16) PARTSUPP 75.1018 3 | 1.99987e+07 INDEX: TPCD.UXPS_PK2KSC
However, the graph does not give all of the details. It is important to know why the fetch is required on the inner table and what columns are the tables being joined on. This information can be found in the access plan details.
For example, for the nested loop join in step 13, the detailed information is below. The cost information in the detailed information contains:
Total cost in units of timerons
Not the elapsed time in a serial environment
Based on the elapsed time in a parallel environment
Cost model is based on resource consumption
Total CPU and I/O resources consumed
Communication costs are considered in a parallel environment
Elapsed time could be different because of parallel I/O and overlap between CPU and I/O operations in a serial environment
Plan costs are cumulative
In general, each plan operator adds cost to the plan
Based on the detailed information below, the total cumulative cost is 125.206 timerons. The Re-Total Cost is the estimated cost to reexecute this sub plan. The Cumulative First Row Cost is the estimated cost to return the first row of the result set. The Estimated Bufferpool Buffers is the expected number of buffer pool pages required by this operator.
13) NLJOIN: (Nested Loop Join) Cumulative Total Cost: 125.206 Cumulative CPU Cost: 164264 Cumulative I/O Cost: 5 Cumulative Re-Total Cost: 0.062461 Cumulative Re-CPU Cost: 49744 Cumulative Re-I/O Cost: 0 Cumulative First Row Cost: 125.204 Estimated Bufferpool Buffers: 6 Arguments: --------- EARLYOUT: (Early Out flag) FALSE FETCHMAX: (Override for FETCH MAXPAGES) IGNORE ISCANMAX: (Override for ISCAN MAXPAGES) IGNORE
The arguments for the nested loop join indicate the following:
Indicates whether the optimizer will get the next outer row after finding the first match on the inner row. This guarantees one match on the inner.
Specifies the maximum number of pages to prefetch for a fetch or index scan.
A nested loop join can override the original settings if it is an ordered nested loop join.
13) NLJOIN: (Nested Loop Join) Predicates: ---------- 16) Predicate used in Join Relational Operator: Equal (=) Subquery Input Required: No Filter Factor: 5.00034e-08 Predicate Text: -------------- (Q1.PS_PARTKEY = Q2.P_PARTKEY)
The predicate information includes the estimated selectivity of the predicate, based on the table and column statistics, as well as the predicate being applied by the operator. In this case, the columns being joined are the PARTKEY columns in table Q1 and Q2.
The join will then have two input streams, one for the inner table and one for the outer table.
13) NLJOIN: (Nested Loop Join) Input Streams: ------------- 5) From Operator #14 Estimated number of rows: 0.968511 Partition Map ID 1 Partitioning: (MULT ) Multiple Partitions Number of columns: 3 Subquery predicate ID: Not Applicable Column Names: ------------ +$RID$+P_PARTKEY+P_NAME Partition Column Names: ---------------------- +1: PS_PARTKEY
The estimated stream cardinality from operator 14 in this case is .968511, and it is returning three columns. In this case, the operation is occurring on multiple partitions in the database. The partitioning key is PS_PARTKEY.
The estimated stream cardinality from operator 15, as seen below, is 4, and it is returning four columns. This operation is also occurring on multiple partitions, and the partitioning key is PS_PARTKEY.
13) NLJOIN: (Nested Loop Join) Input Streams: ------------- 9) From Operator #15 Estimated number of rows: 4 Partition Map ID: 1 Partitioning: (MULT ) Multiple Partitions Number of columns: 4 Subquery predicate ID: Not Applicable Column Names: ------------ +PS_PARTKEY(A)+PS_SUPPKEY(A)+$RID$+PS_AVAILQTY Partition Column Names: ---------------------- +1: PS_PARTKEY
The detailed information for the fetch operation shows that the columns PS_PARTKEY and PS_SUPPKEY are being passed to the fetch from the index scan in operation 16, and the fetch is then retrieving the PS_AVAILQTY column from the table PARTSUPP. The PS_AVAILQTY column must be retrieved from the table because it is not contained in the index used in operator 16.
15) FETCH : (Fetch) Arguments: --------- ... Input Streams: ------------- 7) From Operator #16 Column Names: ------------ +PS_PARTKEY(A)+PS_SUPPKEY(A)+$RID$ 8) From Object TPCD.PARTSUPP Column Names: ------------ +PS_AVAILQTY
In the index scan in step 16, the optimizer is applying a start and stop predicate to the index scan. The scan will read only the index leaf pages where Q1.PS_PARTKEY = Q2.P_PARTKEY; it does not need to scan the entire index. From step 16, the estimated number of rows returned by the index scan is four.
16) IXSCAN: (Index Scan) Predicates: ---------- 16) Start Key Predicate Relational Operator: Equal (=) Subquery Input Required: No Filter Factor: 5.00034e-08 Predicate Text: -------------- (Q1.PS_PARTKEY = Q2.P_PARTKEY) 16) Stop Key Predicate Relational Operator: Equal (=) Subquery Input Required: No Filter Factor: 5.00034e-08 Predicate Text: -------------- (Q1.PS_PARTKEY = Q2.P_PARTKEY)
The details for the sort operation in step 16 of the explain plan below indicate that I/O occurred during the sort. Therefore, the sort must have overflowed and could not be accomplished within the sort heap.
3.65665e+07 TBSCAN ( 15) 6.87408e+06 1.45951e+06 | 3.65665e+07 SORT ( 16) 6.14826e+06 1.30119e+06 | 3.65665e+07 TBSCAN ( 17) 2.00653e+06 1.14286e+06 | 3.74999e+07 TABLE: TPCD ORDERS
The detailed information about the table scan following the sort (i.e., step 15 above) will list the estimated number of buffer pool buffers, which gives an estimate for the size of the overflowed temporary table. Based on the following piece of the explain graph, the estimated size of the overflowed sort table will be 163976 pages.
15) TBSCAN: (Table Scan) . . . Estimated Bufferpool Buffers: 163976
The following explain graph shows an example of list prefetch. In step 12, the index scan is applying the predicates and returning the row identifiers (RIDs) to the sort operation in step 11. The RIDs are then sorted based on the page number, and passed to the RID Scan (RIDSCN) operation in step 10. The RID scan will build a list of the pages and call the prefetchers to retrieve the pages into the buffer pool. The fetch operation in step 9 can then fetch and process the pages because they should already be in the buffer pool, due to the work of the prefetchers.
455.385 FETCH ( 9) 308.619 61.2878 /----+---\ 455.385 15009 RIDSCN TABLE: TPCD ( 10) L_SUMMARY2 219.093 17.4697 | 455.385 SORT ( 11) 219.091 17.4697 | 455.385 IXSCAN ( 12) 218.559 17.4697 | 15009 INDEX: TPCD L_SUMMARY2_IDX
The following explain graph shows an example of index ORing. In steps 6, 8, and 10, the index scan is applying the predicates and returning the RIDs to the sort operations above. The RIDs are then sorted based on the page number, any duplicates are eliminated, and the results are then passed to the RID Scan (RIDSCN) operation in step 4. The RID scan will build a list of the pages and call the prefetchers to retrieve the pages into the buffer pool. The fetch operation in step 3 can then fetch and process the pages because they should already be in the buffer pool, due to the work of the prefetchers. In this case, the fetch operation must reapply the predicates due to the OR predicates.
59537.2 FETCH ( 3) 62819.1 37361.7 /----+---\ 59537.2 1.50002e+08 RIDSCN TABLE: TPCD ( 4) LINEITEM 803.781 82.2908 | +-----------------+-----------------+ 59383.3 150.002 4.00006 SORT SORT SORT ( 5) ( 7) ( 9) 653.127 75.5534 75.1057 76.2908 3 3 | | | 59383.3 150.002 4.00006 IXSCAN IXSCAN IXSCAN ( 6) ( 8) ( 10) 510.773 75.4177 75.1022 76.2908 3 3 | | | 1.50002e+08 1.50002e+08 1.50002e+08 INDEX: INDEX: INDEX: TPCD.L_SD TPCD.L_SK_PK TPCD.L_OK
The following explain graph shows an example of index ANDing. In steps 7 and 8, the index scan is applying the predicates and returning the RIDs from the index. The index ANDing (IXAND) operation then hashes the RIDs into the dynamic bitmap and starts returning the RIDs as it works on the last index. The RIDs are then sorted based on the page number, any duplicates are eliminated, and the results are then passed to the RIDSCN operation in step 4. The RID scan will build a list of the pages and call the prefetchers to retrieve the pages into the buffer pool. The fetch operation in step 3 can then fetch and process the pages because they should already be in the buffer pool, due to the work of the prefetchers. In this case, the fetch operation must reapply the predicates because the bitmap used is a reducing bitmap, and not all "qualified" rows are truly qualified.
As shown by the access plan graph below, index ANDing is considered when there is a large number of rows to process, but the expected result set is relatively small. In this case, the indexes scanned had approximately 250000 rows and 500000 rows, respectively, but the expected number of rows returned by the fetch is only four.
| 4.4314 FETCH ( 3) 5475.6 1952.4 /----+---\ 886.281 1.50002e+08 RIDSCN TABLE: TPCD ( 4) LINEITEM 4027.9 1100.96 | 886.281 SORT ( 5) 4027.9 1100.96 | 886.281 IXAND ( 6) 4026.01 1100.96 /------+-----\ 248752 534445 IXSCAN IXSCAN ( 7) ( 8) 1480.95 2509.07 430.024 670.935 | | 1.50002e+08 1.50002e+08 INDEX: TPCD INDEX: TPCD L_OK L_SD
Query predicates can be handled in two different manners within DB2:
When data is being fetched, the most straightforward approach is to return one row at a time and wait for the next request to fetch another one. Each returned row is then evaluated to determine whether it matches the given predicate(s). After evaluating the row against the given predicates, the next row is returned, and the same cycle is executed until reaching the end of the table being scanned. Although this approach might be the simplest, it causes a large number of round trips, which can turn into a performance penalty.
An alternative is when a record is fetched, to reference it in memory and directly evaluate the predicate to determine whether the row qualifies. If the row does qualify, it is returned at the end of the predicate evaluation. If the row does not qualify, the next row is fetched immediately.
A predicate that can be processed using method 2, described above, is known as a sargable predicate (SARG). There are three types of sargable predicates:
BLOCK? These predicates are resolved while scanning the block index scan for a multi-dimensional clustering (MDC) table.
INDEX? These predicates are resolved while performing a conventional index scan.
DATA (Database Managed Space, or DMS)? These predicates are resolved while scanning the data pages.
All predicates for which method 2, described above, cannot be applied are referred to as residual predicates (RES). In this case, the rows must be returned one at a time and evaluated, as described in method 1.
Based on the definitions and descriptions given above, it is evident that residual predicates are systematically more expensive in their processing than are sargable predicates. However, it is sometimes impossible to "push down" the evaluation of a predicate to make it sargable, and there are two main reasons for this:
LOB and Long Varchar data types cannot be evaluated directly without retrieving the row because, when fetching LOBs and Long Varchars, only a locator is returned, not the actual data.
The evaluation of a predicate requires that more than one page of data be fixed in the buffer pool.
As an example, consider a predicate comparing two columns from two different tables: The first row would need to be fetched from the first data page, and the second row would need to be fetched simultaneously from a different data page before the comparison can be made.
Performance problems in a database can be a result of:
The instance and database configuration parameter settings
The physical layout of the database
Inefficient SQL, normally due to incorrect/inefficient indexing
The next section will focus on analyzing SQL statements to determine what, if any, indexes should be created to help optimize problem SQL statements.
There are numerous methods that can be used to analyze the executing SQL statements to determine which statements, if any, need to be investigated. These include:
Statement event monitor
Dynamic SQL snapshot
SQL snapshot table function
The event monitor tracks each execution of an SQL statement, whereas the snapshot monitor and table function provide one entry for each unique statement, along with information such as the number of times the statement was executed, the total execution time, the number of sorts performed, etc.
To obtain an SQL snapshot or get valid information from the dynamic SQL table function, the STATEMENT snapshot monitor switch must be turned on. If the switch is turned on at the session level, the get snapshot or table function must be run in the same session. If the switch is set at the DB2 instance level, the get snapshot or table function can be run from any session with a connection to the database.
To set the STATEMENT monitor switch at the instance level, first run the following command, then stop and restart the DB2 instance.
update dbm cfg using dft_mon_stmt on
Analyzing the output of the SQL statement snapshot or table function can take a good deal of time, especially if a large number of statements have been run. A spreadsheet can be used to aid in the analysis of this information; however, the output of the SQL snapshot must be parsed and converted to a format that is readable by a spreadsheet first. Appendix B, parseSQL.pl, is a Perl program that will parse the output of the SQL snapshot and convert the information to comma-delimited format so it can be imported into a spreadsheet.
The output of the SQL snapshot table function can be either exported directly to ASCII delimited format to be analyzed using the same spreadsheet or inserted into a table and analyzed using SQL statements. To make the insertion into a table easiest, export the table in IXF format, so that the table to store the data does not need to exist. For example:
export to try.ixf of ixf SELECT * FROM TABLE(SNAPSHOT_DYN_SQL('SAMPLE',-1 )) as SNAPSHOT_DYN_SQL
To import this into a table to be analyzed, the table need not exist because the IXF file contains the table definition. The command to create a new table and import the data is as follows:
import from try.ixf of ixf create into sqlsnap
To capture the SQL snapshot using the table function and create an ASCII delimited file for analysis using a spreadsheet, use the following command:
export to try.del of del SELECT * FROM TABLE(SNAPSHOT_DYN_SQL('SAMPLE',-1 )) as SNAPSHOT_DYN_SQL
The information reported by the SQL snapshot or table function that is important when looking for poorly performing SQL is:
Number of executions
Total execution time
A statement that is run once a day and takes 10 seconds is much less important than one that runs 1,000 times a day and takes 10 seconds to run each time.
The Rows read gives an indication of the efficiency of the statement. A large number of rows read normally indicates that a table scan is being performed to resolve the query. Even if the query runs quickly, performing table scans is not efficient, and the query should be analyzed to determine whether it is indexed correctly.
The rows written and statement sorts elements of the snapshot are normally examined together. Overflowed sorts cause rows to be written to temporary tables in the database; therefore, if the statement is causing sorts to occur and there are no rows written, the sort is occurring within the sort heap. If there are rows written, the sort very likely overflowed. Regardless of whether or not the sort overflowed, all statements with sorts should be examined. Another item to examine is the average number of sorts per execution. To calculate this, divide the statement sorts by the number of executions, as follows:
Sorts per execution = (Statement sorts / Number of executions)
Any statement with more than one sort per execution should be examined immediately.
The total execution time is for all executions of the statement. Even though a statement may have a total execution time of 5,000 seconds, if it was run 20,000 times, this is not excessive. However, if it was run only once, this is very excessive. Therefore, it is important to examine the average execution time as follows:
Avg execution time = (Total execution time / Number of executions)
For example, in the following SQL snapshot entry, the total execution time is 2.25 seconds, which seems like a lot for a simple statement like this. However, the statement was executed over 15,000 times, so the average execution time is very low. There are no sorts happening when this statement is run, so in this case everything appears to be fine.
Number of executions = 15616 Number of compilations = 1 Worst preparation time (ms) = 11 Best preparation time (ms) = 11 Internal rows deleted = 0 Internal rows inserted = 0 Rows read = 0 Internal rows updated = 0 Rows written = 0 Statement sorts = 0 Total execution time (sec.ms) = 2.254846 Total user cpu time (sec.ms) = 0.980000 Total system cpu time (sec.ms) = 0.100000 Statement text = SELECT SUM(B.DEDUCTION) ...
For the following statement, the total execution time is 23.35 seconds for 29 executions. Although each execution is under one second, the most interesting piece of information in this snapshot is the fact that the 29 executions are causing 6604 sorts to occur. Although the statement may not be taking an excessive amount of time to run, it is using a great deal of resources to perform this many sorts and should be examined to determine whether the underlying tables are indexed correctly.
Number of executions = 29 Number of compilations = 1 Worst preparation time (ms) = 35 Best preparation time (ms) = 35 Internal rows deleted = 0 Internal rows inserted = 0 Rows read = 4 Internal rows updated = 0 Rows written = 0 Statement sorts = 6604 Total execution time (sec.ms) = 23.352040 Total user cpu time (sec.ms) = 23.590000 Total system cpu time (sec.ms) = 0.000000 Statement text = SELECT PAGE_NUM ...
When a poorly performing SQL statement is found, the following steps can be used to help determine the cause of the problem and find a solution:
Ensure that all tables in the query (and their indexes) have current statistics.
Distribution table statistics and detailed index statistics should be gathered.
Determine whether new/changed indexes will help.
Looking at the query sometimes is enough to determine a good index.
For complex queries, use the index advisor.
Determine whether an MQT will help.
Especially for a query that is executed many times
An indicator that the current indexes on a table may not be defined correctly (or that more indexes are needed) is when the SQL statement is performing one or more sorts per execution.
For the piece of the SQL snapshot shown below, a sort is required each time the statement is executed.
Number of executions = 378 . . . Statement sorts = 378 Statement text = SELECT custkey, custname FROM user1.customer ORDER BY custkey DESC
To determine whether the table has current statistics, execute the following statement:
select stats_time from syscat.tables where tabname='CUSTOMER'
If the output of the above statement is as follows:
STATS_TIME -------------------------- 2002-10-06-09.45.13.468000 1 record(s) selected.
and the current date is October 6, unless there has been a large number of inserts or deletes since 9:45 A.M., the statistics are current. The next step is to examine the current indexes defined on the CUSTOMER table, as follows:
describe indexes for table user1.customer
View the output of the above statement, as follows:
[View full width]Index schema Index name Unique rule Number of columns -------------- ---------------- -------------- -------------- 0 record(s) selected. SQL0100W No row was found for FETCH, UPDATE or DELETE; or the result of a query is an empty table. SQLSTATE=02000
In this case, there are no indexes that can be used by the optimizer to retrieve the data columns or to help order the data. Therefore, a good first step is to examine the statement to determine what indexes may help.
In this example, there are two columns being selected from the table; therefore, an index can be created on these two columns. The ORDER BY clause in the statement is descending (DESC); therefore, the index should be created either with the CUSTKEY column in descending order or using the ALLOW REVERSE SCANS option.
The order of the key columns in the index is also important. The key column with the highest cardinality (i.e., most unique values) should be the first key in the index because it is the most selective and will narrow down the search must faster.
To determine the cardinality of the key columns for the above select statement, use the following:
select colname, colcard from syscat.columns where tabname='CUSTOMER'
The output of the above statement looks like:
COLNAME COLCARD ------- ------- CUSTKEY 1709 CUSTNAME 203 ADDRESS 609 3 record(s) selected.
In this case, the customer key column (CUSTKEY) has the highest cardinality and, therefore, should be the first column in the index. The CUSTKEY column is also a unique identifier within this table. Thus, there are two options in this case:
Create a non-unique index with CUSTKEY and CUSTNAME.
Create a unique index on CUSTKEY and include the CUSTNAME column.
Therefore, any of the following four indexes would help improve the performance of this statement:
create index cust_ix on customer (custkey desc, custname) create index cust_ix on customer (custkey, custname) allow reverse scans create unique index cust_ix on customer (custkey desc) include (custname) create unique index cust_ix on customer (custkey) include (custname) allow reverse scans
The fact that the CUSTKEY column is unique also helps the optimizer choose better access plans; therefore, if a unique index can be used, this is preferable.
Because the table has current statistics, when the index is created, collect the index statistics at the same time, as follows:
create unique index cust_ix on customer (custkey) include (custname) allow reverse scans collect detailed statistics
After creating the index and rerunning the application, the SQL snapshot information for this statement looks like the following:
Number of executions = 378 . . . Statement sorts = 0 Statement text = SELECT custkey, custname FROM user1.customer ORDER BY custkey DESC
The following statement has been found to be running slowly:
select empid, empname from employees order by empid
If there are no indexes on the table, the first step should be to create one or more indexes to help select the EMPID and EMPNAME columns and sort the EMPID column. Before adding the index, the access plan for the statement looks like the following:
Access Plan: ----------- Total Cost: 4709.851 Query Degree: 1 Rows RETURN ( 1) Cost I/O | 9600 TBSCAN ( 2) 470.851 109 | 9600 SORT ( 3) 470.849 109 | 9600 TBSCAN ( 4) 454.206 109 | 9600 TABLE: USER1 EMPLOYEES
Because the EMPID column is unique in this table, create an index and capture index statistics at the same time, as follows:
create unique index emp_ix on employees (empid) collect detailed statistics Access Plan: ----------- Total Cost: 650.384 Query Degree: 1 Rows RETURN ( 1) Cost I/O | 109 FETCH ( 2) 650.384 2 /----+---\ 109 109 IXSCAN TABLE: USER1 ( 3) EMPLOYEES 0.248997 0 | 109 INDEX: USER1 EMP_IX
The new index has eliminated the sort from the access plan and has greatly reduced the query's cost; however, the query can still be improved. Because the query is accessing only two columns, drop the index above and create a new index with both columns to eliminate the need to FETCH the rows from the table, as follows:
create unique index emp_ix on employees (empid) include (empname) collect detailed statistics
In this case, the index contains all of the columns being selected, so the entire query should be able to be handled by the index, without needing to read data from the table. This is known as index-only access and is the fastest type of data access. The new access plan is shown below:
Access Plan: ----------- Total Cost: 50.2709 Query Degree: 1 Rows RETURN ( 1) Cost I/O | 109 IXSCAN ( 2) 50.2709 2 | 109 INDEX: USER1 EMP_IX
In an index-only access plan, the index is scanned (IXSCAN) and the data returned either to the application or to another database operation other than a fetch from the base table, as shown previously.
When the DB2 optimizer compiles SQL statements, it can rewrite them into a form that can be optimized more easily. The optimizer then generates a number of alternative execution plans for satisfying the SQL statement. It estimates the execution cost of each alternative plan, using the statistics for tables, indexes, columns, and functions, and chooses the plan with the lowest estimated execution cost.
The optimizer must choose an access plan that will produce the result set for the query that was submitted. Therefore, as noted in the following guidelines, the query should be written to retrieve only the data that is required. This helps to ensure that the optimizer can choose the best access plan.
Some guidelines for writing efficient SELECT statements are:
Specify only the required columns.
Do not use select * unless all columns are needed.
Limit the number of rows returned.
Specify the FOR UPDATE clause if applicable.
Specify the OPTIMIZED FOR n ROWS clause.
Specify the FETCH FIRST n ROWS ONLY clause if applicable.
Specify the FOR FETCH ONLY clause if applicable.
Avoid data type conversions if possible.
Particularly numeric data type conversions.
Specify only those columns that are needed in the select list. Although it may be simpler to specify all columns with an asterisk (*), needless processing and returning of unwanted columns can result in slower response time.
Limit the number of rows selected by using predicates to restrict the answer set to only those rows that you require. There are four types of predicates, each with its own distinct method of processing and associated cost. The type of predicate is determined by how and when that predicate is used in the evaluation process. These predicate types are listed below, ordered in terms of performance, starting with the most favorable:
Range delimiting predicates
Index SARGable predicates
Data SARGable predicates
Range delimiting predicates are those used to define the start key and/or stop key for an index search.
Index SARGable predicates are not used to define the start/stop key for an index search but can be evaluated from the index because the columns involved in the predicate are part of the index key. For example, assume a table named STAFF and an index defined on the columns NAME, DEPT, and SVC_YEARS in the table. For the following SQL statement:
SELECT name, job, salary FROM staff WHERE name = 'John' and dept = 10 and svc_years > 5
The predicates name='John' and dept=10 would be range delimiting predicates, whereas svc_years > 5 would be evaluated as an index SARGable predicate, because the start key value for the index search cannot be determined by this information only. The start key value may be 6, 10, or even higher.
If the statement were written as