B.3 Checking the Execution Plans

For this exercise, imagine that the base development is performed on Oracle, with later testing to check that the same SQL functions correctly and performs well on DB2 and SQL Server. You learned of this SQL because it performed more slowly than expected on Oracle, so you already suspect it leads to a poor execution plan on at least that database. You will need to check execution plans on the other databases, which have not yet been tested.

B.3.1 Getting the Oracle Execution Plan

Place the SQL in a file named tmp.sql and run the script ex.sql, as described in Chapter 3. The result is as follows:

PLAN
--------------------------------------------------------------------------------
SELECT STATEMENT
  SORT ORDER BY
    NESTED LOOPS
      NESTED LOOPS
        NESTED LOOPS
          NESTED LOOPS
            NESTED LOOPS
              TABLE ACCESS FULL 4*CUSTOMERS
              TABLE ACCESS BY INDEX ROWID 1*ORDERS
                INDEX RANGE SCAN ORDER_CUSTOMER_ID
            TABLE ACCESS BY INDEX ROWID 2*ORDER_DETAILS
              INDEX RANGE SCAN ORDER_DETAIL_ORDER_ID
          TABLE ACCESS BY INDEX ROWID 5*SHIPMENTS
            INDEX UNIQUE SCAN SHIPMENT_PKEY
        TABLE ACCESS BY INDEX ROWID 6*ADDRESSES
          INDEX UNIQUE SCAN ADDRESS_PKEY
      TABLE ACCESS BY INDEX ROWID 3*PRODUCTS
        INDEX UNIQUE SCAN PRODUCT_PKEY

You notice that your database is set up to use the rule-based optimizer, so you switch to cost-based optimization, check that you have statistics on the tables and indexes, and check the plan again, finding a new result:

PLAN
--------------------------------------------------------------------------------
SELECT STATEMENT
  SORT ORDER BY
    HASH JOIN
      TABLE ACCESS FULL 3*PRODUCTS
      HASH JOIN
        HASH JOIN
          HASH JOIN
            HASH JOIN
              TABLE ACCESS FULL 4*CUSTOMERS
              TABLE ACCESS FULL 1*ORDERS
            TABLE ACCESS FULL 2*ORDER_DETAILS
          TABLE ACCESS FULL 5*SHIPMENTS
        TABLE ACCESS FULL 6*ADDRESSES

Neither execution plan is close to the optimum plan. Instead, both the rule-based and the cost-based optimization plans drive from full table scans of large tables. The database ought to reach the driving table on a highly selective index, so you know that an improvement is certainly both necessary and possible.

B.3.2 Getting the DB2 Execution Plan

Place the SQL in tmp.sql and run the following command according to the process described in Chapter 3:

cat head.sql tmp.sql tail.sql | db2 +c +p -t

The result is an error; DB2 complains that it sees inconsistent column types in the condition on Phone_Number. You discover that the Phone_Number column is of the VARCHAR type, which is incompatible with the number type of the constant 6505551212.

Unlike Oracle, DB2 does not implicitly convert character-type columns to numbers when SQL compares inconsistent datatypes. This is just as well, in this case, since such a conversion might deactivate an index on Phone_Number, if there is one. You might even suspect, already, that this is precisely what has caused poor performance in the Oracle baseline development environment.

You fix the problem in the most obvious way, placing quotes around the phone number constant to make it a character type:

SELECT C.Phone_Number, C.Honorific, C.First_Name, C.Last_Name, C.Suffix, 
C.Address_ID, A.Address_ID, A.Street_Addr_Line1, A.Street_Addr_Line2, 
A.City_Name, A.State_Abbreviation, A.ZIP_Code, OD.Deferred_Ship_Date, 
OD.Item_Count, P.Prod_Description, S.Shipment_Date 
FROM Orders O, Order_Details OD, Products P, Customers C, Shipments S, 
Addresses A
WHERE OD.Order_ID = O.Order_ID
AND O.Customer_ID = C.Customer_ID
AND OD.Product_ID = P.Product_ID
AND OD.Shipment_ID = S.Shipment_ID
AND S.Address_ID = A.Address_ID
AND C.Phone_Number = '6505551212'
AND O.Business_Unit_ID = 10
ORDER BY C.Customer_ID, O.Order_ID Desc, S.Shipment_ID, OD.Order_Detail_ID;

Placing this new version of the SQL in tmp.sql, you again attempt to get the execution plan:

$ cat head.sql tmp.sql tail.sql | db2 +c +p -t
DB20000I  The SQL command completed successfully.
DB20000I  The SQL command completed successfully.

OPERATOR_ID TARGET_ID OPERATOR_TYPE OBJECT_NAME        COST
----------- --------- ------------- ------------------ -----------
          1         - RETURN        -                          260
          2         1 NLJOIN        -                          260
          3         2 NLJOIN        -                          235
          4         3 NLJOIN        -                          210
          5         4 TBSCAN        -                          185
          6         5 SORT          -                          185
          7         6 NLJOIN        -                          185
          8         7 NLJOIN        -                          135
          9         8 FETCH         CUSTOMERS                   75
         10         9 IXSCAN        CUST_PH_NUMBER              50
         11         8 FETCH         ORDERS                      70
         12        11 IXSCAN        ORDER_CUST_ID               50
         13         7 FETCH         ORDER_DETAILS               75
         14        13 IXSCAN        ORDER_DTL_ORD_ID            50
         15         4 FETCH         PRODUCTS                    50
         16        15 IXSCAN        PRODUCT_PKEY                25
         17         3 FETCH         SHIPMENTS                   75
         18        17 IXSCAN        SHIPMENT_PKEY               50
         19         2 FETCH         ADDRESSES                   75
         20        19 IXSCAN        ADDRESS_PKEY                50

  20 record(s) selected.

DB20000I  The SQL command completed successfully.
$

That's more like it, just the execution plan you chose when you analyzed the SQL top-down, except for the minor issue of reaching Products before Shipments, which will have virtually no effect on the runtime. Since the type inconsistency involving Phone_Number might require correcting on SQL Server and Oracle, you need to try this modified version immediately on the other databases.

B.3.3 Getting the SQL Server Execution Plan

Suspecting that you already have the solution to slow performance for this query, you fire up SQL Server's Query Analyzer and use set showplan_text on to see a concise view of the execution plan of the statement modified with C.Phone_Number = '6505551212' to correct the type inconsistency. A click on the Query Analyzer's Execute-Query button results in the following output:

StmtText                                        
-------------------------------------------------------------------------------
  |--Bookmark Lookup(...(...[Products] AS [P]))
       |--Nested Loops(Inner Join)
            |--Bookmark Lookup(...(...[Addresses] AS [A]))
            |    |--Nested Loops(Inner Join)
            |         |--Sort(ORDER BY:([O].[Customer_ID] ASC, [O].[Order_ID] DESC, 
(wrapped line) [OD].[Shipment_ID] ASC, [OD].[Order_Detail_ID] ASC))
            |         |    |--Bookmark Lookup(...(...[Shipments] AS [S]))
            |         |         |--Nested Loops(Inner Join)
            |         |              |--Bookmark Lookup(...(...[Order_Details] AS 
[OD]))
            |         |              |    |--Nested Loops(Inner Join)
            |         |              |         |--Filter(WHERE:([O].[Business_Unit_
ID]=10))
            |         |              |         |    |--Bookmark Lookup(...(...
[Orders] AS [O]))
            |         |              |         |         |--Nested Loops(Inner Join)
            |         |              |         |              |--Bookmark Lookup(...
(...
(wrapped line) [Customers] AS [C]))
            |         |              |         |              |    |--Index Seek(...
(...
(wrapped line) [Customers].[Customer_Phone_Number] 
(wrapped line) AS [C]), SEEK:([C].[Phone_Number]='6505551212') ORDERED)
            |         |              |         |              |--Index Seek(...(...
(wrapped line) [Orders].[Order_Customer_ID] AS [O]), 
(wrapped line) SEEK:([O].[Customer_ID]=[C].[Customer_ID]) ORDERED)
            |         |              |         |--Index Seek(...(...
(wrapped line) [Order_Details].[Order_Detail_Order_ID] 
(wrapped line) AS [OD]), SEEK:([OD].[Order_ID]=[O].[Order_ID]) ORDERED)
            |         |              |--Index Seek(...(...[Shipments].[Shipment_PKey] 
(wrapped line) AS [S]), SEEK:([S].[Shipment_ID]=[OD].[Shipment_ID]) ORDERED)
            |         |--Index Seek(...(...[Addresses].[Address_PKey] 
(wrapped line) AS [A]), SEEK:([A].[Address_ID]=[S].[Address_ID]) ORDERED)
            |--Index Seek(...(...[Products].[Product_PKey] 
(wrapped line) AS [P]), SEEK:([P].[Product_ID]=[OD].[Product_ID]) ORDERED)

(19 row(s) affected)

Good news! The corrected SQL leads to exactly the optimum plan here. Just out of curiosity, you check the execution plan for the original SQL, and you find the same result! Evidently, SQL Server is doing the data conversion on the constant, avoiding disabling the index.