Oracle uses a SQL-centric approach to generating and displaying execution plans. You use SQL to place plan data into a table, after which you can view the data with a normal SQL query. The process can seem awkward at first, especially if you perform it manually. SQL Server sends execution-plan descriptions directly to your screen upon request, but Oracle's SQL-centric approach, writing to a plan table, is much more flexible when you wish to automate the process or analyze whole sets of execution plans at once.
Oracle places execution-plan data into a table, which is normally called PLAN_TABLE. If you do not already have a PLAN_TABLE in the schema you are using to investigate execution plans, create one. You can create an up-to-date PLAN_TABLE with the utlxplan.sql script in the rdbms/admin directory under ORACLE_HOME. If you cannot reach ORACLE_HOME, you can create a serviceable PLAN_TABLE with this script:
CREATE TABLE PLAN_TABLE( STATEMENT_ID VARCHAR2(30), TIMESTAMP DATE, REMARKS VARCHAR2(80), OPERATION VARCHAR2(30), OPTIONS VARCHAR2(30), OBJECT_NODE VARCHAR2(128), OBJECT_OWNER VARCHAR2(30), OBJECT_NAME VARCHAR2(30), OBJECT_INSTANCE NUMBER(38), OBJECT_TYPE VARCHAR2(30), OPTIMIZER VARCHAR2(255), SEARCH_COLUMNS NUMBER(38), ID NUMBER(38), PARENT_ID NUMBER(38), POSITION NUMBER(38), COST NUMBER(38), CARDINALITY NUMBER(38), BYTES NUMBER(38), OTHER_TAG VARCHAR2(255), OTHER LONG);
You use a four-step process from SQL*Plus to generate and display execution plans on Oracle with the least interference to other end users, who may also be using the plan table:
Delete all rows from Oracle's special execution-plan table PLAN_TABLE in the schema you are using to generate the execution plans. You can generate an execution plan for a SQL statement only from a database user that has the privilege to run that SQL statement. Therefore, you usually generate execution plans while connected to the same schema in which the SQL to be tuned runs.
|
Generate the execution-plan records in PLAN_TABLE with the SQL statement EXPLAIN PLAN FOR <Statement_To_Be_Tuned>;.
Display the execution plan with a statement like this:
SELECT LPAD(' ',2*(LEVEL-1))||OPERATION||' '||OPTIONS||' '|| DECODE(OBJECT_INSTANCE, NULL, OBJECT_NAME, TO_CHAR(OBJECT_INSTANCE)||'*'|| OBJECT_NAME) PLAN FROM PLAN_TABLE START WITH ID=0 CONNECT BY PRIOR ID = PARENT_ID ORDER BY ID;
Clean up your work with ROLLBACK;.
Let's follow this process to analyze the execution plan for a simple query:
SELECT Last_Name, First_Name, Salary FROM Employees WHERE Manager_ID=137 ORDER BY Last_Name, First_Name;
Following is the actual content of a SQL*Plus session to manually determine the execution plan of this query:
SQL> delete from plan_table; 0 rows deleted. SQL> EXPLAIN PLAN FOR SELECT Last_Name, First_Name, Salary FROM Employees 2 WHERE Manager_ID=137 3 ORDER BY Last_Name, First_Name; Explained. SQL> SELECT LPAD(' ',2*(LEVEL-1))||OPERATION||' '||OPTIONS||' '|| 2 DECODE(OBJECT_INSTANCE, NULL, OBJECT_NAME, 3 TO_CHAR(OBJECT_INSTANCE)||'*'|| OBJECT_NAME) PLAN 4 FROM PLAN_TABLE 5 START WITH ID=0 6 CONNECT BY PRIOR ID = PARENT_ID 7 ORDER BY ID; PLAN -------------------------------------------------------------------------------- SELECT STATEMENT SORT ORDER BY TABLE ACCESS BY INDEX ROWID 1*EMPLOYEES INDEX RANGE SCAN EMPLOYEES_MANAGER_ID 4 rows selected. SQL> rollback; Rollback complete.
This shows an execution plan that finds the index range (on the index Employees_Manager_ID) that covers employees who report to the manager with ID 137. That index range scan (as shown in the last row of output above the feedback 4 rows selected) delivers a list of rowids that point to specific rows in specific blocks of the Employees table. For each of those rowids, Oracle performs logical I/O and, if necessary, physical I/O to the necessary table block, where it finds the specific row indicated. Following the table reads, Oracle sorts the rows in ascending order, based on the indicated ORDER BY columns.
To a beginner, Oracle's process for displaying execution plans looks clumsy, I know, but you can automate the underlying steps with a little simple scripting. If you are working from Unix, create the following files:
-- File called head.sql: set pagesize 999 set feedback off DELETE FROM PLAN_TABLE WHERE STATEMENT_ID = '<Your name>'; EXPLAIN PLAN SET STATEMENT_ID = '<Your name>' FOR -- File called tail.sql: SELECT LPAD(' ',2*(LEVEL-1))||OPERATION||' '||OPTIONS||' '|| DECODE(OBJECT_INSTANCE, NULL, OBJECT_NAME, TO_CHAR(OBJECT_INSTANCE)||'*'|| OBJECT_NAME) PLAN FROM PLAN_TABLE START WITH ID=0 AND STATEMENT_ID = '<Your name>' CONNECT BY PRIOR ID = PARENT_ID AND STATEMENT_ID = '<Your name>' ORDER BY ID; ROLLBACK; -- File called ex.sql: !cat head.sql tmp.sql tail.sql > tmp2.sql spool tmp.out @tmp2 spool off
You then can iterate execution plans rapidly by editing a copy of the SQL in question (complete with terminating ;) in tmp.sql, using the editor of your choice, in one window. In another window, start a SQL*Plus session from the directory that holds head.sql, tail.sql, ex.sql, and tmp.sql. Generate new execution plans for the current version of tmp.sql (after you save it!) by issuing the command @ex from the SQL> prompt in the window that is running SQL*Plus. The process for analyzing and displaying execution plans then becomes:
Place the bare SQL to be analyzed into tmp.sql, in the same directory as ex.sql, head.sql, and tail.sql.
From a SQL*Plus session started in that same directory, run @ex from the SQL> prompt.
View the execution plan.
Tweak the database (for example, with index changes) and the SQL to be tuned in tmp.sql (following the methods of Chapter 4).
Save tmp.sql and loop back to Step 2. Repeat until you have the execution plan you want, and then save the corrected result someplace permanent.
With this process, it takes just seconds to make a change and see the results. If you need to print the execution plan or to view it with an editor (especially if it is large), it is already available, spooled to the file tmp.out.
In operating systems other than Unix, you can try similar tricks or you can always just add the contents of head.sql to the top of tmp.sql, add the contents of tail.sql to the bottom, and run @tmp from the SQL> prompt, an approach that works in any operating system.
In practice, about half the changes you will make to force the execution plan you want will be to tmp.sql, and the other half will be to the database environment, through SQL*Plus, with operations such as creating and dropping indexes, generating table and index statistics, or modifying session optimization parameters.
When tuning SQL, you'll usually want to verify that you are getting simple execution plans that drive through nested loops in the correct join order. I refer to these execution plans as robust, because they tend to scale well to high data volumes. Here's an example that returns a robust plan, to make the process clear, with the following SQL statement to be tuned, placed in tmp.sql:
-- File called tmp.sql SELECT /*+ RULE */ E.First_Name, E.Last_Name, E.Salary, LE.Description, M.First_Name, M.Last_Name, LM.Description FROM Locations LE, Locations LM, Employees M, Employees E WHERE E.Last_Name = :1 AND E.Manager_ID=M.Employee_ID AND E.Location_ID=LE.Location_ID AND M.Location_ID=LM.Location_ID AND UPPER(LE.Description)=:2;
From SQL*Plus, in the directory with tmp.sql, head.sql, tail.sql, and ex.sql, the command @ex from the SQL> prompt produces the following output, with indexes only on the primary keys and on Employees(Last_Name):
SQL> @ex PLAN ---------------------------------------------------------------------- SELECT STATEMENT NESTED LOOPS NESTED LOOPS NESTED LOOPS TABLE ACCESS BY INDEX ROWID 4*EMPLOYEES INDEX RANGE SCAN EMPLOYEE_LAST_NAME TABLE ACCESS BY INDEX ROWID 3*EMPLOYEES INDEX UNIQUE SCAN EMPLOYEE_PKEY TABLE ACCESS BY INDEX ROWID 2*LOCATIONS INDEX UNIQUE SCAN LOCATION_PKEY TABLE ACCESS BY INDEX ROWID 1*LOCATIONS INDEX UNIQUE SCAN LOCATION_PKEY SQL>
|
Here is how you read the execution-plan output:
All joins are nested loops, based on the nested series of rows stating NESTED LOOPS. If you have a mix of join methods, the first join executed will be the innermost (most highly indented) one, the last one listed. You'll read the order of join methods executed from the inside out, or from the bottom up.
|
The order of table access is Employees, twice, followed by Locations, twicethe same order they appear in the execution-plan output. When SQL references the same tables multiple times, aliases for those tables are mandatory. As you can see in the example FROM clause, the Employees table is aliased to both E and M. You might guess from the index choices that alias E, rather than alias M, represents the driving table, even though both aliases map to the same Employees table. It is less obvious which alias mapping to Locations the database reaches first. This is where the numbers in front of the table names come in: they indicate the order of the alias reference in the FROM clause, so you know that the first Locations alias, LE, is actually the last one the execution plan reaches.
|
All four table reads are through some index, as shown by the phrase TABLE ACCESS BY INDEX ROWID in front of each table name. The indexes used, and indication of whether the index use is guaranteed to be unique, come in the indented entries just below each table access. Thus, you know that the driving table E is reached through an index range scan (a read that at least potentially touches multiple rows at a time) on the index EMPLOYEE_LAST_NAME. The rest of the table accesses are unique reads through the tables' primary keys. Since all reads after the driving table are for unique joins, you know that the query will read at most the same number of rows for each of these other tables as it reads for the driving table.
|
When you find unique scans on an index, you can safely assume they serve an equality condition on a unique key. There is usually only one column or combination of columns the index could cover to provide that unique scan, but even if there is a choice, it does not especially matter which unique condition the database uses, so you can safely guess. Index range scans are another matter. If you do not already know the indexes for a table and how they are named for each combination of columns, and if the index names do not resolve the question, always check in case the index range scan is not the one you expected. The simplest script to provide this check is as follows:
-- File called index.sql column column_name format a40 set pagesize 999 SELECT INDEX_NAME, COLUMN_NAME FROM USER_IND_COLUMNS WHERE TABLE_NAME = UPPER('&&1') ORDER BY INDEX_NAME, COLUMN_POSITION;
From SQL*Plus, logged into the schema that holds the table you need to check, run @index <NameOfTable> from the SQL> prompt. The script lists multicolumn indexes in order, first column first. Here is an example use of this script:
SQL> @index Locations INDEX_NAME COLUMN_NAME ------------------------------ ---------------------------------------- LOCATION_PKEY LOCATION_ID SQL>
To see functional indexes, where those would apply (usually where you are matching UPPER(<Some_Column>) or LOWER(<Some_Column>), or a type conversion on a column), use the findex.sql script:
-- File called findex.sql set long 40 set pagesize 999 SELECT INDEX_NAME, COLUMN_EXPRESSION FROM USER_IND_EXPRESSIONS WHERE TABLE_NAME = UPPER('&&1') ORDER BY INDEX_NAME, COLUMN_POSITION;
I just explained how to find the join order, the join methods, and the table-access methods for the robust execution plan I showed earlier. If you combine that with the basics covered in Chapter 2, you should understand how Oracle will reach the data, from end to end. To test your understanding, try constructing a narrative that explains the full execution plan in English, as a set of instructions to the database. Compare your result with what follows. If it does not match well, try again later, after you have read a few more execution plans, to see if your understanding has improved. Here is the execution plan expressed in narrative form, as instructions to the database:
Using the condition E.Last_Name = :1, go to the index EMPLOYEE_LAST_NAME and find the list of rowids that correspond to employees with the requested last name.
For each of these rowids, go to the table Employees (E) with a single-block read (logical read, physical when necessary) according to each rowid from the previous step, using the block-address part of the rowid. Using the row-address part of the rowid, find the specific row that the rowid points to and read all necessary data (requested data for alias E) from that row.
For each such row, using the join condition E.Manager_ID=M.Employee_ID, go to the primary-key index EMPLOYEE_PKEY to find a single matching rowid that corresponds to the employee record of the manager for the employee whose record you already read. If no matching row is found, discard the result row being built.
Otherwise, for the matching rowid, go to the table Employees (M) with a single-block read (logical read, physical when necessary) according to the rowid from the previous step, using the block-address part of the rowid. Using the row-address part of the rowid, find the specific row that the rowid points to and read all necessary data (requested data for alias M) from that row. Append the applicable data to the incoming row from the earlier table read to build a partial result row.
For each such row, using the join condition M.Location_ID=LM.Location_ID, go to the primary-key index LOCATION_PKEY to find a single matching rowid that corresponds to the location record that matches the manager for the employee whose record you already read. If no matching row is found, discard the result row being built.
Otherwise, for the matching rowid, go to the table Locations (LM) with a single-block read (logical read, physical when necessary) according to the rowid from the previous step, using the block-address part of the rowid. Using the row-address part of the rowid, find the specific row that the rowid points to and read all necessary data (requested data for alias LM) from that row. Append the applicable data to the incoming row from the earlier table reads to build a partial result row.
For each such row, using the join condition E.Location_ID=LE.Location_ID, go to the primary-key index LOCATION_PKEY to find a single matching rowid that corresponds to the location record that matches the employee whose record you already read. If no matching row is found, discard the result row being built.
Otherwise, for the matching rowid, go to the table Locations (LE) with a single-block read (logical read, physical when necessary) according to the rowid from the previous step, using the block-address part of the rowid. Using the row-address part of the rowid, find the specific row that the rowid points to and read all necessary data (requested data for alias LE) from that row. Append the applicable data to the incoming row from the earlier table reads to complete the result row. Discard the whole result row if it contains data that fails to meet the condition UPPER(LE.Description)=:2. Otherwise, immediately return the fully built result row.
|
Execution plans for the SQL you tune will often be nonrobust in the beginning, often as a part of the performance problem you must resolve. These nonrobust execution plans use join methods other than nested loops. You often do not need to understand the nonoptimal execution plans you start with in detail, as long as you can recognize that they are not the plans you want. However, it is useful to have at least a rough idea of why the starting execution plans are as slow as they are, to guess how much better your optimal plans will be. Now, I'll show how alternative execution plans appear for the query you've been looking at for the past couple sections. If I drop all the indexes, the rule-based optimizer delivers a new execution plan:
PLAN -------------------------------------------------------------------------------- SELECT STATEMENT MERGE JOIN SORT JOIN MERGE JOIN SORT JOIN MERGE JOIN SORT JOIN TABLE ACCESS FULL 4*EMPLOYEES SORT JOIN TABLE ACCESS FULL 3*EMPLOYEES SORT JOIN TABLE ACCESS FULL 2*LOCATIONS SORT JOIN TABLE ACCESS FULL 1*LOCATIONS
This shows the same join order, but now the database performs sort-merge joins and finds the rows for each table through full table scans.
Hash joins are more common than merge joins in cost-based execution plans, and you will occasionally even prefer them over nested-loops joins, so I next show an example that produces this style of join. Note that the original SQL that produced the previous plan has a hint (/*+ RULE */) immediately following the SELECT keyword. If I replace the hint /*+ RULE */ with /*+ORDERED USE_HASH(M LE LM) */ and reverse the order of the FROM clausewith empty tables, no indexes, and complete statisticsthe cost-based optimizer delivers a new execution plan:
PLAN ---------------------------------------------------------------------- SELECT STATEMENT HASH JOIN HASH JOIN HASH JOIN TABLE ACCESS FULL 1*EMPLOYEES TABLE ACCESS FULL 2*EMPLOYEES TABLE ACCESS FULL 3*LOCATIONS TABLE ACCESS FULL 4*LOCATIONS
This is identical to the previous execution plan, except that it replaces the merge joins with hash joins.
There are other execution-plan features, such as indicators of which joins are outer joins and steps for sorts and sort-unique operations that discard duplicates that you will see regularly, but these are fairly self-explanatory and are not usually important to performance. The only remaining important subtleties that you will often see deal with subqueries and multipart execution plans. I'll cover both of these at once with one final example:
SELECT /*+ RULE */ E.First_Name, E.Nickname, E.Last_Name, E.Phone_Number, L.Description FROM Employees E, Locations L WHERE (E.First_Name='Kathy' OR E.Nickname='Kathy') AND E.Location_ID=L.Location_ID AND EXISTS (SELECT null FROM Wage_Payments P WHERE P.Employee_ID=E.Employee_ID AND P.Payment_Date > sysdate-31);
Place indexes on:
Employees(First_Name)
Employees(Nickname)
Locations(Location_ID)
Wage_Payments(Employee_ID)
You then find the following execution plan:
PLAN ---------------------------------------------------------------------- SELECT STATEMENT CONCATENATION FILTER NESTED LOOPS TABLE ACCESS BY INDEX ROWID 1*EMPLOYEES INDEX RANGE SCAN EMPLOYEE_NICKNAME TABLE ACCESS BY INDEX ROWID 2*LOCATIONS INDEX UNIQUE SCAN LOCATION_PKEY TABLE ACCESS BY INDEX ROWID 3*WAGE_PAYMENTS INDEX RANGE SCAN WAGE_PAYMENT_EMPLOYEE_ID FILTER NESTED LOOPS TABLE ACCESS BY INDEX ROWID 1*EMPLOYEES INDEX RANGE SCAN EMPLOYEE_FIRST_NAME TABLE ACCESS BY INDEX ROWID 2*LOCATIONS INDEX UNIQUE SCAN LOCATION_PKEY
The CONCATENATION step indicates that the optimizer has implemented this as the implicit UNION of essentially two distinct queries, one driving from the index on First_Name and the other driving from the index on Nickname. Following the completion of the outer query, the FILTER step implements the correlation join on P.Employee_ID=E.Employee_ID, following the index on the foreign key from Wage_Payments to Employees. This FILTER step is really no different than a nested-loops join, except that it halts after finding the first matching row, if there is one. Note that the second FILTER step refers back to the same correlation join to Wage_Payments as the first FILTER step. This is an artifact of the concatenated execution plan, which repeats the steps for the joins in the outer query, but not the steps for the correlated join.