Oracle currently offers two completely different optimizers, the rule-based optimizer (RBO) and the cost-based optimizer (CBO), and the methods for tuning on each differ.
The RBO is Oracle's original automated optimizer, back from the days of Oracle Version 6 and earlier. By rule-based, Oracle means that the optimizer uses only fixed properties of the tables, indexes, and SQL to guess an optimum execution plan from a set of simple rules of thumb (or heuristics) built into the automated optimizer. The RBO uses no data about the sizes of the tables or indexes, or about the distribution of data within those objects. It does use data on the fixed properties of the indexes: whether they are unique, which columns they cover, in which order, and how well those match up with the most selective-looking filter conditions and joins in the SQL. As tables grow and data distributions change, the RBO should go right on delivering the same plan indefinitely, as long as you don't alter the indexes (for example, from unique to nonunique) or change the table structure (for example, from an ordinary table to a partitioned table). However, at some future time, perhaps even in Oracle Database 10g, Oracle will drop all support for the rule-based optimizer, and cost-based optimization will become your only choice.
Since Oracle7, the RBO has been even more stable than before, because Oracle chose to freeze the RBO code beginning with Oracle7, except for rare, slight changes necessary to deliver functionally correct (as opposed to necessarily optimum) results. Therefore, an execution plan that is correct on the RBO today will likely stay unchanged until Oracle drops the RBO altogether. This is appealing from the perspective of stability, although the dark side of this stability is that the execution plans never get any better either.
Execution plans on the RBO never change to adapt to changing data distributions, and this is often cited as an argument to switch to the CBO. However, in my own experience, data-distribution change is the least of the reasons for cost-based optimization. In over 10 years, I have yet to find a single case in which it was important to use different execution plans for different real-world data distributions with the same SQL.
Another argument cited in favor of the CBO is that it can deliver parallel execution plans, plans that can bring multiple processors to bear on the SQL statement at once. I have not found this to be a compelling argument, since I have yet to find a real-world case in which the optimum SQL, with the optimum database design, required parallel execution for adequate performance. I expect some such cases exist in data-warehousing environments, which are not where most of my experience lies, I admit, but almost all cases in which parallel execution plans appear to shine are really covering up some mistake in database design, indexing, or application design, compensating for design deficiencies with horsepower. That, by itself, would not be such a bad thing; extra horsepower might be cheaper than fixing the application. However, parallel plans are usually in service of large batch processes, competing heavily for resources with online processes that are more critical to end users. Therefore, parallel plans often rob needed resources from other processes that are more critical.
These are the strongest arguments against using the RBO:
It will become unavailable in some future release, perhaps during Oracle Database 10g, and you will not be able to use an older release forever.
The CBO keeps getting better, while the RBO is stuck with all the old problems it has ever had.
The CBO has a huge inherent advantage in the information available to it to calculate the best plan.
The RBO cannot take advantage of features created since the CBO appeared in Oracle7, and in most cases the RBO will simply push queries that involve newer object types, such as bit-mapped indexes, off to the CBO. (See the following section, Section 4.2.1, for details about which features the RBO cannot handle.)
That said, the RBO does a surprisingly good job; its heuristics are well designed to get along with the tiny amount of information that the RBO uses to guess the best plan. In Chapter 6, I will describe properties of what I call a robust execution plan, one that behaves well across a wide range of data distributions. The RBO almost always delivers a robust plan when the necessary indexes are available and when the developer has not prevented use of an index with some index-disabling expression, as discussed earlier in this chapter. Given the right indexes, you can almost always get the best robust plan on either optimizer, with manual tuning. With automated tuning, the biggest advantage of the CBO is that it is more resourceful when dealing with imperfect indexing and nonoptimally written SQL; more often, it delivers at least an adequate plan in these cases, without manual tuning. When more than one robust plan is possible, the CBO is also more likely to find the best robust plan, while the RBO will pick one without knowing relative costs, unless you manually tune the SQL .
It is unrealistic to optimize Oracle queries simultaneously for both the rule-based and the cost-based optimizers. Therefore, you should understand the factors that lead Oracle to choose which optimizer it applies, so that you can control those factors and get the optimizer you choose.
The RBO cannot handle certain object types and object properties that did not yet exist when Oracle froze the RBO code. However, rather than simply have its code error out, Oracle modified the RBO code just enough to let it recognize the cases it cannot handle and to have it pass those cases on to the CBO. Thus, even if you think you have set up your system for rule-based optimization, the following circumstances will absolutely force cost-based optimization:
Bit-mapped indexes on any column of a table referenced in the SQL, even if those indexes are on columns the SQL does not touch.
Function-based indexes in a table referenced in the SQL, if such and index is on an expression the SQL references.
Partitioned tables touched by the SQL.
Tables or indexes configured with parallel degree. The optimizer interprets these as a command to find parallel execution plans, which the RBO does not know how to do. As for bit-mapped indexes, indexes configured with parallel degree will disable use of the RBO on a table referenced by your SQL, even if the parallel-degree index is on columns the SQL does not touch.
Unintentionally Disabling Use of the RBO
Here's a real-world scenario I have seen more than once: you have a stable production application performing nicely on the RBO when, without warning, large portions of the application suddenly slow to a crawl. Panic and finger-pointing ensue. After much investigation, it turns out that the night before, a database administrator (DBA) innocently dropped and recreated some large, central-table index, perhaps to move it to a new filesystem that had more space available. Your DBA cleverly recognized that this was such a large index that it might take prohibitively long to create the old-fashioned way, so he chose to create it in parallel, using something like this:
CREATE INDEX Order_Ship_Date ON Orders(Ship_Date) PARALLEL 10;
This brought 10 simultaneous threads to bear on the index creation and greatly sped up that process, meeting the time window available for the work. So far, this is well and good. What no one realized was that it also left behind an index property that instructed Oracle to use cost-based optimization, regardless of the database configuration, to attempt to find plans using this index in parallel threads for all SQL that referenced this table. Since no one expected the CBO to apply to this application, no one bothered to create statistics on the tables and indexes, so the CBO operated in ignorance of correct statistics and suddenly delivered horrible plans on most SQL to this central table. Once recognized, the problem is solvable with this command:
ALTER INDEX Order_Ship_Date PARALLEL 1;
Because this statement only tweaks a value in the data dictionary, rather than rebuild the index, it runs almost instantly, and the application immediately reverts to its old performance. You can find indexes with this problem with this query:
SELECT Index_Name FROM ALL_INDEXES WHERE Degree!=1;
If the tables and indexes involved in your SQL do not prevent using the RBO, Oracle chooses between the RBO and the CBO as follows:
If any SELECT keyword in the SQL (even in a subquery or a view definition) is followed by any valid hint other than /*+ RULE */ or /*+ CHOOSE */, Oracle will use the CBO.
Otherwise, if any SELECT keyword in the SQL (even in a subquery or a view definition) is followed by /*+ CHOOSE */ and there are any statistics on any table or index referenced by the SQL, Oracle will choose the CBO.
Otherwise, if any SELECT keyword in the SQL (even in a subquery or a view definition) is followed by /*+ RULE */, Oracle will choose the RBO.
Otherwise, if the session optimizer_mode parameter is set at the session level (by ALTER SESSION SET OPTIMIZER_MODE=<Your_Choice>;), Oracle will choose according to that session-level parameter.
Otherwise, if the optimizer_mode parameter is set for the database instance, in the init.ora file, Oracle will choose according to that instance-level parameter.
Otherwise, Oracle will choose according to the ultimate default optimizer_mode parameter, CHOOSE.
In the last three steps of this decision cascade, Oracle chooses according to an optimizer_mode parameter, which you or your DBA sets. These are the four possible parameter values and how they affect the choice:
Oracle uses rule-based optimization.
Oracle uses cost-based optimization with the goal of minimizing the cost of the whole query. This default version of cost-based optimization sometimes results in nonrobust plans (plans that use join methods other than nested-loops), with risks described in Chapter 6. However, the optimizer chooses these plans only when it calculates that they are faster than the best robust plans.
Oracle uses cost-based optimization with the goal of minimizing the cost of reaching the first rows from the query. In practice, this tends to favor robust, nested-loops plans similar to those plans the rule-based optimizer favors but built with much more knowledge of the data distributions and probable execution costs. The FIRST_ROWS optimization level creates the same effect as the OPTIMIZE FOR 1 ROW hint on DB2 and the OPTION(FAST 1) hint on SQL Server.
Oracle uses cost-based optimization, as for the ALL_ROWS goal, unless no table or index involved in the query has optimization statistics available, in which case Oracle uses rule-based optimization.
Mixing Cost and Rule
The CHOOSE option offers the potential to mix your optimization styles on a database instance. For example, you might use CHOOSE if you wish to run two distinct applications: one that has SQL certified and optimized for rule-based optimization and one that expects cost-based optimization. You would then avoid generating statistics for the tables referenced by the first application, so its SQL would be optimized by the RBO.
Although this sounds good, I do not recommend it. Chances are, you will have overlap between the tables the two applications use; otherwise, you would not have bothered to put them on the same database instance. In that case, you will end up with some SQL optimized under the worst possible arrangementusing the CBO, but lacking statistics on one or more tables.
Even if you have completely disjoint sets of tables for the two applications, it is always much too likely that someone, some time, will generate statistics on some of the tables for the application that expects to use the RBO. Again, this results in the worst arrangementcost-based plans against SQL that mixes tables with and without statistics. It is far safer to use separate instances, with the setting optimizer_mode=rule set instance-wide on the application that needs it. Alternatively, have one of the applications explicitly set the optimizer_mode when it connects to the database, overriding the instance optimizer_mode setting, and use the instance-wide setting to choose the optimizer the other application uses.
Here's a quick way to check the instance-level parameter for optimizer_mode:
SELECT VALUE FROM V$PARAMETER WHERE NAME = 'optimizer_mode';
When you have an execution plan in PLAN_TABLE, a quick way to see whether it is cost-based is to run the following query:
SELECT POSITION FROM PLAN_TABLE WHERE ID=0;
This returns the cost of the entire execution plan, in arbitrary units, when the plan is cost-based. When cost is not null, you have a cost-based plan.
Most of the methods for controlling rule-based execution plans are the universal techniques of controlling plans, covered in the first section of this chapter. The primary Oracle-specific method of tuning under a rule-based default optimizer_mode is simply to switch modes to cost-based optimization, usually with a hint such as /*+ FIRST_ROWS */. In other words, you can always control a plan via hints, and hints (with the exception of the /*+ RULE */ hint) in a statement cause Oracle to use the CBO for that statement.
However, if you prefer not to use cost-based optimization, thus precluding the use of hints, one RBO-specific technique remains: in the FROM clause, list tables and their aliases in exactly the reverse order from the join order you want. This usually gives enough control of the join order, without using the techniques described earlier in Section 4.1.4. In particular, eligible, unique joins toward primary keys happen in the reverse order they are listed in the FROM clause, without changing the join conditions. For example, consider this query:
SELECT /*+ RULE */ E.First_Name, E.Last_Name, E.Salary, LE.Description, M.First_Name, M.Last_Name, LM.Description FROM Locations LM, Employees M, Locations LE, Employees E WHERE E.Last_Name = 'Johnson' AND E.Manager_ID=M.Employee_ID AND E.Location_ID=LE.Location_ID AND M.Location_ID=LM.Location_ID AND LE.Description='Dallas';
Unlike the earlier version of this query in Chapter 3, which had the wrong order in the FROM clause, you now get the correct join order. In this correct execution plan, E joins to LE before joining to M or LM, as shown by the plan output:
SQL> @ex PLAN ---------------------------------------------------------------------- SELECT STATEMENT NESTED LOOPS NESTED LOOPS NESTED LOOPS TABLE ACCESS BY INDEX ROWID 4*EMPLOYEES INDEX RANGE SCAN EMPLOYEE_LAST_NAME TABLE ACCESS BY INDEX ROWID 3*LOCATIONS INDEX UNIQUE SCAN LOCATION_PKEY TABLE ACCESS BY INDEX ROWID 2*EMPLOYEES INDEX UNIQUE SCAN EMPLOYEE_PKEY TABLE ACCESS BY INDEX ROWID 1*LOCATIONS INDEX UNIQUE SCAN LOCATION_PKEY
When the RBO otherwise has no preference based on the conditions and indexes, the RBO joins tables by working from right to left in the FROM clause. However, this method offers only limited control by itself, because the RBO follows its other rules of thumb before considering the join order in the FROM clause. For example, the RBO always chooses to perform unique indexed reads and joins before doing indexed range scans, when it can.
There are two main parts involved in tuning on the Oracle CBO:
Providing the optimizer with good statistics about the tables and indexes, so it can calculate the costs of alternatives accurately. This is effectively a prerequisite to any manual tuning on a CBO.
Adding hints to queries that the CBO fails to optimize well even with complete statistics about the tables and indexes that the queries reference.
Proving that a little knowledge is a dangerous thing, cost-based optimizers often do a terrible job if they do not have statistics on all the tables and indexes involved in the query. It is therefore imperative to maintain statistics on tables and indexes reliably, including regenerating statistics whenever table volumes change much or tables or indexes are rebuilt. It is safest to regenerate statistics periodically, during times that load is relatively quiet, such as nightly or at least weekly. The best way to generate and update statistics is with Oracle's DBMS_STATS package, documented at length in Oracle8i Supplied PL/SQL Packages Reference and Oracle9i Supplied PL/SQL Packages and Types Reference. Here is a simple example of using DBMS_STATS to generate statistics for a whole schema, Appl_Prod, sampling 10% of the data in the larger tables and cascading statistics collection to the indexes:
BEGIN DBMS_STATS.GATHER_SCHEMA_STATS ('Appl_Prod',10, CASCADE => TRUE); END; /
Often, queries include conditions on highly skewed distributions, such as conditions on special types, codes, or flags, when these columns have only a few values. Normally, the CBO evaluates selectivity of a condition based on the assumption that all nonnull values of a column are equally selective. This assumption generally works well for foreign and primary keys that join business entities, but it is inaccurate when the columns have permanent special meanings and certain meanings apply much more rarely than others.
For example, in an Orders table, you might have a Status_Code column with three possible values: 'CL' for closed (i.e., fulfilled) orders, 'CA' for cancelled orders, and 'OP' for open orders. Most orders, by far, would be fulfilled, once the application has been running for a few months. A steady, significant fraction of orders would end up cancelled, so that value would also eventually point to a large list of orders. However, as long as the business keeps up with incoming orders, the number of open orders would remain moderate and steady, even as data accumulates for years. Quite early, a condition specifying Status_Code='OP' would be selective enough to justify indexed access, if you had an index with that leading column, and it is important to enable the optimizer to realize this fact, preferably without a lot of manual tuning. Enabling the CBO to recognize when a column is selective requires two things:
The SQL must mention the specific selective value, rather than use a bind variable, prior to Oracle 9i Database. Use of bind variables is commonly attractive, since it makes SQL more general and easier to share between processes. However, this need to hardcode especially selective values is the exception to that rule. If you use Status_Code=:1 instead of Status_Code='OP', prior to Oracle 9i, you will deny the CBO potential knowledge of the selectivity of the condition at parse time, when it does not yet know whether the bind variable :1 will be assigned a common or a rare Status_Code. Fortunately, in these cases, the usual reason to prefer using bind variables does not generally apply; since these special codes have special business meanings, it is unlikely that the SQL ever requires substituting a different value than the single selective value.
You need to provide the CBO with special statistics that quantify how rare the uncommon code, type, or status values are, so it can know which values are highly selective.
Oracle stores special statistics on distribution when you request them, based on sorting the rows for a column and arranging the sorted list into a specified number of buckets that each contain the same number of rows. Since Oracle already knows that the range each bucket holds has the same number of rows, Oracle needs to know only the value-range endpoints in each bucket. In the current example, with 20 buckets, the first bucket might hold the range 'CA' to 'CA', and the second bucket might hold the range 'CA' to 'CL'. The next 17 buckets would hold the most common range, 'CL' to 'CL'. The last bucket would hold the range 'CL' to 'OP', which includes the rarest value. From this, Oracle can deduce that the selectivity of the column is 5-10% for the value 'CA', 85-95% for the value 'CL', and 0-5% for the value 'OP'. Since you want the optimizer to know more closely how selective the 'OP' value is, you would choose more buckets than this, perhaps the maximum of 254. (Oracle compresses the bucket information when so few values apply, so the large number of buckets should be inexpensive.) To create 254 buckets for the example case, in the schema owned by Appl_Prod, use this:
BEGIN DBMS_STATS.GATHER_TABLE_STATS ('Appl_Prod','Orders', METHOD_OPT => 'FOR COLUMNS SIZE 254 Status_Code'); END; /
Generate the histogram statistics after you generate the general table statistics, because table-statistics generation deletes earlier histogram statistics.
Oracle uses hints for manual control of cost-based optimization. Syntactically, these hints take the form of comments, like /*+ <Hint_String> */, immediately following the SELECT keyword. Oracle recognizes that this syntax encloses a hint, not a comment, by the + at the beginning and by the location of the hint, which must immediately follow SELECT. However, since these are comments from the point of view of standard SQL syntax, they do not interfere with parsing the SQL if the SQL is also to be executed on non-Oracle databases.
Each hint directly affects only the SELECT block that has the comment. Thus, to control the order of joins and index choices within a subquery, place the hint after the SELECT keyword that begins the subquery. But to affect the outer-query order of joins and index choices, place a hint immediately after the outer-query SELECT.
There are two basic extremes involved in tuning with hints:
Use as little direction as possible to get the execution plan you want, or at least to get close enough to the plan you want for reasonable performance. This approach reasons that the CBO has more information than you have and should be left free to adapt to changing data distributions and to take advantage of improvements in Oracle with future releases. By leaving the CBO the maximum degrees of freedom, you maximize its power to optimize well for you in the future. However, until you try, you won't know how much direction the CBO will need if it did not get the plan right in the first place, so this approach is likely to be iterative, involving the addition of one hint at a time until the CBO delivers a good plan.
If you did not get the plan you wanted from the CBO automatically, assume the CBO has made bad assumptions that will propagate to distort all of its calculations. Therefore, leave it with little freedom, specifying essentially the whole plan you want.
If you are confident in your chosen execution plan, as you should be if you apply the methods I describe later in this book, there is little reason to hold back from fully specifying that plan. I have yet to find a case where a well-chosen, robust execution plan needed to evolve to handle new data distributions or new database features. On the other hand, it is easy for SQL with a partially restricting set of hints to go wrong, especially if some table or index loses its statistics. When the CBO chooses incorrectly, the error that made the CBO choose incorrectly will likely propagate over the entire plan. For example, consider this query:
SELECT E.First_Name, E.Last_Name, E.Salary, LE.Description, M.First_Name, M.Last_Name, LM.Description FROM Locations LM, Employees M, Locations LE, Employees E WHERE E.Hire_Date > :1 AND E.Manager_ID=M.Employee_ID AND E.Location_ID=LE.Location_ID AND M.Location_ID=LM.Location_ID
At parse time, when the optimizer does its work, it cannot know that the bind variable :1 will likely be set to a value in the current week, so it makes a conservative assumption about the selectivity of that condition on Hire_Date. Having made that assumption, it might not only forego using an index on Hire_Date (depending on the data distribution), but it might also further calculate that it will hit most of the rows of all the joined tables as well, and the CBO might choose full table scans with hash joins on them. Even if you instruct the CBO to use the index on Hire_Date, it still retains its initial assumption that the driving condition is unselective, and will likely retain its poor choices for the other joins and table-access methods. This is really no flaw in the optimizer; it cannot know what the application developer knows about the likely values to be assigned to the bind variable. However, the consequence is that, if you need to be any more specific than just specifying ALL_ROWS or FIRST_ROWS, chances are relatively high that the optimizer will need help across the board, to correct for some incorrect assumption somewhere.
These are the main hints to control table-access methods:
This directs Oracle, when possible, to access the alias <Alias_Name> using the index named <Index_Name>. Repeat this hint for each index/alias combination you need to control.
This directs Oracle, when possible, to access the alias <Alias_Name> using a full table scan. Repeat this hint for each full table scan you require.
This directs Oracle, when possible, to access the alias <Alias_Name> using the index named <Index_Name>, reaching the rows in descending order (the reverse of the normal index-sorted order). Repeat this hint for each index/alias combination you need to control, although it is unlikely you will need it more than once in a query.
The INDEX and FULL hints are common and easy to use. The INDEX_DESC hint is useful only rarely, but it is occasionally vital to use. For example, if you want to know all about the last employee hired in April, you might use this query:
SELECT * FROM Employees E WHERE Hire_Date>=TO_DATE('2003-04-01','YYYY-MM-DD') AND Hire_Date< TO_DATE('2003-05-01','YYYY-MM-DD') ORDER BY Hire_Date DESC
You'll find the most recently hired employee you want at the top of the list of rows returned by this query. To avoid reading all the data for other employees hired in April, you might think to add a condition AND ROWNUM=1 to the query. However, this sometimes will not yield the desired result, because (depending on the data) Oracle will sometimes apply that condition before performing the descending sort. If Oracle uses a full table scan, it will return the first employee hired in April it finds in the table, likely the least recently hired. If it uses a simple index range scan on an index on Hire_Date, it will begin, as range scans generally do by default, at the low end of the index range, returning the first employee hired in April. However, the INDEX_DESC hint, with the index Employee_Hire_Date on the Hire_Date column, neatly solves the problem, returning the desired row with just a single logical I/O to the table:
SELECT /*+ INDEX_DESC(E Employee_Hire_Date) */ * FROM Employees E WHERE Hire_Date>=TO_DATE('2003-04-01','YYYY-MM-DD') AND Hire_Date< TO_DATE('2003-05-01','YYYY-MM-DD') AND ROWNUM=1
Note that I removed the explicit ORDER BY clause, since it gives the false impression that it has effect, given the condition on ROWNUM.
There are several other table-access hints that I have not described in this section, but I have never found them necessary.
These are the main hints to control the order of execution for joins and subqueries:
This directs Oracle, when possible, to join the tables in the FROM clause in the same order that they are listed.
In the absence of an ORDERED hint, this selects the driving table, the first table in the join order. Although this gives less control over the join order than the ORDERED hint, it does not require modifying the FROM clause. Often, getting just the driving table correct is all you need to get at least close to the performance of the optimal plan. Later choices in the join order tend to matter less and will likely be well chosen by the optimizer, without your help.
This hint instructs the optimizer to perform correlated subqueries at the first opportunity, as soon as the outer query reaches the join columns needed to evaluate them. Oracle's CBO normally performs correlated subqueries only after completing all the joins in the outer query.
The ORDERED and LEADING hints are common and straightforward to use. The PUSH_SUBQ hint is occasionally useful.
When it comes to subqueries, Oracle offers hint-based control only at the two extremes: executing subqueries as early or as late as possible. However, you can gain full control of when subqueries execute if you combine the PUSH_SUBQ hint with the earlier methods of postponing correlated joins. For example, consider the earlier query:
SELECT ... FROM Orders O, Customers C, Regions R WHERE O.Status_Code='OP' AND O.Customer_ID=C.Customer_ID AND C.Customer_Type_Code='GOV' AND C.Region_ID=R.Region_ID AND EXISTS (SELECT NULL FROM Order_Details OD WHERE O.Order_ID+0*C.Customer_ID=OD.Order_ID AND OD.Shipped_Flag='Y')
Without a hint, Oracle would execute the EXISTS check after joining all three outer-query tables. The point of the expression O.Order_ID+0*C.Customer_ID was to delay the EXISTS check until after the join to C, but not after the join to R. However, without any hint, all EXISTS conditions are automatically delayed until after all outer-query joins. To force the EXISTS condition to execute between the joins to C and R, use both the hint and the correlating-join-postponing expression:
SELECT /*+ PUSH_SUBQ */ ... FROM Orders O, Customers C, Regions R WHERE O.Status_Code='OP' AND O.Customer_ID=C.Customer_ID AND C.Customer_Type_Code='GOV' AND C.Region_ID=R.Region_ID AND EXISTS (SELECT NULL FROM Order_Details OD WHERE O.Order_ID+0*C.Customer_ID=OD.Order_ID AND OD.Shipped_Flag='Y')
Now, the PUSH_SUBQ hint causes Oracle to execute the EXISTS condition as early as possible, and the expression O.Order_ID+0*C.Customer_ID ensures that "as early as possible" doesn't come until after the join to C.
These are the main hints to control the join methods:
This directs Oracle, when possible, to join the tables indicated in the alias list by using nested loops. The alias list is without commasfor example, USE_NL(T1 T2 T3).
This directs Oracle, when possible, to join to the tables indicated in the alias list by using hash joins. The alias list is without commasfor example, USE_HASH(T1 T2 T3).
Here's an example to illustrate the most frequently useful hints to yield complete control of an execution plan. I'll force the join order, the access method to every table, and the join method to every table. Consider the earlier example tuned for the RBO, shown at the end of Section 4.2.2. To fully force the same plan, but substitute a hash join for the first nested-loops join, with the employee locations read through the index on Description, use this query:
SELECT /*+ ORDERED USE_NL(M LM) USE_HASH(LE) INDEX(E Employee_Last_Name) INDEX(LE Location_Description) INDEX(M Employee_Pkey) INDEX(LM Location_Pkey) */ E.First_Name, E.Last_Name, E.Salary, LE.Description, M.First_Name, M.Last_Name, LM.Description FROM Employees E, Locations LE, Employees M, Locations LM WHERE E.Last_Name = 'Johnson' AND E.Manager_ID=M.Employee_ID AND E.Location_ID=LE.Location_ID AND M.Location_ID=LM.Location_ID AND LE.Description='Dallas'
This results in the execution plan, as shown here:
SQL> @ex PLAN ---------------------------------------------------------------------- SELECT STATEMENT NESTED LOOPS NESTED LOOPS HASH JOIN TABLE ACCESS BY INDEX ROWID 1*EMPLOYEES INDEX RANGE SCAN EMPLOYEE_LAST_NAME TABLE ACCESS BY INDEX ROWID 2*LOCATIONS INDEX RANGE SCAN LOCATION_DESCRIPTION TABLE ACCESS BY INDEX ROWID 3*EMPLOYEES INDEX UNIQUE SCAN EMPLOYEE_PKEY TABLE ACCESS BY INDEX ROWID 4*LOCATIONS INDEX UNIQUE SCAN LOCATION_PKEY