8.6 Enhancements in Oracle Database 10g

Oracle Database 10g introduces some new features for hierarchical queries. The new features include the CONNECT_BY_ROOT operator, the new CONNECT_BY_ISCYCLE and CONNECT_BY_ISLEAF pseudocolumns, and the NOCYCLE keyword. We will discuss each of these enhancements in the following sections.

8.6.1 Getting Data from the Root Row

Remember how you can use the PRIOR operator to retrieve a value from a node's parent row? You can now use the CONNECT_BY_ROOT operator to retrieve a value from a node's root. For example:

SELECT lname "Employee", CONNECT_BY_ROOT lname "Top Manager"

FROM employee

START WITH manager_emp_id = 7839

CONNECT BY PRIOR emp_id = manager_emp_id;



Employee             Top Manager

-------------------- ------------

JONES                JONES

SCOTT                JONES

ADAMS                JONES

FORD                 JONES

SMITH                JONES

BLAKE                BLAKE

ALLEN                BLAKE

WARD                 BLAKE

MARTIN               BLAKE

TURNER               BLAKE

JAMES                BLAKE

CLARK                CLARK

MILLER               CLARK

In this example, the hierarchy is built by starting with the rows that meet the condition manager_emp_id = 7839. This means that anyone whose manager is 7839 will be considered a root for this query. Those employees will be listed in the result set of the query along with the name of the top-most manager in their tree. The CONNECT_BY_ROOT operator returns that top-most manager name by accessing the root row for each row returned by the query.

8.6.2 Ignoring Cycles

Cycles are not allowed in a true tree structure. But life is not perfect, and someday you're bound to encounter hierarchical data containing cycles in which a node's child is also its parent. Such cycles are usually not good, need to be fixed, but can be frustratingly difficult to identify. You can try to find cycles by issuing a START WITH . . . CONNECT BY query, but such a query will report an error if there is a cycle (also known as a loop) in the data. In Oracle Database 10g, all this changes.

To allow the START WITH . . . CONNECT BY construct to work properly even if cycles are present in the data, Oracle Database 10g provides the new NOCYCLE keyword. If there are cycles in your data, you can use the NOCYCLE keyword in the CONNECT BY clause, and you will not get an error when hierarchically querying that data.

The test data we have in the employee table doesn't have a cycle. To test the NOCYCLE feature, you can introduce a cycle into the existing employee data by updating the manager_emp_id column of the top-most employee (KING with emp_id=7839) with the manager_emp_id of one of the lowest level employees (MARTIN with emp_id = 7654):

UPDATE employee

SET manager_emp_id

 = 7654

WHERE manager_emp_id IS NULL;

Now, if you perform the following hierarchical query, you will get an ORA-01436 error:

SELECT LEVEL, LPAD('  ',2*(LEVEL - 1)) || lname "EMPLOYEE", 

       emp_id, manager_emp_id

FROM employee

START WITH emp_id = 7839

CONNECT BY PRIOR emp_id = manager_emp_id;



     LEVEL EMPLOYEE                 EMP_ID MANAGER_EMP_ID

---------- -------------------- ---------- --------------

         1 KING                       7839           7654

         2   JONES                    7566           7839

         3     SCOTT                  7788           7566

         4       ADAMS                7876           7788

         3     FORD                   7902           7566

         4       SMITH                7369           7902

         2   BLAKE                    7698           7839

         3     ALLEN                  7499           7698

         3     WARD                   7521           7698

         3     MARTIN                 7654           7698

         4       KING                 7839           7654

         5         JONES              7566           7839

         6           SCOTT            7788           7566

         7             ADAMS          7876           7788

         6           FORD             7902           7566

ERROR:

ORA-01436: CONNECT BY loop in user data







15 rows selected.

Other than the error, notice that the whole tree starting with KING starts repeating under MARTIN. This is erroneous and confusing. Use the NOCYCLE keyword in the CONNECT BY clause to get rid of the error message, and to prevent the listing of erroneously cyclic data:

SELECT LEVEL, LPAD('  ',2*(LEVEL - 1)) || lname "EMPLOYEE", 

       emp_id, manager_emp_id

FROM employee

START WITH emp_id = 7839

CONNECT BY NOCYCLE PRIOR emp_id = manager_emp_id;



     LEVEL EMPLOYEE                 EMP_ID MANAGER_EMP_ID

---------- -------------------- ---------- --------------

         1 KING                       7839           7654

         2   JONES                    7566           7839

         3     SCOTT                  7788           7566

         4       ADAMS                7876           7788

         3     FORD                   7902           7566

         4       SMITH                7369           7902

         2   BLAKE                    7698           7839

         3     ALLEN                  7499           7698

         3     WARD                   7521           7698

         3     MARTIN                 7654           7698

         3     TURNER                 7844           7698

         3     JAMES                  7900           7698

         2   CLARK                    7782           7839

         3     MILLER                 7934           7782

This query recognizes that there is a cycle, ignores the cycle (as an impact of the NOCYCLE keyword), and returns the rows as if there were no cycles. Having the ability to query data containing cycles, your next problem is to identify those cycles.

You can use the NOCYCLE keyword regardless of whether you have a cycle in your data.


8.6.3 Identifying Cycles

It is sometimes difficult to identify cycles in hierarchical data. Oracle Database 10g's new pseudocolumn, CONNECT_BY_ISCYCLE, can help you identify the cycles in the data easily. CONNECT_BY_ISCYCLE can be used only in conjunction with the NOCYCLE keyword in a hierarchical query. The CONNECT_BY_ISCYCLE pseudocolumn returns 1 if the current row has a child that is also its ancestor; otherwise, it returns 0. For example:

SELECT lname, CONNECT_BY_ISCYCLE

FROM employee

START WITH emp_id = 7839

CONNECT BY NOCYCLE PRIOR emp_id = manager_emp_id;



LNAME                CONNECT_BY_ISCYCLE

-------------------- ------------------

KING                                  0

JONES                                 0

SCOTT                                 0

ADAMS                                 0

FORD                                  0

SMITH                                 0

BLAKE                                 0

ALLEN                                 0

WARD                                  0

MARTIN                                1

TURNER                                0

JAMES                                 0

CLARK                                 0

MILLER                                0

Since MARTIN is KING's manager in this data set, and MARTIN also comes under KING in the organization tree, the row for MARTIN has the value 1 for CONNECT_BY_ISCYCLE.

For correct results in subsequent queries, you should revert our example data back to its original state by rolling back the earlier change that forced a cycle in the data. If you have already committed the previous UPDATE, you should update the employee table again to set the manager_emp_id column to NULL for KING.


8.6.4 Identifying Leaf Nodes

In a tree structure, the nodes at the lowest level of the tree are referred to as leaf nodes. Leaf nodes have no children. CONNECT_BY_ISLEAF is a pseudocolumn that returns 1 if the current row is a leaf, and returns 0 if the current row is not a leaf. For example:

SELECT lname, CONNECT_BY_ISLEAF

FROM employee

START WITH manager_emp_id IS NULL

CONNECT BY PRIOR emp_id = manager_emp_id;



LNAME           CONNECT_BY_ISLEAF

--------------- -----------------

KING                            0

JONES                           0

SCOTT                           0

ADAMS                           1

FORD                            0

SMITH                           1

BLAKE                           0

ALLEN                           1

WARD                            1

MARTIN                          1

TURNER                          1

JAMES                           1

CLARK                           0

MILLER                          1

This new feature can help simplify SQL statements that need to identify all the leaf nodes in a hierarchy. Without this pseudocolumn, to identify the leaf nodes, you would write a query like the following:

SELECT emp_id, lname, salary, hire_date

FROM employee e

WHERE NOT EXISTS 

(SELECT emp_id FROM employee e1 WHERE e.emp_id = e1.manager_emp_id);



 EMP_ID LNAME               SALARY HIRE_DATE

------- --------------- ---------- ---------

   7369 SMITH                  800 17-DEC-80

   7499 ALLEN                 1600 20-FEB-81

   7521 WARD                  1250 22-FEB-81

   7654 MARTIN                1250 28-SEP-81

   7844 TURNER                1500 08-SEP-81

   7876 ADAMS                 1100 23-MAY-87

   7900 JAMES                  950 03-DEC-81

   7934 MILLER                1300 23-JAN-82

However, you can make this query much simpler by using the new pseudocolumn CONNECT_BY_ISLEAF, as shown here:

SELECT emp_id, lname, salary, hire_date

FROM employee e

WHERE CONNECT_BY_ISLEAF = 1

START WITH manager_emp_id IS NULL

CONNECT BY PRIOR emp_id = manager_emp_id;



EMP_ID LNAME                SALARY HIRE_DATE

------- --------------- ---------- ---------

   7876 ADAMS                 1100 23-MAY-87

   7369 SMITH                  800 17-DEC-80

   7499 ALLEN                 1600 20-FEB-81

   7521 WARD                  1250 22-FEB-81

   7654 MARTIN                1250 28-SEP-81

   7844 TURNER                1500 08-SEP-81

   7900 JAMES                  950 03-DEC-81

   7934 MILLER                1300 23-JAN-82

This query builds the complete organization tree, and filters out only the leaf nodes by performing the check CONNECT_BY_ISLEAF = 1.