Oracle Database 10g introduces some new features for hierarchical queries. The new features include the CONNECT_BY_ROOT operator, the new CONNECT_BY_ISCYCLE and CONNECT_BY_ISLEAF pseudocolumns, and the NOCYCLE keyword. We will discuss each of these enhancements in the following sections.
Remember how you can use the PRIOR operator to retrieve a value from a node's parent row? You can now use the CONNECT_BY_ROOT operator to retrieve a value from a node's root. For example:
SELECT lname "Employee", CONNECT_BY_ROOT lname "Top Manager" FROM employee START WITH manager_emp_id = 7839 CONNECT BY PRIOR emp_id = manager_emp_id; Employee Top Manager -------------------- ------------ JONES JONES SCOTT JONES ADAMS JONES FORD JONES SMITH JONES BLAKE BLAKE ALLEN BLAKE WARD BLAKE MARTIN BLAKE TURNER BLAKE JAMES BLAKE CLARK CLARK MILLER CLARK
In this example, the hierarchy is built by starting with the rows that meet the condition manager_emp_id = 7839. This means that anyone whose manager is 7839 will be considered a root for this query. Those employees will be listed in the result set of the query along with the name of the top-most manager in their tree. The CONNECT_BY_ROOT operator returns that top-most manager name by accessing the root row for each row returned by the query.
Cycles are not allowed in a true tree structure. But life is not perfect, and someday you're bound to encounter hierarchical data containing cycles in which a node's child is also its parent. Such cycles are usually not good, need to be fixed, but can be frustratingly difficult to identify. You can try to find cycles by issuing a START WITH . . . CONNECT BY query, but such a query will report an error if there is a cycle (also known as a loop) in the data. In Oracle Database 10g, all this changes.
To allow the START WITH . . . CONNECT BY construct to work properly even if cycles are present in the data, Oracle Database 10g provides the new NOCYCLE keyword. If there are cycles in your data, you can use the NOCYCLE keyword in the CONNECT BY clause, and you will not get an error when hierarchically querying that data.
The test data we have in the employee table doesn't have a cycle. To test the NOCYCLE feature, you can introduce a cycle into the existing employee data by updating the manager_emp_id column of the top-most employee (KING with emp_id=7839) with the manager_emp_id of one of the lowest level employees (MARTIN with emp_id = 7654):
UPDATE employee SET manager_emp_id = 7654 WHERE manager_emp_id IS NULL;
Now, if you perform the following hierarchical query, you will get an ORA-01436 error:
SELECT LEVEL, LPAD(' ',2*(LEVEL - 1)) || lname "EMPLOYEE", emp_id, manager_emp_id FROM employee START WITH emp_id = 7839 CONNECT BY PRIOR emp_id = manager_emp_id; LEVEL EMPLOYEE EMP_ID MANAGER_EMP_ID ---------- -------------------- ---------- -------------- 1 KING 7839 7654 2 JONES 7566 7839 3 SCOTT 7788 7566 4 ADAMS 7876 7788 3 FORD 7902 7566 4 SMITH 7369 7902 2 BLAKE 7698 7839 3 ALLEN 7499 7698 3 WARD 7521 7698 3 MARTIN 7654 7698 4 KING 7839 7654 5 JONES 7566 7839 6 SCOTT 7788 7566 7 ADAMS 7876 7788 6 FORD 7902 7566 ERROR: ORA-01436: CONNECT BY loop in user data 15 rows selected.
Other than the error, notice that the whole tree starting with KING starts repeating under MARTIN. This is erroneous and confusing. Use the NOCYCLE keyword in the CONNECT BY clause to get rid of the error message, and to prevent the listing of erroneously cyclic data:
SELECT LEVEL, LPAD(' ',2*(LEVEL - 1)) || lname "EMPLOYEE", emp_id, manager_emp_id FROM employee START WITH emp_id = 7839 CONNECT BY NOCYCLE PRIOR emp_id = manager_emp_id; LEVEL EMPLOYEE EMP_ID MANAGER_EMP_ID ---------- -------------------- ---------- -------------- 1 KING 7839 7654 2 JONES 7566 7839 3 SCOTT 7788 7566 4 ADAMS 7876 7788 3 FORD 7902 7566 4 SMITH 7369 7902 2 BLAKE 7698 7839 3 ALLEN 7499 7698 3 WARD 7521 7698 3 MARTIN 7654 7698 3 TURNER 7844 7698 3 JAMES 7900 7698 2 CLARK 7782 7839 3 MILLER 7934 7782
This query recognizes that there is a cycle, ignores the cycle (as an impact of the NOCYCLE keyword), and returns the rows as if there were no cycles. Having the ability to query data containing cycles, your next problem is to identify those cycles.
|
It is sometimes difficult to identify cycles in hierarchical data. Oracle Database 10g's new pseudocolumn, CONNECT_BY_ISCYCLE, can help you identify the cycles in the data easily. CONNECT_BY_ISCYCLE can be used only in conjunction with the NOCYCLE keyword in a hierarchical query. The CONNECT_BY_ISCYCLE pseudocolumn returns 1 if the current row has a child that is also its ancestor; otherwise, it returns 0. For example:
SELECT lname, CONNECT_BY_ISCYCLE FROM employee START WITH emp_id = 7839 CONNECT BY NOCYCLE PRIOR emp_id = manager_emp_id; LNAME CONNECT_BY_ISCYCLE -------------------- ------------------ KING 0 JONES 0 SCOTT 0 ADAMS 0 FORD 0 SMITH 0 BLAKE 0 ALLEN 0 WARD 0 MARTIN 1 TURNER 0 JAMES 0 CLARK 0 MILLER 0
Since MARTIN is KING's manager in this data set, and MARTIN also comes under KING in the organization tree, the row for MARTIN has the value 1 for CONNECT_BY_ISCYCLE.
|
In a tree structure, the nodes at the lowest level of the tree are referred to as leaf nodes. Leaf nodes have no children. CONNECT_BY_ISLEAF is a pseudocolumn that returns 1 if the current row is a leaf, and returns 0 if the current row is not a leaf. For example:
SELECT lname, CONNECT_BY_ISLEAF FROM employee START WITH manager_emp_id IS NULL CONNECT BY PRIOR emp_id = manager_emp_id; LNAME CONNECT_BY_ISLEAF --------------- ----------------- KING 0 JONES 0 SCOTT 0 ADAMS 1 FORD 0 SMITH 1 BLAKE 0 ALLEN 1 WARD 1 MARTIN 1 TURNER 1 JAMES 1 CLARK 0 MILLER 1
This new feature can help simplify SQL statements that need to identify all the leaf nodes in a hierarchy. Without this pseudocolumn, to identify the leaf nodes, you would write a query like the following:
SELECT emp_id, lname, salary, hire_date FROM employee e WHERE NOT EXISTS (SELECT emp_id FROM employee e1 WHERE e.emp_id = e1.manager_emp_id); EMP_ID LNAME SALARY HIRE_DATE ------- --------------- ---------- --------- 7369 SMITH 800 17-DEC-80 7499 ALLEN 1600 20-FEB-81 7521 WARD 1250 22-FEB-81 7654 MARTIN 1250 28-SEP-81 7844 TURNER 1500 08-SEP-81 7876 ADAMS 1100 23-MAY-87 7900 JAMES 950 03-DEC-81 7934 MILLER 1300 23-JAN-82
However, you can make this query much simpler by using the new pseudocolumn CONNECT_BY_ISLEAF, as shown here:
SELECT emp_id, lname, salary, hire_date FROM employee e WHERE CONNECT_BY_ISLEAF = 1 START WITH manager_emp_id IS NULL CONNECT BY PRIOR emp_id = manager_emp_id; EMP_ID LNAME SALARY HIRE_DATE ------- --------------- ---------- --------- 7876 ADAMS 1100 23-MAY-87 7369 SMITH 800 17-DEC-80 7499 ALLEN 1600 20-FEB-81 7521 WARD 1250 22-FEB-81 7654 MARTIN 1250 28-SEP-81 7844 TURNER 1500 08-SEP-81 7900 JAMES 950 03-DEC-81 7934 MILLER 1300 23-JAN-82
This query builds the complete organization tree, and filters out only the leaf nodes by performing the check CONNECT_BY_ISLEAF = 1.