Let's look at an example to understand how we can represent hierarchical information in a relational database. As a basis for the example, we'll use an organization chart showing how one employee reports to another within a large organization, as shown in Figure 8-1.
Figure 8-1 represents a hierarchy of employees. The information regarding an employee, his manager, and the reporting relationship need to be represented in one table, employee, as shown in the Entity Relationship Diagram in Figure 8-2.
In Figure 8-2, the employee table refers to itself. The column manager_emp_id refers to the emp_id column of the same table. To represent hierarchical data, you need to make use of a relationship such as when one column of a table references another column of the same table. When such a relationship is implemented using a database constraint, it is known as self-referential integrity constraint. The corresponding CREATE TABLE statement will look as follows:
CREATE TABLE employee ( emp_id NUMBER (4) CONSTRAINT emp_pk PRIMARY KEY, fname VARCHAR2 (15) NOT NULL, lname VARCHAR2 (15) NOT NULL, dept_id NUMBER (2) NOT NULL, manager_emp_id NUMBER (4) CONSTRAINT emp_fk REFERENCES employee(emp_id), salary NUMBER (7,2) NOT NULL, hire_date DATE NOT NULL, job_id NUMBER (3));
As a basis for the examples in this chapter, we'll use the following sample data:
SELECT emp_id, lname, dept_id, manager_emp_id, salary, hire_date FROM employee; EMP_ID LNAME DEPT_ID MANAGER_EMP_ID SALARY HIRE_DATE --------- ---------- --------- -------------- --------- --------- 7369 SMITH 20 7902 800 17-DEC-80 7499 ALLEN 30 7698 1600 20-FEB-81 7521 WARD 30 7698 1250 22-FEB-81 7566 JONES 20 7839 2000 02-APR-81 7654 MARTIN 30 7698 1250 28-SEP-81 7698 BLAKE 30 7839 2850 01-MAY-80 7782 CLARK 10 7839 2450 09-JUN-81 7788 SCOTT 20 7566 3000 19-APR-87 7839 KING 10 5000 17-NOV-81 7844 TURNER 30 7698 1500 08-SEP-81 7876 ADAMS 20 7788 1100 23-MAY-87 7900 JAMES 30 7698 950 03-DEC-81 7902 FORD 20 7566 3000 03-DEC-81 7934 MILLER 10 7782 1300 23-JAN-82
The employee table has two important aspects:
The column manager_emp_id
The emp_fk constraint
The column manager_emp_id stores the emp_id of the employee's manager. For example, the manager_emp_id for Smith is 7902, which means that Ford is Smith's manager. The employee King doesn't have a manager_emp_id, which indicates that King is the uppermost employee. To be able to represent the uppermost employee, the manager_emp_id column must be nullable.
There is a foreign key constraint on the manager_emp_id column. This enforces the rule that any value put in the manager_emp_id column must be the emp_id of a valid employee. Such a constraint is not mandatory when representing hierarchical information. However, it is a good practice to define database constraints to enforce such business rules.
Before moving on to the following sections on manipulating hierarchies, we will introduce some hierarchy terminology. The following list defines terms that we'll use often when working with hierarchical data:
A row in a table that represents a specific entry in a hierarchical tree structure. For example, in Figure 8-1 each employee is considered to be a node.
A node that is one level up in a tree. In Figure 8-1, King is the parent of Blake, and Blake is the parent of Martin. The term parent node is sometimes used in place of just parent.
A node that is one level down in a tree. In Figure 8-1, Blake is a child of King. Blake, in turn, has five children: Allen, Ward, Martin, Turner, and James. The term child node is sometimes used in place of just child.
The uppermost node in a hierarchical structure. The definition of a root is that it has no parent. In Figure 8-1, King is the root. You can have only one root in any given tree, but it's worth noting that you can have multiple trees in a hierarchical table. If our employee table stored information on employees from multiple companies, we would have one root per company. The term root node is sometimes used in place of root.
A node with no children, and sometimes called a leaf node. Leaf nodes are the antitheses of root nodes, and represent the lowest levels of a tree structure. The leaf nodes in Figure 8-1 are Adams, Smith, Allen, Ward, Martin, Turner, James, and Miller. Leaf nodes do not all need to be at the same level, but they do need to be without children.
A layer of nodes. In Figure 8-1, King constitutes one level. Jones, Blake, and Clark constitute the next level down, and so forth.