7.1 Set Operators

The following list briefly describes the four set operations supported by Oracle SQL:


UNION ALL

Combines the results of two SELECT statements into one result set.


UNION

Combines the results of two SELECT statements into one result set, and then eliminates any duplicate rows from that result set.


MINUS

Takes the result set of one SELECT statement, and removes those rows that are also returned by a second SELECT statement. Duplicate rows are eliminated.


INTERSECT

Returns only those rows that are returned by each of two SELECT statements. Duplicate rows are eliminated.

Before moving on to the details on these set operators, let's look at the following two queries, which we'll use as component queries in our subsequent examples. The first query retrieves all the customers in region 5:

SELECT cust_nbr, name

FROM customer

WHERE region_id = 5;



  CUST_NBR NAME

---------- ------------------------------

         1 Cooper Industries

         2 Emblazon Corp.

         3 Ditech Corp.

         4 Flowtech Inc.

         5 Gentech Industries

The second query retrieves all the customers with the sales representative 'MARTIN':

SELECT c.cust_nbr, c.name

FROM customer c

WHERE c.cust_nbr IN (SELECT o.cust_nbr

                     FROM cust_order o, employee e

                     WHERE o.sales_emp_id = e.emp_id

                     AND e.lname = 'MARTIN');



  CUST_NBR NAME

---------- ------------------------------

         4 Flowtech Inc.

         8 Zantech Inc.

If you look at the results returned by these two queries, you will notice that there is one common row (for Flowtech Inc.). The following sections discuss the effects of the various set operations between these two result sets.

7.1.1 UNION ALL

The UNION ALL operator merges the result sets of two component queries. This operation returns rows retrieved by either of the component queries, without eliminating duplicates. The following example illustrates the UNION ALL operation:

SELECT cust_nbr, name

FROM customer

WHERE region_id = 5

UNION ALL

SELECT c.cust_nbr, c.name

FROM customer c

WHERE c.cust_nbr IN (SELECT o.cust_nbr

                     FROM cust_order o, employee e

                     WHERE o.sales_emp_id = e.emp_id

                     AND e.lname = 'MARTIN');



  CUST_NBR NAME

---------- ------------------------------

         1 Cooper Industries

         2 Emblazon Corp.

         3 Ditech Corp.

         4 Flowtech Inc.

         5 Gentech Industries

         4 Flowtech Inc.

         8 Zantech Inc.



7 rows selected.

As you can see from the result set, there is one customer, which is retrieved by both the SELECTs, and therefore appears twice in the result set. The UNION ALL operator simply merges the output of its component queries, without caring about any duplicates in the final result set.

7.1.2 UNION

The UNION operator returns all distinct rows retrieved by two component queries. The UNION operation eliminates duplicates while merging rows retrieved by either of the component queries. The following example illustrates the UNION operation:

SELECT cust_nbr, name

FROM customer

WHERE region_id = 5

UNION

SELECT c.cust_nbr, c.name

FROM customer c

WHERE c.cust_nbr IN (SELECT o.cust_nbr

                     FROM cust_order o, employee e

                     WHERE o.sales_emp_id = e.emp_id

                     AND e.lname = 'MARTIN');



  CUST_NBR NAME

---------- ------------------------------

         1 Cooper Industries

         2 Emblazon Corp.

         3 Ditech Corp.

         4 Flowtech Inc.

         5 Gentech Industries

         8 Zantech Inc.



6 rows selected.

This query is a modification of the query from the preceding section; the keywords UNION ALL have been replaced with UNION. Now, the result set contains only distinct rows (no duplicates). To eliminate duplicate rows, a UNION operation needs to do some extra tasks as compared to the UNION ALL operation. These extra tasks include sorting and filtering the result set. If you observe carefully, you will notice that the result set of the UNION ALL operation is not sorted, whereas the result set of the UNION operation is sorted. (The result set of a UNION is sorted on the combination of all the columns in the SELECT list. In the preceeding example, the UNION result set will be sorted on the combination cust_nbr and name.) These extra tasks introduce a performance overhead to the UNION operation. A query involving UNION will take more time than the same query with UNION ALL, even if there are no duplicates to remove.

Unless you have a valid need to retrieve only distinct rows, use UNION ALL instead of UNION for better performance.


7.1.3 INTERSECT

INTERSECT returns only the rows retrieved by both component queries. Compare this with UNION, which returns the rows retrieved by any of the component queries. If UNION acts like "OR," INTERSECT acts like "AND." For example:

SELECT cust_nbr, name

FROM customer

WHERE region_id = 5

INTERSECT

SELECT c.cust_nbr, c.name

FROM customer c

WHERE c.cust_nbr IN (SELECT o.cust_nbr

                     FROM cust_order o, employee e

                     WHERE o.sales_emp_id = e.emp_id

                     AND e.lname = 'MARTIN');



  CUST_NBR NAME

---------- ------------------------------

         4 Flowtech Inc.

As you saw earlier, "Flowtech Inc." was the only customer retrieved by both SELECT statements. Therefore, the INTERSECT operator returns just that one row.

7.1.4 MINUS

MINUS returns all rows from the first SELECT that are not also returned by the second SELECT.

Oracle's use of MINUS does not follow the ANSI/ISO SQL standard. The corresponding ANSI/ISO SQL keyword is EXCEPT.


The following example illustrates how MINUS works:

SELECT cust_nbr, name

FROM customer

WHERE region_id = 5

MINUS

SELECT c.cust_nbr, c.name

FROM customer c

WHERE c.cust_nbr IN (SELECT o.cust_nbr

                     FROM cust_order o, employee e

                     WHERE o.sales_emp_id = e.emp_id

                     AND e.lname = 'MARTIN');



  CUST_NBR NAME

---------- ------------------------------

         1 Cooper Industries

         2 Emblazon Corp.

         3 Ditech Corp.

         5 Gentech Industries

You might wonder why you don't see "Zantech Inc." in the output. An important thing to note here is that the execution order of component queries in a set operation is from top to bottom. The results of UNION, UNION ALL, and INTERSECT will not change if you alter the ordering of component queries. However, the result of MINUS will be different if you alter the order of the component queries. If you rewrite the previous query by switching the positions of the two SELECTs, you get a completely different result:

SELECT c.cust_nbr, c.name

FROM customer c

WHERE c.cust_nbr IN (SELECT o.cust_nbr

                     FROM cust_order o, employee e

                     WHERE o.sales_emp_id = e.emp_id

                     AND e.lname = 'MARTIN')

MINUS

SELECT cust_nbr, name

FROM customer

WHERE region_id = 5;



  CUST_NBR NAME

---------- ------------------------------

         8 Zantech Inc.

In the second MINUS example, the first component query adds "Flowtech Inc." and "Zantech Inc." to the result set while the second component query removes "Flowtech Inc.", leaving " startref="ch07-idx-1000003842-2"/>"Zantech Inc." as the sole remaining row.

In a MINUS operation, rows may be returned by the second SELECT that are not also returned by the first. These rows are not included in the output.