The grouping examples you have seen so far represent simple ways of aggregating data using Oracle's extensions of the GROUP BY clause. These simple mechanisms were introduced in Oracle8i. In Oracle9i Database, Oracle enhanced this new functionality in some interesting and useful ways. Oracle now allows for:
Repeating column names in the GROUP BY clause
Grouping on composite columns
Concatenated groupings
In Oracle8i, repeating column names are not allowed in a GROUP BY clause. If the GROUP BY clause contains an extension (i.e., ROLLUP or CUBE), you cannot use the same column inside the extension as well as outside the extension. The following SQL is invalid in Oracle8i:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY o.year, ROLLUP (o.year, o.month, r.name); * ERROR at line 6: ORA-30490: Ambiguous expression in GROUP BY ROLLUP or CUBE list
However, the same query works in Oracle9i Database and later:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY o.year, ROLLUP (o.year, o.month, r.name); YEAR MONTH REGION TOTAL ---------- --------- -------------------- ---------- 2000 January Mid-Atlantic 1221394 2000 January New England 1018430 2000 January Southeast US 758042 2000 January 2997866 2000 February Mid-Atlantic 857352 2000 February New England 1231492 2000 February Southeast US 1236846 2000 February 3325690 2000 March Mid-Atlantic 1274062 2000 March New England 1132966 2000 March Southeast US 1311986 2000 March 3719014 2001 January Mid-Atlantic 610697 2001 January New England 509215 2001 January Southeast US 379021 2001 January 1498933 2001 February Mid-Atlantic 428676 2001 February New England 615746 2001 February Southeast US 618423 2001 February 1662845 2001 March Mid-Atlantic 637031 2001 March New England 566483 2001 March Southeast US 655993 2001 March 1859507 2000 10042570 2001 5021285 2000 10042570 2001 5021285 28 rows selected.
Repetition of o.year in the GROUP BY clause as well as in the ROLLUP operation repeats the summary rows of each year in the output and suppresses the grand total. Repetition of column names in a GROUP BY clause isn't very useful, but it's worth knowing that such constructs are allowed in Oracle9i and later.
Oracle8i supports grouping on individual columns only. Oracle9i extends the grouping operations to include grouping on composite columns. A composite column is a collection of two or more columns, but their values are treated as one for the grouping computation. Oracle8i allows group operations of the form ROLLUP (a,b,c), while, Oracle9i allows group operations of the form ROLLUP (a,(b,c)) as well. In this case, (b,c) is treated as one column for the purpose of the grouping computation. For example:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY ROLLUP ((o.year, o.month),r.name); YEAR MONTH REGION TOTAL ---------- --------- -------------------- ---------- 2000 January Mid-Atlantic 1221394 2000 January New England 1018430 2000 January Southeast US 758042 2000 January 2997866 2000 February Mid-Atlantic 857352 2000 February New England 1231492 2000 February Southeast US 1236846 2000 February 3325690 2000 March Mid-Atlantic 1274062 2000 March New England 1132966 2000 March Southeast US 1311986 2000 March 3719014 2001 January Mid-Atlantic 610697 2001 January New England 509215 2001 January Southeast US 379021 2001 January 1498933 2001 February Mid-Atlantic 428676 2001 February New England 615746 2001 February Southeast US 618423 2001 February 1662845 2001 March Mid-Atlantic 637031 2001 March New England 566483 2001 March Southeast US 655993 2001 March 1859507 15063855 25 rows selected.
In this example, two columns (o.year, o.month) are treated as one composite column. This causes Oracle to treat the combination of year and month as one dimension, and the summary rows are computed accordingly. Although this query is not allowed in Oracle8i, you can fake composite column groupings in Oracle8i by using the concatenation operator (||) to combine two columns and treat the result as one composite column. Oracle8i can then produce the same result as the previous query in Oracle 9i. For example:
SELECT TO_CHAR(o.year)||' '||TO_CHAR(TO_DATE(o.month,'MM'),'Month') Year_Month, r.name region, SUM(o.tot_sales) FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY ROLLUP (TO_CHAR(o.year)||' '|| TO_CHAR(TO_DATE(o.month,'MM'),'Month'), r.name); YEAR_MONTH REGION SUM(O.TOT_SALES) -------------------- -------------------- ---------------- 2000 February Mid-Atlantic 857352 2000 February New England 1231492 2000 February Southeast US 1236846 2000 February 3325690 2000 January Mid-Atlantic 1221394 2000 January New England 1018430 2000 January Southeast US 758042 2000 January 2997866 2000 March Mid-Atlantic 1274062 2000 March New England 1132966 2000 March Southeast US 1311986 2000 March 3719014 2001 February Mid-Atlantic 428676 2001 February New England 615746 2001 February Southeast US 618423 2001 February 1662845 2001 January Mid-Atlantic 610697 2001 January New England 509215 2001 January Southeast US 379021 2001 January 1498933 2001 March Mid-Atlantic 637031 2001 March New England 566483 2001 March Southeast US 655993 2001 March 1859507 15063855 25 rows selected.
This query converts the numeric month into the string expression of the name of the month and concatenates it with the string representation of the year. The same expression has to be used in the SELECT list and the ROLLUP clause. The expression TO_CHAR(o.year)||' '||TO_CHAR(TO_DATE( o.month,'MM'),'Month') is treated as one composite column.
With Oracle9i and later, you can have multiple ROLLUP, CUBE, or GROUPING SETS operations, or a combination of these under the GROUP BY clause in a query. This is not allowed in Oracle8i. You will get an error message if you attempt the following query in Oracle8i:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY ROLLUP (o.year, o.month), ROLLUP(r.name); * ERROR at line 6: ORA-30489: Cannot have more than one rollup/cube expression list
However, the same query works in Oracle9i and later:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY ROLLUP (o.year, o.month), ROLLUP(r.name); YEAR MONTH REGION TOTAL ---------- --------- -------------------- ---------- 2000 January Mid-Atlantic 1221394 2000 January New England 1018430 2000 January Southeast US 758042 2000 January 2997866 2000 February Mid-Atlantic 857352 2000 February New England 1231492 2000 February Southeast US 1236846 2000 February 3325690 2000 March Mid-Atlantic 1274062 2000 March New England 1132966 2000 March Southeast US 1311986 2000 March 3719014 2000 Mid-Atlantic 3352808 2000 New England 3382888 2000 Southeast US 3306874 2000 10042570 2001 January Mid-Atlantic 610697 2001 January New England 509215 2001 January Southeast US 379021 2001 January 1498933 2001 February Mid-Atlantic 428676 2001 February New England 615746 2001 February Southeast US 618423 2001 February 1662845 2001 March Mid-Atlantic 637031 2001 March New England 566483 2001 March Southeast US 655993 2001 March 1859507 2001 Mid-Atlantic 1676404 2001 New England 1691444 2001 Southeast US 1653437 2001 5021285 Mid-Atlantic 5029212 New England 5074332 Southeast US 4960311 15063855 36 rows selected.
When you have multiple grouping operations (ROLLUP, CUBE, or GROUPING SETS) in a GROUP BY clause, what you have is called a concatenated grouping. The result of the concatenated grouping is to produce a cross-product of groupings from each grouping operation. Therefore, the query:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY ROLLUP(o.year), ROLLUP (o.month), ROLLUP (r.name);
behaves as a CUBE and produces the same result as the query:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY CUBE (o.year, o.month, r.name);
Since a CUBE contains aggregates for all possible combinations of the grouping columns, the concatenated grouping of CUBES is no different from a regular CUBE, and all the following queries return the same result as the query shown previously:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY CUBE (o.year, o.month), CUBE (r.name); SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY CUBE (o.year), CUBE (o.month, r.name); SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY CUBE (o.year), CUBE (o.month), CUBE (r.name);
Concatenated groupings come in handy while using GROUPING SETS. Since GROUPING SETS produces only the subtotal rows, you can specify just the aggregation levels you want in your output by using a concatenated grouping of GROUPING SETS. The concatenated grouping of GROUPING SETS (a,b) and GROUPING SETS (c,d) will produce aggregate rows for the aggregation levels (a,c), (a,d), (b,c), and (b,d). The concatenated grouping of GROUPING SETS (a,b) and GROUPING SETS (c) will produce aggregate rows for the aggregation levels (a,c) and (b,c). For example:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (o.year, o.month), GROUPING SETS (r.name); YEAR MONTH REGION TOTAL ---------- --------- -------------------- ---------- 2000 Mid-Atlantic 3352808 2000 New England 3382888 2000 Southeast US 3306874 2001 Mid-Atlantic 1676404 2001 New England 1691444 2001 Southeast US 1653437 January Mid-Atlantic 1832091 January New England 1527645 January Southeast US 1137063 February Mid-Atlantic 1286028 February New England 1847238 February Southeast US 1855269 March Mid-Atlantic 1911093 March New England 1699449 March Southeast US 1967979 15 rows selected.
The concatenated grouping GROUP BY GROUPING SETS (O.YEAR, O.MONTH), GROUPING SETS (R.NAME) in this example produces rows for aggregate levels (O.YEAR, R.NAME) and (O.MONTH, R.NAME). Therefore, you see aggregate rows for (Year, Region) and (Month, Region) combinations in the output. The following example extends the previous query:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (o.year, o.month), GROUPING SETS (o.year, r. name); YEAR MONTH REGION TOTAL ---------- --------- -------------------- ---------- 1: 2000 10042570 2: 2001 5021285 3: 2000 January 2997866 4: 2000 February 3325690 5: 2000 March 3719014 6: 2001 January 1498933 7: 2001 February 1662845 8: 2001 March 1859507 9: 2000 Mid-Atlantic 3352808 10: 2000 New England 3382888 11: 2000 Southeast US 3306874 12: 2001 Mid-Atlantic 1676404 13: 2001 New England 1691444 14: 2001 Southeast US 1653437 15: January Mid-Atlantic 1832091 16: January New England 1527645 17: January Southeast US 1137063 18: February Mid-Atlantic 1286028 19: February New England 1847238 20: February Southeast US 1855269 21: March Mid-Atlantic 1911093 22: March New England 1699449 23: March Southeast US 1967979 23 rows selected.
This example produces four grouping combinations. Table 13-1 describes the various grouping combinations produced by this query and references their corresponding row numbers in the output.
Grouping combination |
Corresponding rows |
---|---|
(o.year, o.year) |
1-2 |
(o.year, r.name) |
9-14 |
(o.month, o.year) |
3-8 |
(o.month, r.name) |
15-23 |
The GROUPING SETS operation is independent of the order of columns. Therefore, the following two queries will produce the same results as shown previously:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (o.year, r.name), GROUPING SETS (o.year, o.month); SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (o.month, o.year), GROUPING SETS (r.name, o.year);
It is permissible to have a combination of ROLLUP, CUBE, and GROUPING SETS in a single GROUP BY clause, as in the following example:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (o.month, o.year), ROLLUP(r.name), CUBE (o.year);
However, the output from such queries seldom makes any sense. You should carefully evaluate the need for such a query if you intend to write one.
Unlike the ROLLUP and CUBE operations, the GROUPING SETS operation can take a ROLLUP or a CUBE as its argument. As you have seen earlier, GROUPING SETS produces only subtotal rows. However, there are times when you may need to print the grand total along with the subtotals. In such situations, you can perform the GROUPING SETS operation on ROLLUP operations, as in the following example:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (ROLLUP (o.year), ROLLUP (o.month), ROLLUP (r. name)); YEAR MONTH REGION TOTAL ---------- --------- -------------------- ---------- Mid-Atlantic 5029212 New England 5074332 Southeast US 4960311 January 4496799 February 4988535 March 5578521 2000 10042570 2001 5021285 15063855 15063855 15063855 11 rows selected.
This example produces the subtotals for each dimension, as expected from the regular GROUPING SETS operations. Also, it produces the grand total across all the dimensions. However, you get three identical grand-total rows. The grand-total rows are repeated because they are produced by each ROLLUP operation inside the GROUPING SETS. If you insist on only one grand-total row, you may use the DISTINCT keyword in the SELECT clause:
SELECT Distinct o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (ROLLUP (o.year), ROLLUP (o.month), ROLLUP (r. name)); YEAR MONTH REGION TOTAL ----- --------- -------------------- ---------- 2000 10042570 2001 5021285 February 4988535 January 4496799 March 5578521 Mid-Atlantic 5029212 New England 5074332 Southeast US 4960311 15063855 9 rows selected.
In this example, the DISTINCT keyword eliminated the duplicate grand-total rows. You can also eliminate duplicate rows by using the GROUP_ID function, as discussed later in this chapter.
If you are interested in subtotals and totals on composite dimensions, you can use composite or concatenated ROLLUP operations within GROUPING SETS, as in the following example:
SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (ROLLUP (o.year, o.month), ROLLUP(r.name)); YEAR MONTH REGION TOTAL --------- --------- -------------------- ---------- Mid-Atlantic 5029212 New England 5074332 Southeast US 4960311 2000 January 2997866 2000 February 3325690 2000 March 3719014 2000 10042570 2001 January 1498933 2001 February 1662845 2001 March 1859507 2001 5021285 15063855 15063855 13 rows selected.
This query generates subtotals for (year, month) combinations, subtotals for the region, subtotals for the year, and the grand total. Note that there are duplicate grand-total rows because of the multiple ROLLUP operations within the GROUPING SETS operation.