An aggregate function summarizes the results of an expression over a number of rows, returning a single value. The general syntax for most of the aggregate functions is as follows:

([DISTINCT | ALL]aggregate_function)expression

The syntax elements are:

*aggregate_function*-
Gives the name of the function?e.g., SUM, COUNT, AVG, MAX, MIN

*DISTINCT*-
Specifies that the aggregate function should consider only distinct values of the argument expression.

*ALL*-
Specifies that the aggregate function should consider all values, including all duplicate values, of the argument expression. The default is ALL.

*expression*-
Specifies a column, or any other expression, on which you want to perform the aggregation.

Let's look at a simple example. The following SQL uses the MAX function to find the maximum salary of all employees:

SELECT MAX(salary) FROM employee;MAX(SALARY) ----------- 5000

In subsequent sections, we use a series of slightly more involved
examples that illustrate various aspects of aggregate function
behavior. For those examples, we use the following
`cust_order` table:

DESC cust_orderName Null? Type -------------------------------- -------- -------------- ORDER_NBR NOT NULL NUMBER(7) CUST_NBR NOT NULL NUMBER(5) SALES_EMP_ID NOT NULL NUMBER(5) SALE_PRICE NUMBER(9,2) ORDER_DT NOT NULL DATE EXPECTED_SHIP_DT NOT NULL DATE CANCELLED_DT DATE SHIP_DT DATE STATUS VARCHAR2(20)SELECT order_nbr, cust_nbr, sales_emp_id, sale_price,order_dt, expected_ship_dtFROM cust_order;ORDER_NBR CUST_NBR SALES_EMP_ID SALE_PRICE ORDER_DT EXPECTED_ --------- -------- ------------ ---------- --------- --------- 1001 1 7354 99 22-JUL-01 23-JUL-01 1000 1 7354 19-JUL-01 24-JUL-01 1002 5 7368 12-JUL-01 25-JUL-01 1003 4 7654 56 16-JUL-01 26-JUL-01 1004 4 7654 34 18-JUL-01 27-JUL-01 1005 8 7654 99 22-JUL-01 24-JUL-01 1006 1 7354 22-JUL-01 28-JUL-01 1007 5 7368 25 20-JUL-01 22-JUL-01 1008 5 7368 25 21-JUL-01 23-JUL-01 1009 1 7354 56 18-JUL-01 22-JUL-01 1012 1 7354 99 22-JUL-01 23-JUL-01 1011 1 7354 19-JUL-01 24-JUL-01 1015 5 7368 12-JUL-01 25-JUL-01 1017 4 7654 56 16-JUL-01 26-JUL-01 1019 4 7654 34 18-JUL-01 27-JUL-01 1021 8 7654 99 22-JUL-01 24-JUL-01 1023 1 7354 22-JUL-01 28-JUL-01 1025 5 7368 25 20-JUL-01 22-JUL-01 1027 5 7368 25 21-JUL-01 23-JUL-01 1029 1 7354 56 18-JUL-01 22-JUL-01 20 rows selected.

Notice that the
column `sale_price`
in the
`cust_order`
table is nullable, and that it
contains NULL values for some rows. To examine the effect of NULLs in
an aggregate function, execute the following SQL:

SELECT COUNT(*), COUNT(sale_price) FROM cust_order;COUNT(*) COUNT(SALE_PRICE) -------- ----------------- 20 14

Notice the difference in the output of `COUNT(*)`
and `COUNT(sale_price)`. This is because
`COUNT(sale_price)` ignores NULLs, whereas
`COUNT(*)` doesn't. The reason
`COUNT(*)` doesn't ignore NULLs is
because it counts rows, not column values. The concept of NULL
doesn't apply to a row as a whole. Other than
`COUNT(*)`, there is only one other aggregate
function that doesn't ignore NULLs, and that is
GROUPING. All other aggregate functions ignore NULLs. We will discuss
GROUPING in Chapter 13. For now,
let's examine the effect of NULLs when they are
ignored.

SUM, MAX, MIN, AVG, etc., all ignore NULLs. Therefore, if you are
trying to find a value such as the average sale price in the
`cust_order` table, the average will be of the 14
rows that have a value for that column. The following example shows
the count of all rows, the total of all sale prices, and the average
of all sale prices:

SELECT COUNT(*), SUM(sale_price), AVG(sale_price)FROM cust_order;COUNT(*) SUM(SALE_PRICE) AVG(SALE_PRICE) --------------- --------------- --------------- 20 788 56.2857143

Note that `AVG(sale_price)` is not equal to
`SUM(sale_price) / COUNT(*)`. If it were, the result
of `AVG(sale_price)` would have been 788 / 20 =
39.4. But, since the AVG function ignores NULLS, it divides the total
sale price by 14, and not by 20. `AVG(sale_price)`
is equal to `SUM(sale_price) / COUNT(sale_price) (788 / 14 =
56.2857143)`.

There may be situations where you want an average to be taken over all the rows in a table, not just the rows with non-NULL values for the column in question. In those situations you have to use the NVL function within the AVG function call to assign 0 (or some other useful value) to the column in place of any NULL values. (DECODE, CASE, or the COALESCE function can be used in place of NVL. See Chapter 9 for details.) Here's an example:

SELECT AVG(NVL(sale_price,0)) FROM cust_order;AVG(NVL(SALE_PRICE,0)) ---------------------- 39.4

Notice that the use of NVL causes all 20 rows to be considered for
average computation, and the rows with NULL values for
`sale_price`
are assumed to have a 0 value for
that column.

Most
aggregate functions allow
the
use of DISTINCT or
ALL along with the expression
argument. DISTINCT allows you to disregard duplicate expression
values, while ALL causes duplicate expression values to be included
in the result. Notice that the column `cust_nbr` has
duplicate values. Observe the result of the following SQL:

SELECT COUNT(cust_nbr), COUNT(DISTINCT cust_nbr), COUNT(ALL cust_nbr)FROM cust_order;COUNT(CUST_NBR) COUNT(DISTINCTCUST_NBR) COUNT(ALLCUST_NBR) --------------- ----------------------- ------------------ 20 4 20

There are four distinct values in the `cust_nbr`
column. Therefore, `COUNT(DISTINCT cust_nbr)`
returns 4, whereas `COUNT(cust_nbr)` and
`COUNT(ALL cust_nbr)` both return 20. ALL is the
default, which means that if you don't specify
either DISTINCT or ALL before the expression argument in an aggregate
function, the function will consider all the rows that have a
non-NULL value for the expression.

An important thing to note here is that ALL doesn't
cause an aggregate function to consider NULL values. For example,
`COUNT(ALL SALE_PRICE)` in the following example
still returns 14, and not 20:

SELECT COUNT(ALL sale_price) FROM cust_order;COUNT(ALLSALE_PRICE) -------------------- 14

Since ALL is the default, you can explicitly use ALL with every aggregate function. However, the aggregate functions that take more than one argument as input don't allow the use of DISTINCT. These include CORR, COVAR_POP, COVAR_SAMP, and all the linear regression functions.

In addition, some functions that take only one argument as input don't allow the use of DISTINCT. This category includes STTDEV_POP, STDDEV_SAMP, VAR_POP, VAR_SAMP, and GROUPING.

If you try to use DISTINCT with an aggregate function that doesn't allow it, you will get an error. For example:

SELECT STDDEV_POP(DISTINCT sale_price)FROM cust_order;SELECT STDDEV_POP(DISTINCT sale_price) * ERROR at line 1: ORA-30482: DISTINCT option not allowed for this function

However, using ALL with such a function doesn't cause any error. For example:

SELECT STDDEV_POP(ALL sale_price)FROM cust_order;STDDEV_POP(ALLSALE_PRICE) ------------------------- 29.5282639