Cursors

Direct cursor support is new in PL/pgSQL version 7.2. Processing a result set using a cursor is similar to processing a result set using a FOR loop, but cursors offer a few distinct advantages that you'll see in a moment.

You can think of a cursor as a name for a result set. You must declare a cursor variable just as you declare any other variable. The following code snippet shows how you might declare a cursor variable:


...

DECLARE

  rental_cursor      CURSOR FOR SELECT * FROM rentals;

...

rental_cursor is declared to be a cursor for the result set of the query SELECT * FROM rentals. When you declare a variable of type CURSOR, you must include a query. The cursor variable is said to be bound to this query, and the variable is a bound cursor variable.

Before you can use a bound cursor, you must open the cursor using the OPEN statement:


...

DECLARE

  rental_cursor      CURSOR FOR SELECT * FROM rentals;

BEGIN



   OPEN rental_cursor;



...

If you try to OPEN a cursor that is already open, you will receive an error message (cursor "name" already in use). If you try to FETCH (see the section that follows) from a cursor that has not been opened, you'll receive an error message (cursor "name" is invalid). When you use a cursor, you first DECLARE it, then OPEN it, FETCH from it, and finally CLOSE it, in that order. You can repeat the OPEN, FETCH, CLOSE cycle if you want to process the cursor results again.

FETCH

After a bound cursor has been opened, you can retrieve the result set (one row at a time) using the FETCH statement. When you fetch a row from a cursor, you have to provide one or more destination variables that PL/pgSQL can stuff the results into. The syntax for the FETCH statement is


FETCH cursor-name INTO destination [ , destination [...]];

The destination (or destinations) must match the shape of a row returned by the cursor. For example, if the cursor SELECTs a row from the rentals table, there are three possible destinations:

  • A variable of type rentals%ROWTYPE

  • Three variables: one of type rentals.tape_id%TYPE, one of type rentals.customer_id%TYPE, and the last of type rentals.rental_date%TYPE

  • A variable of type RECORD

Let's look at each of these destination types in more detail.

When you FETCH into a variable of some %ROWTYPE, you can refer to the individual columns using the usual variable.column notation. For example:


...

DECLARE

  rental_cursor      CURSOR FOR SELECT * FROM rentals;

  rental             rentals%ROWTYPE;

BEGIN



   OPEN rental_cursor;



   FETCH rental_cursor INTO rental;

   --

   -- I can now access rental.tape_id,

   -- rental.customer_id, and rental.rental_date

   --

   IF ( overdue( rental.rental_date )) THEN

      ...

Next, I can FETCH into a comma-separated list of variables. In the previous example, the rental_cursor cursor will return rows that each contain three columns. Rather than fetching into a %ROWTYPE variable, I can declare three separate variables (of the appropriate types) and FETCH into those instead:


...

DECLARE

  rental_cursor      CURSOR FOR SELECT * FROM rentals;

  tape_id            rentals.tape_id%TYPE;

  customer_id        rentals.customer_id%TYPE;

  rental_date        rentals.rental_date%TYPE;

BEGIN



   OPEN rental_cursor;



   FETCH rental_cursor INTO tape_id, customer_id, rental_date;



   IF ( overdue( rental_date )) THEN

      ...

You are not required to use variables declared with %TYPE, but this is the perfect place to do so. At the time you create a function, you usually know which columns you will be interested in, and declaring variables with %TYPE will make your functions much less fragile in cases where the referenced column types might change.

You cannot combine composite variables and scalar variables in the same FETCH statement[7]:

[7] This seems like a bug to me. You may be able to combine composite and scalar variables in a future release.


...

DECLARE

  rental_cursor  CURSOR FOR SELECT *, now() - rental_date FROM rentals;

  rental         rentals%ROWTYPE;

  elapsed        INTERVAL;

  BEGIN



   OPEN rental_cursor;



   FETCH rental_cursor INTO rental, elapsed; -- WRONG! Can't combine

                                             -- composite and scalar

                                             -- variables in the same

                                             -- FETCH



   IF ( overdue( rental.rental_date )) THEN

      ...

The third type of destination that you can use with a FETCH statement is a variable of type RECORD. You may recall from earlier in this chapter that a RECORD variable is something of a chameleon?it adjusts to whatever kind of data that you put into it. For example, the following snippet uses the same RECORD variable to hold two differently shaped rows:


...

DECLARE

  rental_cursor   CURSOR FOR SELECT * FROM rentals;

  customer_cursor CURSOR FOR SELECT * FROM customers;

  my_data         RECORD;

BEGIN

   OPEN rental_cursor;

   OPEN customer_cursor;



    FETCH rental_cursor INTO my_data;

    --  I can now refer to:

    --     my_data.tape_id

    --     my_data.customer_id

    --     my_data.rental_date



    FETCH customer_cursor INTO my_data;

    --  Now I can refer to:

    --     my_data.customer_id

    --     my_data.customer_name

    --     my_data.phone

    --     my_data.birth_date

    --     my_data.balance

  ...

After you have executed a FETCH statement, how do you know whether a row was actually retrieved? If you FETCH after retrieving the entire result, no error occurs. Instead, each PL/pgSQL function has an automatically declared variable named FOUND. FOUND is a BOOLEAN variable that is set by the PL/pgSQL interpreter to indicate various kinds of state information. Table 7.1 lists the points in time where PL/pgSQL sets the FOUND variable and the corresponding values.

Table 7.1. FOUND Events and Values

Event

Value

Start of each function

FALSE

Start of an integer-FOR loop

FALSE

Within an integer-FOR loop

TRUE

Start of a FOR...SELECT loop

FALSE

Within a FOR...SELECT loop

TRUE

Before SELECT INTO statement

FALSE

After SELECT INTO statement

TRUE (if rows are returned)

Before FETCH statement

FALSE

After FETCH statement

TRUE (if a row is returned)

So, you can see that FOUND is set to TRUE if a FETCH statement returns a row. Let's see how to put all the cursor related statements together into a single PL/pgSQL function:


...

DECLARE

  next_rental  CURSOR FOR SELECT * FROM rentals;

  rental       rentals%ROWTYPE;

BEGIN

  OPEN next_rental;



  LOOP

    FETCH next_rental INTO rental;

    EXIT WHEN NOT FOUND;

    PERFORM process_rental( rental );

  END LOOP;



  CLOSE next_rental;

END;

...

The first thing you do in this code snippet is OPEN the cursor. Next, you enter a LOOP that will process every row returned from the cursor. Inside of the LOOP, you FETCH a single record, EXIT the loop if the cursor is exhausted, and call another function (process_rental())if not. After the loop terminates, close the cursor using the CLOSE statement.

So far, it looks like a cursor loop is pretty much the same as a FOR-IN-SELECT loop. What else can you do with a cursor?

Parameterized Cursors

You've seen that you must provide a SELECT statement when you declare a CURSOR. Quite often, you'll find that you don't know the exact values involved in the query at the time you're writing a function. You can declare a parameterized cursor to solve this problem.

A parameterized cursor is similar in concept to a parameterized function. When you define a function, you can declare a set of parameters (these are called the formal parameters, or formal arguments); those parameters can be used within the function to change the results of the function. If you define a function without parameters, the function will always return the same results (unless influenced by global, external data). Each language imposes restrictions on where you can use a parameter within a function. In general, function parameters can be used anywhere that a value-yielding expression can be used. When you make a call to a parameterized function, you provide a value for each parameter: The values that you provide (these are called the actual parameters, or actual arguments) are substituted inside of the function wherever the formal parameters appear.

When you define a cursor, you can declare a set of formal parameters; those parameters can be used with the cursor to change the result set of the query. If you define a cursor without parameters, the query will always return the same result set, unless influenced by external data. PL/pgSQL restricts the places that you can use a parameter within a cursor definition. A cursor parameter can be used anywhere that a value-yielding expression can be used. When you open a cursor, you must specify values for each formal parameter. The actual parameters are substituted inside of the cursor wherever the formal parameters appear.

Let's look at an example:


 1 ...

 2 DECLARE

 3   next_customer   CURSOR (ID INTEGER) FOR

 4                     SELECT * FROM customers WHERE

 5                     customer_id = ID;

 6   customer        customers%ROWTYPE;

 7   target_customer ALIAS FOR $1;

 8 BEGIN

 9

10   OPEN next_customer( target_customer );

11 ...

Lines 3, 4, and 5 declare a parameterized cursor. This cursor has a single formal parameter; an INTEGER named ID. Notice (at the end of line 5), that I have used the formal parameter within the cursor definition. When I open this cursor, I'll provide an INTEGER value for the ID parameter. The actual parameter that I provide will be substituted into the query wherever the formal parameter is used. So, if target_customer is equal to, say, 42, the cursor opened at line 10 will read:


SELECT * FROM customers WHERE customer_id = 42;

The full syntax for a cursor declaration is


variable-name CURSOR

    [ (param-name param-type [, param-name param-type ...] ) ]

  FOR select-query;

The full syntax for an OPEN statement is


OPEN cursor-name [ ( actual-param-value [, actual-param-value...] ) ];

You would parameterize a cursor for the same reasons that you would parameterize a function: you want the results to depend on the actual arguments. When you parameterize a cursor, you are also making the cursor more reusable. For example, I might want to process all the tapes in my inventory, but I want to process the tapes one distributor at a time. If I don't use a parameterized cursor, I have to declare one cursor for each of my distributors (and I have to know the set of distributors at the time I write the function). Using a parameterized cursor, I can declare the cursor once and provide different actual arguments each time I open the cursor:


 1 CREATE FUNCTION process_tapes_by_distributors( ) RETURNS INTEGER AS '

 2 DECLARE

 3 next_distributor  CURSOR FOR SELECT * FROM distributors;

 4 next_tape         CURSOR( ID ) CURSOR FOR

 5                       SELECT * FROM tapes WHERE dist_id = ID;

 6 dist              distributors%ROWTYPE;

 7 tape              tapes%ROWTYPE;

 8 count             INTEGER := 0;

 9 BEGIN

10   OPEN next_distributor;

11   LOOP

12     FETCH next_distributor INTO dist;

13     EXIT WHEN NOT FOUND;

14     OPEN next_tape( dist.distributor_id );

15     LOOP

16       FETCH next_tape INTO tape;

17       EXIT WHEN NOT FOUND;

18       PERFORM process_tape( dist, tape );

19       count := count + 1;

20     END LOOP;

21     CLOSE next_tape;

22   END LOOP;

23   CLOSE next_distributor;

24   RETURN( count );

25 END;

26 ' LANGUAGE 'plpgsql';

Notice that you can OPEN and CLOSE a cursor as often as you like. A cursor must be closed before it can be opened. Each time you open a parameterized cursor, you can provide new actual parameters.

Cursor References

Now, let's turn our attention to another aspect of cursor support in PL/pgSQL?cursor references.

When you declare a CURSOR variable, you provide a SELECT statement that is bound to the cursor. You can't change the text of the query after the cursor has been declared. Of course, you can parameterize the query to change the results, but the shape of the query remains the same: If the query returns rows from the tapes table, it will always return rows from the tapes table.

Instead of declaring a CURSOR, you can declare a variable to be of type REFCURSOR. A REFCURSOR is not actually a cursor, but a reference to a cursor. The syntax for declaring a REFCURSOR is


DECLARE

  ref-name REFCURSOR;

  ...

Notice that you do not specify a query when creating a REFCURSOR. Instead, a cursor is bound to a REFCURSOR at runtime. Here is a simple example:


 1 ...

 2 DECLARE

 3   next_rental CURSOR FOR SELECT * FROM rentals;

 4   next_tape   CURSOR FOR SELECT * FROM tapes;

 5   rental      rentals%ROWTYPE;

 6   tape        tape%ROWTYPE;

 7   next_row    REFCURSOR;

 8 BEGIN

 9   OPEN next_rental;

10   next_row := next_rental;

11   FETCH next_rental INTO rental;

12   FETCH next_row INTO rental;

13   CLOSE next_rental;

14

15   next_row := next_tape;

16   OPEN next_tape;

17   FETCH next_row  INTO tape;

18   CLOSE next_row;

19 ...

In this block, I've declared two cursors and one cursor reference. One of the cursors returns rows from the rentals table, and the other returns rows from the tapes table.

At line 9, the next_rental cursor opens. At line 10, I give a value to the next_row cursor reference. We now have two ways to access the next_rental cursor: through the next_rental cursor variable and through the next_row cursor reference. At this point, next_row refers to the next_rental cursor. You can see (at lines 11 and 12) that you can FETCH a row using either variable. Both FETCH statements return a row from the rentals table.

At line 14, the next_row cursor reference points to a different cursor. Now, when you FETCH from next_row, you'll get a row from the tapes table. Notice that you can point next_row to a cursor that has not yet been opened. You can CLOSE a cursor using a cursor reference, but you can't OPEN a cursor using a cursor reference.

Actually, you can open a cursor using a REFCURSOR; you just can't open a named cursor. When you declare a CURSOR variable, you are really creating a PostgreSQL cursor whose name is the same as the name of the variable. In the previous example, you created one cursor (not just a cursor variable) named next_rental and a cursor named next_tape. PL/pgSQL allows you to create anonymous cursors using REFCURSOR variables. An anonymous cursor is a cursor that doesn't have a name[8]. You create an anonymous cursor using the OPEN statement, a REFCURSOR, and a SELECT statement:

[8] An anonymous cursor does in fact have a name, but PostgreSQL constructs the name, and it isn't very reader-friendly. An anonymous cursor has a name such as <unnamed cursor 42>.


1 ...

2 DECLARE

3   next_row REFCURSOR;

4 BEGIN

5   OPEN next_row FOR SELECT * FROM customers;

6 ...

At line 5, you are creating an anonymous cursor and binding it to the next_row cursor reference. After an anonymous cursor has been opened, you can treat it like any other cursor. You can FETCH from it, CLOSE it, and lose it. That last part might sound a little fishy, so let me explain further. Take a close look at the following code fragment:


 1 CREATE FUNCTION leak_cursors( INTEGER ) RETURNS INTEGER AS '

 2   DECLARE

 3    next_customer CURSOR FOR SELECT * FROM customers;

 4      next_rental  REFCURSOR;

 5     customer      customers%ROWTYPE;

 6     rental        rentals%ROWTYPE;

 7     count         INTEGER := 0;

 8   BEGIN

 9

10    OPEN next_customer;

11

12    LOOP

13    FETCH next_customer INTO customer;

14    EXIT WHEN NOT FOUND;

15    OPEN next_rental FOR

16       SELECT * FROM rentals

17         WHERE rentals.customer_id = customer.customer_id;

18

19     LOOP

20     FETCH next_rental INTO rental;

21     EXIT WHEN NOT FOUND;

22

23     RAISE NOTICE ''customer_id = %, rental_date = %'',

24           customer.customer_id, rental.rental_date;

25

26      count := count + 1;

27     END LOOP;

28

29     next_rental := NULL;

30

31   END LOOP;

32    CLOSE next_customer;

33   RETURN( count );

34  END;

35 ' LANGUAGE 'plpgsql';

This function contains two loops: an outer loop that reads through the customers table and an inner loop that reads each rental for a given customer. The next_customer cursor is opened (at line 10) before the outer loop begins. The next_rental cursor is bound and opened (at lines 15, 16, and 17) just before the inner loop begins. After the inner loop completes, I set the next_rental cursor reference to NULL and continue with the outer loop. What happens to the cursor that was bound to next_rental? I didn't explicitly close the cursor, so it must remain open. After executing the assignment statement at line 29, I have no way to access the cursor again??remember, it's an anonymous cursor, so I can't refer to it by name. This situation is called a resource leak. A resource leak occurs when you create an object (in this case, a cursor) and then you lose all references to that object. If you can't find the object again, you can't free the resource. Avoid resource leaks; they're nasty and can cause performance problems. Resource leaks will also cause your code to fail if you run out of a resource (such as memory space). We can avoid the resource leak shown in this example by closing the next_rental before setting it to NULL.

You've seen what not to do with a cursor reference, but let's see what cursor references are really good for. The nice thing about a cursor reference is that you can pass the reference to another function, or you can return a reference to the caller. These are powerful features. By sharing cursor references between functions, you can factor your PL/pgSQL code into reusable pieces.

One of the more effective ways to use cursor references is to separate the code that processes a cursor from the code that creates the cursor. For example, you may find that we need a function to compute the total amount of money that we have received from a given customer over a given period of time. I might start by creating a single function that constructs a cursor and processes each row in that cursor:


...

  OPEN next_rental FOR

    SELECT * FROM rentals WHERE

      customer_id = $1 AND

      rental_date BETWEEN $2 AND $3;



  LOOP

    FETCH next_rental INTO rental

    -- accumulate rental values here

    ...

This is a good start, but it works only for a single set of conditions: a given customer and a given pair of dates. Instead, you can factor this one function into three separate functions.

The first function creates a cursor that, when opened, will return all rentals records for a given customer within a given period; the cursor is returned to the caller:


CREATE FUNCTION

select_rentals_by_customer_interval( INTEGER, DATE, DATE )

 RETURNS REFCURSOR AS '

  DECLARE

    next_rental  REFCURSOR;

  BEGIN

    OPEN next_rental FOR

      SELECT * FROM RENTALS WHERE

        customer_id = $1 AND

        rental_date BETWEEN $2 AND $3;

     RETURN( next_rental );

  END;

' LANGUAGE 'plpgsql';

The second function, given a cursor that returns rentals records, computes the total value of the rentals accessible through that cursor:


CREATE FUNCTION

compute_rental_value( REFCURSOR )

 RETURNS NUMERIC AS '

  DECLARE

    total       NUMERIC(7,2) := 0;

    rental      rentals%ROWTYPE;

    next_rental ALIAS FOR $1;

  BEGIN

    LOOP

      FETCH next_rental INTO rental;

      EXIT WHEN NOT FOUND;

      -- accumulate rental values here

      --

      -- pretend that this is a complex

      -- task which requires loads of amazingly

      -- clever code

      ...

    END LOOP;

    RETURN( total );

  END;

' LANGUAGE 'plpgsql';

The last function invokes the first two:


CREATE FUNCTION

compute_value_by_customer_interval( INTEGER, DATE, DATE )

 RETURNS NUMERIC AS '

  DECLARE

    curs  REFCURSOR;

    total NUMERIC(7,2);

  BEGIN

    curs  := select_rentals_by_customer_interval( $1, $2, $3 );

    total := compute_rental_value( curs );

    CLOSE curs;

    RETURN( total );

  END;

' LANGUAGE 'plpgsql';

The advantage to this approach is that you can construct a cursor using different selection criteria and call compute_total_value(). For example, you might want to compute the total values of all rentals of a given tape:


CREATE FUNCTION compute_tape_value( VARCHAR )

 RETURNS NUMERIC AS '

  DECLARE

    curs  REFCURSOR;

    total NUMERIC(7,2);

  BEGIN

    OPEN curs FOR SELECT * FROM rentals WHERE tape_id = $1;

    total := compute_rental_value( curs );

    CLOSE curs;

    RETURN( total );

  END;

' LANGUAGE 'plpgsql';



    Part II: Programming with PostgreSQL