PL/pgSQL is termed a block-structured language. A block is a sequence of statements between a matched set of DECLARE/BEGIN and END statements. Blocks can be nested??meaning that one block can entirely contain another block, which in turn can contain other blocks, and so on. For example, here is a PL/pgSQL function:
1 -- 2 -- my_factorial1.sql 3 -- 4 5 CREATE FUNCTION my_factorial(INTEGER) RETURNS INTEGER AS ' 6 DECLARE 7 arg INTEGER; 8 BEGIN 9 10 arg := $1; 11 12 IF arg IS NULL OR arg < 0 THEN 13 RAISE NOTICE ''Invalid Number''; 14 RETURN NULL; 15 ELSE 16 IF arg = 1 THEN 17 RETURN 1; 18 ELSE 19 DECLARE 20 next_value INTEGER; 21 BEGIN 22 23 next_value := my_factorial(arg - 1) * arg; 24 RETURN next_value; 25 END; 26 END IF; 27 END IF; 28 END; 29 ' LANGUAGE 'plpgsql';
The body of my_factorial() is actually the string between the opening single quote (following the word AS) and the closing single quote (just before the word LANGUAGE).
This function contains two blocks of code. The first block starts at line 6 and ends at line 28. The second block, which is nested inside the first, starts at line 19 and ends at line 25. The first block is called an outer block because it contains the inner block.
I'll talk about variable declarations in more detail in a moment, but I want to point out a few things here. At line 7, you declare a variable named arg. This variable has a well-defined lifetime. arg comes into existence when the function reaches the first DECLARE statement and goes out of existence as soon as the function reaches the END statement at line 27. The lifetime of a variable is also referred to as its scope. You can refer to a variable in any statement within the block that defines the scope of the variable. If you try to refer to a variable outside of its scope, you will receive a compilation error. Remember that you have two (nested) blocks in this function: the outer block and the inner block. Variables declared in an outer block can be used in inner blocks, but the reverse is not true. At line 23 (which is in the inner block), you use the arg variable, which was declared in the outer block. The variable next_value is declared within the inner block: If you try to use next_value in the outer block, you'll get an error.
This function (my_factorial()) contains two blocks, one nested within the other. You can nest blocks as deeply as you need to. You can also define blocks that are not nested. Here is the my_factorial() function again, but this time, I've included a few more blocks:
1 -- 2 -- my_factorial2.sql 3 -- 4 5 CREATE FUNCTION my_factorial(INTEGER) RETURNS INTEGER AS ' 6 DECLARE 7 arg INTEGER; 8 BEGIN 9 10 arg := $1; 11 12 IF arg IS NULL OR arg < 0 THEN 13 BEGIN 14 RAISE NOTICE ''Invalid Number''; 15 RETURN NULL; 16 END; 17 ELSE 18 IF arg = 1 THEN 19 BEGIN 20 RETURN 1; 21 END; 22 ELSE 23 DECLARE 24 next_value INTEGER; 25 BEGIN 26 next_value := my_factorial(arg - 1) * arg; 27 RETURN next_value; 28 END; 29 END IF; 30 END IF; 31 END; 32 ' LANGUAGE 'plpgsql';
This version still has an outer block (lines 6 through 31), but you have multiple inner blocks: lines 13 through 16, lines 19 through 21, and lines 23 through 28. As I said earlier, variables declared in an outer block can be used in inner blocks but the reverse is not true. If you had declared any variables in the block starting at line 19, you could not use any of those variables past the end of the block (at line 21).
Notice that you can indicate the beginning of a block with a DECLARE statement or with a BEGIN statement. If you need to declare any variables within a block, you must include a DECLARE section. If you don't need any local variables within a block, the DECLARE section is optional (an empty DECLARE section is perfectly legal).
Take a close look at line 14 in the previous example:
RAISE NOTICE ''Invalid Number'';
You may have noticed that there are two single quotes at the start of the string ''Invalid Number'' and there are two single quotes at the end. You have to quote strings inside of a function this way because the body of a function is defined as a string. Of course, you can quote embedded strings using any of the three methods described in Chapter 2, "Working with Data in PostgreSQL." You could have written the embedded string in any of the three following forms:
RAISE NOTICE ''Invalid Number''; RAISE NOTICE \'Invalid Number\'; RAISE NOTICE \047Invalid Number\047;
Now, let's go back and look at the components of a function in more detail.
You define a new PL/pgSQL function using the CREATE FUNCTION command. The CREATE FUNCTION command comes in two forms. The first form is used for language interpreters that are embedded into the PostgreSQL server?PL/pgSQL functions fall into this category:
CREATE [OR REPLACE] FUNCTION name ( [ argtype [, ...] ] ) RETURNS return_type AS 'definition' LANGUAGE langname [ WITH ( attribute [, ...] ) ]
The second form is used to define functions that are defined in an external language and compiled into a dynamically loaded object module:
CREATE [OR REPLACE] FUNCTION name ( [ argtype [, ...] ] ) RETURNS return_type AS 'obj_file', 'link_symbol' LANGUAGE langname [ WITH ( attribute [, ...] ) ]
I covered compiled functions in more detail in Chapter 6, "Extending PostgreSQL." For this chapter, I'll focus on the first form.
Each function has a name. However, the name alone is not enough to uniquely identify a PostgreSQL function. Instead, the function name and the data types of each argument (if any) are combined into a signature. A function's signature uniquely identifies the function within a database. This means that you can define many my_factorial() functions:
CREATE FUNCTION my_factorial( INTEGER )... CREATE FUNCTION my_factorial( REAL )... CREATE FUNCTION my_factorial( NUMERIC )...
Each of these functions is uniquely identified by its signature. When you call one of these functions, you provide the function name and an argument; PostgreSQL determines which function to use by comparing the data type of the arguments that you provide with the function signatures. If an exact match is found, PostgreSQL uses that function. If PostgreSQL can't find an exact match, it tries to find the closest match.
When you create a new function, you specify a list of arguments required by that function. In most programming languages, you would declare a name and a type for each function argument. In PL/pgSQL, you declare only the data type. The first argument is automatically named "$1", the second argument is named "$2", and so forth, up to a maximum of 16 arguments (I'll show you how to provide more meaningful names in a moment). You can use predefined data types and user-defined data types in a PL/pgSQL function.
It is important to remember that PL/pgSQL does not support default parameters. If you define a function that requires three parameters, you cannot call that function with fewer (or more) parameters. If you find that you need a function with a variable argument list, you can usually overload your function to obtain the same effect. When you overload a function, you define two (or more) functions with the same name but different argument lists. For example, let's define a function to compute the due date for a tape rental:
1 -- 2 -- compute_due_date.sql 3 -- 4 5 CREATE FUNCTION compute_due_date(DATE) RETURNS DATE AS ' 6 DECLARE 7 8 due_date DATE; 9 rental_period INTERVAL := ''7 days''; 10 11 BEGIN 12 13 due_date := $1 + rental_period; 14 15 RETURN due_date; 16 17 END; 18 ' LANGUAGE 'plpgsql';
This function takes a single parameter, a DATE value, and returns the date one week later. You might want a second version of this function that expects the rental date and a rental period:
20 -- compute_due_date.sql 21 -- 22 CREATE FUNCTION compute_due_date(DATE, INTERVAL) RETURNS DATE AS ' 23 BEGIN 24 25 RETURN( $1 + $2 ); 26 27 END; 28 ' LANGUAGE 'plpgsql';
Now you have two functions named compute_due_date(). One function expects a DATE value, and the other expects a DATE value and an INTERVAL value. The first function compute_due_date(DATE), provides the equivalent of a default parameter. If you call compute_due_date() with a single argument, the rental_period defaults to seven days.
I'd like to point out two things about the compute_due_date(DATE, INTERVAL) function.
First, a stylistic issue?the RETURN statement takes a single argument, the value to be returned to the caller. You can RETURN any expression that evaluates to the return_type of the function (we'll talk more about a function's return_type in a moment). I find it easier to read a RETURN statement if the expression is enclosed in parentheses (see line 25).
Second, you'll notice that I did not DECLARE any local variables. You can treat parameter variables just like any other variable??I used them in an expression in line 25. It's a rare occasion when you should settle for the automatic variable names supplied for function parameters. The name "$1" doesn't convey much meaning beyond telling you that this variable happens to be the first parameter. You should really provide a meaningful name for each parameter; this gives the reader some idea of what you intended to do with each parameter. Using the ALIAS statement, you can give a second, more meaningful, name to a parameter. Here is the compute_due_date(DATE, INTERVAL) function again, but this time I have given alternate names to the parameters:
20 -- compute_due_date.sql 21 -- 22 CREATE FUNCTION compute_due_date(DATE, INTERVAL) RETURNS DATE AS ' 23 DECLARE 24 rental_date ALIAS FOR $1; 25 rental_period ALIAS FOR $2; 26 BEGIN 27 28 RETURN( rental_date + rental_period ); 29 30 END; 31 ' LANGUAGE 'plpgsql';
ALIAS gives you an alternate name for a parameter: you can still refer to an aliased parameter using the $n form, but I don't recommend it. Why bother to give a meaningful name to a parameter and then ignore it?
Every PL/pgSQL function must return a value, even if it only returns NULL. When you create a function, you must declare the data type of the return value. Our compute_due_date() functions return a value of type DATE. A value is returned from a function using the RETURN expression statement. Keep in mind that PL/pgSQL will try to convert the returned expression into the type that you specified when you created the function. If you tried to RETURN( ''Bad Value'' ) from the compute_due_date() function, you would get an error (Bad Date External Representation). We'll see a special data type a little later (OPAQUE) that can be used only for trigger functions.
I'll skip over the function body[3] for the moment and look at the final component[4] required to define a new function. PostgreSQL functions can be written in a variety of languages. When you create a new function, the last component that you specify is the name of the language in which the body of the function is written. All the functions that you will see in this chapter are written in PL/pgSQL, which PostgreSQL knows as LANGUAGE 'plpgsql'.
[3] The function body is everything between the AS keyword and the LANGUAGE keyword. The function body is specified in the form of a string.
[4] When you create a function, you can also specify a set of optional attributes that apply to that function. These attributes tell PostgreSQL about the behavior of the function so that the query optimizer can know whether it can take certain shortcuts when evaluating the function. See the CREATE FUNCTION section in the PostgreSQL Programmer's Guide for more information.
Before you experiment much more with PL/pgSQL functions, it might be useful for you to know how to replace the definition of a function.
If you are using PostgreSQL 7.2 or later, you can use the CREATE OR REPLACE FUNCTION ... syntax. If a function with the same signature already exists, PostgreSQL will silently replace the old version of the function, otherwise, a new function is created.
If you are using a version of PostgreSQL older than 7.2, you will have to DROP the old function before you can create a new one. The syntax for the DROP FUNCTION command is
DROP FUNCTION name( [ argtype [, ...] ] );
Notice that you have to provide the complete signature when you drop a function; otherwise, PostgreSQL would not know which version of the function to remove.
Of course, you can use the DROP FUNCTION command to simply remove a function?you don't have to replace it with a new version.