A transaction is one of the mechanisms provided within SQL to enforce database integrity and maintain data consistency. The details of implementation differ among the RDBMS vendors, though the SQL92/99 spirit is generally preserved.
A transaction complements the concept of the session with additional granularity — it divides every operation that occurs within the session into logical units of work. In this way, database operations — those involving data and structure modifications — are performed step-by-step and can be rolled back at any time, or committed if every step is successful. The idea of the transaction is to provide a mechanism for ensuring that a multistep operation is performed as a single unit. If any of the steps involved in a transaction fails, the whole transaction is rolled back. If all steps have been completed successfully, the transaction can be either committed (to save all the changes into a database) or rolled back to undo all the changes.
The SQL standard defined transactions from the very beginning and enhanced the concept during subsequent iterations. According to the standard, a transaction is started automatically by RDBMS and continues until COMMIT or ROLLBACK statements are issued; the details were left for the vendors to implement.
A transaction must pass the ACID test:
Atomicity. Either all the changes are made or none.
Consistency. All the data involved into an operation must be left in a consistent state upon completion or rollback of the transaction; database integrity cannot be compromised.
Isolation. One transaction should not be aware of the modifications made to the data by any other transaction unless it was committed to the database. Different isolation levels can be set to modify this default behavior.
Durability. The results of a transaction that has been successfully committed to the database remain there.
One of the classic real-life example of a transaction involves an ATM (bank machine) withdrawal operation. Suppose you need $20 and you decide to withdraw this money from the nearest bank machine; you put in your bank card (User ID) and enter your PIN (personal identification number) to initiate the session. Once the bank confirms your identity, you are allowed to proceed; you ask for a money withdrawal operation in the amount of $20. That's where the transaction begins. There are several operations involved: the machine needs to check your account to verify that you have enough money to cover the transaction, subtract the money from your account, and release the money to you. If any of these steps (and some others, depending on the given bank policies) fails, the transaction must be aborted, and everything must revert to a state where it was before the transaction even began.
This means that you cannot get your cash, unless it was subtracted from your balance; the bank cannot subtract the money from your balance unless you have enough money to cover the transaction and you actually received your cash.
The transaction model, as it is defined in the ANSI/ISO standard, utilizes the implicit start of a transaction, with an explicit COMMIT, in the case of successful execution of all transactions logical units, or an explicit ROLLBACK, when the noncommitted changes need to be rolled back (e.g., when program terminates abnormally). Most vendors follow this model, while some — Microsoft SQL Server 2000 is one example — allow for explicit start of a transaction.
The COMMIT statement ends the current transaction and makes all changes made to the data during transaction permanent. The syntax is virtually identical for all three RDBMS vendors, as well as for the SQL99 standard, and is very straightforward:
The keyword WORK is not required, though it might be added for clarity; a simple COMMIT is usually all that is required.
Oracle 9i syntax looks like follows
COMMIT [WORK] [COMMENT (<text>)] [FORCE (<text>), [<int>]] ;
Here the COMMENT clause enables you to specify a comment (up to 255 bytes long) that is recorded for every pending transaction and can be viewed through DBA2_PC_PENDING dictionary view (see Chapter 13 for more information on system catalogs). The FORCE clause allows you to commit an in-doubt distributed (see more about distributed transactions later in the chapter) transaction manually; it commits only a named transaction and has no effect on all other transactions.
The IBM DB2 UDB syntax is identical to the standard. In IBM terminology, transaction is a unit of work (UOW). No authorization is required to issue the statement; all locks held by the transaction are released. Named transactions are not supported.
The following syntax will work both for Oracle 9i and IBM BDF2 UDB:
UPDATE customer SET cust_status_s = 'N'; COMMIT;
Microsoft SQL Server 2000 does support the SQL99 standard syntax — in addition to its own. The Microsoft syntax allows for committing named transaction whereas the standard one does not.
COMMIT [ TRAN [ SACTION ] [<transaction name>]]
As you can see, only COMMIT is required, everything else is optional, and the keywords can be shortened (i.e., TRAN instead of TRANSACTION). Alternatively COMMIT WORK can be used.
The following example illustrates the COMMIT statement using Microsoft SQL Server 2000 explicit transactions mode.
BEGIN TRAN SELECT * FROM customer UPDATE customer SET cust_status_s = 'N' COMMIT TRAN
No changes are taking place until the last COMMIT is executed. Only Microsoft requires a BEGIN TRANSACTION statement to start an explicit transaction; in both Oracle and DB2 UDB, transaction are always started implicitly for every DML or DDL statement.
When COMMIT is executed, SQL Server must start a transaction either implicitly or explicitly for another COMMIT to execute successfully; if no transaction is started, issuing this command will result in an error:
Server: Msg 3902, Level 16, State 1, Line 1 The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION.
Neither Oracle nor DB2 UDB will complain, no matter how many times you execute COMMIT.
When changes made to the data in the databases need to be "undone" the ROLLBACK should be used. It may be issued anytime before the last COMMIT and results in automatic rollback of all changes made since the controlling transaction had started.
The syntax is identical in all RDBMS and SQL99 standards (see Table 7-4), save for using named transactions in Microsoft SQL Server 2000 and some Oracle-specific optional clauses. The following statement will attempt to update column CUST_STATUS_S in the CUSTOMER table of the ACME database, but all changes will be rolled back:
UPDATE customer SET cust_status_s = 'N' ROLLBACK WORK
ROLLBACK [WORK] [TO SAVEPOINT <savepoint name>] | [FORCE <text>]
IBM DB2 UDB
ROLLBACK [WORK] [TO SAVEPOINT <savepoint name>]
Microsoft SQL Server 2000
ROLLBACK [TRAN[SACTION]] [<transaction name>] [<savepoint name>]
As with a COMMIT statement, all the locks are released if the ROLLBACK command is issued.
The Oracle 9i WORK clause is optional and the TO SAVEPOINT clause is explained later in this chapter; the FORCE clause pertains to distributed transactions, acting very much the same as in the COMMIT transaction case; Microsoft SQL Server has an optional transaction name clause.
Because certain statements (like DDL) automatically issue a COMMIT before and after, every change to data that happened prior to the DDL statement would be committed as well.
Here is an example that is valid for all three RDBMS (assuming the IMPLICIT_TRANSACTIONS option is set to ON in Microsoft SQL Server 2000):
UPDATE customer SET cust_status_s = 'N' WHERE cust_id_n = 1 DELETE customer WHERE cust_id_n = 1 ROLLBACK WORK
Neither UPDATE nor DELETE will be committed to the database, as the whole transaction is rolled back.
Usually, a transaction consists of more than one SQL statement that you may want to either COMMIT or ROLLBACK. To add granularity to the transaction processing, the SAVEPOINT concept was introduced. It allows you to specify a named point within the transaction, usually after the last successful statement, and, if any error occurs after that, roll all the changes back not to the beginning of the transaction but to that particular SAVEPOINT. An explicit (or implicit, like the one issued after a DDL statement) COMMIT releases all SAVEPOINTs declared within a transaction.
Oracle 9i has the most straightforward syntax for the SAVEPOINT:
SAVEPOINT <savepoint name>;
Here is an example of using the SAVEPOINTs in Oracle:
UPDATE customer SET cust_status_s = 'N' WHERE cust_id_n = 1; SAVEPOINT first_upadate; DELETE customer WHERE cust_id_n = 2; SAVEPOINT first_delete; DELETE customer WHERE cust_id_n = 10; ROLLBACK first_update; COMMIT;
In the example above, only UPDATE gets committed to the database, all DELETEs are rolled back, and the SAVEPOINT first_delete is erased.
The savepoint name must be unique within the current transaction; if a new savepoint uses the same name, the previous savepoint is destroyed.
Here is the IBM DB2 UDB syntax for SAVEPOINT:
SAVEPOINT <savepoint name > [UNIQUE] [ON ROLLBACK RETAIN CURSORS] [ON ROLLBACK RETAIN LOCKS]
Several optional clauses can be specified with the standard SAVEPOINT statement. The UNIQUE clause indicates that the session does not intend to reuse the name, rendering it therefore unique; if this statement is omitted and the same name is used later in the transaction, the previous SAVEPOINT with that name will be destroyed and a new one created.
The ON ROLLBACK RETAIN CURSORS clause specifies what the system will do with implicit or explicit cursors opened after the SAVEPOINT statement in the case of a rollback; the last clause — ON ROLLBACK RETAIN LOCKS — changes the default behavior that instructs RDBMS not to release locks acquired after the SAVEPOINT statement.
See Chapter 14 for more information on explicit cursors. Both IBM and Oracle employ a concept of an implicit cursor — a special structure for manipulating data, when virtually every select statement opens one. The discussion of implicit cursors is beyond the scope of this book.
DB2 UDB also has RELEASE SAVEPOINT statement that destroys all the SAVEPONTS created after that named savepoint.
Microsoft SQL Server 2000 has the most unorthodox syntax, when it comes to establishing the SAVEPOINTs.
SAVE TRAN[SACTION] <savepoint name>
When rolling back to a specific SAVEPOINT, all data changes become undone, but all the locks are held until COMMIT or full ROLLBACK commands are issued. The SAVE TRAN [SACTION] statement is not supported in distributed transactions.
Here is an example illustrating use of the SAVE TRANSACTION statement in Microsoft SQL Server 2000:
BEGIN TRANSACTION trans1 UPDATE customer SET cust_status_s = 'N' WHERE cust_id_n = 1 SAVE TRANSACTION cust_1 UPDATE customer SET cust_status_s = 'N' WHERE cust_id_n = 2 ROLLBACK TRANSACTION cust_1 COMMIT TRANSACTION
This code begins a named transaction TRANS1, updates field CUST_STATUS_S for the customer whose ID is 1, then creates a SAVEPOINT with the name CUST_1. It then proceeds to update another customer's status, and then it rolls back the changes made for customer 2 by rolling back the transaction to the savepoint. The transaction is finally committed, and only the first update actually takes place.
There are different transaction isolation levels. Isolation levels refer to the ability of the transaction to see the world (data) outside its own scope, i.e., data modified by any other transaction. The SQL99 standard isolation levels are listed in Table 7-5.
This level is the lowest of all isolation levels, permitting dirty reads (i.e., able to see uncommitted data). No locks are issued, none honored.
This level specifies that shared locks will be held while data is being read. No dirty reads (containing uncommitted data) are permitted; though phantom reads (when row number changes between the reads) may occur.
No changes will be allowed for the data selected by a query (locked for updates, deletes, etc.), but phantom rows may appear.
The highest level of transaction isolation; places a lock for the whole dataset; no modifications from outside are allowed until the end of the transaction.
Oracle 9i has two transaction isolation levels — SERIALIZABLE and READ COMMITED. The SET TRANSACTION syntax for Oracle can be complicated:
SET TRANSACTION [READ ONLY] | [READ WRITE] [ISOLATION LEVEL [SERIALIZABLE | READ COMMITTED]] [USE ROLLBACK SEGMENT <segment name>] [NAME <transaction name>]
As you can see, the statement can be used to set many parameters, though it cannot be done all at once. To set a transaction as READ ONLY, the following statement could be used:
SET TRANSACTION READ ONLY NAME 'trans1'; SELECT * FROM CUSTOMER ; COMMIT;
After the transaction was set as READ ONLY, you cannot modify any data within this transaction either with UPDATE or INSERT statements.
Oracle is the only one among the "big three" RDBMS that provides for READ ONLY mode of a transaction. In full compliance with the SQL99 standard, this clause sets the transaction for read-only mode, and an error is generated if an attempt to change data is made. It establishes statement-level behavior, which becomes the default for the session.
There is some terminology confusion in how DB2 UDB defines transaction isolation levels. What SQL99 specifies as SERIALIZABLE, it names REPEATABLE READ (RR), which is the highest isolation level in DB2 UDB.
SQL99 REPEATABLE READ becomes READ STABILITY (RS), and a new level — CURSOR STABILITY — is introduced.
The last one, CURSOR STABILITY (CS), is the default for IBM DB2 UDB and resembles the READ COMMITTED level of the SQL99 standard. Essentially, it guarantees that a row of data will remain unchanged.
The UNCOMMITED READ (UR) level is the same as it is defined by the standard: no locks are acquired, so dirty reads are possible.
DB2 UDB also has NO COMMIT (NC) as the isolation level, which is not supported by its mainframe big brother DB2.
When establishing connection from within an application, the isolation level can be specified using PREP or BIND API directives, from the command-line processor the following statement may be used:
db2 => CHANGE ISOLATION TO UR DB20000I The CHANGE ISOLATION command completed successfully
You cannot change isolation levels while connected to DB2 UDB; the isolation level is specified before the connection is established. Use the TERMINATE command to disconnect from the DB2 UDB database.
Microsoft SQL Server 2000 supports all four levels of isolation. The isolation level is set for the whole session, not just a single transaction. To specify a level within the session, the following statement is used:
SET TRANSACTION ISOLATION LEVEL <level>
Here is an example, illustrating the importance of the transaction isolation level to manipulate consistent data using Microsoft SQL Server 2000. (The example, with minor modifications, is applicable to Oracle and DB2 UDB as well.) This example performs an update, selects the updated value, and then rolls back the transaction (OSQL interface, see Appendix E for more information):
1> SELECT cust_status_s 2> FROM customer 3> WHERE cust_id_n = 1 4> GO cust_status_s ------------- N (1 row affected) 1> SET TRANSACTION ISOLATION LEVEL READ COMMITTED 2> GO 1> BEGIN TRAN TRAN1 2> UPDATE customer 3> SET cust_status_s = 'Y' 4> WHERE cust_id_n = 1 5> GO (1 row affected) 1> SELECT cust_status_s 2> FROM customer 3> WHERE cust_id_n = 1 4> GO cust_status_s ------------- Y (1 row affected) 1> ROLLBACK TRAN TRAN1 2> GO 1> SELECT cust_status_s 2> FROM customer 3> WHERE cust_id_n = 1 4> GO cust_status_s ------------- N (1 row affected)
The transaction TRANS1 updates the field CUST_STATUS_S, changing it from Y to N, and then issues a SELECT statement that shows the changed data. The transaction isolation level for the session is READ COMMITED, so only changes committed to the database are supposed to be selected. Since the SELECT was issued within the same transaction, it will be able to see uncommitted changes made by this transaction update. The data changes will be visible to other transactions that attempt to select it within the sessions with transaction isolation level set to READ UNCOMMITED; but they are invisible for transactions with other levels of isolation — if they were issued prior to the ROLLBACK TRANSACTION statement. The example also shows that the data, after the transaction was rolled back, remain unchanged.