SQL Run

Database Programming

14. Transactions

In this chapter

Executing a Transaction

A transaction is a sequence of one or more SQL statements executed as a single logical unit of work. The DBMS considers a transaction to be an indivisible, all-or-nothing proposition: it executes all the transaction’s statements as a group, or it executes none of them.

For example, suppose that a bank customer transfers $500 from her savings account to her checking account. This operation consists of two separate actions, executed sequentially:

  1. Decrement savings balance by $500.
  2. Increment checking balance by $500.

Figure 14.1 shows the two SQL statements for this transaction. Now imagine that the DBMS fails—power outage, system crash, hardware problem—after it executes the first statement but before the second. The accounts would be out of balance without your knowledge. Accusations of malfeasance and prison time would soon follow.

To avoid legal problems, use a transaction to guarantee that both SQL statements are performed to maintain the accounts in proper balance. When something prevents one of the statements in a transaction from executing, the DBMS undoes (rolls back) the other statements of the transaction. If no error occurs, then the changes are made permanent (committed).

Figure 14.1Two SQL statements are needed when a banking customer transfers money from savings to checking.

UPDATE savings_accounts
  SET balance = balance - 500.00
  WHERE account_number = 208998628;

UPDATE checking_accounts
  SET balance = balance + 500.00
  WHERE account_number = 786783165;

Executing a Transaction

To learn how transactions work, you need to learn a few terms:

Commit.Committing a transaction makes all data modifications performed since the start of the transaction a permanent part of the database. After a transaction is committed, all changes made by the transaction become visible to other users and are guaranteed to be permanent if a crash or other failure occurs.

Roll back.Rolling back a transaction retracts any of the changes resulting from the SQL statements in the transaction. After a transaction is rolled back, the affected data are left unchanged, as though the SQL statements in the transaction were never executed.

Transaction log.The transaction log file, or just log, is a serial record of all modifications that have occurred in a database via transactions. The transaction log records the start of each transaction, the changes to the data, and enough information to undo or redo the changes made by the transaction (if necessary later). The log grows continually as transactions occur in the database.

Although it’s the DBMS’s responsibility to ensure the physical integrity of each transaction, it’s your responsibility to start and end transactions at points that enforce the logical consistency of the data, according to the rules of your organization or business. A transaction should contain only the SQL statements necessary to make a consistent change—no more and no fewer. Data in all referenced tables must be in a consistent state before the transaction begins and after it ends.

When you’re designing and executing transactions, some important considerations are:

Concurrency Control

To humans, computers appear to carry out two or more processes at the same time. In reality, computer operations occur not concurrently, but in sequence. The illusion of simultaneity appears because a microprocessor works with much smaller time slices than people can perceive. In a DBMS, concurrency control is a group of strategies that prevents loss of data integrity caused by interference between two or more users trying to access or change the same data simultaneously.

DBMSs use locking strategies to ensure transactional integrity and database consistency. Locking restricts data access during read and write operations; thus, it prevents users from reading data that are being changed by other users and prevents multiple users from changing the same data at the same time. Without locking, data can become logically incorrect, and statements executed against those data can return unexpected results. Occasionally you’ll end up in a deadlock, where you and another user, each having locked a piece of data needed for the other’s transaction, attempt to get a lock on each other’s piece. Most DBMSs can detect and resolve deadlocks by rolling back one user’s transaction so that the other can proceed (otherwise, you’d both wait forever for the other to release the lock). Locking mechanisms are very sophisticated; search your DBMS documentation for locking.

Concurrency transparency is the appearance from a transaction’s perspective that it’s the only transaction operating on the database. A DBMS isolates a transaction’s changes from changes made by any other concurrent transactions. Consequently, a transaction never sees data in an intermediate state; either it sees data in the state they were in before another concurrent transaction changed them, or it sees the data after the other transaction has completed. Isolated transactions let you reload starting data and replay (roll forward) a series of transactions to end up with the data in the same state they were in after the original transactions were executed.

For a transaction to be executed in all-or-nothing fashion, the transaction’s boundaries (starting and ending points) must be clear. These boundaries let the DBMS execute the statements as one atomic unit of work. A transaction can start implicitly with the first executable SQL statement or explicitly with the START TRANSACTION statement. A transaction ends explicitly with a COMMIT or ROLLBACK statement (it never ends implicitly). You can’t roll back a transaction after you commit it.

Oracle and Db2 transactions always start implicitly, so those DBMSs have no statement that marks the start of a transaction. In Microsoft Access, Microsoft SQL Server, MySQL, and PostgreSQL, you can (or must) start a transaction explicitly by using the BEGIN statement. The SQL:1999 standard introduced the START TRANSACTION statement—long after these DBMSs already were using BEGIN to start transactions, so the extended BEGIN syntax varies by DBMS. MySQL and PostgreSQL support START TRANSACTION (as a synonym for BEGIN).

To start a transaction explicitly:

To commit a transaction:

To roll back a transaction:

The SELECT statements in Listing 14.1 show that the UPDATE operations are performed by the DBMS and then undone by a ROLLBACK statement. See Figure 14.2 for the result.

Listing 14.1Within a transaction block, UPDATE operations (like INSERT and DELETE operations) are never final. See Figure 14.2 for the result.

SELECT SUM(pages), AVG(price) FROM titles;

BEGIN TRANSACTION;
  UPDATE titles SET pages = 0;
  UPDATE titles SET price = price * 2;
  SELECT SUM(pages), AVG(price) FROM titles;
ROLLBACK;

SELECT SUM(pages), AVG(price) FROM titles;

Figure 14.2Result of Listing 14.1. The results of the SELECT statements show that the DBMS cancelled the transaction.

SUM(pages) AVG(price)
---------- ----------
      5107    18.3875

SUM(pages) AVG(price)
---------- ----------
         0    36.7750

SUM(pages) AVG(price)
---------- ----------
      5107    18.3875

Listing 14.2 shows a more practical example of a transaction. I want to delete the publisher P04 from the table publishers without generating a referential-integrity error. Because some of the foreign-key values in titles point to publisher P04 in publishers, I first need to delete the related rows from the tables titles, titles_authors, and royalties. I use a transaction to be certain that all the DELETE statements are executed. If only some of the statements were successful, then the data would be left inconsistent. (For information about referential-integrity checks, see “Specifying a Foreign Key with FOREIGN KEY” in Chapter 11.)


Listing 14.2Use a transaction to delete publisher P04 from the table publishers and delete P04’s related rows in other tables.

BEGIN TRANSACTION;

  DELETE FROM title_authors
    WHERE title_id IN
      (SELECT title_id
         FROM titles
         WHERE pub_id = 'P04');

  DELETE FROM royalties
    WHERE title_id IN
      (SELECT title_id
         FROM titles
         WHERE pub_id = 'P04');

  DELETE FROM titles
    WHERE pub_id = 'P04';

  DELETE FROM publishers
    WHERE pub_id = 'P04';

COMMIT;

ACID

ACID is an acronym that summarizes the properties of a transaction:

Atomicity.Either all of a transaction’s data modifications are performed, or none of them are.

Consistency.A completed transaction leaves all data in a consistent state that maintains all data integrity. A consistent state satisfies all defined database constraints. (Note that consistency isn’t necessarily preserved at any intermediate point within a transaction.)

Isolation.A transaction’s effects are isolated (or concealed) from those of all other transactions. See “Concurrency Control” earlier in this chapter.

Durability.After a transaction completes, its effects are permanent and persist even if the system fails.

Transaction theory is a big topic, separate from the relational model. A good (if sometimes dated) reference is Transaction Processing: Concepts and Techniques by Jim Gray and Andreas Reuter.

Tips for Transactions