Scheduling and Locking Issues

The previous sections have focused primarily on making individual queries faster. MySQL also allows you to affect the scheduling priorities of statements, which may allow queries arriving from several clients to cooperate better so that individual clients aren't locked out for a long time. Changing the priorities can also ensure that particular kinds of queries are processed more quickly. This section looks at MySQL's default scheduling policy and the options that are available to you for influencing this policy. It also discusses the effect that table handler locking levels have on concurrency among clients. For the purposes of this discussion, a client performing a retrieval (a SELECT) is a reader. A client performing an operation that modifies a table (DELETE, INSERT, REPLACE, or UPDATE) is a writer.

MySQL's basic scheduling policy can be summed up as follows:

  • Write requests should be processed in the order in which they arrive.

  • Writes have higher priority than reads.

For MyISAM and ISAM tables, the scheduling policy is implemented with the aid of table locks. Whenever a client accesses a table, a lock for it must be acquired first. When the client is finished with a table, the lock on it can be released. It's possible to acquire and release locks explicitly by issuing LOCK TABLES and UNLOCK TABLES statements, but normally the server's lock manager automatically acquires locks as necessary and releases them when they no longer are needed.

A client performing a write operation must have a lock for exclusive access to the table. The table is in an inconsistent state while the operation is in progress because the data record is being deleted, added, or changed, and any indexes on the table may need to be updated to match. Allowing other clients to access the table while the table is in flux causes problems. It's clearly a bad thing to allow two clients to write to the table at the same time because that would quickly corrupt the table into an unusable mess. But it's not good to allow a client to read from an in-flux table, either, because the table might be changing right at the spot being read, and the results would be inaccurate.

A client performing a read operation must have a lock to prevent other clients from writing to the table so that the table doesn't change while the table is being read. However, the lock need not provide exclusive access for reading. The lock can allow other clients to read the table at the same time. Reading doesn't change the table, so there is no reason readers should prevent each other from accessing the table.

MySQL allows you to influence its scheduling policy by means of several query modifiers. One of these is the LOW_PRIORITY keyword for DELETE, INSERT, LOAD DATA, REPLACE, and UPDATE statements. Another is the HIGH_PRIORITY keyword for SELECT statements. The third is the DELAYED keyword for INSERT and REPLACE statements.

The LOW_PRIORITY keyword affects scheduling as follows. Normally, if a write operation for a table arrives while the table is being read, the writer blocks until the reader is done because once a query has begun it will not be interrupted. If another read request arrives while the writer is waiting, the reader blocks, too, because the default scheduling policy is that writers have higher priority than readers. When the first reader finishes, the writer proceeds, and when the writer finishes, the second reader proceeds.

If the write request is a LOW_PRIORITY request, the write is not considered to have a higher priority than reads. In this case, if a second read request arrives while the writer is waiting, the second reader is allowed to slip in ahead of the writer. Only when there are no more readers is the writer is allowed to proceed. One implication of this scheduling modification is that theoretically, it's possible for LOW_PRIORITY writes to be blocked forever. As long as additional read requests arrive while previous ones are still in progress, the new requests will be allowed to get in ahead of the LOW_PRIORITY write.

The HIGH_PRIORITY keyword for SELECT queries is similar. It allows a SELECT to slip in ahead of a waiting write, even if the write normally has higher priority.

The DELAYED modifier for INSERT acts as follows. When an INSERT DELAYED request arrives for a table, the server puts the rows in a queue and returns a status to the client immediately so that the client can proceed even before the rows have been inserted. If readers are reading from the table, the rows in the queue are held. When there are no readers, the server begins inserting the rows in the delayed-row queue. Every now and then, the server checks whether any new read requests have arrived and are waiting. If so, the delayed-row queue is suspended and the readers are allowed to proceed. When there are no readers left, the server begins inserting delayed rows again. This process continues until the queue is empty.

LOW_PRIORITY and DELAYED are similar in the sense that both allow row insertion to be deferred, but they are quite different in how they affect client operation. LOW_PRIORITY forces the client to wait until the rows can be inserted. DELAYED allows the client to continue and the server buffers the rows until it has time to process them.

INSERT DELAYED is useful if other clients may be running lengthy SELECT statements and you don't want to block waiting for completion of the insertion. The client issuing the INSERT DELAYED can proceed more quickly because the server simply queues the row to be inserted.

However, you should be aware of certain other differences between normal INSERT and INSERT DELAYED behavior. The client gets back an error if the INSERT DELAYED statement contains a syntax error, but other information that would normally be available is not. For example, you can't rely on getting the AUTO_INCREMENT value when the statement returns. You also won't get a count for the number of duplicates on unique indexes. This happens because the insert operation returns a status before the operation actually has been completed. Another implication is that if rows from INSERT DELAYED statements are queued while waiting to be inserted, and the server crashes or is killed with kill -9, the rows are lost. This is not true for a normal kill -TERM kill; in that case, the server inserts the rows before exiting.

The MyISAM handler does allow an exception to the general principle that readers block writers. This occurs under the condition that a MyISAM table has no holes in it (that is, it has no deleted rows), in which case, any INSERT statements must necessarily add rows at the end of the table rather than in the middle. Under such circumstances, clients are allowed to add rows to the table even while other clients are reading from it. These are known as concurrent inserts because they can proceed concurrently with retrievals without being blocked. If you use this feature, note the following:

  • Do not use the LOW_PRIORITY modifier with your INSERT statements. It causes INSERT always to block for readers and thus prevents concurrent inserts from being performed.

  • Readers that need to lock the table explicitly but still want to allow concurrent inserts should use LOCK TABLES ... READ LOCAL rather than LOCK TABLES ... READ. The LOCAL keyword allows you to acquire a lock that allows concurrent inserts to proceed, because it applies only to existing rows in the table and does not block new rows from being added to the end.

The scheduling modifiers did not appear in MySQL all at once. The following table lists the statements that allow modifiers and the version of MySQL in which each appeared. You can use the table to determine which capabilities your server has.

Statement Type Version of Initial Appearance

Locking Levels and Concurrency

The scheduling modifiers just discussed allow you to influence the default scheduling policy. For the most part, they were introduced to deal with issues that arise from the use of table-level locks, which is what the MyISAM and ISAM handlers use to manage table contention.

MySQL now has BDB and InnoDB tables, which implement locking at different levels and thus have differing performance characteristics in terms of contention management. The BDB handler uses page-level locks. The InnoDB handler uses row-level locks, but only as necessary. (In many cases, such as when only reads are done, InnoDB may use no locks at all).

The locking level used by a table handler has a significant effect on concurrency among clients. Suppose two clients each want to update a row in a given table. To perform the update, each client requires a write lock. For a MyISAM table, the handler will acquire a table lock for the first client, which causes the second client to block until the first one has finished. With a BDB table, greater concurrency can be achieved because the updates can proceed simultaneously, as long as both rows are not located within the same page. With an InnoDB table, concurrency is even higher; both updates can happen at the same time as long as both clients aren't updating the same row.

The general principle is that table locking at a finer level allows better concurrency, because more clients can be using a table at the same time if they use different parts of it. The practical implication is that different table types will be better suited for different query mixes:

  • ISAM and MyISAM are extremely fast for retrievals. However, the use of table-level locks can be a problem in environments with mixed retrievals and updates, especially if the retrievals tend to be long running. Under these conditions, updates may need to wait a long time before they can proceed.

  • BDB and InnoDB tables can provide better performance when there are many updates. Because locking is done at the page or row level rather than at the table level, the extent of the table that is locked is smaller. This reduces lock contention and improves concurrency.

Table locking does have an advantage over finer levels of locking in terms of deadlock prevention. With table locks, deadlock never occurs. The server can determine which tables are needed by looking at the query and lock them all ahead of time. With InnoDB and BDB tables, deadlock can occur because the handlers do not acquire all necessary locks at the beginning of a transaction. Instead, locks are acquired as they are determined to be necessary during the course of processing the transaction. It's possible that two queries will acquire locks and then try to acquire further locks that each depend on already-held locks being released. As a result, each client holds a lock that the other needs before it can continue. This results in deadlock, and the server must abort one of the transactions. For BDB tables, you may be able to help prevent deadlock by using LOCK TABLES to acquire table locks explicitly because the BDB handler sees such locks. That doesn't work for InnoDB, because the handler is not aware of locks set by LOCK TABLES.