MS DTC Architecture

In general, each Microsoft SQL Server will have an associated distributed transaction coordinator (MS DTC) on the same machine with it.

The MS DTC allows applications to extend transactions across two or more instances of MS SQL Server and participate in transactions managed by transaction managers that comply with the X/Open DTP XA standard.

The MS DTC will act as the primary coordinator for these distributed transactions. The specific job of the MS DTC is to enlist (include) and coordinate SQL Servers and remote servers (linked servers) that are part of a single distributed transaction.

SQL Server will automatically promote a local transaction to a distributed transaction when it encounters the remote server access in combination with an update request, whether or not you have explicitly started a distributed transaction. The MS DTC coordinates the execution of the distributed transaction at each participating datasource and makes sure the distributed transaction completes. It ensures that all updates are made permanent in all datasources (committed), or makes sure that all of the work is undone (rolled back) if it needs to be. At all times, the state of all datasources involved are kept intact. To guarantee that this is taken care of properly, the MS DTC manages each distributed transaction using the two-phase commit protocol.

Two-Phase Commit Protocol

An MS DTC service provides two-phase commit functionality based on its ability to act as a transaction manager across one or more resource managers. A SQL Server or other OLE DB datasource is considered a resource manager of its own data. A distributed transaction is made up of local transactions in each individual resource manager. Each resource manager must be able to commit or roll back its local transaction in coordination with all the other resource managers enlisted in the distributed transaction (as illustrated in Figure 32.7). That is the transaction manager's job (MS DTC in this case). This distributed transaction is referred to as a UOW. In fact, it will appear in the Transaction List portion of DTC and have a status and a unique UOW ID assigned to it.

Figure 32.7. MS DTC architecture.

graphics/32fig07.gif

A distributed transaction goes through the following steps:

The distributed transaction is started and is assigned a unique UOW ID by the controlling distributed transaction coordinator (the transaction manager) for the SQL Server.
Data modification statements are issued against any linked/remote servers available to SQL Server (as well as to SQL Server). These become local transactions at each linked/remote server and are controlled by the local resource manager (the DBMS engine in most cases).
The transaction server (MS DTC) enlists the appropriate servers into the distributed transaction.
Phase One (Prepare): The transaction manager sends a "prepare to commit" request to each resource manager. The resource managers, in turn, perform their local transaction commit processing to the point just before releasing the minimal locks on the affected resources. All of the resource managers communicate back to the transaction manager that they are "ready to commit."
Phase Two (Commit): If all the resource managers return an okay to their prepare requests, the transaction manager sends commit transaction commands to each of them. Then, each resource manager can do a final commit for the local transaction and release the locks on the held resources. The distributed transaction is complete. However, if any of the resource managers returns an error to the prepare request, the transaction manager will send rollback commands to each of the resource managers to undo all the local transactions.

If one of the enlisted SQL Servers is unable to communicate with the transaction server, the database involved is marked as suspect. When the transaction server is "visible" again, the affected server should be restarted so that the database and the in-doubt transaction can be recovered.

In general, an application can initiate a distributed transaction from SQL Server by doing the following:

Starting with a local transaction and then issuing a distributed query. The transaction will automatically be promoted to a distributed transaction that is controlled by MS DTC.
Issuing a BEGIN DISTRIBUTED TRANSACTION statement explicitly.
Starting with a local transaction and then issuing a remote procedure call (SQL Server option of REMOTE_PROC_TRANSACTIONS must be set "on"). Again, this will automatically be promoted to a distributed transaction.
Allowing a SQL Server connection to participate in a distributed transaction.

Data Transformation Services (DTS) uses functions offered by the MS DTC to include the benefits of distributed transactions to the DTS package developer. This adds significant data integrity to DTS package programming.

If you haven't already done so, start up the MS DTC service. It should be listed as a service under Microsoft SQL Manager, or it can be started via the Control Panel, Services option. Figure 32.8 shows this.

Figure 32.8. Starting the MS DTC service.

graphics/32fig08.jpg

Another quick way to determine whether your server can communicate with DTC is to open a query window and execute a BEGIN DISTRIBUTED TRANSACTION statement:

BEGIN DISTRIBUTED TRANSACTION 
Go
Server: Msg 8501, Level 16, State 3, Line 1
MSDTC on server 'C81124-C\DBARCH01' is unavailable.

The service was probably not started automatically and can be easily done so.

The MS DTC is the transaction manager for distributed transactions. It makes use of a log file to record the outcome of all transactions that have made use of its services.

The Importance of `MSDTC.LOG`

Never modify or delete the MSDTC.LOG file. If you delete this file, MS DTC has nowhere to write its activity and ultimately fails.

MSDTC.LOG Location

By default, the DTC log file is installed in the \System32\DTClog directory under WIN2000 or WINNT. If you want it somewhere else for performance and backup/recovery purposes, specify this location at install time. It's much easier to change the location at install time than to rewire it later. Plan ahead!

CAUTION

Before you apply service packs or upgrades, make sure that all in-doubt transactions are resolved. Microsoft reserves the right to change the format of the DTC log file between versions.

Executing Distributed Transactions

As a developer, you can change the way you code very slightly by using BEGIN DISTRIBUTED TRANSACTION instead of the usual BEGIN TRANSACTION (used for local transactions).

Earlier in this chapter, a business requirement of generating a weekly Customer Orders report was described, and the distributed query was coded to fulfill this requirement. You are now ready to turn this into a distributed transaction that will update the address information on SQL Server with the most recent values available from the Excel spreadsheet. You can simply create the distributed transaction as follows:

Begin Distributed Transaction 
Update CustomersPlus
set Address = b.Address,
City = b.City
FROM CustomersPlus AS a
INNER JOIN [ExcelSW]...[SWCustomers$] AS b
ON a.CustomerID = b.CustomerID
commit transaction
go

A quick peek at MS DTC via the MS DTC Client as this distributed transaction is being executed shows its uniquely assigned UOW ID and "active" status. For Windows 2000, this is available through Control Panel, Administrative Tools, Component Services, Transaction List. Figure 32.9 illustrates the active transaction list on a server.

Figure 32.9. MS DTC Transaction list.

graphics/32fig09.jpg

In general, you will want to try to limit the number of linked/remote servers in a single distributed transaction for performance reasons, locking reasons, and to limit the risk of being hit by a downed network. If you keep the number of linked/remote servers short, they will reward you with great durability.

Performance Monitoring and Troubleshooting

Obviously, because the transaction is more complicated and involves more components (servers, the network, and so on), things are going to go wrong.

The MS DTC Console provides you with a comprehensive set of statistics and information on the DTC service running on the server. It is one of the better ways to monitor some key areas:

General?Version information, coordinator name, and the ability to start and stop the MS DTC service.
Transactions?The currently active transactions and their state. From this window, you can also resolve a transaction by forcing a commit, abort, or forget (see Figure 32.10).
Figure 32.10. Resolving a transaction.
Statistics?The current number of active and in-doubt transactions, together with historical information on the total number committed, aborted, and so on.

You can only resolve a transaction if the transaction status is "in-doubt." The only other way to deal with this is to kill the process spid:

kill 64      /* the spid id of the initiating transaction */ 
go

If the distributed transaction status is "in-doubt," you can also issue a kill to the UOW and have it rolled back.

kill '5185e284-96a4-4529-91f4-27dcc766f9f8' with rollback 
go            /* the UOW of the distributed transaction */

The Transaction Statistic option is great for seeing current and aggregate distributed transaction quantities. As depicted in Figure 32.11, you can see the workload of the MS DTC and the response times associated with all distributed transactions being handled by this MS DTC.

Figure 32.11. Transaction statistics.

graphics/32fig11.jpg

If you have opened more than one connection to SQL Server (or other linked/remote Server) and submitted asynchronous updates (distributed transactions), one connection can become blocked (locked) by the other. This is said to be a distributed deadlock. To avoid this, make sure you have a query time-out for each connection and perhaps a lock time-out for each connection as well. An overall look at the reason for issuing asynchronous updates is in order as well. SQL Server 2000 cannot automatically detect a distributed deadlock. I have had to resort to kills to resolve this situation.

The MS DTC service is also sending application events to the event log. You can start here to investigate the reasons for this service not starting successfully. Figure 32.12 shows the Event Viewer and the MSDTC source information entries.

Figure 32.12. Event Viewer and MSDTC source.

graphics/32fig12.jpg

Make sure you have enough user connections for all servers involved. The ease and transparency of doing distributed queries/transactions often is overlooked from the remote server's point of view as far as available user connections.

Often overlooked is the bigger picture of distributed transactions and the impact on database backups and recovery. When you enter into supporting distributed transactions, you should also build a well-synchronized set of backup and recovery scripts. Logically related databases that are being updated via distributed transactions should be backed up and recovered together!

SQL Query Analyzer also provides a great picture and details of how a distributed query/transaction is executed. Use it extensively. Following is an example of the execution of the distributed transaction (the "Update statement" from earlier in this chapter). The Execution Plan option, as shown in Figure 32.13, has been chosen to show the cost and method of execution of all pieces of the distributed SQL statement.

Figure 32.13. Execution Plan for a Distributed Query.

graphics/32fig13.jpg

In addition, SQL Server 2000 has a mechanism for an OLE DB provider to return data distribution statistics that can be utilized by the query optimizer, increasing the chances of an optimal query plan.