Replication Agents

SQL Server utilizes replication agents to do different tasks during the replication process. These agents are constantly waking up at some frequency and fulfilling specific jobs. As you can see in Figure 22.18, several replication agent categories are listed under the Replication Monitor branch. Let's look at the main ones.

Figure 22.18. Various agent jobs.

graphics/22fig18.jpg

The Snapshot Agent

The snapshot agent is responsible for preparing the schema and initial data files of published tables and stored procedures, storing the snapshot on the distribution server and recording information about the synchronization status in the distribution database. Each publication will have its own snapshot agent that runs on the distribution server. It will take on the name of the publication within the publishing database, within the machine on which it executes ([Machine][Publishing database][Publication Name]).

Figure 22.18 shows what this snapshot agent looks like under the SQL Server Agent (Management > SQL Server Agent > Jobs) branch in Enterprise Manager. In addition, it also can be referenced from the Replication Monitor option (within the Replication Monitor > Agents > Snapshot Agents branch). You probably most often will use the Replication Monitor path to these agents!

It's worth noting that the snapshot agent might not even be used if the initialization of the subscriber's schema and data is done manually.

The Snapshot Agent

The snapshot agent is the process that ensures that both databases start on an even playing field. This process is known as synchronization. The synchronization process is performed whenever a publication has a new subscriber. Synchronization happens only one time for each new subscriber and ensures that database schema and data are exact replicas on both servers. After the initial synchronization, all updates are made via replication.

When a new server subscribes to a publication, synchronization is performed. When synchronization begins, a copy of the table schema is copied to a file with an .SCH extension. This file contains all the information necessary to create the table and any indexes on the tables, if they are requested. Next, a copy is made of the data in the table to be synchronized and written to a file with a .BCP extension. The data file is a BCP, or bulk copy file. Both files are stored in the temporary working directory on the distribution server.

After the synchronization process has started and the data files have been created, any inserts, updates, and deletes are stored in the distribution database. These changes will not be replicated to the subscription database until the synchronization process is complete.

When the synchronization process starts, only new subscribers are affected. Any subscriber that has been synchronized already and has been receiving modifications is unaffected. The synchronization set is applied to all servers that are waiting for initial synchronization. After the schema and data have been re-created, all transactions that have been stored in the distribution server are sent to the subscriber.

When you set up a subscription, it is possible to manually load the initial snapshot onto the server. This is known as manual synchronization. For extremely large databases, it is frequently easier to dump the database to tape and then reload the database on the sub-scription server. If you load the snapshot this way, SQL Server will assume that the databases already are synchronized and automatically will begin sending data modifications.

The Snapshot Agent Processing

Figure 22.19 shows the details of the snapshot agent execution for a typical push subscription.

Figure 22.19. Snapshot agent execution.

graphics/22fig19.jpg

Following is the sequence of tasks that are carried out by the snapshot agent:

The snapshot agent is initialized. This initialization can be immediate or at a designated time in your company's nightly processing window.
The agent will then connect to the publisher.
The agent then generates schema files with an .SCH file extension for each article in the publication. These schema files are written to a temporary working directory on the Distribution Server. These are the Create Table statements, and such, that will be used to create all objects needed on the subscription server side. They will only exist for the duration of the snapshot processing!
All the tables in the publication are then locked (held). The lock is required to ensure that no data modifications are made during the snapshot process.

The agent extracts a copy of the data in the publication and writes it into the temporary working directory on the distribution server. If all the subscribers are SQL Servers, then the data will be written using a SQL Server native format with a .BCP file extension. If you are replicating to databases other than SQL Server, the data will be stored in standard text files with the .TXT file extension. The .SCH file and the .TXT files/.BMP files are known as a synchronization set. Every table or article will have a synchronization set.

CAUTION

Please make sure that you have enough disk space on the drive that contains the temporary working directory. The snapshot data files will potentially be huge, which might be the most common reason for snapshot failure.

The agent will then execute the object creations and Bulk Copy Processing at the subscription server side in the order that they were generated (or skip the object creation part if the objects have already been created on the subscription server side and you have indicated this during setup). This will take a while. For this reason, it is best to do this in an off time so as not to impact the normal processing day. Network connectivity is critical here. Snapshots often fail at this point.
The snapshot agent will then post the fact that a snapshot has occurred and what articles/publications were part of the snapshot to the distribution database. This will be the only thing that is sent to the distribution database.
When all the synchronization sets have finished being executed, the agent releases the locks on all of the tables of this publication. The snapshot is now considered finished.

The Log Reader Agent

The log reader agent is responsible for moving transactions marked for replication from the transaction log of the published database to the distribution database. Each database published using transactional replication has its own log reader agent that runs on the distribution server. It will be easy to find because it takes on the name of the publishing database whose transaction log it is reading [Machine name][Publishing DB name] and REPL-LogReader category. Figure 22.18 also shows a Log Reader agent for the Northwind database.

After initial synchronization has taken place, the log reader agent begins to move transactions from the publication server to the distribution server. All actions that modify data in a database are logged to the transaction log in that database. Not only is this log used in the automatic recovery process, but it also is used in the replication process. When an article is created for publication and the subscription is activated, all entries about that article are marked in the transaction log. For each publication in a database, a log reader agent reads the transaction log and looks for any marked transactions. When the log reader agent finds a change in the log, it reads the changes and converts them to SQL statements that correspond to the action that was taken in the article. The SQL statements then are stored in a table on the distribution server waiting to be distributed to subscribers.

Because replication is based on the transaction log, several changes are made in the way the transaction log works. During normal processing, any transaction that has either been successfully completed or rolled back, is marked inactive. When you are performing replication, completed transactions are not marked inactive until the log reader process has read them and sent them to the distribution server.

It should be noted that truncating and fast bulk-copying into a table are non-logged processes. In tables marked for publication, you will not be able to perform non-logged operations unless you, temporarily, turn off replication on that table.

One of the major changes in the transaction log comes when you have the Truncate Log on Checkpoint database option turned on. When the Truncate Log on Checkpoint option is on, SQL Server truncates the transaction log every time a checkpoint is performed, which can be as often as every several seconds. The inactive portion of the log will not be truncated until the log reader process has read the transaction.

The Distribution Agent

A distribution agent moves transactions and snapshot jobs held in the distribution database out to the subscribers. This agent won't be created until a push subscription is defined. This distribution agent will take on the name of what the publication database is along with the subscriber information [machine name][publication DB name][subscriber machine name]. Figure 22.18 also shows a distribution agent for the Northwind database.

Those not set up for immediate synchronization share a distribution agent that runs on the distribution server. Pull subscriptions, to either snapshot or transactional publications, have a distribution agent that runs on the subscriber. Merge publications do not have a distribution agent at all. Rather, they rely on the merge agent, discussed next.

In transactional replication, the transactions have been moved into the distribution database, and the distribution agent either pushes out the changes to the subscribers or pulls them from the distributor, depending on how the servers were set up. All actions that change data on the publishing server are applied to the subscribing servers in the same order they were incurred. Figure 22.20 shows the latest history of the distribution agent and the successful delivery of a transaction.

Figure 22.20. Distribution agent activity.

graphics/22fig20.jpg

The Merge Agent

When dealing with merge publications, the merge agent moves and reconciles incremental data changes that occurred after the initial snapshot was created. Each merge publication has a merge agent that connects to the publishing server, and the subscribing server and updates both as changes are made. In a full merge scenario, the agent first uploads all changes from the subscriber where the generation is 0, or the generation is greater than the last generation sent to the publisher. The agent gathers the rows in which changes were made, and those rows without conflicts are applied to the publishing database.

A "conflict" can arise when changes are made at both the publishing server and the subscription server to a particular row(s) of data. A conflict resolver handles these conflicts. Conflict resolvers are associated with an article in the publication definition. These conflict resolvers are sets of rules or custom scripts that can handle any complex conflict situation that might occur. The agent then reverses the process by downloading any changes from the publisher to the subscriber. Push subscriptions have merge agents that run on the publication server, whereas pull subscriptions have merge agents that run on the subscription server. Snapshot and transactional publications do not use merge agents.

The Miscellaneous Agents

In Figure 22.21, you can see that several miscellaneous agents have been set up to do house cleaning around the replication configuration. These agents include the following:

Agent History Clean Up: distribution?Clears out agent history from the distribution database every 10 minutes (by default). Depending on the size of the distribution, you might want to vary the frequency of this agent.
Distribution Clean Up: distribution?Clears out replicated transactions from the distribution database every 72 hours by default. This agent is used for snapshot and transactional publications only. If the volume of transactions is high, the frequency of this agent will want to be adjusted downward so you don't have too large of a distribution database. However, the frequency of synchronization with subscribers drives this frequency adjustment.
Expired Subscription Clean Up?Detects and removes expired subscriptions from the published databases. As part of the subscription setup, an expiration date will be set. This agent usually runs once per day by default. You won't need to change this.
Reinitialize Subscriptions Having Data Validation Failures?This agent is manually invoked. It is not on a schedule, but it could be. It automatically will detect the subscriptions that failed data validation and mark them for reinitialization. This can then potentially lead to a new snapshot being applied to a subscriber that had data validation failures.
Replication Agents Checkup?Detects replication agents that are not actively logging history. This is critical because debugging replication errors is often dependent on an agent's history that has been logged.