MSSQL Replication

MSSQL allows you to replicate your databases locally, or to replicate to different servers over a LAN or via the internet. You may want to do this to create a copy of a database for development work, or you may want to maintain a remote copy of a database that is consistent with your local copy, for disaster recovery. You may also want to replicate to mobile devices so that mobile staff can work independently of the central office. SQL Server 2019 (not commercially available yet in September 2019) increases the maximum number of synchronous replicas to 5, up from 3 in SQL Server 2017. There is one primary replica, plus four synchronous secondary replicas.

back to top

Replication Databases

Some replication terms explained

The basic unit of replication is an Article which can be a table or part of a table.

A Publisher is the source server or database that holds the base data that needs to be sent to another server or database.

A Subscriber is the target server or database that receives data from a publisher.

The act of sending articles to another server or database is called Publication. The act of receiving articles from a publisher is called Subscription. If a Publisher send out unsolicited articles to subscribers, this is called Push Subscription. If a Subscriber logs into a Publisher and requests articles this is called Pull Subscription

The Distributor server manages the replication process, and can run stand-alone or as a combined Distributer / Publisher server. Replication information like copy jobs, pending transactions, replication status and history is held in the Distribution Database. The distribution database is used by the replication components of SQL Server, to store data including active transactions, snapshot jobs, synchronization status, and replication history information.

back to top

Types of Replication

There are three types of replication, Snapshot; Transactional and Merge. Transactional or Merge replication both start by applying a Snapshot to the Subscriber.

Snapshot replication is used to make a point-in-time copy of a database. You can schedule a snapshot so it refreshes the replica on a regular basis. Snapshot replication is useful on its own for data that does not change often, or small files.

Transactional replication will capture changes made to articles in the Publisher database and store them in the distribution database. They are then sent out to the Subscribers in the correct transactional order. Transactional replication relies on the Subscribers being constantly connected to the Publisher and is generally more appropriate for large, very active databases. SQL Server 2005 - 2012 supports transactional replication with updatable subscriptions, where updates at a subscriber can be sent back to the publisher, then propagated out to the other subscribers. This feature was removed in SQL Server 2016, so these subscriber databases should really be considered as read-only, as there is no way to get any updates back to the publisher.

Merge replication allows Subscriber users be more independent. They can work 'off-line' and update the Subscriber databases independently of the Publisher database. The Distributer then intercepts all updates and merges them together using a set of user created rules to handle collision updates. This sounds complicated and obviously has scope for data corruption, especially if a subscriber has been disconnected from the Publisher for a while. Potential conflict rules are; first update wins, treat a whole portion of data is a unit, merge data based on business logic. Merge replication is useful for mobile devices.

In SQL releases 2012 to 2106 transactional snapshot can publish data from Oracle to SQL, and also from SQL to Oracle. It can also publish data from DB2 to SQL but not all platforms are supported. This is a warning from Microsoft that this feature may be removed in future releases.

back to top

Backup and Recovery with Replication

The challenge with backup and recovery in a replicated environment is to make sure that the Publication database, the Distribution database and the Subscription databases all end up in a consistent state after a recovery. You need to backup your Production and Distribution databases, Subscriber databases are optional.

Publication Database

The data flow process is; data is updated in the Publication database and is then copied to the Distribution database where it is held until it is propagated out to all the Subscriber databases. If you restore the Publication database back to an earlier point, transactions may exist in the Distribution database that have been backed out from the Publication database

SQL Server has a parameter 'sync with backup' that ensures data consistency and this is called coordinated backups. The process that copies data out to the Distributer database is called the Log Reader Agent, which monitors the log for data changes and immediately propagates them to the distribution database. If you set parameter 'sync with backup' to 'true' the Log Reader Agent will not propagate any transactions until they have been backed up at the Publisher. This has three consequences

  1. You can guarantee that your backups are consistent with transactions passed to the Distributer
  2. You need to backup the publication database frequently
  3. It will take longer to replicate articles, as replication will wait for the next backup to complete.

Your choice is between performance and data integrity. If you want to check the setup of an existing database then execute the T-SQL DATABASEPROPERTYEX function at the Publisher. This will show the setting for the IsSyncWithBackup property of the publication database. A '1' means that coordinated backups are already enabled and a '0' means that they are not.

To ensure that your backups are consistent, execute the command

sp-replicationdboption '{publicationdatabasename}', 'sync with backup', 'true'.

If your performance is more important then you can run with 'sync with backup' set to 'false'. If you need to restore the Publication database, you would then have to reconfigure replication to resychronise all the databases, or if you want, you can force replication to run with inconsistent data. This is not recommended, but if you really want to do it,

Distribution database

In normal operation, the Log Reader Agent will scan the log on the Publication Database to find the next set of transactions that need to be replicated. It will then propagate the transaction out to the Distribution Database and once they are successfully committed it will tell the Publication Database. These transactions are then deleted from the log at the Publication Database and so cannot be propagated back out again if required.

This affects backup and restore of the Distribution database. If you back-level the Distribution database by restoring it, you cannot get it back up to date by re-applying transactions if the missing transactions have been deleted from the Publication database log.

To fix this, set the 'sync with backup' option to 'true' and also set trunc. log on chkpt. option of sp-dboption to false on the Distribution database. This will ensure that the Distribution database is backed up before the Publication database is allowed to truncate its logs. This will not delay replication, but it could increase the size of the Publication Database transaction log.

Subscription database

You do not have to backup Subscription databases. If you lose a Subscription database you can take a new snapshot then re-initialise the subscription.

If you want to be able to recover a subscription database, then change the minimum transaction retention period in the Distributor database from its default value of '0' to a value that is bigger than the interval between Subscription database backups. Then when you restore a Subscriber, the Distribution Agent will propagate any transactions that are missing from the Subscriber.

back to top

Installing SQL Replication

The process of installing replication involves the following steps.

These steps are explained in a bit more detail below, but there are a few pre-requisites to check first.
You need two SQL server instances on different Windows servers, preferrably some distance apart. The local server is called the Publisher server and the remote is called the Subscriber server. For illustration purposes, we will call ours LOCAL01 and REMOTE01.
These two servers must be connected by a network with sufficient bandwidth to handle the amount of changed data.
There must be enough free disk space for the SQL transaction log for the published database.
Replication is not supported on SQL Server instances that are more than two versions apart.
The processes below use SQL Server Management Studio (SSMS) and you must connect to the publisher and subscriber by using a login that is a member of the sysadmin fixed server role.

Create the Windows accounts for replication

You need to create Windows accounts on the local server for the following agents: Snapshot, Distribution, Logreader and Merge. As these are standard Windows functions the process is not explained in detail, you just create these 4 local Windows accounts to your own naming standards using 'Windows System Tools' at the Publisher server, and create accounts for Distribution and Merge at the Subscriber server. To help with the install discussion below, I'll use account names: repl_dist, repl_snap, repl_merge and repl_log.
The accounts must be a members of the db_owner fixed database role in the SQL replication Publisher, Distributor and Subscriber databases, and have read and write or modify permission on the replication snapshot share.

Create a snapshot folder

Create a snapshot folder that will be used to create and store the publication snapshot. The default location is 'C:\Program Files\Microsoft SQL Server\MSSQL.X\MSSQL\Data' but you can use any location with enough free space. In my experience, Windows Administrators do not like to see application data on the C: as it interferes with Windows System State backups. So create your folder, let's call it 'repldata' then right-click the folder and select 'Properties - Sharing - Advanced Sharing'.
In the 'Advanced Sharing' dialog box, select 'Share this Folder', and then select 'Permissions', then 'Add', then enter the Snapshot agent account name that you created above, as LOCAL01\repl_snap, the Merge account as LOCAL01\repl_merge and the Distribution account as LOCAL01\repl_dist.
After you add the three accounts, assign accounts repl_dist and repl_merge with READ permissions, and repl_snap with Full Control.

Select the repldata 'Properties' dialog box, then the 'Security' tab then select 'Edit'. then 'Add'.
In the 'Select Users, Computers, Service Accounts, or Groups' box, enter the Snapshot agent account name as as LOCAL01\repl_snap, the Merge account as LOCAL01\repl_merge and the Distribution account as LOCAL01\repl_dist, and for each account, select 'Check Names' and make sure that the permissions are set as above.

Make a note the Network Path for the share as you will need it later when you're configuring your snapshot folder.

Configure Distribution

Next, you need to configure the Distribution service at the publisher and set the required permissions on the publication and distribution databases. This process describes configuring a new, local distributor, with the distribution service running on the same machine as the Publisher. **Warning** It will overwrite any existing replication topology. Expand the server node in the SQL Server Management Studio, and right-click the Replication folder. If your server connection was defaulting to 'localhost' you will need to change it to the actual server name, LOCAL01 in our case. The Distribution Configuration Wizard should now start.

On the Distributor page, select 'LOCAL01 will act as its own Distributor', clink 'Next', and then the SQL Server will create a distribution database and log.
If the SQL Server Agent is not running, on the SQL Server Agent Start page, select 'Yes, configure the SQL Server Agent service to start automatically'.
Enter your path in the Snapshot folder box, We used \\LOCAL01\repldata, your path should match what you noted at the end of the Configure Snapshot section. Now select 'Next'.
You can accept most of the default values on the remaining pages of the wizard, for instance the name of the distribution database and the folders where the data and log files should be located.
You have a choice in the Wizard Actions, you can select the option to run immediately or you can create a script that can be executed at a later time.
Select Finish to enable distribution.

Now you need to set the database permissions. Expand 'Security' in the SQL Server Management Studio, right-click 'Logins', and then select 'New Login'
On the General page, select Search. Enter LOCAL01\repl_snap in the 'Enter the object name to select' box, select 'Check Names', and then select 'OK'.
On the User Mapping page, in the Users mapped to this login list, select both the distribution and user databases to be replicated.
In the database role membership list, select the db_owner role for the login for all the databases in the previous step.
Select OK to create the login.
Repeat steps 1-4 to create a login for the other local accounts (repl_dist, repl_logr, and repl_merge). These logins must also be mapped to users who are members of the db_owner fixed database role in the distribution and user databases.

Configure the Publisher

Once you’ve configured the Distributor, you can create a Publication.
In Object Explorer, locate the Replication folder, right-click 'Local Publication', and then click 'New Publication'. This will start the New Publication Wizard.
In the Publication Database page, select the user databases to be replicated and click Next.
On the Publication Type page, select 'Transactional publication'.
Now, on the Articles page, choose the articles should be part of this publication. You can restrict replication to specific tables or stored procedures.
Once you’re done with the object selection, Check the 'Show only checked articles' in the list option to list candidates of Publication.
You can also define filters that should be applied to your articles in the 'Filter Table Rows page', or go with the default values.
In the Snapshot Agent page specify when to run the Snapshot Agent. It can be run immediately or it can schedule to run at a later time.
Now, in the Agent Security page, clear the default security settings and enter the snapshot agent account you created above, repl_snap in our case. Select Security Settings for the Snapshot Agent. Enter LOCAL01\repl_snapshot in the Process account box, supply the password for this account, and then select OK. Then do the same for the logreader; LOCAL01\repl_logreader as the Process account for the Log Reader Agent.

In the complete Wizard Actions page, you’ve two options. You can create the Publication immediately or save the configuration in the script file to run at a later time. Type in a name for your publication, we will call ours Lascon01, and click 'Finish'.

Configure the Subscriber

Finally you need to create a subscription.
Connect to the Publisher in SSMS, expand the server node, and then expand the Replication folder. In the Local Publications folder, right-click Lascon01, the publication you created above, and then select 'New Subscriptions' to open the New Subscription Wizard.
On the Publication page, select 'Publication' and then click 'Next'
On the Distribution Agent location page, select 'Run all agents at the Distributor' and Click 'Next'
On the Subscribers page, select 'Add Subscriber', and then select 'Add SQL Server Subscriber' from the drop-down menu and the 'Connect to Server' dialog box will open.
Enter the subscriber instance name and then select 'Connect'.

After the subscriber SQL instance has been added, select the drop-down next to the instance name of your subscriber. Then select 'New Database' under Subscription Database and type in the database name and Click Ok.
The subscription database is created and registered to the subscriber. Now, click 'Next' and make sure that the account has db_owner permission on the newly created database.
On the Distribution Agent Security page, select the button on the right with the ellipsis (...) pattern. Type in the process account details and Click Ok then select Finish to accept the default values on the rest of the wizard.

Check it all works

Connect to the publisher in SMSS and expand the server node then expand the Replication folder.
In the Local Publications folder, expand the Lascon01 publication, right-click the subscription in the ProductReplica database, and then select 'View Synchronization Status' to see the current synchronisation status. If you insert, delete, or update the data in your user database you will (hopefully) see the data propagating between the servers.

It is important to know how much time it takes to propogate data out to the Subscriber. You can check this using tracer tokens
Connect to the publisher in SMSS and expand the server node, then right-click the Replication folder, and select 'Launch Replication Monitor'.
Expand a publisher group in the left pane, expand the publisher instance, and then select the Lascon01 publication.
Select the 'Tracer Tokens' tab then select 'Insert Tracer'.
This will display the elapsed time for the tracer token to pass between the Publisher to Distributor and the Distributor to Subscriber as well as the Total Latency.

back to top

MSSQL storage

Lascon latest major updates