Installing TSM backups on a Microsoft cluster

A Microsoft Cluster consists of a number of physical servers that are capable of hosting a series of resources. If any node in the cluster fails, all the resources hosted on that node fail over to another node in the cluster. For ease of management, resources are combined together into groups, and failover acts at the group level. Resources can be things like server names and IP addresses, but the ones that apply to TSM are disks, file shares and TSM schedule resources.

Open up the Windows Cluster Administrator, and on the left hand menu you will see Groups, Resources, Cluster Configuration, then a list of nodes, or physical servers that are hosting the cluster. Let us assume we have a 2 node cluster, CLS001 and CLS002, and 5 SAN attached disks, DSK01-5. We also have 2 groups, group-a and group-b. In this example, group-a contains DSK01-3 and group-b contains DSK04-5. Make a node of the Cluster Name as you need it to define the schedule services, ours is CL001.

Defining the Clients

For TSM, you then need 4 clients, 1 each for the physical nodes CLS001 and CLS002, to backup the local drives, and 1 for each of the groups, to backup the shared drives. You also need 4 schedule services, one for each client. Install TSM on both the physical nodes, and make sure it is installed on the same path on each server. The default location is C:\Program Files\tivoli\tsm\baclient. Just install and schedule standard TSM clients on the physical servers, but set the domains so they backup local disks only, make sure they do not backup the cluster disks, DSK01-5.

For the cluster resources, you define two clients at the TSM server. If you look at each group through the Cluster Administrator, you will see that each group has a Network Name. It seems intuitively obvious (to me), that this is the best name to use for the TSM client name. The clients are just defined with a standard Register Node command, with an extra parameter, ours are called CLABC01 and CLABC02.

Allocate a directory on one of the disks in each cluster group for your TSM configuration. It is best to use a standard name for this over all your clusters, as you will be typing it a lot, something like tsm, tsmconfig or tsmfiles. Suppose DSK01-3 are defined to the servers as f: g: h: and DSK04-5 are defined as i: j: You then allocate a directory \tsm on the f: and the i: and put a dsm.opt file in each one. You will have your own standards from dsm.opt files, but they will look something like

Note that the clusternode option is set to YES, and the DOMAIN option picks out the 3 drives on that cluster group. The TSM schedule and error logs are allocated on the cluster disk, so they move between physical nodes too. Define another dsm.opt file for the other resource group, with appropriate nodename, log file locations and domains, and place that file on the 'i' drive.

Defining the Schedule Services

Next, you need to install schedule services for the cluster nodes, and it is best to do this with a dsmcutil command. You must do this on both the physical nodes, for each cluster node, so here you would install 4 clustered schedule services. Start with the server that is hosting the cluster disks, and navigate to the directory where you installed your client code, usually C:\Program Files\tivoli\tsm\baclient then run these commands.

dsmcutil install SCHED /name:"TSM Scheduler Service - CLABC01" /clientdir:"c:\Program Files\tivoli\tsm\baclient" /optfile:f:\tsm\dsm.opt /node:CLABC01 /password:nodepassword /validate:yes /autostart:no /startnow:no /clusternode:yes /clustername:CL001

dsmcutil install SCHED /name:"TSM Scheduler Service - CLABC02" /clientdir:"c:\Program Files\tivoli\tsm\baclient" /optfile:i:\tsm\dsm.opt /node:CLABC02 /password:nodepassword /validate:yes /autostart:no /startnow:no /clusternode:yes /clustername:CL001

Next, fail the groups over the other physical server using Cluster Administrator, then run the same 2 commands again. If the cluster groups are not both hosted on the same server, then fail over or adjust the way you run these commands as appropriate.

Defining the Cluster Services

Now you need to add a Windows cluster service resource to manage the TSM schedule resources. Again, start with the physical node that is hosting both resource groups, and open up the Cluster Administrator.

Right click on group-a the select New -> Resource
On the first panel enter a name for this resource, which must be unique and should start with TSM so it's obvious what it is about, something like 'TSM SCHEDULE SERVICE FOR GROUP-A', and optionally add a description. The 'Resource Type' must be 'Generic Service' and the final 'Group' field should already be pre-filled with 'group-a'.
The next screen lists the physical owners, or servers that can host this group, in our case CL001 and CL002. Make sure all the physical servers are allocated.
The next screen is for Dependencies that must be available before the TSM service can start. For CLABC01, this will be the f: g: and h: drives.
Next you define the local service that you will start when this cluster service starts, which is the TSM scheduler. The name you specify here must exactly match the name that you used when defining the service, which was 'TSM Scheduler Service - CLABC01'.
Now you need to define the Registry key, which is SOFTWARE\IBM\ADSM\CurrentVersion\BackupClient\Nodes\CLABC01\TSM-server-name
Select OK, and the Cluster Resource will be created, but before you start it, right click on it, go into properties, navigate to the 'parameters' section and untick the 'Affect the group' box. If you leave this ticked and there is a problem with the TSM scheduler, then that could take the managed disks offline and affect customer service.

Repeat this procedure for the other cluster group.

The new scheduler service is now associated with the cluster group. If the group is moved (failed) to the other nodes in the cluster, the service should correctly fail over between the cluster nodes and notify both cluster nodes of automatic password changes.

Now navigate to \program files\tivoli\tsm\baclient\ and enter the command

dsm -optfile=f:\tsm\dsm.opt

this should load up a TSM GUI that points to the correct filespaces and data, and now its just a standard restore.