Selecting and Excluding Drives

You can select and exclude drives from scheduled backups by placing an entry in the DSM.OPT file on the client, with a line which looks like this for a Windows client

DOMAIN c: d:

The problem with these approaches is that you need to remember to update the dsm.opt file if you add new drives

DOMAIN ALL-LOCAL

will backup everything. A good variant, if you never wanted to backup your c: drive for example is

DOMAIN ALL-LOCAL -c:

This means that all drives are backed up except the c: drive, and you do not need to change the dsm.opt file as drives are added or removed. Specific selection criteria for the Windows System Object is given below.

However, the DOMAIN ALL-LOCAL approach will only work if you are backing up at domain level with a scheduled backup. If you manually select a volume that is excluded in the dsm.opt file, then it will be backed up.

dsmc incremental c: -subdir=yes

will backup the entire c: drive, even though it is excluded in the domain statement. This is not necessarily a bad thing, it means that you can exclude the c: from scheduled backups, but when you really want a backup, you can do it manually. This approach will also allow you to backup selected files or directories from an excluded disk

dsmc selective c:\tivoli\tsm\* -subdir=yes

If you absolutely never want to let anyone backup the c: drive in any circumstances, then use EXCLUDE statements as these will always apply. To exclude an entire disk you need two commands, one to exclude the drive and one to exclude the files in the root folder, like this

exclude c:\*
exclude.dir c:\*

So now if you try to backup with the command

dsmc selective c:\tivoli\tsm\* -subdir=yes

nothing will happen, because those files are always excluded.

back to top


Selective backups of Windows directories with embedded spaces

It is possible to select a number of directories from a Windows command line interface by listing them as in the example command below.

dsmc inc c:\dir1\* "d:\dir2\sub dir1\*" d:\dir3\ -subdir=yes

This command will incrementally backup all files in directories
c:\dir1
d:\dir2\sub dir1\
d:\dir3
and any subdirectories underneath them. Note that the asterisk is required in the second directory because 'sub dir1' contains an embedded space and needs to be enclosed in quotation marks. The Windows command processor treats a \" combination in a special way, but it will parse a \*" combination as expected

back to top


Schedmode, Polling or Prompted

SCHEDMODE POLLING means that every now and again the client asks the server if there is a schedule waiting to be started
SCHEDMODE PROMPTED means that the server contacts the client when it is time to start a backup

POLLING seems to work best with Windows clients, and is used with a QUERYSCHEDPERIOD parameter that tells it how often to contact the server to see if a backup is required. Typical parameters are shown below, and mean contact the server every hour.

SCHEDMODE         POLLING
QUERYSCHEDPERIOD    1

SCHEDMODE POLLING must be used if a client is outside a firewall.

SCHEDMODE PROMPTED is best used if you want to tell a client which specific LAN address and port it needs to use for a backup, otherwise it will use the address it used for first contact, every time. By default, TSM uses port 1501. If you find you are having problems with schedules missing with no apparent cause, it is possible that the server is trying to contact the client on the wrong address or port. You can force the server to use a specific ip address and port as shown below.

SCHEDMODE           PROMPTED
TCPCLIENTADDRESS    10.56.21.123
TCPCLIENTPORT       1501

Sometimes you will get backups missing due to port problems with the DSMCAD when you are running with SCHEDMODE=PROMPTED. Typically DSMCAD has to be recycled after each backup or backups will fail. You can check which port DSMCAD is using by recycling it then checking the dsmwebcl.log for an entry like:

(dsmcad) ANS3000I TCP/IP communications available on port XXXXX

DSMCAD should be listening for the server prompt on the port shown. You can check to see if it is listening by running a

netstat -an

command from an operating system command line, and you should see a listener on that port. Next, check that the TSM server can get to that port by opening an operating system command line from there, then running command

telnet client _ip_address port_no

If you get no messages, then the server is connecting OK. If you get errors then one possibility is that you are trying to get through a firewall, and you need that port opened up for both inbound/outbound communication.
Another option is to check that you can run a manual backup from the client. If this works then you could consider changing to schedmode polling.

back to top


Using the dsmcutil command

On a Windows client you can use the dsmcutil command to add and remove schedules from TSM, - much faster than using the TSM GUI Wizard. To define a standard node, use

dsmcutil inst /name:"TSM Scheduler Service - Z095XFSU1" /optfile:"C:\Program Files\Tivoli\TSM\baclient\dsm.opt" /node:z095XFSU1 /password:xxxx /autostart:yes /startnow:yes

The command should go to the opt file and get the sched and error log file names
To remove a schedule service, use

dsmcutil remove /name:"Tivoli Integrated Portal - TIPProfile-Port-16310"

you can also work with services using the Windows sc.exe program. For example,

sc query | FIND "TSM" /I

sc query lists all Services, then pipes the result into a find command to just list out the services that start with TSM, the /I switch means ignore case. Once you get the service name from the query command, you can use other sc commands on that service, for example

sc start 'Service-name'
sc stop 'service-name'
sc delete 'service-name'

back to top


Installing TSM backups on a Microsoft cluster

A Microsoft Cluster consists of a number of physical servers that are capable of hosting a series of resources. If any node in the cluster fails, all the resources hosted on that node fail over to another node in the cluster. For ease of management, resources are combined together into groups, and failover acts at the group level. Resources can be things like server names and IP addresses, but the ones that apply to TSM are disks, file shares and TSM schedule resources.

Open up the Windows Cluster Administrator, and on the left hand menu you will see Groups, Resources, Cluster Configuration, then a list of nodes, or physical servers that are hosting the cluster. Let us assume we have a 2 node cluster, CLS001 and CLS002, and 5 SAN attached disks, DSK01-5. We also have 2 groups, group-a and group-b. In this example, group-a contains DSK01-3 and group-b contains DSK04-5. Make a node of the Cluster Name as you need it to define the schedule services, ours is CL001.

Defining the Clients

For TSM, you then need 4 clients, 1 each for the physical nodes CLS001 and CLS002, to backup the local drives, and 1 for each of the groups, to backup the shared drives. You also need 4 schedule services, one for each client. Install TSM on both the physical nodes, and make sure it is installed on the same path on each server. The default location is C:\Program Files\tivoli\tsm\baclient. Just install and schedule standard TSM clients on the physical servers, but set the domains so they backup local disks only, make sure they do not backup the cluster disks, DSK01-5.

For the cluster resources, you define two clients at the TSM server. If you look at each group through the Cluster Administrator, you will see that each group has a Network Name. It seems intuitively obvious (to me), that this is the best name to use for the TSM client name. The clients are just defined with a standard Register Node command, with an extra CLUSTERNODE parameter, our clients are called CLABC01 and CLABC02.

Allocate a directory on one of the disks in each cluster group for your TSM configuration. It is best to use a standard name for this over all your clusters, as you will be typing it a lot, something like tsm, tsmconfig or tsmfiles. Suppose DSK01-3 are defined to the servers as f: g: h: and DSK04-5 are defined as i: j: You then allocate a directory \tsm on the f: and the i: and put a dsm.opt file in each one. You will have your own standards from dsm.opt files, but they will look something like

Note that the clusternode option is set to YES, and the DOMAIN option picks out the 3 drives on that cluster group. The TSM schedule and error logs are allocated on the cluster disk, so they move between physical nodes too. Define another dsm.opt file for the other resource group, with appropriate nodename, log file locations and domains, and place that file on the 'i' drive.

Defining the Schedule Services

Next, you need to install schedule services for the cluster nodes, and it is best to do this with a dsmcutil command. You must do this on both the physical nodes, for each cluster node, so here you would install 4 clustered schedule services. Start with the server that is hosting the cluster disks, and navigate to the directory where you installed your client code, usually C:\Program Files\tivoli\tsm\baclient then run these commands.

dsmcutil install SCHED /name:"TSM Scheduler Service - CLABC01" /clientdir:"c:\Program Files\tivoli\tsm\baclient" /optfile:f:\tsm\dsm.opt /node:CLABC01 /password:nodepassword /validate:yes /autostart:no /startnow:no /clusternode:yes /clustername:CL001

dsmcutil install SCHED /name:"TSM Scheduler Service - CLABC02" /clientdir:"c:\Program Files\tivoli\tsm\baclient" /optfile:i:\tsm\dsm.opt /node:CLABC02 /password:nodepassword /validate:yes /autostart:no /startnow:no /clusternode:yes /clustername:CL001

Next, fail the groups over the other physical server using Cluster Administrator, then run the same 2 commands again. If the cluster groups are not both hosted on the same server, then fail over or adjust the way you run these commands as appropriate.

Defining the Cluster Services

Now you need to add a Windows cluster service resource to manage the TSM schedule resources. Again, start with the physical node that is hosting both resource groups, and open up the Cluster Administrator.

Right click on group-a then select New -> Resource
On the first panel enter a name for this resource, which must be unique and should start with TSM so it's obvious what it is about, something like 'TSM SCHEDULE SERVICE FOR GROUP-A', and optionally add a description. The 'Resource Type' must be 'Generic Service' and the final 'Group' field should already be pre-filled with 'group-a'.
The next screen lists the physical owners, or servers that can host this group, in our case CL001 and CL002. Make sure all the physical servers are allocated.
The next screen is for Dependencies that must be available before the TSM service can start. For CLABC01, this will be the f: g: and h: drives.
Next you define the local service that you will start when this cluster service starts, which is the TSM scheduler. The name you specify here must exactly match the name that you used when defining the service, which was 'TSM Scheduler Service - CLABC01'.
Now you need to define the Registry key, which is SOFTWARE\IBM\ADSM\CurrentVersion\BackupClient\Nodes\CLABC01\TSM-server-name
Select OK, and the Cluster Resource will be created, but before you start it, right click on it, go into properties, navigate to the 'parameters' section and untick the 'Affect the group' box. If you leave this ticked and there is a problem with the TSM scheduler, then that could take the managed disks offline and affect customer service.

Repeat this procedure for the other cluster group.

The new scheduler service is now associated with the cluster group. If the group is moved (failed) to the other nodes in the cluster, the service should correctly fail over between the cluster nodes and notify both cluster nodes of automatic password changes.

Now navigate to \program files\tivoli\tsm\baclient\ and enter the command

dsm -optfile=f:\tsm\dsm.opt

this should load up a TSM GUI that points to the correct filespaces and data, and now its just a standard restore.

Restoring Windows Cluster files

The only different action required for a cluster restore is to determine which of the TSM clients you need to start up. A cluster can consist of ten or more real servers and tens of disk resources so this is might not be a trivial task. Log into the Windows server using the virtual cluster server name, and that will take you to the physical windows server that is hosting this client.

Open a Windows Explorer session, and take a look at which disks are online. Hopefully, your naming standards will let you identify which disk you need to restore from quite easily. Then, if you have several disks grouped into a single cluster resource, you need to find which disk is hosting the TSM control files. Assuming you are following the same standards as illustrated in the TSM Windows Cluster Backup page, and you want to recover a file from the G: you will see that is on the group-a cluster, and the tsm control files for that group are in f:\tsm\

back to top


BACKING UP SYSTEM DATA

Backing up the Windows SystemState

The Windows System state contains all the data needed to recover the operating system from scratch. According to Microsoft, "System state is a collection of several key operating system elements and their files. These elements should always be treated as a unit by backup and restore operations." See the Windows section for more details about the System State . You backup the Windows System object by either using the command

backup systemstate

or by specifying a SYSTEMSTATE domain in the dsm.opt file. The ALL-LOCAL domain includes the system state. Exactly what you backup will depend on the release of Windows, and what Windows components are installed.

The systemstate is a special object type and requires special scheduling. If you are running a full incremental backup of a server, then the system state will be included. However if you want to be selective, then you must schedule a backup with an ACTION type of BACKUP, with SYSTEMSTATE in the OBJECTS field. In a selective backup, the systemstate must be backed up on its own with no other objects in the schedule.

A system state backup uses Volume Shadow Copy Service (VSS), where each operating system 'element' is represented by a Microsoft VSS writer of type 'VSS_UT_BOOTABLESYSTEMSTATE'. Exactly which system state writers will be used depends on the Windows operating system. The 'System Writer' will process most of the files needed for the system state, but other writers may include the 'Registry Writer', the 'WMI Writer', the 'Task Scheduler Writer' , the 'COM+ REGDB Writer' and the 'ASR Writer'.
The backup process works like this

  • TSM, acting as a VSS requester, queries VSS for the list of bootable system state writers
  • VSS requester queries each VSS writer for its metadata which includes the files that need to be backed up for that writer
  • The necessary snapshot(s) are created by the VSS provider
  • The data is backed up from the snapshots
  • The snapshots are released
  • The backup is complete

The IBM recommendation is that you use Open File Support for drive backups, and investigate and fix all 'cc=4' open file errors. Do not exclude files unless you are certain they are not needed for restore, specifically do not exclude ntuser.dat or usrclass.dat files.

Backing up the system state became much more of a challenge with Windows 2008 onwards as the number of objects requiring backup were considerably higher, 8,000 with Windows 2003, maybe 80,000 with Windows 2008. This massive increase affected backup processing and TSM server housekeeping.

The first thing you will notice is that backups run for considerably longer, and will appear to hang for several hours. This is partly because an incremental systemstate backup needs to do a lot of work comparing client data with server data to decide what to backup. The other reason is that systemstate backups are 'grouped' and once the backup is complete the server will regroup the systemstate objects which can take a long time. While TSM is doing this, it holds the client session open, and will not mark the backup as complete until the regrouping is finished.
TSM server expiration will also take a long time, especially if you are retaining a lot of systemstate backup versions.

The first question to ask your server support people is, 'would they actually use a TSM backup to recreate a Windows system, or do they recreate from a standard build?
If TSM systemstate restores are not required, there is no point in running backups.
If backups are required, then consider that we tend not to do system maintenance on servers every day, so systemstates are usually quite static. We also do not want to backlevel a server by several weeks, if we need to restore, then we usually want the last backup. Based on these facts, it seems reasonable to backup the systemstate just 2-3 times per week, and keep the retention period low, 2 weeks would be more than adequate. This low retention rate would limit the impact of systemstate backups on the TSM server.

If you have a large number of Windows clients, then IBM has suggested the following strategy

  • Split the Windows clients into 3 domains, assume Domain1, Domain2, Domain3.
  • Backup each domain twice per week, on separate nights
    • Domain1, Monday/Thursday
    • Domain2, Tuesday/Friday
    • Domain3, Wednesday/Saturday
  • Retain 2 weeks worth of backups, that is, 6 versions.
  • Run expiration by domain using the domain=xxx parameter
    • Expire domain3 systemstate on Monday/Thursday
    • Expire domain1 systemstate on Tuesday/Friday
    • Expire domain2 systemstate on Wednesday/Saturday

Running expiration like this means that it will will not cause lock contention with the backups.

To restrict the number of backups held, you bind the systemstate files to a management class that keeps relatively few versions. You achieve this with the following include statement in the dsm.opt file, or in an include/exclude file if you keep these separate

INCLUDE.SYSTEMSTATE ALL yourmgmtclassname


TSM 6.2.3 introduced the ability to take incremental systemstate backups. Incremental is the default option, but if you need to take full backups, this can be controlled using a SYSTEMSTATEBACKUPMETHOD in the client options file (dsm.opt). The options are FULL, OPPORTUNISTIC and PROGRESSIVE.

As you would expect, FULL means backup all the files belonging to the system state.
OPPORTUNISTIC means that one or more files are changed since the last backup, the entire system writer is backed up, but if no files have changed then the smaller writers like registry are still backed up, but the huge system writer is not backed up again.
PROGRESSIVE is standard TSM incremental processing. That is, only those system writer files that have changed since the last system state backup will be backed up. This is the default.


For systemstate backups to work, the Windows VSS writers must be working successfully. When they are not working, you typically see error messages like 'ANS1950E Backup using Microsoft volume shadow copy failed'. The error message text usually includes 'vss'.

To resolve these errors, first check that the Windows VSS service is in 'Manual' mode and can be started. It's normal state is 'Stopped', as TSM must be able to start it up with the correct set of parameters. Use the Windows command 'vssadmin list writers' to check the status of the writers.

Second, check that the userid that you use to run your backups has the correct permissions to be able to access the writers

If these both look OK, then check with Microsoft Support for the latest hotfixes for VSS.

There is also a Microsoft utility, VSHADOW.EXE, that can be used to test and report on VSS writers. The following link describe the utility and tell you where the downloads are
VShadow Tool and Sample (Windows) - MSDN - Microsoft

Another option is to test the VSS writers using a Microsoft tool called DiskShadow.exe. This operates outside of TSM and so is useful to check to see if a problem lies with TSM or with VSS.
Open up a Windows command prompt and start up diskshadow with the command

diskshadow /l c:\diskshadow.log

This will give you a diskshadow prompt, so to get the status of the writes run commands

reset
list writers
list writers status
list writers metadata
list writers detailed
list providers
exit

check out file c:\diskshadow.log as that will hold the results of the commands, and will hopefully tell you if there are any errors

.

You can also create a snapshot of SystemState using diskshadow, which is independent of any snaphot created by TSM. Run the following commands:

reset
set verbose on
set option differential
set context volatile
add volume c:
add volume d: (if the system is on more than one disk, add them one by one)
create
exit

Again, checkout c:\diskshadowsys.log and see if any errors occured during the create phase. If you see any, then your issues are with VSS, not TSM.

Changing the Management class for System Objects

If you want to change the management class on System Objects, you need to add a line to dsm.opt. ALL system objects must be bound to the same management class.

include.systemobject ALL new-class-name

If the system object won't rebind to new mgt class, try deleting the filespace, and it should rebind on the next backup. The command to do this from the server is

del fi nodename "SYSTEM OBJECT" nametype=unicode

Why would you want to do this? Well the system state is large, so if your default management class holds 40 backup versions say, then you will use lots of TSM backup space for the system state for every server. You can assign a management class to the system state that keeps fewer versions for less time, to save space.

back to top


How to restore the Windows System State

See the Windows section for details of what Microsoft call the System State

Using the command line, the command to restore the entire system state is simply

restore systemstate

It will restore ALL of the System state components that were backed up, but exactly what was backed up will depend on your Windows version and setup. It is possible to restore some of the system state components, but if you want to do that, it may be easier to use the TSM GUI.
To do this, open the GUI and click on the 'Restore' option. This will open up a restore window. Expand the directory tree and locate the system state node. Click on the '+' sign to display the available system state components. You can then select the individual system state components that you wish to restore.

back to top