TSM Restore Hints

The links below take you to sections within this document


BASIC RESTORES

Date options on restores

Date options on restores can be a bit confusing. The first thing to remember, especially if you move between companies, is that there are various different date formats available in TSM. The date format is set in the TSM OPT file used by the TSM server. The default format is MM/DD/YYYY, other possibilities are DD-MM-YYYY, YYYY-MM-DD, DD.MM.YYY & YYYY.MM.DD.

There are three different types of date available for restores

TODATE will restore all ACTIVE and INACTIVE files backed up BEFORE the date specified
FROMDATE will restore all ACTIVE and INACTIVE files backed up AFTER the indicated date.
PITDATE will restore only the files that were ACTIVE on the day specified.

To use these effectively, you have to consider when a backup was taken. For example, you want to recover d:/mydir, as it was on July 4th, 2011. The backups run overnight, and usually complete by 06:00. The command to use would be

res d:/mydir/* -pitdate=07/05/2011 -pittime=07:00:00 -subdir=yes

You specify 07:00 on July 5th., to catch backups of files changed on July 4th., and that will get your directory back to as it was on the evening of July 4th.

PITDATE is slower than -FROMDATE & -TODATE, though it restores less data. That is because the -PITDATE option uses a different 'No-Query' Restore protocol. Your copygroup retention parameters determine how far back you can go with PITDATE. If you want to be able to take a server, or a directory back in time for 60 days, then you need to code

VEREXISTS=NOLIMIT
VERDELETED=NOLIMIT
RETEXTRA=60
RETONLY=60

back to top


How to recover a file with spaces in the name

Enclose the filepath in quotes. For example

restore "\\server name\c$\\PROGRAM FILES\COMMON FILES\file name.ext"

This applies in any circumstance where you have to specify files by name.

back to top


What is the difference between a no query restore and a standard restore?

TSM has two types of client restore, a 'classic' restore or a 'no query' restore. TSM will automatically decide which restore type to use, a Classic restore is used if the restore parameters specify any of 'latest', 'inactive', 'pick', 'fromdate' or 'todate' and a No Query restore is used for an unrestricted wildcard source file specification.

In a standard restore, the client queries the server for all objects that match the restore file specification. The server sends this information to the client, then the client sorts it so that tape mounts will be optimized. However, the time involved in getting the information from the server, then sorting it (before any data is actually restored), can be quite lengthy. Once the restore list is built, TSM requests the files from the server in groups where the group size is based on the TXNBytelimit setting on the client and the TXNGroupmax setting on the server. This used to work fine but can be a real problem if the file system cotains millions of files as it uses up client memory and so will cause system paging and performance issues.

A 'No query' restore lets the TSM server do the work: the client sends the restore file specification to the server, and the server starts restoring files as soon as it has found enough for a group, while at the same time building eligible files for the next group. The most visible benefit of no query restore is that data starts coming back from the server sooner than it does with 'classic' restore.

However, some Tivoli Storage Manager customers with large client file systems have reported performance problems when the NQR technique is used and only a few files qualify as restore candidates. In one case, it took the Tivoli Storage Manager server one hour to search through its database tables before eligible files were sent to the client. It is possible to force such a restore to use the classic method by specifiying the "-latest" option as part of the restore command pr by selecting the 'Disable No Query Restore method' check box in the client GUI.

back to top


How do you exclude files from a restore?

It is possible to do this using an undocumented command. All the usual caveats for undocumented commands apply: Test it first before you use it for real, and it it goes wrong, don't expect to get any support.

The command is

EXCLUDE.RESTORE file.name

back to top


Finding the tapes needed to restore a node

This is a common question and I've never seen a good answer. The following SQL query will show you all the backup tapes used by your node, but your restore will just want a sub-set of them. If anyone has a better answer I'd like to hear it.

select distinct node-name,volume-name,stgpool-name from volumeusage where node-name='xxxxx'

You need to change the xxxxx to your node name, and it will be case sensitive.

back to top


ADVANCED OPTIONS

Using the GUI for PIT restores - the files are missing!

There are two ways to run Point-in-time (PIT) restores; from the command line and from the GUI. However if you use the GUI you might find that it does not display any files for a given date, even though you know that backups exist.

The problem is down to the way the GUI works. It does not display a list of all available backups and versions for a client, as this would take too long to build and probably cause memory shortages on the client. You are initially presented with a list of available file spaces, and you have to drill down through the file spaces and directories to get a list of backups available in a given directory.

However if there is no retained backup of a directory then you cannot click on it to see the backups within it. Directory backup retention does not depend on the number of backup versions kept, but they get the management class within the policy domain that has the highest 'Retain only version' setting. It is possible that the management class (MC1) with the longest RETONLY setting will only keep one backup version, while you may have another class (MC2) that keeps twenty versions, but with a smaller RETONLY setting.

So consider the following scenario. You backup your directories to MC1 by default, but you bind all files in the /audit/ directory to the MC2 management class

Apr 01 - run the your first backup, the directories will all get MC1 management class and all the audit files get the MC2 management class.

Apr02-06, further backups of audit files, up to 6 backup versions retained, depending on how often the files change

Apr07, change the AUDIT directory and it gets backed up again, the first version expires

Apr08-12, more backups of audit files, up to 12 backup versions now retained.

Apr13, decide to recover the AUDIT directory back to Apr 04, run a PIT restore through the GUI, but you can't see anything as the directory backup from Apr01 has expired but you know that backups of files exist as they have a 20 day retention!

The short term solution is to use the command line as this does not rely on the ability to display directories before it can display its files and subdirectories.

If you want to use the GUI then the long term strategy is to make sure you keep enough backup versions of your directories by assigning them to a suitable management class. Set one up with VEREXISTS=NOLIMIT, VERDELETED=NOLIMIT, and RETEXTRA='n' where 'n' is the number of days that you must be able to go back with the GUI to do a PIT restore. You could set RETEXTRA to NOLIMIT of course but that could be expensive in terms of database usage

back to top


How do you recover files between 2 different clients?

The first thing to note is that the clients must both be the same platform. You cannot recover a file backed up from a UNIX box to a Windows box, for example. Be aware that if you try to access UNIX backup data from a Windows client using the virtualnode option below, you may prevent all access to the UNIX backup data.

By far the easiest way is to invoke your command line or gui with the virtualnode option. This does not disturb any of the existing client configuration and will not affect backups. Navigate to your TSM directory on the client you want to restore to, ususally \program files\tivoli\tsm\baclient\ (Windows) or \usr\tivoli\tsm\client\ba\bin64\ (AIX) and run one of the following commands.

dsmc -virtualnodename=sourceclient
dsm -virtualnodename=sourceclient

The first command opens the tsm command line, the second command opens the TSM GUI. The sourceclient is the name of the client that you want to restore files from, and TSM will always prompt you for its password. You will then be able to query all the backups on the source client, but when you run a restore, you must give TSM a target location on your current server, or it will restore to the original client.

back to top


Using Active-data pools to speed up restores

The most recent backup of any file is called the 'active' backup, and all older versions are 'inactive' backups. Many files are never changed after they are created and so are just backed up once by TSM. These older backups are stored on the original tapes, and as time goes by the active backups for a server or file system are mixed up with lots of inactive backups and are spread over loads of tapes. If you use tapes, then the problem with this is when you want to restore a file server or a large directory, then TSM has to mount lots of tapes and scan through them selecting the active files, and this slows the restore right down.

Active-data pools are designed to fix this issue. They are storage pools that contain only active versions of client backup data. Newly created active backups are stored in active-data pools, and as older versions are deactivated they are removed during reclamation processing. Active-data pools can be disk based FILE type or a dedicated tape pool. FILE type pools offer fastest restore times, partly because client sessions can access the volumes concurrently. Tape active copy pools are still beneficial, because the restore does not have to continually position the tape between inactive files.

Active-data pools should only be used for nodes that need to be recovered quickly in a disaster.

There are a couple of restrictions
Restoring a primary storage pool from an active-data pool might cause some or all inactive files to be deleted from the database if the server determines that an inactive file needs to be replaced but cannot find it in the active-data pool. As a best practice and to protect your inactive data, therefore, you should create a minimum of two storage pools: one active-data pool, which contains only active data, and one copy storage pool, which contains both active and inactive data. You can use the active-data pool volumes to restore critical client node data, and afterward you can restore the primary storage pools from the copy storage pool volumes.
The server will not attempt to retrieve client files from an active-data pool during a point-in-time restore. Point-in-time restores require both active and inactive file versions and for efficiency, tsm retrieves both active and inactive versions from the same storage pool rather than switching between storage pools.

There are two ways to start using an active-data pool, either by command using the COPY ACTIVEDATA command, or automatically using the simultaneous-write function on the Domain definition. In either case, TSM will only use the active-data pool if the data belongs to a node that is a member of a policy domain that specifies the active-data pool as the destination for active data.

Before you can run with either method you need to define the active-data pool with a command something like this-

DEFINE STGPOOL ADPPOOL fileclass POOLTYPE=ACTIVEDATA MAXSCRATCH=1000

and the domain must specify an active-data pool like this-

UPD DOMAIN domainname ACTIVEDESTINATION=ADPPOOL

then assuming this domain normally writes to a pool called BACKUPPOOL, add the active-data pool to it

UPDATE STGPOOL BACKUPPOOL ACTIVEDATAPOOLS=ADPPOOL

now you would want to get any existing active-data copied into this pool by using the command

COPY ACTIVEDATA BACKUPPOOL ADPPOOL

then under normal processing, active data will be copied into this pool when backups run, and files that go inactive will be removed. You might also want to schedule a weekly copy command just to make sure all the active data continues exist in the active-data pool as time goes by.

back to top


CLUSTER RESTORES

Restoring Windows Cluster files

The only different action required for a cluster restore is to determine which of the TSM clients you need to start up. A cluster can consist of ten or more real servers and tens of disk resources so this is might not be a trivial task. Log into the Windows server using the virtual cluster server name, and that will take you to the physical windows server that is hosting this client.

Open a Windows Explorer session, and take a look at which disks are online. Hopefully, your naming standards will let you identify which disk you need to restore from quite easily. Then, if you have several disks grouped into a single cluster resource, you need to find which disk is hosting the TSM control files. Assuming you are following the same standards as illustrated in the TSM Windows Cluster Backup page, and you want to recover a file from the G: you will see that is on the group-a cluster, and the tsm control files for that group are in f:\tsm\

back to top


SYSTEM RESTORES

How to restore the Windows System State

See the Windows section for details of what Microsoft call the System State , and TSM calls the System Object.

The command is simply

restore systemobject

It will restore ALL of the System objects which were backed up, but exactly what was backed up will depend on your Windows version and setup. See the TSM backup tips section for details.

The restore systemobject command is only valid for Windows 2000, XP, and Windows.NET.

If you just want to restore the Windows NT registry system object, use the command

restore registry entire

This command has a number of parameters which allow you to restore just part of the registry

To restore the active Windows NT eventlog, use the command

restore eventlog entire

If you are restoring a complete Windows system from a backupset, then you need to restore on tape one filespace at a time as it is not possible to control in which order the filespaces will be restored.

The backupset may typically contain both the system state and the system drive, usually the C:
To restore the system drive, assuming is is the c:, use the following command

dsmc restore backupset c:\* -backupsetname=\\.\tape0 -loc=tape -su=yes -repl=all
   or
dsmc restore backupset %systemdrive%\* -backupsetname=\\.\tape0 -loc=tape -su=yes -repl=all

Substitute the tape name tape0 as appropriate.
For the system state, use the following command

dsmc restore backupset systemstate -backupsetname=\\.\tape0 -loc=tape

You cannot restore the Windows system state from a VMware snapshot. The reason is the the system state consists of more than just the files that are backed up by a snapshot, it also consists of the registry.

back to top


Restoring a Windows 2003 server using TSM ASR

Recovery pre-requisites

The main physical requirement for BMR is that the new server must have the same number of disks as the original, and each disk must be the same size, or larger than the original disks.

To run the restore, you need

  • An ASR diskette, as described in the Windows section
  • a Windows CD that is at the same operating system level, and with the same service packs, as the operating system was at when you took the ASR backup
  • a TSM client CD that is at the same level or higher as the one that was used for the backup. It must be 5.2.0 or above.
  • A network connection that supports DHCP. (Dynamic Host Configuration Protocol is an Internet protocol for automating the configuration of computers that use TCP/IP)
  • The TSM node and password for the original client

The recovery process

  1. Boot the server using the windows install CD. You will need to reply to the prompt 'Press any key to boot from CD .. ' quite quickly
  2. Once setup commences you will see a message at the bottom of the screen that says 'Press F2 to run Automated System Recovery (ASR)' Again, you need to press F2 quickly or the setup will continue
  3. Insert the ASR diskette that you created with TSM into the floppy drive when you see the prompt 'Please insert the disk labeled Windows Automated System Recovery Disk' TSM will actually have labeled this disk TSMASR
  4. Windows will then check and reformat the disk partitions and the boot volume as necessary to match the original server setup, then copy some file across.
  5. When prompted by the message 'Insert the CD labeled TSMCLI into your CD-ROM drive', do just that. This will copy TSMCLI.EXE over.
  6. You will then be prompted to mount the TSMASR diskette again, so Windows can copy over three TSMASR files.
  7. The server will reboot, you will be prompted to remove the TSMASR diskette first.
  8. You then need to put the windows install CD back into the CD drive to allow the install to continue.
  9. ASR will install the TSM client, then ask you if you want to recover from a remote TSM server, or a local backupset, and ask for the TSM userid and password.
  10. TSM will then recover the system disks from the TSM backup, the system will be rebooted, and the server will be back to its former state, except that you will have to run normal TSM restores for all the data and application drives.

Of course, all this depends on you having an up to date TSM ASR diskette. What if you didn't create one? If you are stuck, you could create an ASR diskette on another (working) machine following the instructions in the backup section. You will need to edit the tsmasr.opt and tsmasr.cmd files on the diskette to change the nodename from the working server to the one you want to recover.

back to top


TROUBLESHOOTING TSM RESTORES

ANS1314E File data currently unavailable on server

This message sometimes crops up with Oracle restores, basically it means that TSM was unable to access the data, not necessarily that the backup data does not exist. I suggest that first you check to see if you had problems with tape drives. The specific error messages can be many pages above or below the ANS1314E messages, so the best way to check is to search the activity log for error messages "ANR8779E unable to open drive". If you have a faulty tape drive, you will see a number of errors for that drive. Vary the drive offline and retry your restore.

If your drives are OK, then the next suspect is a faulty tape, but you need to know exactly which tape you were trying to use. The ANR1314E message should list the file that it could not get. You need to match that file to a tape and one way to do it is to run the following commands. These commands assume that you are looking for a backup of a file called no.restore.txt, from a node called WIN_SERVER001. These commands can also be used to investigate archive recall problems. First you need to find the OBJECT_ID of the file you are after.

select object_id from backups where node_name='WIN_SERVER001' and ll_name='no.restore.txt'

That command will return an object ID, something like 123456789. Next, you need to run a show command which tells you which volume that object is on. Like all show commands, this command is unsupported. Substitute your own object ID.

show bfo 0 123456789

This command may return a tape volume name, like 'Volume Name: E12334', or it may return a disk file name as a super bit object, like 'Super-bitfile:0.12345' In this case you would re-run the show command with 12345 as the object, and it will return the disk file name.
Once you know which tape or disk file was required, you can run commands like 'query volume' to check the volume status, 'query content ... damaged=yes' to check that the backup is OK, or audit the volume to attempt to fix any problems.