Image Backups

The case for Image Backups

File systems that contain millions of files cause a number of problems for TSM.

  • Every backup of every file requires an entry in the TSM database, so one filespace with millions of files can account for 20% or more of the database capacity. Apart from the obvious space overhead, this also affects the time required to run tape reclamation and expire inventory.
  • Backups and Restores take a long time, not so much because of the amount of data involved, but because of the amount of activity required at the TSM database to handle all those files.
  • Backups of millions of files require a lot of client memory and backups can fail due to memory shortages. While memory efficient backup techniques can help here, even they can fail if a file space contains too many files.

An image backup will copy any filespace as a single blob, no matter how many files it contains. The advantages are that backups and restores of the filespace are fast, and you just get one entry in the database. The disadvantage is that you cannot restore individual files, just the whole filespace.

There are two scenarios where image backups can assist with problem filespaces.

  1. Recovery can be speeded up by taking a weekly image backup and normal daily incrementals. Recovery involves restoring the image, then 'rolling forward' with the incrementals by restoring with the -fromdate option.
  2. Image is perfect for those cases where you need to take a backup of a filespace, but the chances are you will never need to restore a file. An example would be to take a tax year end copy of a filespace, just in case the Inland Revenue comes calling.

There's 2 kinds of image backups, static image and online image.

Static image requires exclusive use of a filespace, so it puts a lock on it at the start of the backup so no-one else can access it, and releases the lock at the end of the backup. This is pretty much unrealistic for any kind of production data, unless you can arrange a total outage while you run your backup.

Online Image backups use snapshot technology. There's lots of ways to take snapshots, using either hardware or software, but the simplest answer seemed to be to use Microsoft's VSS (volume shadow services) if it is enabled. This is software based copy-on-write technology that uses temporary space in the existing filespace. you can find more information on VSS here, and hardware snapshots here.

configuring Online Image Backups

Online Image Backup configuration requires three steps.

  1. Configure TSM for VSS Snapshot Support and Open File support at the client. The easiest way is to use the TSM GUI wizard, select Utilities, then Setup Wizard and you should see the configuration options. This is a very easy, next-next-next process.
  2. Next you need some changes to your dsm.opt file, add the following lines - which assumes you are going to use VSS snapshot and you want an image backup of your E: drive.
    Snapshotproviderimage VSS
    Snapshotproviderfs VSS
    Domain.image E:
  3. Next, define an image backup schedule on your TSM server, where the action is IMAGEBACKUP, rather than the standard INC and associate your client node with it.

And that's it. Run the schedule and you should see something like

If you want to bind your image backups to specific management classes, please be aware that wildcards do not work with image backups in the same way that they work with other include statements. You can assign all the image backups for this client to a management class with the statement

include.image * mgmtclass

However if you want image backups to have different management classes, you cannot pick them out with wildcards, you need to specify the full filespace like this

include.image /usr/filesystem1 mgmtclass1
include.image /var/filesystem2 mgmtclass2

Restores

OK so far, but what about a restore? An image restore from either a static or online image backup requires exclusive use of an empty file space. This should not be an issue if you are rebuilding a server, but if the unlikely happens and you just need to retrieve a couple of files from an old backup, then you need to get some temporary disk space allocated and you have to restore the whole image, or in other words, the whole filespace. Note that even though the backup above just copied 61.63GB, a restore needs the full 300GB.

Once the restore completes you can see the full filespace with all it's original files, so it would be easy to copy over any that were required.

So this process could work well for large file spaces that hardly ever need files restored but the question you need to answer is, is the saving in TSM database worth the effort involved if you ever need that restore?