TSM Database and Recovery Log

TSM Database version 6.1 and later

TSM Database version 5.5 and earlier

This information has been moved to a separate page as TSM 5.x is now out of support TSM 5.x

Basic database structure

TSM now uses a DB2 relational database. This means it can be much bigger than the old legacy database, does not need database audits and will automatically re-organise itself if required. Software database mirroring is no longer supported.
One thing to note is that you cannot have any other DB2 applications running on the same server as the one hosting your TSM database, though you can have multiple TSM instances on one host server. A TSM 'instance' is everything required to run a TSM server, including database, logs, storage pools etc.

The database uses DB2 relational consistency rules to prevent incorrect data from entering, and is self auditing. The database will also run automatic 'runstats' from time to time. This is a DB2 feature that optimises storage paths through the database to improve performance.
The database also uses relational indices, so it does not require special index tables to speed up SQL queries.

As the database is DB2, it is possible to connect to it directly, outside of TSM and run query commands. To do this in a Windows environment, follow this procedure:

Open a Windows command prompt.
Run the db2cmd command. This opens a new command window with the DB2 environment set up.
From this new window, run the following commands :
set DB2INSTANCE=server1
db2 connect to TSMDB1
db2 set schema TSMDB1

In a UNIX or Linux environment, the process is:

Log on with the instance user ID.
From a shell, run the following commands :
db2 connect to TSMDB1
db2 set schema TSMDB1

After you connect to the database, you can run DB2 select commands. For example to get the list of all table indexes that are used by the Tivoli Storage Manager server, run the following command :

db2 "select indname from syscat.indexes where indschema='TSMDB1'"

To get a list of indexes that are used by a specific table, use the tabname parameter to identify the table name. For example, to get a list of indexes that used by the BACKUP_OBJECTS table, run the following command :

db2 "select indname from syscat.indexes where indschema='TSMDB1' and tabname='BACKUP_OBJECTS'"

To get a list of table names, run the following command :

db2 "select tabname from syscat.tables where tabschema='TSMDB1' and type='V'"

back to top

Database sizing and tuning

The database can exist in up to 128 directories or 'containers' to use the correct DB2 term. The data is striped evenly across the directories and the containers do not require an initial format before they can be used. The Q DBSPACE output below shows a database striped over 3 containers.

                     File System(MB)  (MB)          AVAILABLE (MB)
/tsm/tivoli/dbdir001 102,144          6,919.22      98,224
/tsm/tivoli/dbdir002 102,144          6,919.22      98,224
/tsm/tivoli/dbdir003 102,144          6,919.22      98,224

When a TSM database is initially created on multiple file systems, that database will be spread equally over all the file systems. However if you add an extra file system to the database space using the 'extend dbs' command, DB2 will not rebalance the database to spread the data equally. This means that if some of the original file spaces were 100% full, they will still be 100% full after the new filespace is added and this could cause the TSM to stop.
If you are running TSM Server V6.2 or above you can rebalance the database dynamically using DB2 commands. I suggest that you look up the IBM technote about this, and also contact IBM for advice before trying this. However in outline the process goes like this:

At a quiet time, log into the DB2 command line with the DB2 administrator userid and run these commands

db2 connect to tsmdb1
db2 set schema tsmdb1
db2 alter tablespace [tablename] rebalance

The tables that IBM mention in the technote are SYSCATSPACE, USERSPACE1, LARGESPACE1, LARGEIDXSPACE1, IDXSPACE1, SYSTOOLSPACE. It can take about 30 minutes to rebalance a large tablespace. When all tablespaces have done the rebalance, execute the following commands to reduce the same list of tablespaces :

db2 alter tablespace [tablename] reduce max

After the rebalance/reduce, the file systems should be equally utilized and none should be full

The maximum database size for TSM 6.3 and TSM 7.1 is 4TB, and 2TB for TSM 6.2. The DB2 database is largely self tuning, so there is no requirement for DB2 tuning skills. A new parameter, DBMEMPERCENT, replaces the old BUFFPOOLSIZE. This set of buffers contains much more data than the old buffer so the recommendation is to set its size to unlimited. In fact, TSM/DB2 will try to change it to unlimited on startup.

The Deduplication effect on Database size

Deduplication reduces the amount of storage space used for backups, but this comes at the cost of increased database space as TSM must track the metadata needed to reconstitute deduplicated data. There is no exact formula for working out that database increase, but it is possible to estimate the database size impact.

Deduplication data is stored in 'chunks' of variable size, but the average size is about 100k. Each chunk uses about 490 bytes in the database, and then another 190 bytes if it is copied to a copy storage pool. Compare that with the normal overhead of 200 bytes needed to store metadata for every copy of every file backed up.
If you want to know how many files are stored on your TSM server, use the following select command

select sum(cast(num_files as bigint)) from occupancy where node_name is not null and filespace_id is not null

So it would appear reasonably easy to estimate your database size, all you need to know is backup file count and the number of bytes in your storage pools. However it is not quite as easy as this, as the first set of chunks that are created are called 'base chunks' and they are not automatically expired when that file is deleted, as other files may be linked to them. Worst case, and very unlikely, is that this effect could double the chunk count.

A starting point is to estimate your database size without deduplication, and to do this you use the formula
db_size = file_count * number_of_backup_copies * 200

Now to estimate the database size with deduplicaton, considering an enterprise that is managing 20TB of data, with the primary storage pool deduplicated, and a copy storage pool that is not deduplicated.
Estimated # of chunks created = 200,000,000 (20,000,000,000,000 / 100,000)
Database cost for storing chunks in primary pool = 98,000,000,000 (200,000,000 x 490)
Database cost for storing chunks in copy pool = 38 GB (200,000,000 x 190)
Initial estimated database cost for deduplication = 136 GB
Add in the worst case scenario for non-expiring base chunks then database cost is 272GB
If a copy storage pool is deduplicated, the same cost would apply as if it were stored in an additional primary pool.

IBM provides a pearl script that provides details on how your database handles deduplication content, once you have been running dedup for a while. Search the IBM site for tsm_dedup_stats.pl and you should find it.
It gives you information like database size broken down by rows in the large tables, the number of client nodes registered, by storage pool, the percentage of pool deduplicated and how many bytes deduplication is saving in each pool. It also gives you detailed deduplication information like chunk count and average chunk size.

back to top

Recovery log sizing and tuning

TSM has three recovery logs.
The Active log contains updates that have not been committed to disk yet and is used for roll-forward or roll-back in case of problems. Once a transaction is committed, the data is moved to the archive log. The default size for the Active log is 2GB and the size can be increased by increments of 512MB right up to 128GB.
The Archive log contains committed transaction data and is used for PIT recovery of the database. The Archive log is cleared out by a full database backup. However it retains all data updates applied right back to the second last backup, so you need to size your archive log with that in mind.
The Failover Archive log
TSM collectively calls these three logs the 'recovery log', but a DB2 DBA would just call them 'transaction logs'.

The log files form part of the TSM database, and unlike the legacy TSM database there is no need to create and format log volumes. The logmode is equivalent to legacy roll-forward. In DB2 terms, these are archive logs, not circular logs. This means that the log files can fill up, so log file management is still required. You can specify a failover log for the Archive log to help prevent this, but the Active log cannot failover and the size is fixed between 2GB and 512GB, so don't allocate all the space that you have available for the Active log, keep some in reserve for emergencies.
It is highly recommended that FailoverArchiveLog space be set aside for possible emergency use. You can use slower disks for FailoverArchiveLog space.

Prevention is better than cure, and the active log can become pinned by client sessions that are either hung or running very slowly. A pinned backup can result in the active log becoming full.
It is possible to automatically identify and terminate any hung or long-running client sessions by using the THROUGHPUTDATATHRESHOLD and THROUGHPUTTIMETHRESHOLD options. These are specified in the dsmserv.opt file and are used by the server to monitor the status of each client session. If after the amount of time specified by THROUGHPUTTIMETHRESHOLD, the average throughput is less than the value specified by the THROUGHPUTDATATHRESHOLD parameter, then the session will be cancelled by the server.
You code these parameters in dsmserv.opt like this:

throughputtimethreshold 100
throughputdatathreshold 60

This means that if a session is not achieving an average throughput of at least 60KB per second after running for 100 minutes, the server will cancel this client session. These values are for illustration purposes, you need to pick the values that work best for your site.

The archive logs should be automatically deleted after a full database backup completes, but in some cases this might not happen and then the archive logs will continue to grow until the filespace is full.
Run the DB2 command

get db cfg for tsmdb1

and check the output, which should have the default values set like this.

Number of database backups to retain (NUM_DB_BACKUPS) = 1
Recovery history retention (days) (REC_HIS_RETENTN) = 0
Auto deletion of recovery objects (AUTO_DEL_REC_OBJ) = ON

Once the number of backups specified by the NUM_DB_BACKUPS is reached, the old backups are marked as expired and will be deleted by the next database backup. The AUTO_DEL_REC_OBJ parameter specifies whether the database log files, backup images, and load copy images associated with those backups should be deleted at the same time. These parameters cannot be changed from within TSM by a TSM administrator and should not be changed from the DB2 command line, but left at the default values.

If you get a LOGREAD388 error, then the TSM server will crash. This error happens when a recovery log volume becomes corrupted. If this happens to you, your first option should be to check your operating system error logs. If they indicate that a file system has corruption errors or disk errors, get your system administrator to check out the operating system and see if the error can be fixed there.

Your next option is to check to see if your logs are mirrored, and if so then tell TSM to use the mirror copy. From an OS command line issue the command below and check the output

dsmserv display logv

this will tell you if the logs are mirrored if the command shows more than one copy for the log files. If they are mirrored, then add the option
in the dsmserv.opt file, and that will force the TSM server to sync from the other copy. If dsmserv.opt already contains a MIRRORREAD line then comment it out to make sure that your option is used. If the TSM server starts OK, then it has fixed the issue by copying the uncorrupted data over from the copy log, and all is well.

If this fails, then you will probably have to restore your database to the latest backup, but you might want to contact IBM for further advice first.

You may want to change the location of the achive logs, maybe because you have added a bigger file system and this is usually quite easy.

  1. Take a full database backup, not just because that is the starting point for any change, but also because a full backup of the TSM database will clear the archive logs down.
  2. Halt the TSM server
  3. Edit the dsmserv.opt, changing the values for ARCHLOGDirectory and ARCHFAILOVERLOGDIR to your new paths.
  4. Start the TSM server up again and the directory structure will be automatically generated in the new empty filesystem. There is no need to copy the archive logs over from the original location, because your full backup emptied them out.
  5. If you want to prove to yourself that this worked, then a 'query log f=d' command should show the new locations.

If you really need to, or even just want to, it is possible to update DB2 directly to change the log location. You should really ask your IBM support rep for advice before you start, but an outline process would go like.

  1. backup your database (full), devconfig and volhistory files.
  2. Halt your TSM Server
  3. Login to your DB2 command line and check the existing log locations with command 'db2 get db cfg for tsmdb1'
  4. Create your new archive log directory and paths, and make sure that the TSM instance user has read/write access to them.
  5. Update DB2 to use the new path structure with these 3 commands, substituting your paths as appropriate
    db2 update db cfg for tsmdb1 using OVERFLOWLOGPATH /tsm/new_archivelog/path/
    db2 update db cfg for tsmdb1 using LOGARCHMETH1 DISK:/tsm/new_archivelog/archmethpath/
    db2 update db cfg for tsmdb1 using FAILARCHPATH /tsm/new_archivelog/failarchpath/
  6. run the 'db2 get db cfg for tsmb1' command again and make sure your changes worked
  7. Edit your 'dsmserv.opt' and update the archive log path entry to match the changes you made to DB2
  8. Start your TSM Server up
  9. Run the 'q log f=d' command and check archive log location.

back to top

Activity, Schedule and Event logs

The TSM server stores information about server and client activity in three places; the Activity Log, The Events Log and the Summary Log. The Activity Log tells you everything that is happening on the TSM server, the Events log details all the activity that is triggered by the TSM server Scheduler, including both Admin and Client schedules. The Summary Log, as the name suggests, stores summary data of things like expiration, reclamation, backups and restores. Since TSM 6.0 all three of these logs are stored in the TSM database so they use database space and can cause problems if they hold too much data. To find out how many days worth of data they are holding, run the QUERY STATUS command and look for lines like

    Activity Log Retention Period: 5 Day(s)
    Event Record Retention Period: 10 Day(s)
Activity Summary Retention Period: 30 Day(s)

You can change these default entries using the SET command as shown below. The numbers are for illustration purposes and should not be considered as recommendations. Most people stick with the defaults. If you increase them, then monitor the impact on database size.

set ACTlogretention 10
Set EVentretention 12
Set SUMmaryretention 20

If you want to to know exactly how much data is held in the event and summary tables, you could try the following (rather complicated) queries that appeared in IBM tech notes in 2014. You need to run them from a DB2 command line, not the TSM command line, so you initialy connect to the TSM database then run the queries like this

db2 connect to tsmdb1
db2 set schema tsmdb1

db2 "select tu.name,cast(rows_in_table as bigint),cast(table_used_mb as bigint),cast(table_alloc_mb as bigint),cast(index_used_mb as bigint),cast(index_alloc_mb as bigint) -
from ( select substr(tabname,1,28) as name,bigint(card) as rows_in_table, bigint(float(t.npages)/(1024/(b.pagesize/1024))) as table_used_mb from syscat.tables t, syscat.tablespaces b where t.tbspace=b.tbspace and t.tabschema='TSMDB1' ) as tu, ( select substr(tabname,1,28) as name,bigint(sum(i.nleaf)*(b.pagesize/1024)/1024) as index_used_mb from syscat.indexes i, syscat.tablespaces b -
where i.tbspaceid=b.tbspaceid and i.tabschema='TSMDB1' -
group by tabname, pagesize ) as iu, ( select substr(tabname,1,28) as name, bigint(data_object_p_size/1024) as table_alloc_mb, bigint(index_object_p_size/1024) as index_alloc_mb from sysibmadm.admintabinfo ) as ta -
where tu.name=iu.name and tu.name=ta.name and tu.name='SCHEDULE_EVENT' with ur"

db2 "select tu.name,cast(rows_in_table as bigint),cast(table_used_mb as bigint),cast(table_alloc_mb as bigint),cast(index_used_mb as bigint),cast(index_alloc_mb as bigint) -
from ( select substr(tabname,1,28) as name,bigint(card) as rows_in_table, bigint(float(t.npages)/(1024/(b.pagesize/1024))) as table_used_mb from syscat.tables t, syscat.tablespaces b -
where t.tbspace=b.tbspace and t.tabschema='TSMDB1' ) as tu, ( select substr(tabname,1,28) as name,bigint(sum(i.nleaf)*(b.pagesize/1024)/1024) as index_used_mb -
from syscat.indexes i, syscat.tablespaces b where i.tbspaceid=b.tbspaceid and i.tabschema='TSMDB1' -
group by tabname, pagesize ) as iu, ( select substr(tabname,1,28) as name, bigint(data_object_p_size/1024) as table_alloc_mb, bigint(index_object_p_size/1024) as index_alloc_mb from sysibmadm.admintabinfo ) as ta -
where tu.name=iu.name and tu.name=ta.name and tu.name='ACTIVITY_SUMMARY' with ur"

back to top

Backing up the TSM Database

It is very important that your TSM database is regularily backed up using TSM commands, so it can be recovered if it gets corrupted. Most sites take 2 types of backup, an on-site copy to disk and an off-site copy to tape. The location of TSM database backup files are stored in the volume history, so it is essential at a minimum that you backup both the volume history file, and the TSM database. It is also a good idea to backup the device config file at the same time.
The shortest database backup command is simply

backup db devclass='classname'

Where the classname in the command points to a sequential access device class, either disk or tape. This will default to an incremental backup that will use scratch tapes and the command line will not hang while the backup runs. You can change these defaults by using parameters

Scratch=No means don't use scratch tapes and then you have to tell TSM which volumes to use with a VOLumenames='csv_volume_list' parameter

Wait=YES means the command will wait until the database backup completes before it moves on to the next task. This is typically used in automated server management scripts. The default is wait=no.

The TYPE parameter is used to decide what tpe of backup to take. Options are Type=Full, Type=Incremental or Type=DBSnapshot.

A Full backup will copy the entire TSM database and will then truncate the TSM server active and archive logs.
An Incremental backup will copy the changes that have been made to the TSM database between the current point in time and the last full database backup. The maximum number of incremental backups you can run between full backups is 32.
A DBsnapshot is also a full backup of the TSM database, but it runs independently of the full / incremental backup sequence and will not truncate the TSM database transaction logs.

BACKUP DEVCONFIG command backups up the device class definitions, library definitions, drive definitions and exit definitions. You must have this information available to restore the Tivoli Storage Manager database.
The command is simply


with an optional Filenames= parameter where you can tell TSM where to store the backups. If you don't use the Filenames parameter, then TSM looks for the filename that is contained in the server options file, under the DEVCONFIG option.

The BACKUP VOLHISTORY command saves history records for database backup and dump volumes, and information about lots of other volumes, for instance Volumes removed by using the DELETE VOLUME command or during reclamation of scratch volumes.
The command is simply

BAckup VOLHistory

with an optional Filenames= parameter where you can tell TSM where to store the backups. If you don't use the Filenames parameter, then TSM looks for the filename that is contained in the server options file, under the VOLUMEHISTORY option.

There is one thing to be aware of, if you start poking about in the db2diag.log file for the DB2 system that the TSM server runs on. You might see error messages like

MESSAGE : ADM10502W Health indicator "Database Backup Required" ("db.db_backup_req") is in state "Manual backup required" on "database" "tsm1.TSMDB1"

This is just the DB2 health monitor warning that the database config is set to AUTO_DB_BACKUP = OFF as DB2 thinks it should be managing database backups. TSM needs to manage its own database backups so it can update the volume history file when backups run, so AUTO_DB_BACKUP should be set to OFF and this is working as designed. As long as you are scheduling database backups through TSM, the ADM10502W messages can be ignored.

back to top

Restoring the TSM Database

The restore process will depend on what you need to achieve and where you are starting from. Three major scenarios exist;

  1. The database is broken and requires a simple restore up to the last committed transaction
  2. The database is corrupt and needs a point in time restore to a known good state
  3. The hosting server has crashed and a full DR restore is required

Taking these three in turn.

Restoring a server database to its most current state

Assuming the active and archive logs are intact, all backups exist and all the files and directories for the database still exist, the restore process is simply halt the server then from the O/S command line, issue the command

dsmserv restore db

Restoring a server database to a point in time

This is a little more complex. You need the latest full backup and the latest incremental backup from before the point in time. You also need the Server options file and the Volume history file, but note that the volume history file is wiped by the restore process, so take a copy of it before you start. If necessary, re-create your database and recovery log directories.
To restore the database to a backup that was created before 16:00 on August 23, 2014, enter:

dsmserv restore db todate=08/23/2014 totime=16:00

The server will then find the last full backup from the volume history file and restore it, then work its way through any incremental backups create before the date and time specified above. However, the problem is that as you have wound the database back to a previous time, it is now out of step with the data in the storage pools, as migration and expiration might have run after the restore point. Your old volume history file has a record of all volume usage, so take a look at it and pull out a list of all volumes that were changed after the point in time restore. What you are looking for is volume statuses of STGREUSE (updates) STGNEW (added volumes) and STGDELETE (removed volumes) - you can do this while you are waiting for the restore to run.

Run an AUDIT VOLUME command with the FIX=YES parameter against all your disk volumes, and all tape volumes marged as reused or deleted. If storage pool volumes were added after the backup you will need to add them again.

If the audit cannot find a backup file, it will be permanently lost and so copied again on the next backup. If the file exists in a copy storage pool, it will be marked as 'damaged', so you will need to run the RESTORE STGPOOL command to get those files restored from the copy. If you don't have any backups, the ultimate cleanup is to run DELETE VOLUME with the DISCARD=YES parameter, but this will delete your backup data.

You will also need to run an AUDIT LIBRARY command to fix any tape volume anomalies.

Disaster Recovery issues

A disaster recovery restore adds another layer of complexity. First you need to rebuild your server, hopefully from a SYSBAK or MKSYSB that puts all the relevant code back exactly as it was. If you do need to re-install TSM then note that the IBM manuals specifically state that 'The restore operation must be done by a Tivoli Storage Manager server at a code level that is the same as that on the server that was backed up - This means that the server code level can not change at all - not even by a patch level'.

You will probably then need to define your tape library, at least one drive, and paths using DEFINE LIBRARY, DEFINE DRIVE and DEFINE PATH commands. At this point you should have a TSM system ready for restore, so you can follow whichever of the 2 options above is appropriate.

back to top

Using DB2 commands on a TSM server

IBM's design model for TSM v6 and upwards is to store TSM metadata in a DB2 database, without the TSM administrators needing to know anything about DB2 and how to manage it. It is now becoming obvious that a TSM Administrator does need to know a bit of DB2, but it would always be wise to contact IBM for advice before running base DB2 cvommands. If you have a test 'sandpit' system that you can try things out on, then here are a few DB2 commands that might be useful.
On Windows, you start a DB2 command line from Start -> All Programs -> IBMDB2 -> Command Line Tools.


The TSM DB2 system is 'owned' by the userid that installed it, and normally only that userid has the administration authority needed to manage the DB2 database, including the ability to start the TSM service. However you can give access to another userid using DB2 commands. Open up a command line as the TSM instance owner by right clicking on it and taking the 'run as' option. You will need the instance owner userid and password to do this. Once you have the command line, type the following commands

db2==> connect to tsmdb1

db2==> grant dbadm on database to user TSM_ADMIN

Userid TSM_ADMIN can now be used to stop and start the TSM services

Recovering from a full archive log

Under tsm 6.x, the archive and active log directories can fill up, and if they do, the server will shut down. To prevent this, you need to make sure you trigger a FULL database backup once the archive log hits a threshold, but if the worst happens and the log files do fill up, you need a recovery process.

If this happens then you cannot use TSM commands to move the logs into bigger directories, as you cannot start TSM. What you need to do is create temporary logs elsewhere, then prune the archive log using native DB2 commands. However, remember that the archive log will hold enough information to wind back through the last 2 full backups, so you need to run 2 full backups to clear it down.

  1. Create a temporary directory large enough to hold your active logs. The dsmserv.opt file may contain the log sizes in the ACTIVELOGSIZE parameter, and if not, it will point to the physical log location.
  2. Open a DB2 command line and run the commands below to switch the logs to a new location

    Set db2instance=SERVER1
    db2 update db cfg for tsmdb1 using newlogpath path\to\new\logs

  3. 'Activate' the database to copy the log files with the following command, this command does not affect the original logs. This may take a while, and success will be indicated when you see a command prompt again.

    db2 activate db tsmdb1

  4. Now you need to back the database up to clear the logs out, and you need to do this on disk, so identify or create a directory with enough space to take a database backup then run the following DB2 commands.

    db2 backup db tsmdb1 to path\to\database\backup\directory

    The archive logs will start pruning once you see the 'Backup Successful' message, but this could take a while to appear if your database is large. Make a note of the backup timestamp, which will look something like 'The timestamp for this backup image is: 20120412130821'
  5. Find some more space, and run another full DB2 database backup with the command.

    db2 backup db tsmdb1 to path\to\another\database\backup\directory

    When this second backup completes, the archive log directory and original active log directory are empty of log files. Make a note of the backup timestamp again, let us call this one 20120412150425
  6. Now you need to delete the first backup using these commands - note how you use the timestamp from step 4.

    db2 connect to tsmdb1

  7. Point DB2 back to the original, empty active log in the original location, you will get this from the ACTIVELOGDIRECTORY parameter in dsmserv.opt


  8. Connect to the database again, and that will automatically start moving the active logs from the temporary location to original active log location, and again, this can take a while if the logs are big.

    db2 force application all
    db2 connect to tsmdb1

  9. Now you need to start the TSM server up and run a good backup. You need to start the server in the foreground to do this, so open a normal Windows command line and navigate to the server directory and run dsmserv. If you have more than one TSM server on this machine, you may need to use the -k option to get the right server. This will bring you up a TSM server command line. Disable your client sessions then take 2 full database backups. You need to know your backup device classes to be able to do this.

    Disable sessions
    Backup db type=full dev=your_db_devclass

  10. Delete the second DB2 database backup as follows, using the database timestamp that you recorded in step 5.


  11. Now you can halt your server in the foreground, and start it normally. Remember to enable sessions.

It is possible to query what is happening while a database recovery is in progress with the db2pd utility, a DB2 diagnostic tool that is provided with the TSM server installation code. You simply run this as a command from the shell prompt, like this:

tsm:~ # db2pd
db2pd> You are running db2pd in interactive mode.
db2pd> If you want command line mode, rerun db2pd with valid options.
db2pd> Type -h or -help for help.
db2pd> Type q to quit.

To check out what is happening with a database recovery, run

db2pd> -recovery -db tsmdb1


Runstats is used to optimise access paths through the TSM tables and should normally be set to run automatically as required. However if runstats starts automatically when the TSM is started up after a database upgrade, it can cause performance problems to the extent that no-one can log into the system.
To temporarily suspend auto runstats, before halting the TSM server for an upgrade, submit the following commands to the DB2 instance that is associated with the TSM server:

db2 connect to tsmdb1
db2 update db cfg for TSMDB1 using AUTO_RUNSTATS OFF

Now runstats will not start automatically when you restart TSM server. However you need runstats to keep your database optimised, so once you are happy that your TSM server is up and running, submit the following commands to the DB2 instance for your TSM server and Runstats will resume normal processing.

db2 connect to tsmdb1
db2 update db cfg for TSMDB1 using AUTO_RUNSTATS ON

If you have a corrupt TSM database, it is possible that it is just the database index that is corrupt and if so, it can be rebuilt without having to restore the whole database. You really should be involving IBM and following their advice before trying a rebuild, but it might be worth knowing how to tell if corruption in DB2 is with the Table or the Index.

The DB2 Database Analyzing and Reporting tool, db2dart, can help here, but note that it does require exclusive use of the database and can run for a long time. If you run the tool and check the output, it will tell you which pool is in error. Look for messages like 'Error: Extent number 136 in tablespace 3 is orphaned'. Then you need to grep the db2diag.log for "Obj: {pool:3;", and the output will look something like

Obj: {pool:3;obj:95;type:0} Parent={3;95}

The important part is the 'type'; type:0 is table, type:1 is index and if it is type:1 then the index can be rebuilt to fix the corruption issue, but as I said above, consult IBM first.


You can enter any DB2 command from the DB2 command line, including SQL queries and commands that update or delete the database, so be careful. Some of the query commands could be useful for investigating TSM problems

  • get instance -   returns the name of the TSM server
  • list active databases -   will show the TSM database name as known to DB2, and the path to it.
  • get dbm config -   shows the settings for the database configuration manager
  • db2start and db2stop -   obvious what these are! They should never be necessary as DB2 should be started automatically as part of the server startup, but if necessary, it can be done manually
  • The following commands require you to be logged in with adminstrator authority and connected to a database

  • get db cfg show detail -   shows the confguration parameters for the database
  • list tables -   for the connected database
  • describe table table_name -   lists the columns for the specified table

back to top

Automatic Database reorganisation issues

Reclaimable Storage

TSM should automatically tune and manage its database. However this might not always work. One problem that you might see is that your database grows a lot faster than you might have expected, even though you are running expiration and the number of objects in the database is not growing by much. Apart from consuming space, this growth can also cause performance problems.

DB2 should have a feature called 'reclaimable storage' which according to IBM can 'consolidate in-use extents below the high water mark and return unused extents in your table space to the system for reuse'. A problem can arise if you originally installed TSM 6.1 as that used DB2 9.5 tablespaces and they do not have the reclaimable storage feature. If you then upgraded your TSM to higher levels, you will still be using DB2 9.5. You could get in touch with IBM and obtain the scripts that will allow you to upgrade DB2 to release 9.7. The issue is that the scripts must be run while the server is halted and generally require ten to thirty hours to run, and sometimes much longer.

If you are not sure whether the database was created by installing Tivoli Storage Manager V6.1 or by installing a later version, issue the following SELECT command:

db2 "select cast(TBSP_NAME as char(30)), reclaimable_space_enabled from table(mon_get_tablespace('',-1)) where TBSP_NAME in ('USERSPACE1','IDXSPACE1', 'LARGESPACE1','LARGEIDXSPACE1')" > reclaimable_space.txt

In the output, the RECLAIMABLE_SPACE_ENABLED column displays a value of 0 for V6.1 databases, even if the server was later upgraded to V6.2 or later.

If you installed TSM 6.2 then things are a bit better, as that used DB2 9.7, and that release did include reclaimable storage. However even then you might still not be able to reuse freed up space, as if your workload is high, the database reorg utility can conflict with your regular housekeeping. Large deduplicated storage pools can be an especial problem.

How does TSM reclaimable storage work?

If you are running version or later, then online index reorganization is enabled by default, and a reorganization and cleanup process is run automatically on all tables. If you have the DISABLEINDEXREORG and DISABLEREORGCLEANUPINDEX options in your dsmserv.opt file, then these options will be silently ignored.

If your server level is earlier than, then you may see ANR3497W messages in the activity log

  • If a table requires reorganization, ANR3497W, "Reorganization is required on table 'table name'. The reason code is 1."
  • If the indexes for a table require reorganization, ANR3497W, "Reorganization is required on table 'table name'. The reason code is 2."
  • If an index reorganization with the CLEANUP PAGES ALL option is required on the BF_BITFILE_EXTENTS table, ANR3497W, "Reorganization is required on table BF_BITFILE_EXTENTS. The reason code is 3."

TSM checks via DB2 to see which tables and indexes need to be reorganised, then it works through the list and processes the tables one at a time. It can take a long time to reorganise a large table, possibly a month or more. The process will be paused at the end of the daily reorg window, then restarted when the next daily window starts. When each table reorg is complete, TSM will run a RUNSTATS agains the table to optimise its performance. Once all the tables and indexes are reorganised, TSM will then wait at least 20 days before it starts the process again.
Database reorganisation cannot run alongside database backup. If a backup is running, the reorg process will not start. If the reorg process is running, then every so often it will pause and allow a backup to run, if one is required or scheduled.
If table reorganization is in progress at the end of the reorganization window, it is paused until the reorganization window starts the next day, when reorganization is resumed.
If index reorganization is in progress at the end of the reorganization window, reorganization continues. It is not possible to pause index reorganizations.

How do I know if TSM reclaimable storage is running?

The easiest way is to check the server log for messages
ANR0293I and ANR0294I are issued when a table reorg starts and completes
ANR0317I and ANR0318I are issued when a table index reorg starts and completes
ANR0336I and ANR0337I are issued when a RUNSTATS starts and completes, when RUNSTATS is triggered at the end of a table or index reorg.
A more direct way is to run the DB2 reorgchk command. Open a DB2 command window and type

db2 reorgchk current statistics on table all >db2reorgchk.out

What control do I have over what reclaimable storage does?

You can run reorganisations online or offline. Online means run alongside other work, which might cause performance and contention issues. Offline means bring the server down to run the reorgs, but then the reorgs run 100 times faster. Neither of these options sounds attractive. In general, the more current your level of TSM, the better control you have over online reorgs. Some of the options below might not be valid for older releases of TSM.

The default position is that online reorganisation can run 24 hours a day, but you can use options REORGBEGINTIME hh:mm and REORGDURATION nn to control when reorganisation starts, and how long it runs for. The 'nn' parameter is hours. You should aim to schedule reorg activity at a time when server utilisation is low. Things to avoid are intense backup and archive processing, and expiration, migration, or reclamation.

ALLOWREORGTABLE and ALLOWREORGINDEX can be set to 'NO' or 'YES'. The default values are YES for Table reorg and NO for Index reorg. Index reorg really needs to run at a quiet time, and if you enable it you must set the server option DB_DB2_KEEPTABLELOCK NO as it can cause database deadlocks (this is the default for TSM v6.3 onwards). Also, unlike Table reorg, Index reorg cannot be paused but if necessary it can be cancelled from a DB2 command line. I suggest you discuss options with IBM before enabling ALLOWREORGINDEX and DB_DB2_KEEPTABLELOCK NO as there are several factors to consider.

If you see locking problems, for example error messages like ANR1880W: Server transaction was canceled because of a conflicting lock on table 'REPLICATING_OBJECTS', you can disable reorgs for selected tables or indexes. To do this you need to add parameters to the database options file, then halt and restart the server. The options are DISABLEREORGTABLE, DISABLEREORGIndex and DISABLEREORGCleanupindex. DISABLEREORGTABLE and DISABLEREORGIndex take a comma separated list of tables, while DISABLEREORGCleanupindex will only accept BF_BITFILE_EXTENTS. The following parameter are examples, you need to find out which tables are appropriate for your site. DISABLEREORGIndex and DISABLEREORGCleanupindex are deprecated from TSM 7.1.1.


IBM advises that online reorganization might not work on the following tables:

back to top

Investigating Problems with the Server Instance

The first place to start is the TSM Active log, but if you need to go deeper, then the DB2 logs can be useful. However finding those logs can be a challenge as the location can depend on the the OS platform or even the OS release level. The best way to be sure you have the correct log is to check the DIAGPATH variable in DB2.
Start up a DB2 command line, in Windows go to Start-> Programs-> IBM DB2-> Command Line tools-> Command Window and in UNIX, su - db2inst1 (db2inst1 is the default instance, if you change the instance name or have multiple instances, you need to su to the correct userid for your instance). You then type 'db2' to open the DB2 command line
From the db2 command line type db2=> get dbm cfg. The command produces a lot of output, look for the line like 'Diagnostic data directory path     (DIAGPATH) = /home/E1WT1/e1wt1/sqllib/db2dump    ' and this shows the path to the log files. If the DIAGPATH is blank, look for the default PATH directory instead.
'quit' gets you out of that DB2 command line

The DB2diag.log contains information like database backups, table reorganizations, memory management messages, start and stop of TSM server and hardware information logged at instance start time, as well as error and warning messages.

Sometimes when investigating TSM server problems, the DB2 terminology does not quite match TSM so the error messages in the DB2 logs can look a bit strange. For example, Tivoli Storage Manager refers to 'transactions' which DB2 calls to 'units of work' (UOW). Tivoli Storage Manager uses select statements where DB2 uses SQL and are also sometimes referred to as DML, or data manipulation language statements.

Another potentially useful file is the startup trace log dsmupgdx.trc, which is located in c:\program files\tivoli\tsm for Windows or /opt/tivoli/tsm/server for UNIX and Linux servers. If you get database startup problems it's always worth checking the file to see if any useful error messages exist.

Some TSM errors messages are very generic and more investigation is needed to pinpoint the problem. An example is 'ANR8503E A failure occurred in writing to volume'. To investigate this you need to enable a TSM server trace. To do this, enter the following commands from the TSM server command line

trace disable *
trace enable pvr addmsg
trace begin path/name/trace.out
redo the activity that created the error message
trace flush
trace end
trace disable *

Then open the trace file and look at the messages that were written out just before the error.

You might sometimes see an error message, 'ANR2981E Database backup terminated due to environment or setup issue related to file access - DB2 sqlcode -2033 sqlerrmc 106.' The messages in the DB2 log file are a bit confusing, and will be something like

MESSAGE : SQL2033N An error occurred while accessing TSM during the processing of a database utility. TSM reason code: "".
FUNCTION: DB2 UDB, database utilities, sqlubDeleteVendorImage, probe:558
MESSAGE : SQL2062N An error occurred while accessing media "". Reason code:"".
DATA #1 : String, 27 bytes
Error returned by sqluvdel.
DATA #2 : Vendor RC, PD_DB2_TYPE_VENDOR_RC, 4 bytes
Vendor RC=0x0000001A=26 -- see DB2 API Guide for meaning.
DATA #3 : Hexdump, 48 bytes
0x0A00060137109070 : 0000 006A 3133 3639 2031 3036 0000 0000 ...j1369 106....
0x0A00060137109080 : 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0A00060137109090 : 0000 0000 0000 0000 0000 0000 0000 0000 ................

A possible cause is that the DSMI variables are not defined correctly in the DB2 user profile. This is found in /sqllib/userprofile in UNIX/Linux systems. The script variables should be set as

export DSMI_CONFIG=/home/tsminst1/tsminst1/tsmdbmgr.opt
export DSMI_DIR=/usr/tivoli/tsm/client/api/bin64
export DSMI_LOG=/home/tsminst1/tsminst1

Sometimes after a problem, DB2 goes into crash recovery mode and the TSM server appears to be hung. DB2 crash recovery can take a long time. If your TSM server is hung, check your DB2 db2diag.log for a message like

MESSAGE : DIA2051W Forward phase of crash recovery has completed. Next
LSN is "0000259AD0276FD8".

Then log into DB2 as the instance owner and run the command 'db2 list utilities show detail'. If Crash Recovery is active, the message will tell you how much work it needs to do, so you can estimate how much longer it will take. Some of the sample output of this command is shown below.

Database Name = TSMDB1
Estimated Percentage Complete = 100
Phase Number = 1
Phase Number [Current] = 2
Total Work = 25937697602 bytes
Completed Work = 25769328527 bytes
Start Time = 10/17/2013 11:36:29.368810

back to top

Errors on Database Startup

Your first point of call should be to raise an incident with TSM support and get advice on your problem. However the following might help.

When trying to start up TSM the following error message can appear "ANR0170E - Error detected, database restart required", and you may see errors in the actlog a bit like

ANR0171I dbieval.c(874): Error detected on 3:2, database in evaluation mode.
ANR0170E dbieval.c(935): Error detected on 3:2, database restart required.
ANR0162W Supplemental database diagnostic information: -1:58031:-1034
([IBM][CLI Driver] SQL1034C The database is damaged. The application has been disconnected from the database. All applications processing the database have been stopped.

The resolution is to restart DB2 manually with the RESTART command. Open up a DB2 command line window as explained above then issue the following

set db2instance=db2inst1 (this is the default instance)
db2 force application all
db2 restart database db2inst1

You might have to run the restart command a few times before the issue is resolved. If this does not fix the problem you probably need to contact IBM Support, although you can use the db2dart command to run a database analysis. This generates a report file that would be useful for IBM support.

db2 force application all
db2dart db2inst1 /db

Another possible message is 'ANR0151W Database manager fails to start'. This problem can occur after a database crash, maybe due to a hardware issue. If the database will not start up with the dsmserv command and you see errors like

ANR0151W Database manager fails to start.

Improper shutdown of instance produced orphaned IPC resources

And you see entries like this in the db2diag.log

I779399E357 LEVEL: Severe
PID : 18375 TID : 140737134171936PROC : db2star2
INSTANCE: tsmserv NODE : 000
FUNCTION: DB2 UDB, base sys utilities, DB2StartMain, probe:580
Trace facility mutex unexpected lock error

The problem is that some Inter Process Communication (IPC) tasks are still active and are locking out resources. You need to cancel the DB2 instance then remove the IPC resoures. Log on with the Tivoli Storage Manager server instance user ID and run the following command to stop the DB2 instance

or if necessary,

Check to see if there are outstanding IPC resources that are associated with the instance

ipcs -a

Clean up any IPC resources that are associated with the instance

-bash-3.1$ ipclean
ipclean: Removing DB2 engine and client's IPC resources for tsmserv.

Verify whether there are outstanding IPC resources that were not removed by the ipclean command.

ipcs -a

If there are outstanding IPC resources associated to the instance, use the following commands to remove them

To remove a shared memory segment resource
ipcrm -m shmid

To remove a semaphore resource
ipcrm -s semid

To remove a message queue resource
ipcrm -q msgid

Once all outstanding IPC resources have been removed, you should be able to restart the Tivoli Storage Manager server.

On Windows, a TSM server might fail to start after an upgrade with the error
ANR0188E Fail to start the database manager. Database manager license not found
The DB2 License has probably expired. To check this, log onto windows with your TSM Server instance userid, open a command prompt and type 'DB2CLP'. This will open another DB2CLP command window, and from this window enter the command 'db2licm -l'. If the licence has expired you will see

Product name: "DB2 Enterprise Server Edition"
Expiry date: "Expired"
Product identifier: "db2ese"
Version information: "x.x"

The DB2 license file is called db2ese.lic and was shipped with your TSM server package. Its location will depend on where you extracted the server package, but assuming you extracted to a directory d:\tsm_extract\, you will find it in the following directory, where DB2_x.x corresponds to your DB2 version number.
Navigate your command line to that directory, then you can register the DB2 license again with the following command

db2licm -a db2ese.lic
LIC1402I License added successfully.

Run the 'db2licm -l' command again and now you should see

Product name: "DB2 Enterprise Server Edition"
License type: "Restricted"
Expiry date: "Permanent"
Product identifier: "db2ese"
Version information: "x.x"

and you should be able to start your TSM Server instance normally.

back to top

AIX Maximum Number of Processes

You may see a database backup failing on an AIX server with an error like 'ANR2968E Database backup terminated. DB2 sqlcode: -2033. DB2 sqlerrmc: 292' If this error is not corrected the recovery log will fill up and crash the server. You may also see a message like 'Insufficient AIX system resource' in the db2diag log file.

The API error code 192 means that the API was unable to 'fork' or create a process to do its database backup. AIX has a parameter called maxuproc which limits the maximum number of processes that a user is allowed per user, and this value should be increased.

To see what value is set, use the command

lsattr -l sys0 -E | grep maxuproc

and to change the value use the command below, selecting a value that is suitable for your server.

chdev -l sys0 -a maxuproc='2048'

back to top

Moving a TSM database hosted on Windows

If you run your TSM server under Windows, then by default, your TSM database will be installed on the C: This can be an issue as the C: is usually reserved for system data, and system administrators prefer that your application data be stored on a different drive. It is possible to move the database using the DB2 utility 'db2relocatedb'. However db2relocatedb will not actually move your database, you have to do that yourself. What it does is update the pointers within DB2 so it knows where the database has been moved to.

First, just confirm that the database is allocated on the C:. Run the 'db2cmd' command from a Windows command line to open a DB2 command line, then run 'db2 list database directory' and within the command listing you should see a line

Local database directory = C:

Now, just incase something goes wrong, run a full TSM database backup

Stop the TSM Server with the HALT command. This should stop DB2 as well, but check and make sure it is down.

Copy your data to the new drive with the Windows copy command. Assuming you are going to the G:, this would be
copy c:\server1\* g:\server1\*
Yes, you could move it, but if you use copy, the original data is still on the c: incase anything unforseen happens - nothing wrong with having 2 backout plans! Take a look at the copy, and make sure the data and directory structure is as you expected

The db2relocatedb utility needs a configuration file to tell it what you are moving. The configuration file would normally be created in the same directory as db2relocateddb runs from and is usually called reloc.cfg. However you can create the file anywhere, in which case you need to specify the absolute path in the command example below.
Assuming you are using the default TSM database name and database instance then in our case, the config file will contain


Create this file then run the db2relocatedb command pointing it to the configuration file location.

db2relocatedb -f reloc.cfg

Now start your TSM server up again and check that all is well. Rerun that 'db2 list database directory' command and check that the server is actually running from the g: Once you are certain all is well you can delete the data off the c:

back to top