Backing up DB2 on AIX with TSM

Unlike the other major databases, there is no TDP for DB2, DB2 uses the TSM API. This makes DB2 backups a bit easier to configure and also means you don't need to buy a TDP licence. This page discusses DB2 backups on AIX clients. The process will differ on other platforms.

Configuring the Backups

For the sake of simplifying the text, this section assumes that you are going to define a DB2 backup on an AIX server called QPRD10, running on a TSM server called TSM1.

Defining client nodes at the TSM server

You will already have a TSM client defined on TSM1 to do the file level backups for QPRD10 and that will be scheduled as a normal node. You can't use that for DB2, simply because you have to schedule a command or script for DB2, so you need a different client. So what you need to do is create a TSM client which we'll call QPRD10_DB2.
You will need a DB2 policy set and management class, with a backup copy group with the parms set to 1,0,0,0. This lets DB2 manage backup retention. You will also need an archive copy group defined for the DB2 log archiving, which is scheduled and managed by DB2. The archive copy group retentions will be specific to your site. Define your QPRD10_DB2 client to the TSM1 server with the DB2 policy set.

Client definitions

The dsm.opt file

You will already have a dsm.opt file for your standard backups, so you need to create another one called something like dsm_db2.opt which contains the following.

The dsm.sys file

The parameters that control an AIX client are contained in a file called dsm.sys. Within this file, add an extra stanza below the one that manages your standard backups, so your dsm.sys file looks something like

The two node names must match the nodes defined at the TSM server. The webports must be unique and are used for communication with the dsmcad. You could let them default, but that makes debugging harder. If you define unique log names for the DB2 stanza then that separates out the DB2 messages from the standard client and again it makes debugging easier.

DB2 environment variables

The API uses three environment variables to point DB2 to the right locations for the TSM parameters. These are DSMI_DIR, DSMI_CONFIG and DSMI_LOG. To check to see if these variables are set, you need to log onto the AIX server with the db2 userid, lets call it db2user, or sudo to it from the root id like this . su - db2user Note the '-' in the middle. If you miss this out you will see the environment variables for root. One way to check the variables is to use the command env |grep DSMI Another way is to use the 'set' command. If the variables are missing you need to set them with the export command, and add them to the .profile dataset for that userid. Paths and files below are examples of course; use your own paths. If you are using the C shell then the commands are setenv, not export.

export DSMI_DIR=/usr/tivoli/tsm/client/opt/bin64
export DSMI_LOG=/path to your log files
export DSMI_CONFIG=/usr/tivoli/tsm/client/ba/bin64/dsm_db2.opt

File permissions

The TSM1_QPRD10_DB2 stanza in the example above has two log files, /path/tsm.db2.dsmerror.log and /path/tsm.db2.dsmsched.log where /path is the path to your logs. TSM will create these using the root userid so you must change the access to the files so DB2 can update them. It's considered good practice to make all your TSM log files read/write to everyone with chmod 666.

DB2 must also be able to read its dsm.opt file and the dsm.sys file. In general you do not want just anyone to be able to update these files, so owner write and everyone else read; or chmod 644 is considered appropriate.

Scheduling

First, you need to define a different dsmcad for DB2 to allow you to pick up the DB2 option file. The easiest way to do this is to add a symbolic link that defines a 'virtual' dsmcad for DB2 and that points to the real dsmcad. Use a command something like this

ln -s /usr/tivoli/tsm/client/ba/bin64/dsmcad dsmcad_db2

Now you need to start your dsmcad up, so you use a command like this. Substitute your own paths and filenames

nohup "/usr/tivoli/tsm/client/ba/bin64/dsmcad_db2 -optfile=/usr/tivoli/tsm/client/ba/bin64/dsm_db2.opt" > /dev/null 2 > &1 &

You will also need symbolic links in the API directory, usually /usr/tivoli/tsm/client/api/bin64 to point to the dsm.sys and dsm.opt files in the ba/bin64 directory

Finally you need to schedule it from the TSM server. You need to know the path and name of the DB2 backup script that has hopefully been defined by your DBA. You also need to know if your DBA has included "su db2user" commands inside the script, in which case you run the command as root. Otherwise you run it as the db2 user as shown in the second example below - your DBA will supply you with the correct username. Assuming the script is in /usr/local/scripts/db2backup.ksh, the command is something like

/usr/local/scripts/db2backup.ksh
  or
su - db2user -c "/usr/local/scripts/db2backup.ksh"

In the second example, the '-' means run the command as db2user but don't inherit the environment settings from root. The "-c" means switch back to the root userid when the command completes.

DB2 Archiving

DB2 copies its archive logs off to TSM using the API, and as this is not controlled by TSM it is quite difficult to work out what is happening. There is usually nothing on either the TSM server or the TSM client logs to indicate that archiving is working, and usually no error messages if it fails. You need to define an archive copygroup in the DB2 management class, otherwise archiving will fail. You would do this using a TSM command like the one below, with suitable names that fit your site's conventions. Note that you specify RETVER=NOLIMIT, as the deletion of expired archives is controlled by DB2. For this to happen successfully you need to set archivedelete=yes on the client definition at the TSM server.

DEFINE COPYGROUP domain_name policyset_name mgmtclass_name TYPE=ARCHIVE DESTINATION=archive_stgpool_name RETVER=NOLIMIT

IBM recommends that you make archive logs go to a different storage pool that the DB2 database backups. You also need to tell DB2 to use the TSM interface, and while this would normally be done by your DBA, for completeness. the DB2 command is shown below. If you use the default management class for your archive copy group, then the ': mgmtclass_name' parameter is not required. A DB2 restart will be needed to pick the changes up.

db2 update db cfg for db_name using logarchmeth1 TSM: mgmtclass_name

You can find out if TSM is working by running the following select statement twice, an hour or so apart. If the number of archived objects is changing then that indicates that archiving is happening. Alternatively, you can just check the archive dates.

select class_name,count(class_name) from archives where node_name='your_db2_nodename' group by class_name
  or
select class_name,archive_date from archives where node_name='your_db2_nodename'

A more direct way is to look at the DB2 archive log, if you know where it is. There will be an archive log for every DB2 instance on each server. As far as I know, the path below is not site specific but is the DB2 installation default. If you don't know what your DB2 instance is called, take a look in db2home and pick the one that was updated most recently. Otherwise, try ps -ef |fgrep -i db2sysc to see what instances are active. The archive log can be found in -

/db2home/instance_name/sqllib/db2dump/db2diag.log

DB2 LAN free backup considerations

See the LAN free section for a general description of LAN free backups. LAN-free will speed up backups of DB2 databases, but is not suitable for smaller files like archive logs. To segregate the two, you need to define two TSM clients, one for backups, and one for archives. You would then need two different stanzas in the dsm.sys file (Unix) one for each client, and set up the backup client with LANFREE YES, but the archive client with LANFREE NO.
The backup data must be sent to a management class that is LAN free capable and this must be defined for every type of backup object. Recent releases of DB2, certainly from DB2 9.7 onwards, have several types of object, and the backups of each object must be directed to a LAN free management class.
The DB2 object names and paths are

Object DB2 object name
Full database recovery object /dbname/NODEnnnn/FULL_BACKUP.timestamp.seq_no
Incremental database recovery object /dbname/NODEnnnn/DB_INCR_BACKUP.timestamp.seq_no
Incremental delta database recovery object /dbname/NODEnnnn/DB_DELTA_BACKUP.timestamp.seq_no
Full table space recovery object /dbname/NODEnnnn/TSP_BACKUP.timestamp.seq_no
Incremental table space recovery object /dbname/NODEnnnn/TSP_INCR_BACKUP.timestamp.seq_no
Incremental delta table space recovery object/dbname/NODEnnnn/TSP_DELTA_BACKUP.timestamp.seq_no

where dbname is the database alias name, and NODEnnnn is the node number.
Corresponding TSM include statements are

include /*/NODE????/FULL_BACKUP.* LANFREE_MC
include /*/NODE????/DB_INCR_BACKUP.* LANFREE_MC
include /*/NODE????/DB_DELTA_BACKUP.* LANFREE_MC
include /*/NODE????/TSP_BACKUP.* LANFREE_MC
include /*/NODE????/TSP_INCR_BACKUP.* LANFREE_MC
include /*/NODE????/TSP_DELTA_BACKUP.* LANFREE_MC

If you use TSM to store your DB2 backup data, then you can use a utility called db2adutl to query, extract, verify, and delete the recovery objects. As it can delete backup objects, it is best run by a DBA, but you can find it on Linux and UNIX servers in the sqllib/adsm directory, and on Windows operating systems, it is located in sqllib\bin. For snapshots, use the db2acsutil utility located in sqllib/bin.

Some common problems

DB2 Script

Make sure your DB2 backup script is set as executable (chmod 755 is usually considered appropriate). If it is not executable you will get a error message like
"Scheduled Event 'xxxxx' failed return code 126."

TSM server password not set

If you don't set the TSM password for your DB2 stanza (by logging into it with dsmc -se=TSM1_QPRD10_DB2 and setting the password) then you will get an error like
ANS1503E Valid password not available for server 'TSM1_QPRD10_DB2'
The administrator for your system must run TSM and enter the password to store it locally

However I've seen this error happen when the server password definitely was set. The problem was that the filespace that was mounted with the nosuid parameter set, which meant that password worked for root only. Solution - get the filespace mount changed.

DB2 SQL2025N RC15 error

A DB2 backup is failing with an SQL2025N RC15 error. The would be accompanied with errors in the db2diag.log like: MESSAGE : Error in vendor support code at line: 1335 rc: 157.

The problem could be that TSM is using simultaneous write and this has hit a problem. Simultaneous Write is a TSM feature which is used to write backup data into both the primary and copy stgpools at the same time while a backup is running. Simultaneous Write can be set to recover from errors if a COPYCONTINUE parameter is set to 'YES'. If there is an error with a Copy Pool, then TSM will remove that failed pool from its in-memory list of copy pools and try to continue the backup operation without the simultaneous write. TSM then sends a RC=15 retry code back to the client, which basically tells it that the data write failed, and invites it abort the current transaction and then resend the data. The problem is that DB2 does not have a retry capability.

You would typically see this problem if the backups are going direct to tape, and not enough tape drives are available. The backup will require 2 tape drives for each active session and if enough drives are not available, this would cause an rc=157 error.

DB2 errors

RC 406

If the backup is not working but there is nothing in the tsm logs then you need to look at the DB2 logs. These are very wordy and difficult to decipher, an example is

If you can find a DB2 API guide and look up RC=11 it just tells you that DB2 has had an initialisation error. The real error is that 406 in the hex dump and this comes from the TSM API. So now you need the TSM API codes, and they can be found in http://publib.boulder.ibm.com/tividd/td/TSMC/GC32-0793-02/en_US/HTML/ansa0000139.htm
A return code 406 means that DB2 cannot find the dsm options file. There are at least three possible solutions
you have not allocated the dsm.opt file
There is no symbolic link from the api directory to the ba directory
The DB2 environment variable DSMI_CONFIG is not set correctly.

RC 106

A return code 106 is usually a permissions error on a file.
From the db2diag.log

The API return code 106 means DSM_RC_ACCESS_DENIED to a file. This is often the dsmerror log and DB2 needs write permission to that log. If you have an ERRORLOGNAME specified in the dsm.sys file, then DB2 writes errors to this file, otherwise it uses the value in the DSMI_LOG environment variable. To find out where this variable is pointing, use the following commands:

ps -elf | grep -i "db2sysc" | grep -i "<instance owner>"
ps eww <pid of db2sysc from the output above>

The output from the second command will return all the variables under which DB2 is running. If you change the DSMI_LOG variable the DB2 needs to be restarted to pick it up.

RC 0168

You might see an error message like 'ANR4588E A database backup configuration might be incorrect' in the TSM server or client log, associated with 'MESSAGE : Error in vendor support code at line: xxx rc: 168' in the DB2 log.
A return code 0168 means that DB2 cannot find the password file. Most likely you have not set the password for the TSM DB2 client, or just possibly, the file system is mounted with nosuid set.

RC 186

A similar DB2 error, 'TSM RC=0x000000BA=186' means that no archive copy group is present. You might see this error on a DB2 archive. The resolution is simply to define a suitable copy group into the correct management class.
The problem is that these errors are not always easy to spot. Sometimes all you get is messages like the extract from the DB2 archive log below, which just shows that it is having a problem with media. You need to dig into the DB2 log to find the TSM API return code above.

Problems with excludes

If you see an error like this (from Windows - other operating systems will differ a bit)

SQL2062N An error occurred when accessing media
"C:\PROGRA~1\SQLLIB|bin\db2tadsm.dll". Reason code: "185:

This means that there is an exclude statement somewhere which is excluding DB2 databases from backups. Check that the domain that contains your database file is included in the DB2 stanza, or that you are not using global exclude statements like exclude \...\*

Problems with DB2 Restores to an Alternate Machine

When working with DB2 restores to an alternate machine using the Tivoli Storage Manager client, some of the configuration must be set within the DB2 database. For example, the TSM_NodeName, TSM_Password and the TSM_OWNER.

The TSM_OWNER value is case sensitive. When this OWNER value is sent to the Tivoli Storage Manager Server during the restore, if it does not exactly match the OWNER value on the Tivoli Storage Manager Server for the backup, the restore will fail. Tivoli Storage Manager uses this OWNER value to determine if the user has permission to the object. Thus if the OWNER does not exactly match, the Tivoli Storage Manager Server will deny access to the object and the restore will fail with "object not found".

back to top