TSM Administration Hints

Node Replication

The TSM team introduced a new function in version 6.3, the ability to replicate TSM nodes, or client backups, between TSM servers. This means that you can have two active copies of your backups in two different locations, giving you an instant ability to recover servers in a disaster. You can also recover data from the replicated site to the live site.

The easiest way to manage this is to combine similar nodes into node groups, so you work with groups of nodes rather than hundreds of individual nodes.
Assuming you have 2 servers called PROD_SERVER and DR_SERVER with IP addresses 200.100.100.0 and 200.100.100.50 then you would set replication us like this:

Define the production TSM server to DR TSM server

define server PROD_SERVER SERVERPA=password hla=200.100.100. lla=1500
set replserver PROD_SERVER

On the DR TSM server to the Production TSM Server

define server DR_SERVER SERVERPA=password hla=200.100.100.50 lla=1500
set replserver DR_SERVER

Create a TSM node group on both TSM servers and add some nodes to it

define nodegroup repl_group desc="Node Group for Replication"
define nodegroupmember repl_group node1
define nodegroupmember repl_group node2
define nodegroupmember repl_group node2

On the production TSM server, enable replication for each node:

update node node1 replstate=enabled
update node node2 replstate=enabled
update node node3 replstate=enabled

On the production TSM server, add an admin schedule to run the replication:

define schedule replicate_nodes type=admin cmd="replicate node repl_group wait=yes" desc="3 Hourly Node Replication Schedule" dayofweek=any period=3 perunits=hours active=yes starttime=00:00

Now all 3 nodes will replicate their backup data to the DR server every three hours. You can check the progress of replication with the Query Process command.

Note that if the nodes are deduplicated then the 'Amount Transferred' represents the deduplicated bytes sent. Also, if you keep issuing the Query Process command you will notice that the replication statistics keep changing. This is normal. When older releases of TSM started to run processes like replication they would sit for a long time while they worked out what to do. This could be a bit worrying as it looked like the process had hung. Now, TSM will batch the work up and start processing as soon as it determines the first batch. Filespaces can be in 4 states as shown in the output above.

Identifying: means the server is still looking at the objects in the file space and have not yet dispatched any batches for replication.
Identifying and Replicating: means the server is still looking at the objects (Identifying) but is also moving objects (Replicating).
Replicating: means the server is done looking at the objects (in the file space), and the values you've seen increase will stop increasing.
Complete: means the server finished the file space.

Sometimes replication can timeout due to problems with external network components like switches, routers, firewalls etc. If you are running TSM 6.3.4.200 you could also investigate the KEEPALIVE function as described in IBM TechNote 1642715

If you try to use the QUERY OCCUPANCY command to check that your replication is working, by looking to see if your target node has the same amount of data as your source node, then you will get confusing results. This is because the QUERY OCCUPANCY command includes data residing in copy storage pools and active-date storage pools and this data is not replicated. You would need to use a SELECT command that just looked at the file count and data in the primary pools, something like this;

select sum(a.NUM_FILES) as "# of files", sum(a.LOGICAL_MB) as "Logical(MB)" from occupancy a, stgpools b where a.stgpool_name=b.stgpool_name and b.pooltype='PRIMARY'

By default the REPLICATE NODE command uses the "DATATYPE=ALL" parameter. If you use a different value for this parameter, then you will need to adjust your SELECT command by using the 'type' parameter in your 'where' clause for the occupancy column.

back to top


Using the TSM command line

UNIX Server Command Line

To start a TSM server command line in AIX, you use the dsmadmc command with an optional servername. The command line is useful if you want to do bulk upgrades, as you can then do them with a script. For example, if you want to register 10 new nodes, rather than go through the tedium of adding 10 nodes through a panel, it is much easier to set this up as 10 commands within a UNIX script, then just execute the script. A typical script could look like

dsmadmc -se=servername -id=youruserid -pa=yourpassword register node nodename nodepassword passe=9999 con=\"This server contact details\" do=policy-set clo=cloptset userid=none comp=no archdel=no backdel=no etc.

you obviously need to substitute your own values for the parameters, and note how special characters like quotes are escaped out with a forward slash.

Other dsmadmc options

The command can also be very useful if you want to start up a session to a TSM server in different modes. for example, if you want to run some queries that you want to feed into an excel spreadsheet or maybe programs, and you do not want the screen titles displayed you would use the following. -commad means comma delimited output, the dataonly switch suppresses the headers.

dsmadmc -se=servername -dataonly=yes -comma

an alternate option is -tab for tab delimited. These commands can be usefully argumented with the -out parameter, which writes the console output into an external file. For example if you start a server session as below, then the output of any commands you issue will be placed in sqlout.txt

dsmadmc -se=servername -dataonly=yes -comma -out="c:\program files\tivoli\tsm\sqlout.txt"

Three other useful options are

dsmadmc -con
dsmadmc -mount
dsmadmc -noconfirm

The first one starts a server session in console mode, so all the console messages automatically scroll past your terminal. This is useful if you are waiting for a process to complete, as it saves you from keep typing q actlog or q pr commands.
The second option is similar, but instead of displaying all the console messages, it just displays tape mount messages. This one was intended for a media librarian that had to mount tapes manually
The Third option suppresses the 'do you really want to do this' type messages, for example if you run a query that will produce a lot of output, you will see a message like, 'Do you wish to proceed? (Yes (Y)/No (N))'. -noconfirm will suppress these messages, but obviously you need to use it with care as the messages can be very useful. For example, if you start to delete objects, the confirmation message gives you a second chance to check that you are not deleting items by accident.

dsmadmc with Storage Agents

If you are logged into a client that is running a Storage Agent called ClientA_Sta you would log into the agent with the command

dsmadmc -se=ClientA_Sta

and then you can run a limited commandset from the Agent's command line
To get out of the Storage Agent type 'quit'. If you type 'halt', you will stop the storage agent then exit.

Logging into a remote TSM server

With Windows servers, you can log into any remote TSM server, provided you have the dsmadmc command code loaded with your TSM client, and provided your firewall rules permit remote access. If you have a remote server on a Windows box with IP address 99.245.24.123 and with a DNS name of WSTSM01, you can start a session by navigating to \program files\tivoli\tsm\baclient and running either

dsmadmc -tcps=99.245.24.123
or
dsmadmc -tcps=WSTSM01

If you get a message like "dsmadmc not found" then you've got a basic client install and you need to install the server management component.
This works on the IP address of the Windows box that is hosting TSM. What if you are running more than one TSM server on that box? By default you will get TSM Server1. To reach the other servers you need to add a -tcpp switch that specifies the port of the other server

dsmadmc -tcps=WSTSM01 -tcpp=1502

Client command line or Gui?

On the client side I used to prefer a command line simply because you get much more control over backups and restores. In fact on a UNIX machine, the command line is probably the best option. If your client has multiple server stanzas, then you can invoke each stanza with the server name parameter. For example if you define a stanza for oracle RMAN and use a servername of oracle-backup in it, then you would start a client TSM session like this.

dsmc -se=oracle-backup

On Windows clients, I find the GUI by far the best for doing restores, and probably the command line for doing anything else, but it's all a matter of personal taste.

back to top


Running processes in parallel

It is mentioned in the Administrator's Guide manual that running multiple processes simultaneously might lead to process termination and it is recommended to avoid the situation if possible. If you run data movement or delete operations, like pool migration, simultaneously with data access operations like storage pool backup, they can contend and one process may terminate early but report completion without errors.
You need to run such processes serially to avoid this happening. Examples include running 'storage pool backup' alongside 'reclaim storage pool' and running 'copy activedata' alongside 'reclaim stgpool'

back to top


Using the dsmcutil command

On a Windows client you can use the dsmcutil command to add and remove schedules from TSM, - much faster than using the TSM GUI Wizard. To define a standard node, use

dsmcutil inst /name:"TSM Scheduler Service - Z095XFSU1" /optfile:"C:\Program Files\Tivoli\TSM\baclient\dsm.opt" /node:z095XFSU1 /password:xxxx /autostart:yes /startnow:yes

The command should go to the opt file and get the sched and error log file names
To remove a schedule service, use

dsmcutil remove /name:"Tivoli Integrated Portal - TIPProfile-Port-16310"

you can also work with services using the Windows sc.exe program. For example,

sc query | FIND "TSM" /I

sc query lists all Services, then pipes the result into a find command to just list out the services that start with TSM, the /I switch means ignore case. Once you get the service name from the query command, you can use other sc commands on that service, for example

sc start 'Service-name'
sc stop 'service-name'
sc delete 'service-name'

back to top


Using ASNODE for backups and restores

TSM backups often use Proxy Nodes, where a 'master' client stores data on behalf of a number of normal clients. You often see this with VMware. GPFS and cluster backups. Backups are assigned to the proxy node by using the ASNODE parameter in a backup, either by using an -asnodename=proxynodename parameter in the schedule definition, or by starting a dsmc commandline with a parameter '-asnode=proxynodename'. For this to work, someone must have already issued a 'grant authority' command that allows the proxy node owner authority over the client node backups.

If you use the asnodename option to back up a client, then be aware that parameters like TXNGROUPMAX apply to proxy node, not the target node.
For example, you define a proxy relationship where a target node of NODE247 is related to PROXY1 as an agent node and you want to set TXNGROUPMAX to 12288 for the backups. If you set TXNGROUPMAX on NODE247 to 12888, but leave PROXY1 to default to 4096 then the backup will just batch up 4096 objects in a transaction. You need to update the target node PROXY1: 'update node PROXY1 TXNGROUPMAX=12288'

While scheduled TSM backups are normally taken using the root or an admin user, it is possible for any user to backup and restore files, providing that user can access those files. This is controlled by an OWNER field in the backups table, so if I ran a backup of my home drive with my CSISJA userid, the owner field would be set to OWNER: CSISJA. TSM would then know that I 'owned' those backups, and would let me restore them.

However, if I was to try that backup using the asnode parameter with command

dsmc backup /home/csisja/* -asnode=proxy1

then this process breaks down as the owner field is set to OWNER: *. If I then tried to query those backups from my CSISJA userid I would not see them and I would not be able to restore them as I was not the owner.

This is 'working as designed'. The reason given is that the purpose of the ASNODE parameter is to allow any client in a proxy node group to be able to access and restore the data from any other client in that group, and so individual file ownership is ignored. It should be possible to process these files using the fromnode and fromowner parameters as shown below

dsmc query backup /home/CSISJA/* -fromnode=proxy1 -fromowner="*"

back to top


Setting up an Enterprise Server

Controlling and managing several TSM servers with hundreds of clients can be a headache, if only when working out which client is managed by which server. TSM offers the Enterprise Configuration facility to make this easier. This lets you create and distribute settings and configurations from one configuration TSM server to several managed TSM servers.

You start with the TSM server that is going to be the configuration server.

From the Object view, select Server - Server Groups, then 'Define New Server Group' from the drop down menu. Give this group a meaningful name and description, add it, then select 'Server Group Members' and add the TSM servers that you want to be associated with this group.

On that server GUI, open Operational View; Configuration Manager Operations option and select 'Establish this server as a configuration manager', set the configuration manager drop down window to 'ON' and press the Finish button (or use console command SET CONFIGMANAGER ON)

Next, you need to select the 'managed server operations' menu The configuration manager works on profiles, which are used to collect the objects that are to be managed together, so they can be managed with a single command.

You select the 'define a new Configuration Profile' option from the drop down menu, then you will be presented with a list of objects that can be managed by this profile. The available objects are:

  • Managed administrators
  • Managed policy domains
  • Managed command schedules
  • Managed scripts
  • Managed option sets
  • Managed server definitions
  • Managed server groups
  • Subscribers

You can then define which of these objects will be managed centrally. However it is not as simple as this, as, for example, if you decide to centrally manage all your command schedules, then they need to be identical on every server in the group. Installing an Enterprise Configuration is a good opportunity to clean up and standardise your TSM estate

Once you get the configuration manager setup, you then need to log onto each of your managed servers, and subscribe each one to the profile that you just created. Suppose you called it WINDOWS-PROD and your Configuration Manager server was called CONF-MAN, then you would issued the command

define subscription WINDOWS-PROD server=CONF-MAN

What does this all mean?

Suppose that you defined a server group called WINDOWS that contains all your Windows TSM servers. You can now send commands to all your Windows TSM servers by prefixing your command with WINDOWS:

WINDOWS: Q DB
WINDOWS: upd stg cartpool reclaim=40

Suppose you defined your WINDOWS-PROD profile to control administrators, client schedules and admin schedules. They are all centrally managed by the CONF-MAN server now and you can only make changes to your managed servers by applying the updates to CONF-MAN then distributing the changes out to the managed servers by issuing the command

NOTIFY SUBSCRIBERS PROFILE=*

From the CONF-MAN server. This both simplifies management and maintains consistency across all TSM servers.

back to top


Which IP ports does TSM need open on a firewall?

This is a list of listener ports that various Tivoli Storage Manager products expect other hosts to connect to, and may be useful in firewall setup. This applies to Tivoli Storage Manager version 6.2.2 and higher

Tivoli Storage Manager server configurable listener ports

Operating system ports which the Tivoli Storage Manager server might use

Tivoli Storage Manager client configurable listener ports

(*)WEBPORTS: If a firewall is used with the dsmcad scheduler and/or webclient, the WEBPORTS option MUST be manually set.

Tivoli Storage Manager Administration Center version 6.1 listens on http ports 9043 or 9044.

Tivoli Storage Manager Administration Center version 6.2 listens on http port 16310 and https port 16316. Connecting to 16310 simply redirects the connection to the https port.

back to top


To compress or not to compress?

Client compression will use up CPU cycles on your client server. However, once the data is compressed, it will use less Network resource and will also reduce the I/O pressure on your TSM server. This used to be a big plus point, in the days of restricted bandwidth. However, these days, with gigabit ethernet, its probable that the CPU consumption on the client actually outweighs any benefit you might get from better network usage.
If your data is a good compression candidate (big Oracle databases can compress down to 20% of original size) then you might get a benefit from client compression. If your clients are CPU constrained, and you have a good network, then your backups will probably run slower with client compression.
You should also not use compression if your clients support file compression. If compression would make a file bigger than it was before, which happens if a file is already compressed, then the file transfer will fail and will be retried without compression. If this happens for lots of files you will suffer a real performance hit. You then have three options: don't use compression, set the COMPRESSALWAYS parameter to on, then a file will be compressed and transmitted even it if grows with compression, or just exclude problem file types from compression with EXCLUDE.COMPRESSION statements as explained below.
My impression is that most people run with client compression off these days.

If you are network constrained and want to try compression, then configure it for one clients of a given type (databases for example() and monitor the backup timrs and data transfer rates to see if it is speeding up your backups.

back to top


Compression statistics

How do you get compression statistics for a given backup session?

From the Client side, when the backup completes, look at the SCHEDLOG (normally dsmsched.log) on the client and page down to the last paragraph. This will give (among other stuff) compression statistics for the backup in the line 'Objects compressed by n %'

From the server side, issue command 'q actlog' and look for the lines starting ANE4968I. These give you compression stats for each client.

Note that when you look at space occupied by backups, you see two numbers, %utilised, and %reclaimable. These two numbers do not add to 100%. This is because the reclaimable space includes 'holes' within Aggregates, whereas utilised space considers Aggregates as intact units.

back to top


Selective Compression by Directory

If you want to compress some parts of a filespace but not others, maybe because some directories contain large files which should compress well, then try the following in the dsm.opt file

COMPRESSION ON
exclude.compression */.../*
include.compression */.../compressed-dir/.../*

back to top


Using Directory Container Storage Pools

IBM introduced a new type of storage pool in 2015 called a container storage pool. These pools were designed specifically to assist with data deduplication, as data can be either deduplicated at the client end, or as it enters the TSM server. This is more efficient than having to use post-process to deduplicate the data.

Container storage pools come in two variants, Directory Containers and Cloud Containers.
Directory containers combine some of the features of Disk and Sequential pools and try to avoid the disadvantages of both. For example, there is no need to run reclamation, and the pool is not fixed size anymore. A disadvantage is that you need to learn a new command set to manage them, as traditional commands like BACKUP STGPOOL, EXPORT, IMPORT, GENERATE BACKUPSET, MOVE DATA, MOVE NODEDATA and MIGRATE STORAGEPOOL do not work with containers.

Directory based storage pools are defined with the command

DEFINE STGPOOL poolname STGT=DIRECTORY

The DEFINE STGPOOL command has lots of new parameters for directory containers, mainly to do with using the Protect Storagepool function and the maximum size that the pool is allowed to grow to.
Some of the admin commands specific to container pools are:

The MOVE CONTAINER command does not move a container, it moves all the data from one container to another. It creates a new container for the move, so you must have enough free space in the pool to create a container which is the same size as the source container. Be aware that the QUERY STORAGEPOOL command will show the percentage of free data within a storage pool, but this includes any free space within the containers. So if a pool size is 100GB and the QUERY STORAGEPOOL shows that the pool is 75% utilised that does not mean that there is room for a new 25GB container.
Try the SHOW SDPPOOL command instead and look for the FsFreeSpace entry. This will show you how much free space exists in the file system.

There is no DELETE CONTAINER command, containers are deleted automatically once all of the data expires or is moved out of a container, and the REUSEDELAY parameter on the storagepool is exceeded.

There is no BACKUP STGPOOL command for directory-container pools, the PROTECT STGPOOL command is used instead. This command uses replication to copy the container data to a target server. You need to combine this command with the REPLICATE NODE command to fully protect your backup data.
The PROTECT STGPOOL command should be run before the REPLICATE NODE command, as it can repair any damaged extents in the data and will make replication run faster.

If a container is damaged, you can use the AUDIT CONTAINER command to recover or remove data. The REPAIR STGPOOL command can be used to recover damaged data extents from a replication pair.

back to top


Using Cache on Disk Storage Pools

It is possible to speed up recoveries by keeping a backup copy on disk, after it has been migrated to tape. This is called disk caching. By default, caching is disabled on storage pools so the backups on disk are deleted as soon as they are migrated to tape. You need to enable disk cache by specifying CACHE=YES when you define or update a storage pool. Then the backup copy on disk will be kept until space is needed in the disk pool for new backups. Disk cache is useful for pools that have a high recovery rate.

The disadvantage of disk cache is that backups will take longer, as the backup operation has to clear space in the disk pool before it can copy a new backup to disk. Disk cache will also increase the size of the TSM database, as it needs to track two backup copies, one on disk and one one tape.

If you run a query storagepool command

Q stgpool backuppool

You see the following (partial) result

The pct Util (utilised space) includes the space used by any cached copies of files in the storage pool. The (Pct Migr (migratable space) does not include space occupied by cached copies of files.

Storage Pool migration processing triggers on the "Pct Migr" value not the "Pct Util" value as found in the query storage pool output. This can cause confusion as a Storage Pool can appear to be full when most of the data is cache data. You may then expect automatic migration processes to run, but they will not run until the "Pct Migr" threshold is reached.

It is not possible to duplex storage pools, that is to simultaneously write data to two pools. If you want a second copy of a pool you need to schedule the backup storage pool command

backup storage pool primary_pool_name backup_pool_name

This command will just copy new data to the backup pool, that is, data that was written to primary since the last backup command ran. If you want to know what volumes will be needed for a backup you can run the command with the preview option.

backup stgpool primary_pool_name backup_pool_name preview=volumesonly

TSM will write out the required volume names to the activity log. Search for message ANR1228I to find them.

back to top


How TSM allocates space on disk and tape

The amount of physical space used by TSM when saving files will depend on the blocksize used. A small file will always use a minimum of a full block to store the data. Random access disk pools use a 4K block size, Sequential access files on both disk and tape use 256K blocks, so these are the mimumum file sizes. This means that a 2K file will occupy 4K on a disk pool, then 256K when it gets migrated to tape. Quote from IBM, "This will be true for each object saved". Any object bigger than 4K will use a whole number of 4K blocks, for example a 21K object will use 6*4K blocks, or 24K of disk space. The unused space at tghe end of the block cannot be used for anything else.

LTO tapes have a minimum space requirement on top of this called a dataset space. For LTO1 and LTO2 this is approximately 400KB and for LTO3 and LTO4, it is approximately 1.6MB. Data is written out to LTO when a transaction ends, but this does not necessarily correspond to a single object, but would be a concatention of all the objects written in that transaction. So if a single 5K object is written to an LTO4 tape, that object will use the mimumum dataset size of 1.6MB on the tape. However if a number of objects totalling 2.4MB are streamed to the tape, for example as part of migration, then TSM will send them in 7*256K blocks and the LTO dataset would use 1,792K, the size of those 7 blocks.

back to top


Password handling in Unix shell scripts

How can you include a dsmadmc command in a UNIX shell scripts without advertising a name and password by including it in the script, i.e., 'dsmadmc -id=xxxxxx -pa=xxxxxx select ....'?

One way is to set up a 'password file', and have your shell script read it. You would do this by defining password and user variables, reading in the data from the file, then putting the variables in the command

$pass=tail -1 /var/tmp/dsmaccess
$user=head -1 /var/tmp/dsmaccess

dsmadmc -id=$user -pa=$pass

You then protect the password file so only root can read it.

If you only want to run queries from your scripts, then another way is to define a TSM userid, lets call it QUERY which can only issue queries. You can then hardcode the userid and password in your scripts, as there is little security risk in others using it. The problem with this method is that if you have to change the 'QUERY' password, then you have to change all your scripts. Holding the info in a single file is a better option.

back to top


MAXSESSIONS and MAXSCHEDSESSIONS

TSM limits the maximum number of sessions it will run at any time to try to prevent performance issues by runnng too many tasks at once. This is controlled by the MAXSESSIONS server option, but the default value is quite low at 25 client sessions. If you have it set too low you will see errors like

ANR0429W Session xxx refused maximum server sessions exceeded.

You can update this server option without stopping and restarting the server by using a SETOPT command like this

MAXSessions 100

The value you would use depends on how many sessions you are trying to run and how much resources you have available. There is no definite rule here, it very much depends on your environmeent. IBM states that 'the maximum value is limited only by available virtual storage size or communication resources'. So how do you know if you have increased it too far and you are getting performance issues? Check that your database cache hit is at least 98% by running the q db f=d command.

There are two types of session, scheduled and unscheduled. Unscheduled includes things like restores and retrieves. You can limit the number of sessions available to scheduled operations, to ensure that unscheduled operations can run, using the set maxschedsessions command. The command to do this is set maxschedsessions followed by a percentage. It may be a better option to increase maxschedsessions rather than maxsessions if maxschedsessions is set too low.

You may often see the number of active client sessions exceeding the maxschedsessions value. A scheduled session is a type 5 session, a generated session is a type 4 session. When a node starts a scheduled backup to the storage manager server, only the initial session is opened as a type 5 session. All other sessions opened as part of the node's scheduled operation are of type 4. Only type 5 sessions count when calculating the number of scheduled sessions.
To find out what MAXSCHEDSESSIONS is set to, run the 'SHOW CSVARS' command. Output looks like

ANS8000I Server command: 'show csvars'
Max %% Scheduled Sessions : 50
Max Scheduled Sessions : 100

The command output shows that there are 100 type 5 sessions available for schedule backups. The 'SHOW SESSIONS' command will then tell you what type of sessions are active. Output typically looks like

ANS8000I Server command: 'show sessions'
Session 73568: Type=Node, Id=TSM_P14312PRW120_ORA
Platform=WinNT, NodeId=537, Owner=
SessType=5, Index=0, TermReason=0

Session 73867: Type=Node, Id=TSM_P14312PRW120_ORA
Platform=WinNT, NodeId=537, Owner=
SessType=4, Index=6, TermReason=0

Now count all the 'SessType=5' sessions in the SHOW SESSIONS output and see how close you are to the maxschedsession limit. If you are still getting ANR0429W errors and you cannot increase maxsessions further, consider increasing the maxschedsession limit, and if you cannot do that your only other option is to spread your backup schedules out to reduce the load at any one time.

back to top


Storage Pool Migration

TSM traditionally caches backup data in a disk pool to speed up backups, then it moves that data from disk off to a tape pool as part of its housekeeping routine. The data movement activity is called 'Migration' and this is carried out by TSM processes. However, you cannot run or control these processes directly, you manage them by changing the values of parameters on a storage pool. These are 'nextstgpool', highmig', 'lowmig', 'migprocess' and 'migdelay'.

It is blindingly obvious, but it you don't define a second storage pool in the Nextstgpool parameter, migration will never work. Traditionally this is a tape pool, or you may want to use low tier disk.
HIghmig and LOwmig control the triggers that start and stop migration. TSM will start migration processes when the HI threshold is reached, and stop it when the pool occupancy gets down to the LO threshold. If HI=100, then migration cannot start (or is switched off).
MIGPRocesses controls how many process can run in parallel, provided the other limits below do not come into play.
MIGDelay is the number of days that a file must exist in the primary storage pool, before it can be migrated to the secondary.

The way TSM works out how many concurrent migration processes to run can be a bit confusing. The maximum number of processes cannot be more than the MIGPR parameter above for any one storage pool. One obvious limit, if you are migrating to tape, is the number of free tape drives. This means that it's not wise to run migration alongside other tape hungry housekeeping tasks like reclamation.

TSM will also just run one migration process per node in the storage pool, so if you just have a small number of nodes backing up in a pool, that will restrict the number of concurrent processes. TSM will start with the client that is using the most space, and them migrate the largest filespace first. This is an especial issue for big Oracle databases, as the data is all owned by one node and so will be migrated by a single process.

If you want to clear out an old disk pool you can do this with migration commands. You could move the data to an existing disk pool, a new disk pool or to tape. If you are going to use a new storage pool, then you need to create that pool and add sufficent volumes into it. The process then would be:

Update your old storage pool so that the target pool is added the next pool

UPDATE STGPOOL pool_name NEXTSTGPOOL=new_stgpool

Set the himig threshold on the old pool to 100 to prevent any automatic migration processed from running, then migrate the data from old stgpool to the new stgpool by using a MIGRATE STGPOOL command with the low threshold set to '0'

UPDATE STGPOOL pool_name HI=100
MIGRATE STGPOOL pool_name LO=0

occasionally check the status of the old pool to see if it is empty

QUERY STGPOOL pool_name

Once the pool is empty, delete the volumes from the old pool, then the storage pool.

DELETE VOLUME volume_name
DELETE STGPOOL pool_name

back to top


Controlling the amount of data going to the TSM scheduling and error logs

The DMSMSCHED.log and the DSMERROR.log are usually the first point of call when investigating problems. They are usually found in the CLIENT/BA/ or BACLIENT/ directory. TSM will update both these files every time it runs a scheduled backup and will record every backed up file. The problem is that if they are not controlled, the logs will quickly become too big to manage.

You have two parameters in your dsm.opt file that control the data held in these files, schedlogretention and errorlogretention. The default values are schedlogretention N and errorlogretention N, which means never prune the logs. Other options are

ERRORLOGRetention 7 D

which means keep the errors for 7 days then discard them, or

ERRORLOGRetention 7 S

which means after 7 days move the data to DSMERLOG.PRU. The schedlog retention syntax is the same. You can select how many days you want to keep your logs for. You can also add a QUIET parameter in your DMS.OPT file, which will suppress most of the messages, but this is not recommended as you lose most of your audit trail.

A further pair of parameters were introduced With the 5.3 baclient,

schedlogmax nn
errorlogmax nn

This parameters cause the logs to wrap, so when they reach the end of the file, they start to overwrite the data at the beginning. The end of the current data is indicated by an 'END OF DATA' record. The nn value is the maximum size of the log in megabytes, with a range from 0 to 2047. 0 is the default and means do not wrap the log.

back to top


Querying data transfer rates

how do you find out yesterdays data transfer rate for client sessions?

Use the command - This will display all client messages, not just data transfer messages

q ac begind=today-2 begint=23:59 endd=today-1
endt=23:59 originator=client

obviously, you can manipulate the relative dates to get data from other days

back to top


Encrypting the backup data

It is generally considered that TSM backup data is secure, as it cannot be read without a copy of the database. However if you have a legal requirement for full data encryption then standard DES 56-bit encryption is available. When you turn on encryption, you will be prompted to create a unique key. Without this key, you won't be able to restore your data. It is very important that you keep a copy of this key someplace other than the computer that is being backed up. If you forgot the encryption key, then the data cannot be restored or retrieved under any circumstances.

To enable encryption you add an encryptkey parameter to the dsm.opt file on the client, and add include.encrypt and exclude.encrypt statements as required. TSM will not encrypt anything unless at least one include.encrypt statement is present.

The encryptkey options are -

encryptkey prompt

With the prompt option you should see the following message every time you run a backup

User action is required, file requires an encryption key

And you need to provide the key twice

encryptkey save

You should only see the password required prompt once, and the password is saved in the tsm.pwd file

The manual states that with the encryptkey option set to Prompt you will be prompted for the password on every backup or restore, but it in my experience it appears that TSM stores this password on the local client, probably in a file called tivinv/tsmbacnw.sig so you can then run further backups or restores from that client without having to specify a password and TSM encrypts or decrypts the data as required. If you delete that sig file, or presumably try to recover to a different client, then you will be given a selection screen with the following options
1 Prompt for password
2 Skip this file
3 Skip all encrypted files
4 Abort the restore

Taking option 1, you will be prompted for the encryption password twice, then the restore runs as normal.

back to top


Specifying 2 or more servers from 1 client

The first question is, why would you want to run several servers in the same client? The answer is that you typically do this when you want to handle different parts of your client data differently, maybe for databases or maybe for shared resources in client clusters. In this case you would define some virtual TSM servers in the same dsm.sys file.

On a Windows client, if you want to define TSM servers, and you want to be able to specify either server from a TSM client, add the following lines to the dsm.opt file

servername s2 tcpserveraddress ip-address or domain-name

you then have a 'primary' server which you pick up by default, and you can invoke the secondary server using

dsmadmc -se=s2 etc

On an AIX client you would define a different dsm.opt file for each server, For example, suppose you want a basic client to backup the rootvg, an oracle client for database backups, and an HACMP client for the non-database data on the shared resource. You need three opt files which for example could be defined like this

Then in your dsm.sys file you would code

Note that the tcpserveraddress is the same for each 'server' and is the dns name of the real tsm server. If you make each server stanza write to a different set of logs then that makes it easier to investigate issues. Each of the three nodenames is defined independently to the TSM server, so they can be scheduled independently. You would also define two symbolic links for extra dsmcads, so each stanza can be scheduled independently like this.

ln -s dsmcad /tivoli/tsm/client/ba/bin/dsmcad_oracle -optfile=/tivoli/tsm/client/ba/bin/dsm_Oracle.opt
ln -s dsmcad /tivoli/tsm/client/ba/bin/dsmcad_hacmp -optfile=/tivoli/tsm/client/ba/bin/dsm_HACMP.opt

back to top


Which nodes are associated with each schedule

The command syntax is 'query association domain-name schedule-name'. You can query all associations by using

q assoc * *

If you want to query a specific schedule, but do not know the domain, use

q assoc * sched-name

back to top


Synchronising your TSM server with the OS

If your TSM server is out of step with your hosting server, possibly due to a daytime savings change, you can easily change TSM to match the OS server by entering the following command at the admin command line

ACCEPT DATE

back to top


Canceling sessions and processes

To cancel server processes like migration you need to know the process number, which you get with the Q PROCESS command. However note that if you cancel a migration session then a new one will start unless you also reset the pool thresholds.

CANCEL PROCESS process-number

To cancel tape mount requests use the command

CANCEL REQUEST request-number
CANCEL REQUEST ALL

To cancel active sessions you either need to know the session numbers, which you get with the Q SESSION command, or you can cancel all active sessions

CANCEL SESSION request-number

To prevent new sessions or processes from starting use

DISABLE SESSION client
DISABLE SESSION Server
DISABLE SESSION Admin

These commands just prevent new sessions from starting. All active session will run to completion unless you cancel them with the cancel command. The disable session server command will stop new server to server sessions from starting, it will not stop expiration or migration.

To do an emergency cancel of all sessions use the command

DISABLE SESSION ALL
CANCEL SESSION ALL

back to top


Investigating install problems

If you have problems when installing either a TSM server or the Admin centre, both installs create a log.zip file that can be used to work out what went wrong. log.zip contains a collection of log files for the TSM server, the DB2 Database and the Admin Centre. The location of these files depends on the operating system, and can be found in :

Windows
C:\IBM\AC\logs.zip ...Admin center logs
C:\Program Files\Tivoli\TSM\logs.zip ...TSM server logs

Unix/Linux :
/var/tivoli/tsm/logs.zip

back to top


How to configure an extra DSMCAD on Linux

How would you configure two dsmcad instances on a single Linux client? And why would you want to? To answer the second question, you often need two different dsmcad processes if you are working with database clients and server clusters, where you want the cluster resources to back independently from the local disks. The VCS Cluser Backups page describes how to setup two dsmcads when backing up a cluster disk, this section describes how to set up a separate dsmcad to backup a database on a single Linux server.

To set up two dsmcads, follow the process below. The examples shown give names used to backup an oracle database. The names are just my own examples of course. Use names that are appropriate to your configuration.

Define 2 nodes at your TSM server, one for standard backups and one for the Oracle TDP. To make the stuff below a bit more readable, we are going to use TSM server TSM1, with IP address 1.2.3.4 and port 1500. The nodes will be called linux001 and linux001_ora.

Create two dsm.opt files, one for each node
dsm.opt, the default node, should contain servername TSM1
dsm_ora.opt should contain servername TSM1_ora

Create two unique node stanzas in the dsm.sys file

In this example the TSM TDP for Oracle will select the databases for the second stanza, so the first stanza backs up all the disks with suitable excludes to avoid copying the database files. If you want two dsmcads for ordinary files, just use appropriate domain statements or include-exclude files to select the data.

Generate the passwords for both node names

dsmc q sess
dsmc q sess -optfile=linux001_ora

IBM supplies a rc.dsmcad init script, usually in /opt/tivoli/tsm/client/ba/bin. Make a copy of this script for the oracle dsmcad

cp /opt/tivoli/tsm/client/ba/bin/rc.dsmcad -
/opt/tivoli/tsm/client/ba/bin/rc.dsmcad_linux001_ora

Edit rc.dsmcad_linux001_ora and change line

daemon $DSMCAD_BIN
To be
daemon $DSMCAD_BIN -optfile=/opt/tivoli/tsm/client/ba/bin/dsm_ora.opt

Create a symbolic link in /etc/init.d/ to point to the new rc.dsmcad init scripts, so the extra DSMCAD starts at boot time

ln -s /opt/tivoli/tsm/client/ba/bin/rc.dsmcad_linux001_ora dsmcad_linux001_ora

Register the new rc script with chkconfig

chkconfig --add dsmcad_linux001_ora

Test the configuration with "service start" to make sure the scripts load and start without issue

service dsmcad_linux001_ora start

back to top


How to Configure a TSM Server on a Windows 2008 R2 Cluster

IBM recommends that you use the configuration wizard, but if you can't use it, here is how to configure a TSM server manually.

You need to run through the process below as a domain user, and you need a windows 2008 R2 cluster with mscs cluster enabled, shared disk resources under a resource group and a dedicated IP address with a corresponding DNS entry for the TSM server itself.

Install the server software

First, install the Tivoli Storage Manager server package on all cluster nodes. As always, we will make up some names to make the examples below more meaningful, but these are our names, you should use names that fit your site standards. The TSM instance will be called tsm_server1 and the db2 user account will be tsm_user1. The instance location will be d:\tsm\tsm_server1

Create a server instance on all cluster nodes, using the db2icrt command. You will be prompted for the password for user ID tsm_user1, and we will just use 'passw0rd'.

db2icrt -u tsm_user1 tsm_server1

Update DB2 Parameters

Change the default path for the database to be the drive where the instance directory for the server is located. Open up a DB2 command line from
Start - Programs - IBM DB2 - DB2TSM1 -Command Line Tools - Command Line Processor.
Enter quit to exit the command-line processor and a window with a command prompt should now be open, with the environment properly set up to successfully issue the commands in the next steps.
From the command prompt in that window, issue the following commands to set the environment variable for the server instance that you are working with, change the default drive, then set the DB2 code page

set db2instance=tsm_server1
db2 update dbm cfg using dftdbpath d:
db2set -i tsm_server1 DB2CODEPAGE=819

Create a new server options file as described in the Configuring server and client communications section of the manual, then on each node do the following steps:

  1. cd /d c:\windows\cluster
  2. regsvr32.exe /s /u TsmSvrRscExX64.dll
  3. cluster resourcetype "TSM Server" /delete
  4. copy ...\program files\tivoli\tsm\console\TsmSvrRscX64.dll c:\windows\cluster
  5. copy ...\program files\tivoli\tsm\console\TsmSvrRscExX64.dll c:\windows\cluster
  6. regsvr32.exe /s TsmSvrRscExX64.dll
  7. Create the TSM server service by issuing the following command:

sc create "TSM tsm_server1" -
binPath="d:\program files\tivoli\tsm\server\dsmsvc.exe -
-k tsm_server1" start= demand obj= "test\tsm_user1" password=passw0rd

On the primary node only, issue the following command to create the cluster definitions for DB2. This command creates the definitions by reading a configuration file which by default is called db2mscs.cfg and is located in subdirectory /CFG in the DB2 install directory

Run db2mscs - ie db2mscs -f:db2mscs.cfg

The db2mscs.cfg fille contains various parameters that are explained here

Now stop the db2 resource with the following command

cluster resource tsm_server1 /offline

Format the database

ONLY if you are setting up a new instance, format the TSM server database with the following command, which you run from the TSM server instance directory. Adjust the names shown to fit your site.

dsmserv -k tsm_server1 format dbdir=d:\tsm\db001 activelogsize=8192 activelogdirectory=e:\tsm\activelog archlogdirectory=f:\tsm\archlog archfailoverlogdirectory=g:\tsm\archfaillog mirrorlogdirectory=h:\tsm\mirrorlog

If you want to know more about the DSMSERV FORMAT command, check out the TSM Server Administration manual.

Copy the Registry Key

Export the registry key below and import it on all secondary nodes

HKEY_LOCAL_MACHINE\SOFTWARE\IBM\ADSM\CurrentVersion\Server\tsm_server1

Sort out the cluster resources

Issue the following commands to create the TSM server resource - again, substitute your names in the commands below

cluster resourcetype "TSM Server" /create /DLL:tsmsvrrscX64.dll /type:"TSM Server"
cluster /regadminext:"c:\windows\cluster\tsmsvrrscexX64.dll"
cluster resource "TSM tsm_server1 Server" /create /group:"TSM tsm_server1 Group" /type:"TSM Server"
cluster resource "TSM tsm_server1 Server" /adddep:"tsmclustsrv1"
cluster resource "TSM tsm_server1 Server" /adddep:"tsm_server1"

Add dependencies for each shared disk resource, starting with the disk where the instance is found:

cluster resource "TSM tsm_server1 Server" /adddep:"disk_name"
cluster resource "TSM tsm_server1 Server" /adddep:"disk_name"
.. repeat for all shared disk resources

Issue the following to set the location for the console log. If the TSM server resource will not come online you need this log to find out what the problem is.

cluster resource "TSM tsm_server1 Server" /privprop service="TSM tsm_server1"
ServerKey="tsm_server1" ConsoleLogFile="\tsm\tsm_server1\console.log"
LogToFile=1

Set password and start server

Run dsmsutil on all nodes including the primary node to set the $$_TSMDBMGR_$$ password.

"c:\program files\tivoli\tsm\server\dsmsutil.exe"
UPDATEPW /NODE:$$_TSMDBMGR_$$ /PASSWORD:passw0rd /VALIDATE:NO /OPTFILE:
"d:\tsm\tsm_server1\tsmdbmgr.opt"

Bring the cluster resource online with the following command.

cluster resource "TSM tsm_server1 Server" /online

Now log into your server and run some of your favourite commands to check that all is well.