TSM LAN free backups
Why would you want to use LAN free? Generally speaking, backing up and restoring over a SAN is much faster that over a LAN, but that really depends on how fast those two items are in your site. Another benefit for traditional TSM is that LAN free avoids clogging up backup disk pools with large backups, as this uses a front end disk pool that is offloaded to tape each day. LAN free is also very suitable for big databases or anything else that sends data in large chunks, like image backups or VMware backups from snapshots. If you have a high capacity server, but that capacity consists of lots of small files, then you will get no advantage from backing up over a SAN.
Assuming you decide to install LAN free, this is how you would do it on AIX.
Connecting the Tape Drives
First you need to organise cabling up your tape drives. If your backups are critical you will need two fiber cards installed in each client
for resilience. Then you need them cabled up to your SAN switches and zoned in so your client can access the tape drives. It is possible to rename your tape drives to something more meaningful that the rmtnn names that UNIX provides by default, but to do that you need to install an Atape driver, supplied free by IBM. Once you do this, you rename the drives with smitty, or with the chdev command as below. It is best to give your drives a name that contains the WWWN then they are unique. When you rename them, several files will be created in the /dev directory. The device parameter in your path name in TSM must match these names. The chdev command in full is:
chdev -l rmt0 -a new_name=T-AA02450
Defining the Storage Agent
Define at the TSM Server
Use a command like the one below to define a storage agent to the TSM server. If you want to look at existing storage agents to see how they are defined, you will find them by using the 'q server' command or the 'other servers' tab in the GUI.
define server nnnnn serverpassword=password hla=x.xx.xx.xx lla=xxxx
- nnnnn is the name of the storage agent. Each agent name needs to be unique, so come up with a good naming standard. A simple standard that works is nodename-agent.
- password is the password for the storage agent and must match the password supplied in the client definition below
- xx.xx.xx.xx is the ip address (or DNS entry) of the client machine
- xxxx is the port you use in the dsm.sys for the storage agent. The example below uses 1510.
define the tape paths
To define the tape paths to the TSM server, use commands like
define path agent.name tape.name SRCT=server destt=drive library=LIBNAME device=/dev/tape-name
If you use alternate pathing, add a '0' onto the end of your tape name in the device parameter above.
Define at the Client
The Storage Agent is usually found in the /opt/tivoli/tsm/StorageAgent/bin/ for TSM 6.1 clients and above. A Storage Agent is basically just a cutdown version of a TSM server with a reduced command set. It needs a dsmsta.opt options file just like a real server and typical values could be as follows.
You need to work out what timeout parameters are best for your site. Commtimeout is in seconds and idletimeout in minutes, so they are both set here to two hours. If you are backing up Oracle databases with incremental RMAN, then Oracle can spend some time searching its catalog to workout what needs to be backed up. Without high timeout values the backup could fail, so the numbers above could be reasonable. However this does mean that if you hit a problem you are locking out tape drives for a long time, so smaller values could be more appropriate for standard backups.
Once you have an options file you create the storage agent with the command
dsmsta setstorageserver myname=nnnn mypasswordpppp myhladdress=nnn servername=TSM1 serverpassword=pppp hla=nnn lla=mmmm
This will create a file called devconfig.sta that contains the above details, with the passwords encrypted. The parameters in the command are
- myname is the name you call your storage agent, the same one you used to define the agent to the TSM server above.
- mypassword is the password for the storage agent and must match the password used when you defined the agent to the server.
- myhladdress is the TCPIP address or DNS name of the client that is hosting this storage agent
- servername is the name of the TSM server
- serverpassword is the password for the TSM server
- hla is the tcpip address that you use for comms. with the TSM server, the same one that you use in the dsm.sys file. This address is not used on initial start up, so if you get this wrong, LAN free will appear to be working fine and the backups will work. However the tape dismount will fail and the tape drive will go into dismount retry failure mode. The only way to free the drive up is to stop the storage agent on the client.
- lla is the port name that you use to access the TSM server, the tcpport parameter in the dsm.sys file
Next you need to start your storage agent, which you can do by simply typing echo "dsmsta" from the command line. To stop the storage agent, you can log into it using 'dsmadmc -se=storageagentname then typing halt from the command line. Alternatively, just use the UNIX kill command.
Changes to dsm.sys
To use LAN free backups you need to make a few changes to the dsm.sys file. First you need to add a stanza for the storage agent like this, which is used to connect to the storage agent with the dsmadmc command. The node name must not be the one you use for normal backups.
The tcpserveraddress could be the address of your real server, but it is shown here as the standard IP address of the localhost, as it is just used for internal communication.
The tcpport number does not have to be 1510, but it must match the value you used in dsmsta.opt. I avoid 1500 as that is the server default and like to use 1501 upwards for the webports, so it seems reasonable to standardise all the storage agents at 1510. You must make sure that you do not conflict with any address used by other software on your machine so this standard might not be suitable for you.
Next you need to add some lines to your existing backup stanzas like this
The lanfreetcpport number must match the tcpport in dsmsta.opt and the tcpport in the storage agent stanza
Finally you need to create a management class that writes direct to tape. The backup section discusses how to do this
It can be difficult to troubleshoot LAN free when it stops working as there are so many parameters to check. The following list might help.
Look for the obvious issues first.
- Do you have ENABLELANFREE YES in your client option files?
- Is the storage agent started and running?
- Do all the IP adressess and port numbers match in the dsm.opt file, the LAN free server definition on the TSM server and the client?
- Are you picking up a 'direct to tape' management class?
- Are your tape drives and paths online?
- Do you have any errors on the TSM server or client logs?
If that looks OK, then
- Try running the 'validate LANfree' command on the TSM server and see if it thinks everything should work.
- Stop your storage agent and run it in the foreground, by typing dsmsta from a command line. Now try running a backup from a different window and watch the storage agent, see if it makes any attempt to mount a tape, or if it tries to mount, but gets problems.
- Run a client trace while trying a backup, and check the trace output for 'bLanFreeDest'. If this is 'FALSE' then you are not picking up a valid LAN free management class. The key is that the management class must write direct to tape. If it is going to disk, or Centera then LAN free will not work. Also, the storage pool must not be configured for simultaneous write operations.
If all this fails, you might have to stop and start your TSM server, as faulty LAN free definitions may be stuck in the server cache and a restart would be needed to flush them.
It is possible for a Lanfree session to seem to work fine until it gets to the end and fails to dismount the tape. It does produce an error message - 'ANR0456W Session rejected for server server name - the server name at High level address, Low level address does not match', but this error can be hard to find in all the other messages on the TSM server log.
The problem is that the TSM server does not attempt to communicate back to the storage agent until the backup is complete. It does this to verify that all is finished before it releases the tape mount, and if it cannot get that verification, it may not release the tape and a manual dismount may be required.
This problem is most likely caused because the HLA (High Level Address) of the storage agent as defined on the TSM server does not match the IP address of the actual storage agent. You would correct this with the 'update server' command
update server servername hla=x.xx.xx.xx
Problems with Mountpoints
Sometimes a LAN free backup will fail with ANR0539W Transaction failed for session ... This node has exceeded its maximum number of mount points.
The most likely cause is that the resourceutilization on the client is higher than the maxnummp (maximum number of mount points).
Check the RESOURCEUTIL option in the client options file, dsm.opt (Windows) or dsm.sys (Unix) and the MAXNUMMP for the node on the TSM server
QUERY NODE 'nodename' F=D.
If the RESOURCEUTILIZATION is greater than MAXNUMMP, either reduce RESOURCEUTILIZATION or increase MAXNUMMP. The default MAXNUMMP is 1.
If the RESOURCEUTIL is less than or equal to MAXNUMMP, then it is possible that the storage agent has not released mounts it held from prior sessions. check the Reset Drives (RESETDRIVES) option for the library with the following TSM server command
QUERY LIBRARY F=D
By default, for a shared library, RESETDRIVES should be set to YES. If it is set to NO, IBM recommends that you update the library with
UPDATE LIBRARY RESETDRIVES=YES
When a shared library is used, the library manager does a heartbeat check to see if library clients are still using resources. It is possible on a busy server that the library manager will receive the heartbeat from the library client but not process it prior to resetting the drives. If a client is done with the drive but the manager still thinks the drive is in use, with RESETDRIVES=YES the drive will get reset without manual intervention from an administrator.
back to top