Virtual Tape

This section covers both z/OS (IBM Mainframe) and Open Systems virtual tape.

The Problems with Physical Tape

Filling large tapes. Mainframes traditionally use tape for all kinds of data; backups, large GDGs, 'ML2' DFHSM migrated data and maybe just files that are too large to fit on a 8.5GB Mod9 disk. In this case, while a physical tape can typically hold up to 5TB native, it is quite hard to fill one of these tapes, unless you have a specialist application such as DFHSM or TSM which are designed to pack lots of small files onto a big tape. By contrast, Open Systems operating systems tend to just use tapes for backups and as they are backing up large disks, they can fill tapes reasonably easily.

Full tapes develop holes. Unless you make sure that all the files that you store on a tape are going to expire on the same day, then a tape that was initially 100% full of required data will steadily become less full as time goes by. This is because as files are expired, they leave 'holes' on the tape and unlike disk, these holes cannot be filled with new data. As time goes by, the tape will hold less and less active data but it cannot be scratched until the last active file is expired.

Shortage of tape drives. There are always points in the day when you do not have enough physical drives, and other times when you are hardly using any drives. It does not make economic sense to buy lots of real tape drives to manage peak demand, if these drives will be unused for most of the day.

Retenton of long term, inactive data. We have requirements to keep certain data for long periods of time. Tape might seem to be the ideal medium for these files, but unless you try reading them from time to time, how do you know the data will still be intact?

Duplexing data offsite for DR. In a disaster you need all your data in your recovery site, including all primary data stored on tape. the only safe way to do this is to duplex the data as it is written, but duplexing tape can be difficult, unless your application provides duplexing facilities. FDRABR and DFHSM are applications which do. Most applications will not create two copies at write time, you have to copy the tape later.

Recovery speeds. Generally speaking, it is as fast to get data off a tape as it is from disk. The problem is that before you can start to read the data, the tape has to be located and mounted in a drive, then wound forward to the correct position. This does not take as long as it used to, but it can still be significant for that service critical restore.

Virtual Tape

Virtual tape solves these problems, as explained in this page set. The tapes are virtual so it does not matter how much data is on them. Drives are virtual so you can have hundreds of them. If you have a physical tape solution, then the physical tapes are recycled regularily, which gets rid of any holes in them and ensures that they can be read. Holes are not an issue with a disk based solution as they can be re-used. Offsite tapes can be duplexed at source, but see the relevant page for details. Recovery from VTE systems is fast, and it can be fast from VTA solutions too if the data is still in the disk cache.

Some virtual tape systems do not use real tapes at all, so there are two fundamental types of virtual tape

  1. Tape Virtualisation or Virtual Tape Augmented (VTA); where the applications write data to virtual volumes on virtual drives and this data is later consolidated onto real volumes on real drives.
  2. Tape Elimination or Virtual Tape Elimination (VTE); where the applications write data to virtual volumes on virtual drives and this data is stored permanently on disk. This type of virtualisation is often combined with data de-duplication.

Some people claim a third type, Virtual Tape Hybrid (VTH), where you have a disk only solution, with the option to add a tape library with real backend tape drives if you want them.

Does Virtual Tape have a future? Some say that the integration between backup products and de-duplication makes it more cost effective now to store backup data permanently on cheaper SAS or SATA disk, especially on disk subsystems where hard drives can spin down when they are not being read, as this saves on power consumption. The argument is, why use a VTS to make a hard drive emulate tape, when the data can simply be stored on disk using more efficient block sizes? Most backup products now support disk to disk, or D2D backups with de-duplication. This means that a VTS that just uses disk storage is a waste of time. This is a compelling argument, but there are no immediate signs that disk based VTS devices are disappearing. In fact, pure disk based VTS devices make a lot of sense for a small to medium business.
The advantages are less clear for large enterprises as they tend to have ferocious growth rates. It is relatively easy and cheap to cope with this by adding more physical tape slots to a library, but more expensive to add several extra terabytes of disk storage. Also, VTLs are designed to cope with large amounts of data and this takes the management strain away from the rest of your hardware.

Another emerging strategy is to recognise that backups and archives are two different solutions to different problems. Backups tend to be short retention, say 30 days, while archives are long retention, measured in years. Disks are not really suitable for long term retention, tapes are arguably a better solution. So a workable strategy could be to used D2D for backups and tape based VTS for archives. A clear advantage of VTS over native tape here is that the physical tapes can be recycled regularily, so proving that they can be read, and faulty tapes can be replaced if they are duplexed.
One final innovation is to offload all this archived file management and let someone else look after it. However, data in the Cloud is still stored on disk and tape, the difference is that someone else is looking after it.

back to top