FATS Advert

Tape Futures

There have been a few announcements over the years that tape is dead, but the technology is still very much alive in 2016 and seems to be going places. Tape technology continues to advance, and it's possible to fit more data into the same space. Older LTO tapes use Metal Particle (MP) technology and these particles are magnetised to represent data. The particle size has been steadily decreasing, from 100 nm in LTO1 to about 35 nm in LTO6, but that technology seems to have hit a signal-to-noise ratio limit.
LTO7 uses BaFe particles, as originally used IBM 3592 cartridges. To put it in simple terms, BaFe are better magnets than MP, and BaFe is an oxide rather than a metal alloy so it does not demagnetize over time. This means that the BaFe particles can be made smaller than MP and so even more data can be packed onto a tape.
Some of the reasons for why tape's demise was imminent were:

    GFS Advert

  • It can be quite difficult to fill up a high capacity tape in a reasonable time. Take a 6TB LTO7 tape. It takes 6 hours to fill, writing at 300MB/s. This is true, but is a limitation of the high data capacity stored, rather than the technology. This is only a problem when recycling tapes where some of the data on the tape has expired, for example DFHSM archives or TSM backups.
  • An associated problem occurs when the data on a tape expires at different rates, and again DFHSM and TSM are good examples of this. Much of the space on the tape is occupied by expired data, but that space cannot be reused. The solution to this is to 'recycle', that is to write all the required data off onto a new tape, and then re-use the old tape. If you recycle at 50%, then up to 2TB of capacity is wasted on 4TB tapes. This problem, of course is little different for several smaller tapes, as up to 50% of capacity is still wasted.
  • Getting a specific piece of data off a large tape can take a while (I've waited 40 minutes to recover a critical file from the end of a 3480). This is partly because of the fact that a tape was read sequentially from start to finish. This problem has been fixed with linear serpentine tapes, where it is only necessary to read down the track that holds the data, rather then the whole tape.
  • Only one task can use a tape at a time. This causes problems when several people want to recall migrated data which is stored on one tape. This is still true, though LTFS promises to make it less of an issue.
  • The cost differential between disk and tape is eroding. True, but a TB of data on tape is still significantly cheaper than disk, and once data is on tape, there is little power required to maintain it. This means that the 'green' environmental cost of tape is a lot less than disk.

   EADM Advert

Accelerate DB2 Write with zHyperWrite and "EADM™ by Improving DB2 Logs Volumes Response Time:

Tape was always traditionally used for backup, and while disk based backups are available, explosive data growth makes them relatively expensive, even with de-duplication and compression. A suitable backup storage strategy seems to be disk-disk-tape, or DDT. The reason for this is that 95% of restore requests are made within 14 days of a backup, so it might be work keeping backups on disk for the first two weeks, but after that, tape is suitable and cheap.

Tape is also useful for long term archives, and Active Archives. Tape has a number of advantages over disk when used for long term archive. Cost is the obvious one, but less obvious is that since tape reliability has improved by 700% in the last decade, it is now more reliable that disk for long term retention.

Active Archive is a relatively new idea where a number of companies are trying to establish standards for accessing data on different types of storage. The difference between Active Archive and HSM is that whereas with HSM, when a migrated file is required it is recalled back to primary storage, with Active Archive, files are accessed from wherever they reside in the storage hierarchy.

IBM is driving a new initiative to introduce a common file system for tape, called LTFS. This has obvious advantages for Active Archive, as it will make it easier to access individual files on tape. It will also have benefits for long term archiving, as currently most applications use their own data storage formats. If a universal format is agreed, then it should be much easier to read 30 year old data. There is no guarantee that any current backup application would be available 30 years from now to read a proprietary tape format, but if it supports an open source piece of software, like LTFS, then the chance is much higher that the LTFS drivers will still be available.

The improvement that comes with LTFS is that the media is mounted and read by the operating system instead of the application. There have been other attempts at introducing open tape, but because IBM has released LTFS as open source and because it is used at a file system level, it has a better chance of adoption as previous solutions were vendor proprietary.

LTFS combines well with LTO5 and above. LTO5 introduced an ability to carve a tape up into two media partitions. LTFS has a directory that details the contents of a tape. This can be placed in Partition 0, which is small and can be quickly read to allow anyone to see what is on the tape. Partition 1 is larger and is used to hold the real data.

Not only would an LTFS export option be ideal for situations where large volumes of data need to be shared with others, but it could be seen as a best practice for archive data.

Another potential use for LTFS/LTO5+ is for large data transfers to the cloud. One of the issues with the cloud is that the internet is not geared up for transferring terabytes of data. Customers who require large data uploads frequently do this by shipping magnetic disks to a cloud data centre for upload. An LTO5 tape that could be read anywhere with a LTFS driver would be an easier and cheaper option.

Where does Virtual tape fit in? Virtual tape consists of virtual tape drives which write to a disk buffer. The disk buffer is then flushed out to tape when enough data has been stored. The Virtual Tape section explains this in detail. Virtual tape can be used to fill tapes, and can even allow concurrent access to 'tape' data when it is in the disk cache. However the down side is that the cost ratio of disk to tape is not as good with virtual tape.

back to top