FATS Advert

Why use virtual tape for Open Systems?

Open Systems tape usage is almost entirely confined to backup and recovery. Some backup products, TSM for example, work just like a VTS as they backup data to a disk cache then stage if off to tape later. SATA disk is so cheap that backup to disk is a viable option and that will speed up recovery. Many companies are using the Cloud to store backups, and while that data does not end up stored in fluffy white stuff, the physical storage is not your problem anymore. So is there a requirement for Virtual Tape in the Open Systems arena?

    GFS Advert

Backups traditionally went to tape. This was disruptive to applications and tended to be slow and recoveries could also be slow due to the tape needed to locate, mount and search a tape. Tape is a serial medium; only one backup job can use the tape at a time and it can be difficult to use up tape capacity effectively. On the plus side it is cheap and easy to take data off-site for disaster recovery. Direct tape backups are suitable for moderate quantities of data where a reasonable overnight window is available.

The fastest way to take a backup is to take a disk snap-copy. It is possible to backup terabytes of data this way in a few minutes. The initial copy happens by creating pointers to the old data. This is explained in detail in the snap copy section. If requested, the snap copy software can continue to copy the data under the covers until a full physical copy of the data exists, but this operation can take several hours. Once a full copy is complete, it is possible to snap copy the data back again for fast recovery. The advantage of this process is the ability to take very fast backups of large amounts of data almost non-disruptively. The disadvantage is the cost of maintaining two sets of disks, plus the cost of snap copy software. Another disadvantage is that only one backup is possible, unless you purchase another set of disks for each backup. Thin provisioning does bring the cost down, but tape is still cheaper.

   

Accelerate DB2 Write with zHyperWrite and "EADM™ by Improving DB2 Logs Volumes Response Time:

You fix that last issue by snap copying data to disk then copying the snap data off to tape at your leisure. FDRinstant is an example of this. Once the data is copied off to tape, the second copy disks are available for another backup. This method can cope with large amounts of data quickly and can store several versions of backups, but it is expensive.

A final way is to backup the data to disk instead of tape. Cheap SATA disks can be used to make the process economical. This is not the same as snap-copy, it requires a backup product that can recognise the SATA drives as valid backup media. It also does not require a full set of disks, but just enough space to take a de-duplicated backup. Unlike tape, it is possible to multi-stream backups to the same set of disks, and also possible to do multiple restores from a set of disks. But what if your backup and recovery product does not support disk, or you do not want to expend the effort needed to switch your product from disk to tape? This is where a virtual tape system comes in. A Virtual Tape system will actually write data to disk, but it looks just like tape drives to your backup applications.

There are essentially three types of virtual tape libraries; those that completely eliminate tape by replacing tape entirely with disk; those that augment tape by using a disk cache to virtualise the real tapes, and hybrid products that give you the option to be either disk only, or to use backend tape. These virtual tape types are called VTE (Virtual Tape Elimination), VTA (Virtual Tape Augmented) or VTH (Virtual Tape Hybrid) products in future. Most Open Systems virtual tape systems are VTE architecture.

An advantage of virtual tape is the ability to configure multiple virtual libraries. With a single-library all hosts see one communal library. A single library is easy to manage and tends to performance tune itself and also uses media more efficiently. The downside of a single library is that backup from multiple hosts are mixed together on one tape, and that may not be what you want.

Multiple virtual libraries are more flexible. You can keep backups from individual hosts on separate tapes. Multiple libraries are useful if -

  • You run more than one backup application as then you can assign a virtual library to each backup application
  • You have multiple SAN fabrics then you can configure one virtual library per SAN
  • Your backup application does not handle SAN sharing then you can assign a library to each host.
  • the licensing agreement for your backup application is based on drive usage or back end storage and you need to limit available drives or storage. If you dedicate virtual libraries to some hosts you can optimise license fees

back to top


Open Systems Virtual Tape Suppliers

There are a few suppliers of open systems virtual tape, the ones below are a sample. A word about VTL capacity. Vendors usually quote a 'logical capacity', by which they mean that this is the equivalent capacity that you would need without deduplication. OK, except that the deduplication ratios that they use to calculate this can often be very optimistic.

Hitachi Protection Platform S2750

The Hitachi Protection Platform for virtual tape was formerly called Sepaton, which is "No Tapes" spelled backward. No surprise then that a Hitachi / Sepaton VTL is VTE architecture, a disk only device which can either emulate tape devices, or Symantec's OST disk devices.
The hardware consists of a number of HP Proliant servers running a 64bit Linux Kernel. Each server component is called a node, and the nodes are coupled together into a grid solution with DeltaScale software to provide automatic performance tuning and failover. Nodes can be added to boost performance as required, and each node delivers 10TB/h, so with the maximum of 8 nodes a Sepaton S2100 can handle a backup throughput of 80TB/h.
Disk Storage is provided by an HDS VSP that uses 4TB disks, configured as RAID6. New disk shelves can be added for capacity uplifts, up to maximum capacity of 8 PB usable.
Host connectivity uses 4*16Gb Fiber Channel or 4*10GB Ethernet, with 2*16Gb Fiber Channel connections to the disk storage
Backup data can be segregated into storage pools to separate different kinds of data. Space reclamation is also managed by DeltaStor and runs continuously.

DeltaStor deduplication software runs concurrently with backups and Sepaton claims that it can provide deduplication ratios up to 100:1 as it processes parallel streams.
ContentAware software checks backup data for type (word, excel etc.) and so picks on data likely to be good deduplication candidates. DeltaRemote software is used to maintain offsite copy of data, just transmitting changed data.

Dell EMC Data Domain

A VTE solution, Data domain storage was designed to be 'the storage of last resort', that is the designers concentrated on ensuring that data was always valid and available, rather than making any concessions to boost performance. Extensive checksum processes guarantee that backup data is the same as that sent from server. The architecture is 'log structured file', so updated data is always written to a new location. Contrast this with standard RAID architectures which require that when a RAID stripe is being updated, some old data must be loaded into cache to recalculate parity, so there is always some possibility of losing older data after a power failure. Data domain eliminates that risk, and the RAID6 implementation protects static data from 2 device failures.

The smaller DD2200 can scale up to 860 TB with a transfer rate of 5.6 TB/h. The larger DD9800 can hold up to 43.2 PB and will handle normal backup rates up to 31 TB/h. These rates can be more than doubled with Data Domain Boost, which is a performance enhancing add-on that offloads some of the deduplication processing onto the clients. Data Domain can also use a Cloud tier which significantly extends its capacity.

EMC claim inline deduplication ratios of between 20 and 30:1 but this is very dependent on the type of data being copied.

FalconStor VTL

FalconStorVTL with deduplication is a disk-based backup solution. A FalconStor VTL can be bought as a software-only option, as an integrated appliance on Dell, IBM or Hitachi hardware (servers plus storage) or as a gateway function on top of existing storage. The gateway function is appropriate for large scale enterprise solutions and consists of multiple FalconStor VTLs in a cluster, often backed by an HDS USP storage subsystem, supporting up to 2PB capacity in a RAID6 configuration. A single-node FalconStor VTL can achieve aggregate backup speeds of 20TB/hour, and it can handle up to 160TB/h backup data with a maximum 8 node cluster.

The FalconStor SIR deduplication engine offers a choice of deduplication options, including inline, post-process, concurrent deduplication, or no deduplication at all.
When used with high-speed protocols such as 8Gb Fibre Channel (FC) and 10GbE iSCSI, FalconStor VTL can sustain deduplication rates of over 5TB/hour per node, and scale up to a sustained deduplication rate of 20TB/hour as cluster nodes are added.

The FalconStor VTL supports a variety of protocols, such as FC, iSCSI, and NDMP, and it provides a plug-in component for Symantec NetBackup and Backup Exec Media Servers that works with the Symantec OST API to integrate with NetBackup. FalconStor VTL supports up to 250,000 Symantec OST images per node and also offers a Fibre Channel (FC) SAN target for Symantec OST.

Enterprise-level virtual/physical tape management includes; auto tape caching, tape consolidation, tape encryption, tape duplication and tape shredding

IBM TS7650G

The TS7650G Gateway consists of 3958-DD5 hardware combined with ProtecTIER Deduplication software.
The TS7650G can be configured to run in three different modes; VTL, OST, or FSI but only one of these modes can be configured at a time on a 3958-DD5. Clustering support is offered for VTL and OST modes.

The VTL mode scales up to 1 PB of physical storage and includes Fibre Channel ports for host and server connectivity, emulation of up to 16 virtual tape libraries and 256 virtual tape drives in a stand-alone configuration, or up to 512 virtual tape drives in a dual-node cluster.
Tape library emulation includes the IBM TS3500 with IBM LTO Ultrium 3 tape drives, the Quantum P3000 with DLT7000 or IBM LTO Ultrium 2 tape drives and the IBM DTC VTF 0100 virtual tape libraries with DLT7000 IBM LTO Ultrium 2 tape drives.
OST configuration includes 10 Gb or 1 Gb connection on dual port or quad port adapters for NetBackup host connectivity.

The TS7650G Gateway uses inline data deduplication powered by HyperFactorTM technology. and is designed for mid-sized companies
ProtecTIER FSI implements Common Internet File System (CIFS) or Network File System (NFS), which can provide shares or exports for the hosts as backup and recovery storage.
ProtecTIER FSI is supported for backup images that are produced by backup application, but it is not intended to be used as primary storage with deduplication

Fujitsu ETERNUS CS800.

Fujitsu has two Open systems tape solutions. The ETERNUS CS800 is open systems only, while the Eternus CS8000 supports open systems and mainframes.

The ETERNUS CS800 is a deduplication backup appliance designed for small and midsized environments. The backend storage is disk only with from 8 to 352 TB of usable capacity, and Fujitsu claims it can achieve up to 13 TB/hour in conventional inline target mode, without restricting backup applications or requiring additional software installation. Deduplication and Replication software are included as standard. It emulates up to 160 DLT or LTO drives and can run disk backups with simultaneous NAS, VTL and OST interfaces. There is an 'export to tape' option for long term retention.

The ETERNUS CS8000 is discussed in detail on the mainframe page, but in short it supports major Unix and Windows operating systems as well as major tape libraries. It combines VTL and the NAS option to consolidate backup, archiving, compliant archiving and second-tier file storage in one appliance.

Cybernetics' iSAN V2 Series VTL

While Cybernetics specialises in IBM iSeries servers, the iSAN V2 VTL is also designed to provide virtual tape functionality to pSeries, Windows, Linux, Unix and MAC servers. The Cybernetics' VTL is a disk based appliance with the option to offload archives to physical tape if required. The appliance uses a cache to initially store the data, which assists when the library is performing parallel operations. The iSAN V2 comes in 4 models, the V8, V12, V16 or V24. The numbers relate to the number of bays used. Cybernetics gives you the option to install SSD only VTL models, or a mix and match model with SAS, SATA, and SSD disks. All tape drives and libraries are supported for remote offload, and it is also possible to offload to removable USB or eSATA devices.

Connectivity can be via SCSI, SAS, HD SAS, iSCSI (1GbE/10GbE/40GbE) and FC (2G/4G/8G/16G) and multiple connections can be made to one VTL simultaneously. An older AS/400 can connect via SCSI while a newer iSeries or Power Server could connect via SAS, HD SAS and/or FC.

Deduplication is post-process and performed at the byte level. Cybernetics clamims dedup ratios averaging 50:1 for Windows, and as high as 700:1 for IBM midrange devices. iSAN uses AES 256 encryption, with the option to encrypt data at rest on disks and offline data on tape. It is possible to replicate data between two iSAN devices, and then encryption is enabled by default. Replication also uses WAN optimising compression to save on bandwidth between devices.

It is possible to manage multiple VTLs from a single web browser window, and Cybernetics states that all their engineers are fully trained in the hardware and software; none of them are 'script readers'.

back to top