Storage Futures

Storage Management can, in a simplistic way, be broken into two disciplines; providing capacity for your customers to store their data, then providing backup and recovery services for that data in case it gets damaged. The first discipline is usually discussed in depth by anyone trying to predict the future of storage, while the second one is often neglected. So, lets look at storage provision first.

FLASH

Flash Storage continues to grow at the expense of spinning disk and all those alternatives that were going to replace NAND flash, like memrisors and even 3D Xpoint, are still watching from the sidelines.
Seagate has produced a 60TB NAND drive in a 3.5 inch form factor, and Samsung is making a 15 TB V-NAND drive in a smaller, 2.5-inch form factor. Both Samsung's 32 TB and Seagate's 60 TB SSDs will go into mass production sometime in 2017. These high density SSD drives are closing the purchase cost gap per terabyte with spinning disk and when you factor in the savings in space, power and cooling, then the total cost of ownership is even closer.

However there are signs that traditional NAND storage is starting to run out of options to cram more capacity into the same space, and improve performance. It looks like a new technology in the form of Prototype Phase change or PRAM could replace NAND. At a very simplistic level, it works by changing the state of chalcogenide glass as the two states have different electrical resistance. The amorphous state has a high resistance and represents a binary 0, while the crystalline phase has a low resistance state and represents a 1. The advantages of PRAM are that it is more stable than NAND, it can write data much faster, and it does not suffer as much write degradation as NAMD does. Also, PRAM has the capability to change a single bit rather then a whole block and it can change that bit faster. In other words, it is getting closer to RAM performance and usability.

3D XPOINT

3D XPoint was to be released in 2016, but is now planned for sometime in 2017. It is claimed to be 10 times faster and 10 times denser than NAND and was invented by Intel and Micron. Intel is not pitching it as a replacement for either flash storage or RAM, but as something that come in between the two. They claim that the technology is not phase change, but I have seen a very persuasive argument that it could be just that. The name 3D XPoint (pronounced cross point) relates to the crossbar structure of wiring, with layers of parallel wires, where each layer runs at right angles to the one above. At each intersection of the wires there is a very small column which consists of a memory cell and a selector cell which is used to allow read and write access to the memory call. Access is controlled by varying the amount of voltage it receives via the wires. As 3D XPoint does not require transistors to store data, it avoids most of the issues found in NAND chips.
The unknown quantity at present is price, but we can realistically expect 3D XPoint to be more expensive than NAND, for a while at least.

NVMe

One of the problems with producing faster technology is that the storage devices can process data faster than the existing comms channels that serve them, so the comms channels become a bottleneck. There are two angles to consider here, the interfaces that exist within a server or device, and the external connections and channels

Internaly, NVMe is an alternative to the old SCSI protocol for transferring data between hosts and peripheral storage. The NVMe specification was released in 2011, designed to be faster than existing protocols by streamlining the register interface and command set to reduce the CPU overhead required by the I/O stack. NVMe was designed specifically to support the high IO rates demanded by faster storage technology such as PCI Express SSDs.
NVMe is not common yet in storage systems, but it should take off once off-the-shelf NVMe devices support enterprise capabilities such as hot plug and dual port.
If single channel NVMe is not fast enough, then the m.2 device interface allows a device to connect to 2 or 4 PCIe lanes. Micron intend to produce a compact m.2 slot storage products in a future release.

Fiber Channel, Ethernet and NVMe-oF

Externally, The battle between Fiber Channel and Ethernet is ongoing. Ethernet dominates storage for file-based unstructured data, while block-based structured data usually requires Fibre Channel SANs. However, file-based unstructured data is growing much faster than block based and Ethernet tends to be standard for Cloud systems.
Because Ethernet is starting to dominate the market, there are barely a handful of FC networking companies left, and they all now support Ethernet as well. 32-gig FC is now available (that's a terabyte in about 4 minutes!) and next year, the big storage vendors will probably need it to handle the performance of big, all SSD storage subsystems. Last year, we saw early 32-gig FC products hit the market, including switches from Brocade and Cisco and adapters from Broadcom and QLogic. Adoption is expected to pick up in 2017, when storage array vendors support 32-gig.
The specification for NVMe over Fabrics (NVMe-oF) was just finalised in June 2016, so this technology is a bit further away. It is an alternative to PCIe and will extend the distance over which NVMe hosts and NVMe storage devices can connect, so enabling NVMe to run over Datacenter Fabrics.

DISK and TAPE

Magnetic Disk (HDD) and Magnetic tape are the old stalwarts of storage technology. Have they reached the end of the road as far as innovation and development are concerned? Will they slowly fade away and be replaced by solid state technology and the cloud? Well, disk drives will certainly continue to lose ground to flash drives, especially as flash becomes more cost comparable. However, HDD is still 10x cheaper than SSD, and Tape is still 10x cheaper than HDD, so they both still have a market. It is probable that a combination of shingled, helium filled drives will still be used for large capacity archive storage for a few more years.
Western Digital has announced a 12 TB He-filled HDD, planned for release in early 2017, and a 14 TB HDD that will use Shingled Magnetic Recording (SMR), planned for late 2017.
The next technology change will be heat assisted magnetic recording (HAMR), not due until 2018 or later. HAMR, combined with bit patterned media magnetic recording, should increase areal densities by another order of magnitude and so keep HDD cost effective.

Magnetic tape does seem to be falling behind, with the current LTO-7 tape holding just 6TB raw storage capacity, though there is a routemap to LTO-10 with a 48TB capacity. Magnetic tape is still the best product for cold archive data, as the slower access is offset by the cheaper price per GB.

The CLOUD

The vendors like to pitch the Cloud as some kind of fuzzy new data storage, where your data is held almost by magic and is always available from anywhere you can get an internet connection. Of course we storage professionals know that data in the cloud is stored on SSD, HDD and magnetic tape, just the same as any other data!

A lot of smaller businesses are now using the cloud to reduce their storage costs, often using SaaS applications.
Industries like finance, healthcare and some public sector clients have very stringent regulatory environments. Because of these concerns, large multinational companies have traditionally stored information on-premises. However, large public cloud providers such as Microsoft Azure have build data centers in several different countries and have the certified resilience and security arrangements that should satisfy those regulations. This means that the Cloud should now be suitable for most companies.
Despite this, most of the large enterprises tend to pursue a hybrid strategy and keep a significant amount of their storage capacity on site. Typically they keep mission-critical data in-house and use the cloud for lower-priority data like backups, and long term archive.

BACKUP and RECOVERY

What is the future for backup and recovery services? Applications can span to several terabytes and while it is possible to back this up using traditional methods from snapshots, it would take several hours to recover an application from tape. That is pretty much unacceptable for most of today's businesses. For me, the future seems to be snapshots. The EMC DMX3 storage subsystem can snapshot a whole application with a single command, maintain up to 256 of those snapshots for each application, and restore the source volumes from any one of those snapshots. Of course, most of the time you don't want to restore the whole application, just a few files. So you mount the relevant snapshot on a different server and copy over the files that you want.
If you lose the whole subsystem you lose the source data and all the snapshots, so to fix that you need a second site with remote synchronous mirroring between the two. This is not the future of course, you can do all this now. I think the future is that backup and recovery applications will start to recognise that they do not need to move data about to create backups, but can use snapshots and mirrors as backup datastores. The role of the software would then be to manage all that hardware and maintain the necessary catalogs that refer to backups contained in all these snapshots so the storage manager can easily work out what backups are available and also recover from them with simple commands.

We may see standard backups using the cloud as a longer term repository, with on-site snapshots retained for large application restore or user error type restores from recent backups. As more important data is moved to the cloud, it too needs to be protected. Cloud-to-cloud backup, where data is copied from one cloud service to another cloud, will be important in 2017. Backup vendors will need to add cloud-to-cloud capabilities to satisfy this requirement. Specifically, they will need to add tools to back up and restore applications within the cloud.

CONTAINER VIRTUALISATION

A new product has appeared in the last year or so that may well bring some new storage challenges. Docker containers are based on virtualisation, but while VMware virtualises complete servers with operating systems, Docker just virtualises complete applications on the same operating system. Containers themselves have been around for a while, they share operating systems and hardware, and so are much more efficient than virtualised VM servers running on hypervisors. Docker containers seem to be taking off and should be watched in 2017.

Docker is mainly used by developers and testers, as they can easily, quickly and cheaply create several copies of applications, all running under the same operating system. This used to be restricted to Linux, but now they can run on Windows too. The good news is that Docker does not require a complete set of data for each application copy, it uses an overlay file system to implement a copy-on-write process, so all copies of an application use the same base set of data, and just store independent copies of any updated data. If a Docker container is deleted, then by default all changed data belonging to that copy is deleted too.

However, a Docker container can use persistent storage resources in the form of Docker volumes and data containers, and these can be shared between containers. The problem here is that if volumes are shared, there is no level of data integrity protection so file locking need to be managed by the containers themselves. At present Docker does not provide snapshots or replication, so backup and recovery has to be done at host level. Also, all Docker volumes are stored by default in the /var/lib/docker directory, which can become a capacity and performance bottleneck.