Storage Futures

Coronavirus

So far this year, the biggest change in our lives has been the Coronavirus. Can I offer my deepest sympathy for any who have lost family, friends and loved ones to this virus.
It seems that we are starting to get control of the disease in South East Asia, Australasia and Europe, while things are not looking so good in the Americas, Africa and India. Hopefully we will beat the virus everywhere in the coming months. However, the next problem will be how to cope with the aftermath of the virus, as the world economy is taking a beating. How could this affect the data storage industry?

One major change over the last few months has been the number of people working from home. There is a definite perception that this will continue in future and that a fully staffed office will be a thing of the past. I've been able to work from home for some years now. I'm lucky enough to have a large garden, and have built a big office shed in there, with power and heating and I pick up the wireless broadband from the house. With a setup like this, when I go out of the house with my coffee in the morning, I'm going to work. I'd never get that feeling working in the living room with a laptop balanced on my knee. While I get more work done at home, I do miss the company and banter of an office, and like to get up there once a week if I can. The point is, if this is to become the new normal, you need to organise an office space for yourself if possible. You will save on transport costs and help the environment too.

The world wide lockdown that has been enforced to control the virus has stalled the economy, and while our leaders are trying to get it started again, it will be depressed for several months to come. There will be an impact on the industry, with less money for upgrades and more pressure on cost savings. The onus will be on us to work smarter and keep costs down, while still providing a world class service.

The other drivers for change in 2020 continue to be data growth, cost containment and regulatory compliance. The disruptors that are changing the storage world are:
Hardware management focus is changing from storage to data. We are moving away from the traditional simple data storage systems to intelligent systems that are data aware. They have built in real-time analytics functions that provides real-time information on data and performance.
The shift from spining disk to flash storage continues, with storage systems either all-flash or hybrid systems that automatically move active data to flash storage. The cost differential between spinning disk to flash continues to erode and is now 10x or less, making flash very economical for performance demanding workloads.
The market is moving from propriety hadrware systems to software-based systems that use commodity hardware. This simplfies data migration and in fact removes the need for forklift upgrades. Software based storage can run onsite, in the cloud, or a bit of both.
Data mangement focus is moving from managing data to managing files. As the sheer number of files grows, legacy storage management systems struggle to cope. Modern storage managment products are morphing to find ways to manage the billions of files created in today's digital and Internet-connected world.

The storage management discipline can be split up into three areas:

  1. adding more capacity for your customers to store their data
  2. installing products and processes for managing that data, for example backup and recovery services
  3. All singing, all dancing solutions where you buy storage as a service, with the capacity and management rolled in together

Adding Capacity

The biggest upcoming change for capacity provision will be storage automation. An authorised user will be able to request and obtain (and pay for) capacity on demand, with automation doing all the provisioning in the background. The requestor will select the storage service they need based on some simple criteria, like capacity, resilience, cost and performance, with the data management, mirroring and backups happening in the background. This could well happen in a local data center, and is already happening for applications based in the Cloud.

If you decide to purchase capacity yourself rather than using the Cloud, then you need to consider the trade-off between capacity, performance and cost. High speed storage usually comes in smaller increments and is more expensive. The trick is to make sure that the data that needs best performance is on that high speed storage, while older data can be on a cheaper media. The future of primary storage is flash disk connected by NVMe.
Flash Storage continues to be the favourite for fast access, and is rapidly replacing hard disk drives. Every vendor supplies all-Flash arrays now, and most will supply hybrid mixed disk and flash systems, but there are very few storage systems now that only host hard disk drives. 3D NAND will continue to push prices down in 2020 as 96 layer slowly enters the market. With multi-terabyte Flash drives already in production, the disk drive future looks very uncertain. However it looks like Flash storage itself is coming to the end of its development capabilities, and new technologies are planned to supplement it.

Traditional SAN and NAS arrays were designed for a world without virtualization. Storage subsystems are designed to recognise sequential IO and pre-load the next set of data into a cache to speed up retrieval. However once you have a single physical server that is hosting hundreds of independent virtual machines, all sharing a limited number of LUNs, then these caching algorithms break down. All the IO from the VMs is merged, causing the so-called 'IO Blender effect', where IO operations from multiple VMs are mixed together and sent to the storage subsystem. IO which was once sequential then gets mixed up with other IO and effectively becomes random. This can be a killer for physical disk, although Flash storage can cope with better with blended IO. We can expect to see new storage subsystems optimised for flash storage and virtualisation in 2020.
Storage Acceleration products can assist with the IO Blender effect. They involve the use of specialized hardware to offload various storage-related operations from the host system. They enable a block storage subsystem to create a full copy of a data store without the need for a host ESXi system to first read or write data. This reduces the amount of traffic on the storage network, as well as the amount of work that the host system must perform to create a data copy for tasks such as vMotion.

The fastest storage out there is DRAM, on the PC motherboard. DRAM is fast, volatile and expensive. Storage class memory (SCM) products lie somewhere between DRAM and flash. They are typically slower and cheaper than DRAM, but faster and more expensive than flash. There are a number of SCM products on the horizon, and one which is available now; Intel's Optane. Optane uses Prototype Phase change or PRAM. At a very simplistic level, this works by changing the state of chalcogenide glass as the two states have different electrical resistance. Optane drives are available in 2019, but sizes are small at 280 or 480GB. They are currently used as a high speed cache between the processor and storage, Flash or HDD. Optane has some way to go to supplant Flash, but we can expect it to be more prevalent in 2020, and hopefully the cost will come down too.

Other potential SCM products are aimed closer to DRAM, but are worth mentioning briefly.
Nanotube RAM
NRAM uses Carbon Nano Tubes (CNTs) in a fabric, where the resistive state of the fabric can be high or low, and so represent a binary '0' or '1'. The CNT fabric is formed on top of a silicon substrate that contains the transistors and diodes that are needed to read and alter the resistive states.
Nantero from Massachusetts has developed NRAM products, including a multi-Gb DDR4-compatible nonvolatile standalone memory product and a standalone chip designed as a cache for SSDs or HDDs. The storage is non-volatile which removes the need for battery backup.

Resistive RAM
Resistive Random-Access-Memory or ReRAM is claimed to be about 1000 times faster and to use 1000 times less power than Flash. One exponent of ReRAM is the Israeli company Weebit. They use two metallic layers with a Silicon Oxide layer sandwiched between them. Weebit then applies a positive forming voltage to each cell, which forms a conducting filament through the SiO layer. This low resistive state can then switched high, and reverted to low again by applying a negative and positive voltages to the cell as required.
Weebit Nano announced in August 2019 that they had signed a Letter-of-Intent with the Chinese firm XTX Technology to co-operate in investigating ways in which XTX can use Weebit’s technology in its products.

Spin Transfer Torque RAM
SSTT-RAM or MRAM is about as fast and expensive as DRAM, but its advantage is that it is persistent, so the data therein is preserved at powerdown. The technology uses three layers, a lower magnetic 'fixed layer', a metallic 'tunnel junction', and an upper magnetic 'free layer'. Electrons have an angular momentun property called 'spin', which can be 'spin up' or 'spin down'. In a normal beam of electrons these two states cancel out, so the beam is unpolarised. If a current is passed through the fixed layer, this current can become spin polarised. If this current is then directed through the tunnel layer to the free layer, it can flip the magnetic orientation of this layer.
This is more than just theory. Everspin from Arizona produces STT-RAM products, including the 1GB EMD4E001G

'Over the Horizon' Technologies

There are plans to store data on DNA strands, which would be at molecular level. This promises very high capacity density, but if it ever becomes a mainstream storage product, then a lot more development is needed. CATALOG Technologies is developing an implementation that uses standard DNA building blocks, or pre-made DNA molecules, which they say is a faster and cheaper way of building a datablock that assembling the molecules individually as required. One to watch, but unlikely to surface until the late 2020s.

Quantum Storage is another long term possibility. Data that is stored at atomic level using light particles with instant remote mirroring and it is almost hack-proof as it uses a quantum key for encryption.
Quantum computers exist now, IBM has one online, but they are some way from being a commercial proposition yet.

NVMe

One of the problems with producing faster technology is that the storage devices can process data faster than the existing comms channels that serve them, so the comms channels become a bottleneck. I'm certainly hearing this from Mainframe performance experts, who tell me that Flash drives have resolved the problems with disk performance, but the FICON channels are now the problem. There are two angles to consider here, the interfaces that exist within a server or device, and the external connections and channels.

PCI Express, or PCIe, provides the fast interface and internally, NVMe is an alternative to the old SCSI protocol for transferring data between hosts and peripheral storage. NVMe was designed specifically to support the high IO rates demanded by PCIe connected Flash Storage and it is generally accepted that NVMe will eventually supersede SATA SSDs.
There are physical limits to how fast you can send a signal down a wire but what you can do, is use more wires in parallel. The m.2 PCIe device interface allows a device to connect to 2 or 4 PCIe lanes and as this scales up, it should cope with the fastest Flash Storage transfer rates, but this is a motherboard connection. It should resolve the comms issues inside the storage device, but will not help with the external channels.
The next stage is NVMe over Fabrics (NVMe-oF), which is intended to improve the data transfer between host computers and target storage systems. At the moment, NVMe-oF just provides remote direct memory access (RDMA) over converged Ethernet (RoCE) and Fibre Channel (NVMe-FC). All-flash array vendors such as Kaminario, Pure Storage and Western Digital's Tegile moved to use NVMe-oF as a back-end fabric in 2019. The expectation is that NVMe will expand further in 2020, making inroads into storage systems, servers, and SAN fabrics. Initially it is expected that NVMe-of will just be used in a small scale connecting components in maybe a couple of racks, but eventually it will probably supplant SCSI as the connectivity protocol of choice.

DISK and TAPE

Magnetic Disk is not completely dead, two technology improvements exist which could extend their life a bit. Heat assisted magnetic recording (HAMR) and Microwave assisted magnetic recording (MAMR) both use techniques to persuade the magnetic domains to change polarity faster, by using a laser (HAMR) or a microwave generator (MAMR) in the write head. This allows the data density on a disk to improve by a factor of two or four, and so reducing the cost per terabyte.
Magnetic tape does seem to be falling behind, with the current LTO-8 tape holding just 12TB raw storage capacity, though there is a routemap to LTO-10 with a 48TB capacity. Magnetic tape is still the best product for cold archive data, as the slower access is offset by the cheaper price per TB. Tape backups are reliable and inexpensive, and once the data is written to tape, it cannot be altered in situ, making it the best solution for recovery from ransom ware attacks.

Blockchain Storage

Imagine a scenario where your data is stored on dozens of individual storage units around the globe, accessable via the internet, but with no central control point. The starting point for understanding this is a blockchain.

A Blockchain is a distributed ledger or database that records transactions between two or more parties and maintains details about each transaction, where each transaction is added to the ledger in chronological order. The data is stored as a series of blocks and each block references the preceding block to form an interconnected chain. This looks like a SPOF at first sight, but the ledger is distributed across multiple nodes, with each node maintaining a complete copy. As every node has a copy of the ledger, and has full access to add blocks and verify blocks, there is no need for a central authority or third-party verification service. Blockchain also allows parties to trace assets back to their origin and also allows two or more parties who don’t know each other to safely interact in a digital environment and exchange value without the need for a centralized authority.

So a blockchain can be used to store data in a distributed and geographically dispersed way, where the data is stored as blockchain nodes. The basic process goes like this
The storage subsystem breaks the data up into blocks, or 'shards'
It then encrypts each shard, base on a key provided by the data owner
It generates a unique hash value for each shard, and records it in the shard metadata and the ledger
It creates redundant copies of the shard, with the number of copies and locations controlled by the data owner
A P2P network then distributes the replicated shards to several storage nodes, which can be either distributed regionally or globally for data resilience. It is expected that the nodes will be owned by various organisations or even individuals, who will lease out the storage space. However the data will be spread between several storage owners, so only the content owners have full access to all their data.
Finally, the storage subsystem will record transactions in the blockchain ledger, then sychronise the ledger over all the participating nodes.

Enterprise blockchains today take a practical approach and implement only some of the elements of a complete blockchain by making the ledger independent of individual applications and participants and replicating the ledger across a distributed network to create an authoritative record of significant events. Everyone with permissioned access sees the same information, and integration is simplified by having a single shared blockchain. Consensus is handled through more traditional private models. Gartner predicts that by 2021, most private and permissioned blockchain uses will be replaced by ledger DBMS products.
We might see blockchain storage taking off in 2020, although there are some reservations about internet bandwidth. The advantage of using blockchain storage should be that because it is based on blockchain technology, it is verifiable, traceable, tamper-proof and controlled by the data owner.

The CLOUD

The Cloud can be defined as a network of remote servers hosted on the Internet to store, manage, and process data, rather than a local server or a personal computer. A lot of smaller businesses are now using the cloud to reduce their storage costs, often using SaaS applications. A lot of the pressure to move to the Cloud comes from COEs or other senior management, rather than being driven by the IT department. One reason for this pressure is that as data moves to the Cloud, the cost moves from CapEx to OpEx, which is always preferred by accountants.

As the business usage of the Cloud matures, consistent management of data between Clouds and data movement between Clouds will become critical. Most companies now use more than one Cloud provider, but it is important to make sure that specific applications are held on the most cost-effective Cloud platform.
Many companies have on-premises private clouds, but your biggest challenge might be getting your data back out of a public Cloud if you need to. If you use a Cloud provider, find out what options you have to extract your data. What you need is a storage management tool that can transparently move data from on-premises configurations to public clouds and across private cloud deployments. You can then benefit from the performance advantages of a private cloud, but also the savings public clouds drive for backup and archival data.
The other requirement is policy-based data management with a common set of rules for data retention, protection and access control over different Clouds. However the problem with implementing this is that different Cloud providers use different semantics and formats for their cloud object stores.

Enter multi-cloud data management products like Vizion.ai, Swiftstack, Scality Zenko, NooBaa, Rubrik Polaris GPS and Cohesity Helios. These products are intended to fix the issues above. They provide a mixture of: a single namespace for the stored files or objects, search and analysis facilities, global policy management and migration facilities. You would need to check out individial products to see what each one delivers.
One thing multi-cloud data management tools could do better is to make performance recommendations, and recommendations for tiering and usage levels across cloud providers. That analytical element may be the next step in multi-cloud data management.
Distributed cloud refers to the distribution of public cloud services to locations outside the cloud provider’s physical data centers, but which are still controlled by the provider. In distributed cloud, the cloud provider is responsible for all aspects of cloud service architecture, delivery, operations, governance and updates. The evolution from centralized public cloud to distributed public cloud ushers in a new era of cloud computing. Distributed cloud allows data centers to be located anywhere. This solves both technical issues like latency and also regulatory challenges like data sovereignty. It also offers the benefits of a public cloud service alongside the benefits of a private, local cloud.

Getting data in and out of the cloud can take some time, seconds for small amounts of data, and hours for Big Data. This is beginning to becone a problem, especially for the Internet of Things. Enter Edge computing and Fog computing. You can wait a second of two for a response from Amazon Echo, but we want a device like a driverless car to respond instantly. For this to happen you need processing power and storage on the device itself, or at the 'edge'.

If you consider the internet to be a bit like a spider's web, then the Cloud would be the computing and storage 'spider' in the center, and the web extends out to all the connected things. The idea behind Edge computing is that storage and processing is provided at the 'edge' of the web, to reduce the amount of raw data that would need to be passed over the web, and speed up processing for the things. Now pardon me from being a cynic, but is that not just the way things used to work before the Cloud came along. Could it be that the cloud was over hyped, and cannot provide fast enough response times for many applications? But, rather than admit they got things wrong, the vendors and planners have to come up with a new term for computing outside the Cloud, so let's call it the Edge.
Fog computing provides the same functionality, but could be a little nearer the users than the Edge, or it could be the same as the Edge. As yet there is no agreed definiton for this. Edge devices can be both small, low-cost cluster hardware in an SME, or server farms with clustering and large scale storage networks in a very large corporation.
Gartner states that By 2023, there could be more than 20 times as many smart devices at the edge of the network as in conventional IT roles. Edge processing is expected to grow further in 2020, which will mean that companies must provision and manage data storage for them. If your cloud, public or private, spans multiple cities or countries, then this could be a challenge.

Containers

Containers have been around for several years, but I suspect that increased home working will make them more prevalent. So what is a Data Container? "A container is an application, including all its dependencies, libraries and other binaries, and the configuration files needed to run it, bundled into a single package that can be moved, in total, from one computing environment to another."

So imagine that you are a developer working from home on a Windows 10 laptop. You are developing a critical application that will eventually run on a Windows 2016 server. The machines are running slightly different operating systems, with different fix levels, so how can you be certain that when you upload your changes to the server, they will work correctly? This is a critical application, you can't just try it and see what happens. This is where a container comes in. you can develop that application on a container on your laptop, and be certain that when you upload that container to the server, it will work in exactly the same way. The container can be moved to a testing server, a production server, or a VM in the Cloud and it will work in the exact same way.

Data containers are not the same as Server Virtualisation. A VM is a complete virtual server, including the operating system, whereas several Containers can share the same operating system. This means that Containers use resources more efficiently, at a cost of some loss in security. Some Container use transient storage, so when a Container was deleted, all the data was deleted too. Some use persistent storage in various formats.
So how do we manage these containers? How about backup and recovery? Containers might be an excellent idea for Devops, but what if they start becoming the next best thing for production? The problem is that in the original design, Containers were only meant to contain transient data, and the concept of persistent data has been mapped on later. This gives us problems with identifying the data and working out which application it belongs to, but as Containers become ever more prevalent, it is an issue that we will have to resolve securely.

Data Management Products

Backup and Recovery

What is the future for backup and recovery services? Applications can span to several terabytes and while it is possible to back this up using traditional methods from snapshots, it would take several hours to recover an application from tape. That is pretty much unacceptable for most of today's businesses. For me, the future seems to be snapshots. The EMC DMX3 storage subsystem can snapshot a whole application with a single command, maintain up to 256 of those snapshots for each application, and restore the source volumes from any one of those snapshots. Of course, most of the time you don't want to restore the whole application, just a few files. So you mount the relevant snapshot on a different server and copy over the files that you want.
If you lose the whole subsystem you lose the source data and all the snapshots, so to fix that you need a second site with remote synchronous mirroring between the two. This is not the future of course, you can do all this now. I think the future is that backup and recovery applications will start to recognise that they do not need to move data about to create backups, but can use snapshots and mirrors as backup datastores. The role of the software would then be to manage all that hardware and maintain the necessary catalogs that refer to backups contained in all these snapshots so the storage manager can easily work out what backups are available and also recover from them with simple commands.
These snapshots and replicas can be used as read only data for development and testing, and even for updates once they are no longer required for backups. This means the extra capacity required for these copies is not an overhead, but can be used as an asset.

We may see standard backups using the cloud as a longer term repository, with on-site snapshots retained for large application restore or user error type restores from recent backups. As more important data is moved to the cloud, it too needs to be protected. Cloud-to-cloud backup, where data is copied from one cloud service to another cloud, will be important in 2020. Backup vendors will need to add cloud-to-cloud capabilities to satisfy this requirement. Specifically, they will need to add tools to back up and restore applications within the cloud.
Beware of vendors who tell you to just move your data to the cloud, and then backup and recovery is sorted as the data is replicated between dispersed data centres. Remember that if the data is replicated in real time (synchronous), then deletes and data corruption will be replicated too. Ask your cloud provider how they cope with this, and also how they cope with recovering versions of data from previous days.

Ransomware protection

Ransomware malware is picked from an infected email attachment or website. It then encrypts your data and demands money for the decryption key. Ransomware attacks, such as WannaCry and Petya have been big news in the last year or so. Victim organisations have two choices; pay the ransom or take a lot of downtime while fixing the problem. Many companies emphasise education, informing all employees of the risks and warning not to open unsolicted attachements.
However, Backup and Recovery vendors are now adding ransomware protection to their products and this will continue in 2020. Your backup and recovery product can help in various ways; by detecting suspicious application behavior before files are corrupted, with ransomware monitoring and detection tools, or by using predictive analytics to determine the probability that ransomware is operating on a server. Companies that are doing this now include Acronis, Druva, Unitrends and Quorum and more will surely follow in 2020. Of course, don't overlook tape. Tapes are 'write once read many', so once a tape backup is created, it cannot be encrypted by an outside agency.

Metadata Intelligence

Metadata Intelligence, the process of using metadata to manage data, is being touted as an exciting new way to get on top of managing your data. Of course, Mainframes have been using metadata like this for 30 years or more, the point is that Windows is starting to catch up. Metadata lets you see when a file was last opened and with this information, you can keep current data on fast flash storage and move older data off onto cheaper storage.
The EU has recently introduced the General Data Protection Regulation (GDPR) legislation, which dictates how personal data must be stored, processed and deleted when the 'right to be forgotten' applies. Metadata Intelligence will help manage this, as data can be automatically stored and deleted based on pre-determined rules.
The requirement to store data securely will mean that data copies must be geographically dispersed, especially for long term archived data.

Artificial Intelligence, or AI, links into this. It is often used to detect ransomware viruses, but it can also be used to analyse your estate and make intelligent recommendations. An example of a product suite would be Igneous Systems, with DataDiscover to analyse and record your data, DataProtect to back it up, and Data Flow to move it round the system as required. You can also use Imanis Data Management Platform 4.0's SmartPolicies to generate backup schedules which are based on a desired recovery point objective as set by the user.
When storing data, we normally use two storage tiers, maybe flash storage and HDD, with tape as a possible third tier. The problem is that the investigation and coding needed to manually manage three tiers is not trivial. However with AI doing all the work, it becomes possible to add and manage even more tiers, to get the optimum balance between performance and cost for different classes of data.

In more general terms, AI will be used for a wide range of processes in the future and all these processes will need lots of data, which must be stored securely and be accessable with the best of performance, especially if AI is working in real time.

The various disks, disk arrays, switches and other bits of the storage estate generate lots of data describing the current health of the product. Predictive Storage Analytics is about continuously analysing all those data points, to predict the future behaviour of the storage estate. The theory is that this can include pinpointing potential developing problems, such as defective cables, drives and network cards, then alerting support staff, with a precisely located problem and a recommended solution. One of my least favourite error messages goes something like 'An unidentified System Error has occurred'. I'm not sure how that would be pinpointed.
Predictive Storage Analytics would also be able to monitor storage pools, cache, CPU and channel utilisation and recommend capacity requirements now and in the future.
We can expect to see data management manual tasks reduced in 2020 through the addition of machine learning and automated service-level management.

Hyperconverged Infrastructure

HCI is designed to reduce data center complexity and increase scalability. A workable definition of Hyper Converged Infrastructure could be "HCI is a single system framework that combines storage, computing and networking". HCI platforms typically run on standard, off-the-shelf servers and include software-defined storage, a hypervisor for virtualized computing and virtualized networking. Several hypervisor nodes can be clustered together to create pools of shared compute and storage resources and they can include pre-configured monitoring, backups, networking and storage configuration. Extra resources can be dynamically allocated as needed, without requiring system downtime.

HCI would typically by introduced as part of a data center modernization projects, and would provide a company with the scalability and cost benefit of a public cloud infrastructure without having to give up the control element of having hardware on their own premises. HCI can be hardware based with an integrated HCI appliance from a single vendor, or it can be software based and so hardware-agnostic.

A hardware approach will use commodity components and will be supported by a single vendor. The advantage of this approach is that you get an infrastructure that should be easier to manage and more flexible. The disadvantage is that you are locked into a single supplier. These HCI systems were initially targeted at general-purpose workloads with fairly predictable resource requirements such as virtual desktop infrastructure. They are now used for more unpredictable applications, such as Oracle and SQL servers, file and print services, and web servers.

A software based solution lets you deploy HCI on your own technology.There are obvious initial cost savings with this approach, and also potental ongoing savings, as you can negotiate upgrade deals with several suppliers. The downside, of course, is some loss in simplicity. HCI software vendors include Maxta and VMware (vSAN).

Composable infrastructure is an alternative to HCI, but Unlike HCI, which uses a hypervisor to manage the virtual resources, composable infrastructure uses APIs and management software to both recognize and aggregate all physical resources into the virtual pools and to provision, or compose, the end IT products. It didn't quite live up to the hype in 2018, but it may take off in 2019 as more vendors are putting out composable infrastructure products.

Issues with installing a hyper-converged infrastructure are:

Lascon latest major updates