Different types of RAID protection

What is RAID? The concept of RAID, or Redundant Array of Independent Disks, was originally discussed in a Berkeley paper by Patterson, Gibson and Katz. The idea is that instead of writing data block by block over a single disk, the data is spread over several spindles. This gives performance benefits, as data is read off several spindles, and availability benefits, as extra parity data can be generated and stored, so that the data will still be available if one or more disks are lost.

Parity is a means of adding extra data, so that if one of the bits of data is deleted, it can be recreated from the parity. For example, suppose a binary halfword consists of the bits 1011. The total number of '1's in the halfword is odd, so we make the parity bit a 1. The halfword then becomes 10111. Suppose the third bit is lost, the halfword is then 10?11. We know from the last bit that there should be an odd number of '1's, the number of recognisable '1's is even, so the missing but must be a '1'. This is a very simplistic explanation, in practice, disk parity is calculated on blocks of data using XOR hardware functions. The advantage of parity is that it is possible to recover data from errors. The disadvantage is that more storage space is required. In enterprise disk subsystems, backup disks called 'dynamic spares' are kept ready, so that when a disk is lost, a dynamic spare disk is automatically swapped in and the faulty disk is rebuilt from the remaining data and the parity data.

There has been some speculation in recent years that RAID is no longer relevant. This is based on the fact that disks are now much bigger then they were when RAID was invented and so it takes much longer to swap a dynamic spare in. Why is this important? Well until the data is rebuilt there is no protection, so if another disk failed, all the data would be lost. With physical disks sizes of 16TB or more, that means in a RAID5 6+P+S configuration, 96 TB of data could be lost. In that same configuration, the disk controller would have to read 96 TB of data to rebuild the missing 16TB and that could take days if the system is busy. Also, while the controller is performing this recovery, performance will be affected when data on the rest of the raid group is accessed. Most enterprise vendors now insist on RAID6 or RAID1 for large disks.
On the plus side, RAID performance is evolving. Access to the disks does get faster as disks get bigger, and the hardware functions that are used to rebuild the data are constantly being improved. This means that as each new generation of disks arrive, the rebuild performance of RAID controllers improves and keeps pace with the drive capacity increase.
The conclusion is that RAID is not dead yet, nor is it likely to be for some time.

   

Accelerate DB2 Write with zHyperWrite and "EADM™ by Improving DB2 Logs Volumes Response Time:

So which RAID configuration is best? RAID1 is simple to implement, performs well and is probably the best solution for small configurations and especially home PCs. RAID6 is usually preferred for enterprise subsystems, especially if they use large disks.
RAID1 can only tolerate 1 disk failure, but as the RAID protection can be restored by reading just one disk, the risk of data loss is low, especially if the disks are relatively small. The issue with RAID 1 is that only half the installed capacity is usable.
RAID5 can also only tolerate one failure, and a rebuild can take some time for large disks, so increasing the chance that a second disk might fail and so lose all the data.
RAID6 can tolerate 2 disk failures, so when a disk fails, 2 more need to fail during rebuild time, before data is lost. The RAID overhead depends on how many disks are in the RAID rank. The overhead is 25% for an 8 disk array.

The various types of RAID are explained below. In the diagrams, the square box represents the controller and the cache. Blue and yellow blocks represent data and red blocks represent parity. For simplicity, the dynamic diagrams show each IO as a RAID block. In practice, RAID blocks are fixed size, and so IOs are split into RAID blocks as appropriate. The RAID striping and parity is usually generated by ASICs.

Enterprise Disk

SATA/SCSI disk

Flash Storage

   

Accelerate DB2 Write with zHyperWrite and "EADM™ by Improving DB2 Logs Volumes Response Time:

Lascon latest major updates

Welcome to Lascon Storage. This site provides hints and tips on how to manage your data, strategic advice and news items.

back to top