A Guide to RAID

Section 1

RAID Overview

What is RAID?

Redundant Array of Independent Disks (RAID) is a storage technology that combines several physical hard disks to create a logical drive with better performance and reliability than individual units. It increases the speed of storing and accessing data while preventing loss of data and downtimes.

The RAID technology, originally known as redundant array of inexpensive disks, was developed by Randy Katz, David Patterson, and Garth Gibson and in 1987. The three scientists from the University of California, Berkeley, were trying to address challenges that often resulted in data losses. Today, their creation - which has been improved and enhanced - enables organizing data in multiple disks, and reconstruction of missing information in case of a hardware failure of one or more disks.

Although traditionally designed for servers, RAID is also used in workstations, storage-intensive computers and other applications that require data security, high transfer speeds, and large storage capacities. Typical applications where fast read and write operations for large files are important include video editing, CAD, graphic design, etc.

A RAID configuration achieves one or a combination of the following benefits.

Improving the data read/write performance hence providing faster transfers.

 

Replicating data across two or more disks to increase redundancy and prevent data loss in case of a disk failure.

 

Combining multiple disk drives to provide a larger capacity.

 

How Does RAID Work?

RAID is a technology for configuring and supporting various combinations of physical hard drives, with the aim of improving reliability, performance, and capacity. It consists of multiple physical disks and a controller to configure and manage them.

There are different RAID schemes to spread or replicate data across different member disks. Each of the configurations provides a unique balance between capacity, performance, and resilience. Generally, the three main concepts are striping, mirroring, and parity. Each of these has its merits and limitations, but can be combined for better performance.

Striping spreads the data evenly across multiple physical disks, mirroring replicates data on two or more disks, while parity uses raw data to calculate and store parity information for error correction. By writing or accessing information simultaneously in striping, the RAID improves performance while mirroring allows the data to be accessed from remaining good drives in case of a disk failure.

When Should I Use RAID

RAID is ideal for high reliability applications and those requiring larger storage or high data transfer speeds.  All websites and critical online and offline applications should use RAID to improve performance, prevent data loss or downtimes.

Most modern servers utilize fast SSD drives hence may not require further performance improvements. However, there is need to add redundancy to ensure reliability and availability of the website in case of a disk failure. For servers using older slow drives, it may be necessary to use a RAID level that combines performance improvement and data redundancy. For further reading on HDD and SSD, read our SSD vs. HDD hosting guide.

Almost all physical servers in shared hosting, VPS, or dedicated servers have disk drives that operate in a RAID setup. Usually, at least one of the drives is configured for parity and all the data copied here has an extra bit that helps in recovering data in case of a failure in one of the disks.

RAID on dedicated, VPS or shared servers increases the server performance and data redundancy. However, it does not eliminate the need for an offsite data backup, just in case of a virus attack or disaster.

Generally, most providers use the RAID for both there servers and the backup systems. This increases the level of data protection and speed of recovering data in case a problem with any the disks in the server or backup storage.

Although RAID was initially designed for servers, individuals and data-intensive users such as video and audio editors can use it to improve the read and write operations.

Using RAID With a Raid Controller

A RAID controller is a hardware device or software driver for configuring and managing hard drives in an array. It provides an interface for combining the physical disks and presenting them to the OS as a single logical unit.

A hardware RAID controller is a physical device that is either integrated on the motherboard, or available as an add-on PCI or PCI Express extension card. For hardware RAID, the controller runs everything and has its CPU and memory. Controllers are designed to support specific hard disk interfaces and raid levels. For example, there are unique controllers for SCSI, SATA, SAS or SSD drives and are not interchangeable.

Some hardware controllers have an additional cache to avoid data loss in case of a power outage as well as increase the read and write operations. Advantages of a hardware raid are better performance, supporting boot from the array, and providing a better abstraction. However, they are more expensive and there are risks of vendor lock-in since most of these use proprietary firmware.

A software-based RAID uses the operating system and existing hardware such as the computer CPU and standard SAS, IDE or SATA controllers. This is more flexible, less costly and available in most server and desktop operating systems. However, the installation is often tied to a specific operating system and may not be compatible with other types. Since it uses the computer processing power and memory, this may degrade the server performance. Other limitations include inability to boot from the RAID array and lack of support for hot swap unless using a compatible hardware controller.

Section 2

Raid Levels

What are RAID Levels?

A RAID level refers to the technique of distributing, organizing and managing data across multiple disks in an array. Each level has a different fault tolerance, data redundancy, and performance properties, and the choice depends on requirements or goals as well as cost. Some levels provide more data protection while others offer better performance improvement than other methods.

Generally, all RAID arrays are classified as standard, non-standard or nested levels depending on configuration and type and level of improvements it offers.

Standard RAID levels rely on basic and simple configurations. These include the original levels, one through five, plus two others (0, and 6) that were added later. Other levels beyond these are defined as non-standard. However, level 0 is sometimes is not considered as RAID since it does not offer redundancy.

A nested or hybrid RAID combines a standard RAID level that provides redundancy with a RAID 0 to improve data transfer performance. This level requires more drivers, higher quality hardware controllers, and more powerful computers. Some low-cost controllers and software drivers do not support nested RAID. This makes it more expensive to implement and often ideal for large businesses and enterprises.

Non-standard RAID levels are those that do not rely on basic architectures or methods used in traditional RAID levels. Some of these are proprietary and only used for certain applications. These provide higher levels of performance and are usually suitable for specific applications.

Standard RAID Levels

Standard RAID levels are based on simple and basic hardware configurations, and are ideal for a wide range of businesses and individuals. Typical standard level is RAID 0, 1, 2,3,4,5, and 6. Each of these provides a unique combination of redundancy and performance.

While levels 1, 5, and 6 provide some degree of fault tolerance, level 0 doesn’t but offers the fastest performance. RAID 1 is the most reliable in data security while level 5 provides the best balance between performance, fault tolerance, and reliability.

 

RAID 0

A RAID 0 level uses block-striping to spread data across multiple physical disks. This has the fastest I/O performance since it writes or copies small different parts of a file to -, or from -, multiple disks simultaneously.

It requires a minimum of two physical drives and provides the maximum disk space, which is the total of the individual device capacities. However, it does not offer any data redundancy or fault tolerance, and is best for organizations looking for performance. A failure in any of the disks in a RAID 0 array results in complete loss of data, including data saved in the good drives.

RAID 0 level is best for applications processing non-critical data but requires high performance.

Diagram of a RAID 0 setup

RAID 1

RAID 1 mirrors data on two or more disks without parity. The level requires at least two drives and total usable space equals the size of a single disk.

All the disks have identical copies of data. In case of a disk failure, the system continues to use the existing disk or disks in good working condition.

RAID 1 level provides better data redundancy and is ideal for applications where data availability is critical. This is a simple technology with basic fault tolerance but no performance improvements since it must write the data twice.

This is ideal for applications where data availability and redundancy are important.

Diagram of a RAID 1 setup

RAID 2

RAID 2 uses bit-level striping with parity compared to block striping in RAID 0. Additionally, it uses Hamming code for error detection and therefore requires disks without self-disk error checking option. Since most of the modern disks have this feature, the level is rarely used. In addition, it requires an extra disk to store parity information for error detection purposes. Effective disk capacity is n-1 where n is the number of disks.

RAID 2 works like RAID 0 but uses bit-level striping along with an error protection mechanism to protect data loss due to corruption. This is resource extensive and not widely used.

Diagram of a RAID 2 setup

RAID 3

RAID 3 uses byte-level striping with parity for rebuilding data. It requires a minimum of three drives, of which one stores the parity information. The level has high-level data transfer rates for large files since data is accessed in parallel but slower on small files.

This level performs better for long sequential data transfers such as video but not in applications where there are many requests such as a database. In case the disk with parity crashes, there is no way of rebuilding data. The level is not used much and just like RAID 2, its usable capacity is n-1.

Diagram of a RAID 3 setup

RAID 4

RAID 4 is almost similar to RAID 3 but uses block-level striping. It combines block-level striping across multiple disks with a dedicated parity disk. The level requires a minimum of three disks where one is reserved for parity information. Data from each drive is accessed independently at only one block at a time hence slow operations. In addition, writing operations are slower since the system must write the parity information.

This is ideal for sequential data access. However, the parity disk may slow the write applications. The level is rarely used.

Diagram of a RAID 4 setup

RAID 5

RAID 5 has block level striping along with distributed parity. This is a cost-effective, all-round configuration that balances between redundancy, performance, and storage capacity.

Striping improves the read I/O performance while parity is important for reconstructing data in case of disk failure. However, it cannot survive multiple disk failures and takes longer to rebuild data since the process involves calculating parity from each of the available drives. It requires a minimum of three disks but has a usable space of n-1 disk.

RAID 5 level is suitable for applications and file servers with limited storage devices.

Diagram of a RAID 5 setup

RAID 6

RAID 6 uses block striping like RAID 5 but with a dual distributed parity. The two blocks of parity information provide additional redundancy and fault tolerance. This level can survive two concurrent disk failures. However, it is expensive; requiring at least four drives while giving a usable space is n-2 disks.

It is more reliable and common in SATA environments and applications such as disk-based backups and data archives where there is a need for long data retention. It is also suitable for environments where data availability is more important than performance.

Drawbacks of the level 6 include the additional disk for the double parity information as well as being complex to implement compared to level 5. Due to the dual parity, the write and restore speeds are slower.

Diagram of a RAID 6 Setup

Nested (Hybrid) RAID Levels

A nested RAID is a combination of a level that provides redundancy and a RAID 0 that increases performance. This may use RAID arrays or individual disks. Usually, the best combination is having RAID 0 on top of a redundant array since fewer disks will need regenerating in case of a disk failure.

The nested levels provide better performance and higher tolerance. However, they require complex configurations and more drives, while the effective capacity is halved the installed disk space. They are also expensive and have limited scalability.

The common levels includes 0+1, 1+0 (10), 0+3, 3+0 (30), 0+5, 5+0 (50), and 6+0 (60)

 

RAID 0+1

RAID 0+1 combines RAID 0 and 1 to provide redundancy and improve performance. The process starts by striping the data across multiple disks, which increases performance, followed by mirroring for data redundancy.

RAID 0+1 requires a minimum of four physical hard drives and is a complex configuration that provides high performance and fault tolerance. It can survive more than one disk failure in the same mirrored set provided there is no concurrent failure of two mirrored disks.

This level requires disks in multiples of two but the total usable capacity is usually half the total disk space. In addition, it is more costly and not easily scalable.

A nested RAID 01 configuration

Nested RAID 01 configuration

A hybrid RAID 01 configuration

Hybrid RAID 01 configuration

RAID 1+0

RAID 1+0 or RAID 10 starts with mirroring data before stripping it across the mirrored arrays. The approach makes it more redundant, reliable, and efficient than RAID 0+1, and can survive multiple drive failures. It requires a minimum of four drives and can survive multiple concurrent disk failures as long as none of the mirrors loses all the disks.

RAID 1+0 has better fault tolerance, data redundancy and rebuild compared to RAID 0+1. However, it is very expensive and just like the 0+1 has limited scalability. The level is ideal for organizations looking for high performance and data security. The usable capacity is half the total installed disk space.

Diagram of a RAID 1+0 Setup

RAID 0+3

This is also referred to as RAID 53 and consists of a Raid 0 array striped into a RAID 3 array. In addition, it has a dedicated parity array that is striped across disks.

The level has high rates of data transfer and fault tolerance offered by the RAID 3 segments. This level provides high tolerance and has excellent performance with both sequential and random reads and writes. However, it is more complex and expensive since it requires more drives.

Unfortunately, the level is expensive and requires disks with spindles that must be synchronized together. This might limit the choice of disks to use.

Diagram of a RAID 0+3 Setup

RAID 5+0

RAID 5 +0 or RAID 50 combines distributed parity of RAID 5 with striping of RAID 0. It consists of two or more RAID 5 arrays in which data and parity information in the arrays is striped across the disks. Requiring a minimum of six physical disks it has improved data protection, write performance, as well as faster rebuilds compared to RAID 5. It is hence ideal for applications where high availability is important.

A single drive failure will only affect that array and won’t degrade performance as happens in RAID 5. In addition, it can withstand up to four drive failures as long as each is in a different RAID 5 array. However, it requires a sophisticated RAID controller.

Diagram of a RAID 0+5 Setup

JBOD RAID N+N

JBOD (Just a Bunch Of Disks) combines several disks which it represents to the OS as a single drive with larger capacity but without redundancy. Unlike other RAID levels, this arrangement allows accessing of individual drives separately. This is not really a RAID level but simply an arrangement.

The JBOD consists of several standard disks which may have different sizes. The total capacity is the sum of the individual disks and can be increased by just adding an extra drive. Just like the RAID 0, it provides the best performance since it also doesn’t have parity that would add more overhead. However, it does not have data protection and each disk is a potential point of failure. It is therefore ideal for I/O intensive applications and those requiring larger storage.

Diagram of a JBOD disk Setup

Nonstandard RAID Levels

The non-standard RAID levels rely on architectures or algorithms different from those in a standard RAID. Some are based on open source systems while others rely on proprietary technologies and only offered by certain vendors for specific applications.

Those using proprietary hardware and software and may not be compatible with other systems from different manufacturers. Examples include Pure Storage's RAID-3D and Dell EMC's XtremIO Data Protection (XDP).

Non-standard RAID levels provide better performance and fault tolerance than the standard levels. They are used for specialized applications that require more availability and reliability than what standard level can offer.

 

RAID 3D

This is proprietary RAID developed by Pure Storage and uses flash drives instead of hard disks. This is usually used to prevent data loss in case of component failure in the flash storage. Due to the faster transfer speeds in solid state drives, the array has a high I/O performance. If the RAID 3d detects a device failure that often causes I/O delays, it rebuilds the data from the other devices within the same parity group.

 

Enhanced RAID 1E

RAID 1 Enhanced (RAID 1E) combines mirroring and striping data across several disks. It is almost similar to RAID 1 but has striping, and requires an odd number of disks, of which the minimum is 3 drives. The Enhanced RAID 1E mirrors the complete stripe of data to a different stripe within the set of disks, and is sometimes referred to as a mirrored stripe. Due to the mirroring, this level has a good data redundancy.

Diagram of a RAID 1E Setup

Enhanced RAID 5 E

RAID 5 E is a variant of RAID 5 but with an additional hot spare drive.  The hot spare is usually active waiting for another drive to fail. Once a failure occurs, the hot spare becomes available for rebuilding data. RAID 5E requires a minimum of four disks and has a better performance than traditional RAID 5. However, it is not possible to share the spare drive between arrays. In addition, it suffers from slow rebuilds.

Diagram of a RAID 5E Setup

Section 3

Pros and Cons of RAID

Benefits of Using RAID

The benefits of a RAID system vary according to the level. An array may increase the performance, resilience or data redundancy but the level of improvement vary according to the type of configuration and number of disks. Generally, an array will provide one or more benefits, but not all the maximums at the same time.

  • Preventing data loss in case of a disk failure: A RAID with data redundancy provides better continuity of business operations. In such as system, a disk failure does not interfere with the applications or data access since the server will use the remaining good disks. In addition, replacing a faulty disk in a hot-swappable RAID arrays does not require shutting down or interrupting operations. More disks provide a better fault tolerance level.
  • Improving read/write speeds hence the performance of the servers or computer such as workstations for video editing and other data-intensive applications. However, this will depend on the RAID level and number of physical drives.
  • Increasing the storage capacity using simple and cheaper disks: This is more cost-effective than buying a large single drive.
  • Increasing fault tolerance through the use of multiple disks.

Reduced costs and improved reliability: By using several less expensive, smaller disks, the array allows increasing the capacity at a lower cost than acquiring a single high-capacity drive.

Disadvantages of using RAID

Although there are different RAID levels to address various data storage needs, the technology is vulnerable to a number of failures that can result in data losses or downtimes. The disadvantages include:

  • Since the RAID drives are usually inside a server within the same data centre, a disaster can damage the drives or entire array, hence potentially destroying all the data. Other systems such as the CDP store the data in remote drives, hence adding an extra protection layer in case of a disaster.
  • RAID storage contains the current version of data, which ensures easier rebuilding in case of failure. However, it is not possible to recover an older version of the file, especially if there was a virus attack, erroneous altering of files or malicious edits.
  • With larger drive capacities, RAID suffers from lengthy rebuild times whenever one or more disks fail. It takes longer to rebuild a RAID volume when a failure occurs and in case of other disks failing before a rebuild is complete, all the data will be unrecoverable. This will also increase the downtime.
  • Implementing a RAID array is expensive since it requires several disks. For RAID offering redundancy, it is not possible to use the full capacity. The usable space is often smaller than total installed capacity.
  • Complex and not transferable. Although hardware controlled or RAID boxes are transferable, the software-based RAID arrays are not.
  • Requires IT skills and familiarity with the technologies. As such organizations may require spending more money to train their staff or hire third-party service providers, especially to rebuild data or troubleshooting malfunctions.

Conclusion

RAID will continue offering performance and data protection benefits for several more years. However, it requires new strategies to make it more effective and compatible with emerging technologies and needs. Currently, there are critical storage requirements that are beyond existing RAID technologies.

Some manufacturers are already using new approaches to meet the growing and changing needs, and also address modern disk technologies and limitations. For example, instead of using RAID 0 to improve performance, modern systems can utilize DRAM, flash caches, automated storage tiering (AST) and other technologies such as wide striping.

Today’s disks, such as SSDs, are larger and fast. This eliminates the need to stripe data for performance improvement. However, larger drives have a challenge of longer rebuilding times which can range from 4 hours to several days for a 2TB hard drive.

As such, organizations handling large amounts of data, such as in the petabyte scale, will require different strategies. These should aim at making RAID more effective while enabling it to compete with existing and upcoming alternatives such as erasure encoding and continuous data protection (CDP).

The erasure encoding starts by breaking data into fragments; it then expands and encodes them with redundant data pieces. These are then stored in different storage media and locations. The technology has little overheads compared to the traditional RAID. It requires less time and overhead to reconstruct data. However, it is processor intensive and has higher latency compared to RAID.

Going forward, one approach is to retain the data protection provided by a RAID based physical storage and then virtualize this. Such an arrangement will create a virtual volume that does not depend on specific hardware configuration. Replicating such volumes in different locations decreases the potential risk of a complete failure in case of a disaster or other critical failure.