Definition
RAID (today is known as Redundant Array of Independent Disks) is a data virtualization technology that combines multiple physical drives into a logical element for redundancy and increased productivity. Essentially, it allows us to significantly increase the overall performance and enhance the reliability of stored data.
History
Originally, RAID arrays were referred to as Redundant Array of Inexpensive Disks — as they had replaced SLED (Single Large Expensive Drive), the previously used technology that, as the name implies, relied on the utilization of one large disk with the purpose to improve reliability. However, due to the implications of low costs, industry vendors soon insisted to amend the descriptor.
Although the technologies implying use of multiple disks had already been mentioned in various products prior to the paper’s release, the idea of RAID storage itself was invented by David Patterson, Garth Gibson and Randy Katz, and first mentioned in their technical report “A Case for Redundant Arrays of Inexpensive Disks (RAID)”, written in 1988.
Basics
As it was said above, RAID represents multiple physical disks joint into one (or more) logical unit to distribute data across the drives. Different RAID levels, or schemes, deploy different techniques to approach stored data in one of several ways (stripping, mirroring, parity, or combination of them), which provides a balance among reliability, availability, performance, and capacity — depending on the ultimate requirements. The number following the word “RAID” refers to the configuration (level), which is applied to operate stored information.
Here are three base RAID configurations:
- Striping: divides data into blocks, splitting the flow between the drives;
- Mirroring: stores identical copies of data simultaneously in different blocks.
- Parity: calculates the missing block to prevent the system from going down in the case of a failed drive or missing data.
RAID Levels
Initially, there only were five RAID levels. However, based on them advanced over years, plenty of organizations have created their own non-standard configurations and nested (hybrid) arrays to meet the specialized needs of a small niche group. To take a closer look at the differences in offered options, check www.salvagedata.com/raid-configuration.
Below you can find four of the most common RAID levels.
RAID 0. Consists of striping; distributes stored data across two or more disks, with no mirroring, parity, or redundancy. RAID 0 doesn’t provide fault tolerance, which means all data in the array will be lost and the whole system becomes unable to function in case one of the drives fails.
Raid 0 offers great performance and no overhead because of the parity controls but grants no redundancy. Best used when speed is vital and reliability is secondary.
RAID 1. Consists of mirroring; data is written identically to pairs of drives. Through creating a “mirrored set”, RAID 1 is able to provide fault tolerance: the array continues to operate as long as at least one drive is functioning (meaning the system can still access data from the remaining disks). Replacing a faulty disk with a new one will copy information to it, and rebuild the array.
RAID 1 offers an increase in read performance speed, but not in write; it has lower usable storage capacity since, due to all data is written twice, only half of the total drive volume is available. Best when data redundancy and reliability are crucial.
RAID 5. Consists of striping with distributed parity; requires at least 3 drives to operate. Upon failure of a single drive, subsequent reads can be calculated from the distributed parity, preventing data from loss.
RAID 5 is presently the most common option as it offers both speed (since data is accessed from multiple disks) and great reliability, thus being considered as one of the more secure configurations. Best used for file and application servers that have a limited number of data drives.
RAID 6. Consists of striping with dual distributed parity; RAID 6 requires at least 4 drives and provides higher redundancy along with enhanced reliability as it applies one more parity block. Offers increased read performance, but write performance is slower than in RAID 5.
Double parity ensures fault tolerance up to two failed drives — which makes larger RAID groups more practical, specifically for high-availability systems (although drive failures may affect throughput). Best suitable for large file storage or/and application servers.
Conclusions
Employment of RAID arrays may be exceedingly helpful in a number of situations, as it covers multiple needs in an affordable and dependable way, the most important of which is the possibility to prevent your sensitive personal or crucial business-related information from getting lost due to a power outage, drive failure, or other possible accidents you don’t want to deal with.