Discussion Forums > Technology

Triple-parity RAID

(1/3) > >>

kureshii:
Article from ACM Queue: Triple-parity RAID and Beyond

Choice excerpts:

How much longer will current RAID techniques persevere? The RAID levels were codified in the late 1980s [...] RAID-6, double-parity RAID, was not described in Patterson, Gibson, and Katz's original 1988 paper but was added in 1993 in response to the observation that as disk arrays grow, so too do the chances of a double failure.

While bit error rates have nearly kept pace with the growth in disk capacity, throughput has not been given its due consideration when determining RAID reliability. [...] When RAID systems were developed in the 1980s and 1990s, reconstruction times were measured in minutes. The trend for the past 10 years is quite clear regardless of the drive speed or its market segment: the time to perform a RAID reconstruction is increasing exponentially as capacity far outstrips throughput. [...] The time to repair a failed drive is increasing, and at the same time the lengthening duration of a scrub means that errors are more likely to be encountered during the repair.


Time required to repopulate a failed disk in a RAID array



The common questions:
How is this relevant to home use?
It's not, not at the moment anyway. But RAID 5 came about in the 1980s, and almost nobody thought it would be deployed for home use one of these days. The consumer market generally benefits from a trickle-down effect from technologies that are usually first deployed in enterprise environments.

Isn't triple-parity RAID overkill? Most of us haven't even needed double-parity RAID yet...
If you have to ask, of course it is overkill for you. But that doesn't mean nobody else needs it.

That's just silly, why don't people use multiple RAID5  or RAID6 volumes instead of a single triple-parity storage volume?
Why don't you use a cluster of Pentium 1s instead of a dual-core or quad-core machine?



Anyway, the point of posting this article is to ask a question. The following paragraph in the article caught my attention:


--- Quote ---A recurring theme in computer science is that algorithms can be specialized for small fixed values, but are then generalized to scale to an arbitrary value. A common belief in the computer industry had been that double-parity RAID was effectively that generalization, that it provided all the data reliability that would ever be needed. RAID-6 is inadequate, leading to the need for triple-parity RAID, but that, too, if current trends persist, will become insufficient. Not only is there a need for triple-parity RAID, but there's also a need for efficient algorithms that truly address the general case of RAID with an arbitrary number of parity devices.
--- End quote ---

What that means is that the RAID 5 and 6 algorithms are not scalable with regards to parity blocks. You can scale the number of disks in a RAID 5 or 6 array, but those arrays will always use 1 and 2 blocks of parity data respectively, in every stripe (concurrent series of blocks) of data.

RAID 5 is a single-parity algorithm, and RAID 6 is a double-parity algorithm. Currently, triple-parity RAID-Z is also not using a generic algorithm. Perhaps in time to come we'll finally have an efficient, robust n-parity RAID algorithm that works for multiple values of n (I hesitate to say all values of n, because then one has to consider absurdly huge numbers that won't be practical even for enterprise use)

This got me thinking. I'm not intimately familiar with the mathematical details of the RAID5 and RAID6 algorithms, apart from the fact they use Reed-Solomon codes for redundancy, so I don't know how computationally intensive it is to generate parity data blocks for each of these 2 algorithms.

But assuming that we do have a robust n-parity RAID algorithm, how would the computational load scale with n? Be it a computation-efficient or memory-efficient algorithm, what would an optimistic estimate of the scaling function be?

I find such questions quite interesting, and although I don't have the time to read up on this in detail, if anyone has interesting insights I'll be glad to hear them.

nstgc:
I'm far from an expert, but that doesn't make that much sense. If the problem is errors that accumulate during the reconstruction time, which are getting exponentially longer, then nested arrays should take care of that. I remember going over the calculations once for fairlure rate in a nsted system versus an n-parity system. I can't remember exactly what I came to, but concluded that a nested system was better. I did not take reconstruction time into consideration, however. If, as you said, the size of drives is increasing at a great rate, then it seems that the best thing to do is not use RAID 5/6 at all for the top level. The problem with a nesteded array is the hardware to run it. With more controllers, more things can go wrong. If you have a duplex of two nested arrays (lets say a large RAID 6 arrar of small RAID 5s with a spare RAID 5 array) wouldn't that take care of it. The chances of that many HDD dying seems very slim, but the chances of enough HDD and a controller (which may be harder to detect in advance) seems like a possibility.

Again, I've only looked at a single piece of the problem, but that is my opinion.

[edit] Another disclaimer -- my memory isn't perfect.

[edit2] On second thought, simply doubling already costly equipment doesn't seem very cost effective. What if the spare drive was an SDD? The array itself should be able to recover and read the parity at a respectable rate, but I can see how copying it onto that spare would be time consuming. This can be alleviated with a faster drive. You won't need a lot of them. Just enough to replace one hard drive or perhaps one array (on a lower level).

[edit4] edit3 was wrong. Chances of failure given a 50k MTBF HDD in a 6 member RAID 5 array that already has a dead drive in a 24 hour period is 8.2% not 1.7%. 1.3% is in 4h not 0.27%.

halfelite:
You could always run triple parity raid-z. Unless you are running 12+ drives at home I see no reason for triple parity. And even then running an offline raid0 system as a backup gives you even more of a fail proof then just adding more parity. And the cost effectiveness is not that bad.

Proin Drakenzol:
I could see needing n-parity RAID first for military applications where speed of reconstitution through the use of hot-swappable drives is more important than cost efficiency or even data transfer efficiency. However a corporation is probably better off using a cheaper solution involving nested and alternating active/inactive RAIDs than investing in n-parity capability.

nstgc:

--- Quote from: halfelite on December 22, 2009, 08:40:55 PM ---You could always run triple parity raid-z. Unless you are running 12+ drives at home I see no reason for triple parity. And even then running an offline raid0 system as a backup gives you even more of a fail proof then just adding more parity. And the cost effectiveness is not that bad.

--- End quote ---

The discussion isn't on home use.


--- Quote from: Proin Drakenzol on December 22, 2009, 10:46:31 PM ---I could see needing n-parity RAID first for military applications where speed of reconstitution through the use of hot-swappable drives is more important than cost efficiency or even data transfer efficiency. However a corporation is probably better off using a cheaper solution involving nested and alternating active/inactive RAIDs than investing in n-parity capability.

--- End quote ---

Wouldn't higher parity be more cost effective? Isn't it more a software limitation then a hardware limitation? Once the software is developed, wouldn't the price scale more favourably then a nested array? While I agree that a nested RAID is a better solution, it doesn't seem as if its necessarily more cost effective. I don't know. I've only dealt with non-nested RAIDs on the level of a home computer. (Four member RAID 5 using a dedicated card with its own xor processor. The damn card died! Redundancy is great, but it doesn't do any good if the controller is shit.)

One thing I don't understand is why you can't just use something like par2 (a archive backup program that uses a Read-Solomon ring), but with drives. It scales very well.

Navigation

[0] Message Index

[#] Next page

Go to full version