RAID: Part 6 – WrapUp

Finally the end – what a long, wordy trip it has been.  If you waded through all 5 posts, awesome!

As a final post, I wanted to attempt to bring all of the high points together and draw some contrasts between the RAID types I’ve discussed.  My goal with this post is less about the technical minutia and more about providing some strong direction to equip readers to make informed decisions.

Does Any of This Matter?

I always spend some time asking myself this question as I dive further and further down the rabbit hole on topics like this.  It is certainly possible that you can interact with storage and not understand details about RAID.  However I am a firm believer that you should understand it.  RAID is the foundation on which everything is built.  It is used in almost every storage platform out there.  It dictates behavior.  Making a smart choice here can save you money or waste it.  It can improve storage performance or cripple it.

I also like the idea that understanding the building blocks can later empower you to understand even more concepts.  For instance, if you’ve read through this you understand about mirroring, striping, and parity.  Pop quiz: what would a RAID5/0 look like?

raid50

Pretty neat that even without me describing it in detail, you can understand a lot about how this RAID type would function.  You’d know the failure capabilities and the write penalties of the individual RAID5 members.  And you’d know that the configuration couldn’t survive a failure of either RAID5 set because of the top level striping configuration.  And let’s say that I told you the strip size of the RAID5 group was 64KB, and that the strip size of the RAID0 config was 256MB.  Believe it or not, this is a pretty accurate description of a 10 disk VNX2 storage pool from a single tier RAID5 perspective.

Again to me this is part of the value – when fancy new things come out, the fundamental building blocks are often the same.  If you understand the functionality of the building block, then you can extrapolate functionality of many things.  And if I give you a new storage widget to look at, you’ll instantly understand certain things about it based on the underlying RAID configuration.  It puts you in a much better position than just memorizing that RAID5 is “parity.”

Okay, I’m off my soapbox!

Workload – Read

  • RAID1/0 – Great
  • RAID5 – Great
  • RAID6 – Great

I’ve probably hammered this home by now, but when we are looking at largely read workloads (or just the read portion of any workload) the RAID type is mostly irrelevant from a performance perspective in non-degraded mode.  But as with any blanket statement, there are caveats.  Here are some things to keep in mind.

  • Your read performance will depend almost entirely on the underlying disk (ignoring sequential reads and prefetching).  I’m not talking about the obvious flash vs NLSAS; I’m talking about RAID group sizing.  As a general statement I can say that RAID1/0 performs identically to RAID5 for pure read workloads, but an 8 disk RAID1/0 is going to outperform a 4+1 RAID5.
  • Ask the question and do tests to confirm: does your storage platform round robin reads between mirror pairs in RAID1/0?  If not (and not all controllers do), your RAID1/0 read performance is going to be constrained to half of the spindles.  From the previous bullet point, our 8 disk RAID1/0 would be outperformed by a 4+1 disk RAID5 in reads because only 4 of the 8 spindles are actually servicing read requests.

Workload – Write

  • RAID1/0 – Great (write penalty of 2)
  • RAID5 – Okay (write penalty of 4)
  • RAID6 – Bad (write penalty of 6)

Writes are where the RAID types start to diverge pretty dramatically due to the vastly different write penalties between them.  Yet once again sometimes people draw the wrong conclusion from the general idea that RAID1/0 is more efficient at writes than RAID6.

  • The underlying disk structure is still dramatically important.  A lot of people seem to focus on “workload isolation,” meaning e.g. with a database that I would put the data on RAID5 and the transaction logs on RAID1/0.  This is a great idea from a design perspective starting with a blank slate.  However, what if my RAID5 disk pool I’m working with is 200 disks and I only have 4 disks for RAID1/0?  In this case I’m pretty much a lock to have better success dropping logs into the RAID5 pool because there are WAY more spindles to support the I/O.  There are a lot of variables here about the workload, but the point I’m trying to make is you should take a look at all the parts as a whole when making these decisions.
  • If your write workload is large block sequential, take a look at RAID5 or RAID6 over RAID1/0 – you will typically see much more efficient I/O in these cases.  However, make sure you do proper analysis and don’t end up with heavy small block random writes on RAID6.

Going back and re-reading some of my previous posts, I feel like I may have given the impression that I don’t like RAID1/0.  Or that I don’t see value in RAID1/0.  That is certainly not the case and I wanted to draw an example to show when you need to use RAID1/0 without question.  That example is when we see a “lot” of small block random writes and don’t need excessive amounts of capacity.  What is a “lot”?  Good question.  Typically the breaking point is around 30-40% write ratio.

Given that a SAS drive should only be allowed to support around 180 IOPs, let’s crunch some numbers for an imaginary 10,000 front end IOPs workload. How many spindles do we need to support the workload at specific read/write ratios?  (I will do another blog post on the specifics of these calculations)

Read/Write Ratio RAID1/0 disk count RAID5 disk count RAID6 disk count
90%/10% 62 73 78
75%/25% 70 98 123
60%/40% 78 125 167

So, at lighter write percentages, the difference in the RAID type doesn’t matter as much.  But as we already learned RAID1/0 is the most efficient at back end writes, and this gets incredibly apparent at the 60/40 split.  In fact, I need over twice the amount of spindles if I choose RAID6 instead of RAID1/0 to support the workload.  Twice the amount of hardware up front, and then twice the amount of power suckers and heat producers sitting your data center for years.

Capacity Factor

  • RAID1/0 – Bad (50% penalty)
  • RAID5 – Great (generally ~20% penalty or less)
  • RAID6 – Great (generally ~25% penalty or less)

Capacity is a pretty straightforward thing so I’m not going to belabor the point – you need some amount of capacity and you can very quickly calculate how many disks you need of the different RAID types.

  • You can get more or less capacity out of RAID5 or 6 by adjusting RAID group size, though remember the protection caveats.
  • Remember that in some cases (for instance, storage pools on an EMC VNX) a choice of RAID type today locks you in on that pool forever.  By this I mean to say, if someone else talks you into RAID1/0 today and it isn’t needed, not only is it needlessly expensive today, but as you add storage capacity to that pool it is needlessly expensive for years.

Protection Factor

  • RAID1/0 – Lottery! (meaning, there is a lot of random chance here)
  • RAID5 – Good
  • RAID6 – Great

As we’ve discussed, the types vary in protection factor as well.

  • Because of RAID1/0’s lottery factor on losing the 2nd disk, the only thing we can state for certain is that RAID1/0 and RAID6 are better than RAID5 from a protection standpoint.  By that I mean, it is entirely possible that the 2nd simultaneous disk failure will invalidate a RAID1/0 set if it is the exact right disk, but there is a chance that it won’t.  For RAID5, a 2nd simultaneous failure will invalidate the set every time.
  • Remember is that RAID1/0 is much better behaved in a degraded and rebuild scenario than RAID5 or 6.  If you are planning on squeezing every ounce of performance out of your storage while it is healthy and can’t stand any performance hit, RAID1/0 is probably a better choice.  Although I will say that I don’t recommend running a production environment like this!
  • You can squeeze extra capacity out of RAID5 and 6 by increasing the RAID group size, but keep it within sane limits.  Don’t forget the extra trouble you can have from a fault domain and degraded/rebuild standpoint as the RAID group size gets larger.
  • Finally, remember that RAID is not a substitute for backups.  RAID will do the best it can to protect you from physical failures, but it has limits and does nothing to protect you from logical corruption.

Summary

I think I’ve established that there are a lot of factors to consider when choosing a RAID type.  At the end of the day, you want to satisfy requirements while saving money.  In that vein, here are some summary thoughts.

If you have a very transactional database, or are looking into VDI, RAID1/0 is probably going to be very appealing from a cost perspective because these workloads tend to be IOPs constrained with a heavy write percentage.  On the other hand, less transactional databases, application, and file storage tend to be capacity constrained with a low write percentage.  In these cases RAID5 or 6 are going to look better.

In general the following RAID types are a good fit in the following disk tiers, for the following reasons:

  • EFD (a.k.a. Flash or SSD) – RAID5.  Response time here is not really an issue, instead you want to squeeze as much capacity as possible out of them for use, ’cause these puppies are pricey!  RAID5 does that for us.
  • SAS (a.k.a. FC) – RAID5 or RAID1/0.  The choice here hinges on write percentage.  RAID6 on these guys is typically a waste of space and added write penalty.  They rebuild fast enough that RAID5 is acceptable.  Note – as these disks get larger and larger this may shift towards RAID1/0 or RAID6 due to rebuild times or even UBEs, but these are actually enterprise grade and have exponentially less UBE rate.
  • NLSAS (a.k.a. SATA) – RAID6.  Please use RAID6 for these disks.  As previously stated, they need the added protection of the extra parity, and you should be able to justify the cost.

Again, this is just in general, and I can’t overstate the need for solid analysis.

Hopefully this has been accurate and useful. I really enjoyed writing this up and hope to continue producing useful (and accurate!) material in the future.

RAID: Part 3 – RAID 1/0

So, if you have been following along dear reader, we are now up to speed on several things.  We have discussed mirroring (and RAID1, which leverages it) and striping (and RAID0, which leverages that).  We have also discussed RAID types using some familiar and standard terminology which will allow us to compare and contrast the versions moving forward.

Now, on to the big dog of RAID – RAID 1/0.  This is called “RAID one zero” and “RAID ten,” and sometimes “RAID one plus zero” (and indicated as RAID 1+0).  I have never heard it called “RAID one slash zero” but perhaps somebody somewhere does that also.  All of these things are referring to the same thing, and RAID ten is the most common term for it.

Why do we need RAID1/0?

In this section I wanted to ask a sometimes overlooked question – what are the problems with RAID0 and RAID1 that cause people to need something else?

If you know about RAID0 (or even better if you read Part 2) you should have an excellent idea of the failings of it.  Just to reiterate, the problem of RAID0 is that it only leverages striping, and striping only provides a performance enhancement.  It provides nothing in the way of protection, hence any disk that fails in a RAID0 set will invalidate the entire set. RAID0 is the ticking time bomb of the storage world.

RAID1’s problems aren’t quite as obvious as the “one disk failure = worst day ever” of RAID0, but once again let’s go back to Part 1 and look at the benefits I listed of RAID:

  1. Protection – RAID (except RAID0) provides protection against physical failures.  Does RAID1 provide that?  Absolutely – RAID1 can survive a single disk failure.  Check box checked.
  2. Capacity – RAID also provides a benefit of capacity aggregation.  Does RAID1 provide that?  Not at all.  RAID1 provides no aggregate capacity or aggregate free space benefit because there are always exactly two disks in a RAID1 pair, and the usable capacity penalty is 50%.  Whether I have a RAID1 set using a 600GB drive or a 3TB drive, I get no aggregate capacity benefit with RAID1, beyond the idea of just splitting a disk up into logical partitions…which can be done on a single disk without RAID in the first place.
  3. Performance – RAID provides a performance benefit since it is able to leverage additional physical spindles.  Does RAID1 provide that?  The answer is yes…sort of.  It does provide two spindles instead of one, which fits the established definition.  However there are some caveats.  There isn’t a performance boost on writes because of the write penalty of 2:1 (both of the spindles are being used for every single write).  There is a performance boost on reads because it can effectively round-robin read requests back and forth on the disks.  But, and a BIG BUT, there are only two spindles.  There are only ever going to be two spindles.  Unlike a RAID0 set which can have as many disks as I want to risk my data over, a RAID1 set is performance bound to exactly two spindles.

Essentially the problem with the mirrored pair is just that – there are only ever going to be two physical disks.

By now it may have become obvious, but RAID0 and RAID1 are almost polar opposites.  RAID1’s benefit lies mostly around protection, and RAID0’s benefit is performance and capacity.  RAID1 is the stoic peanut butter, and RAID0 is the delicious jelly.  If only there was a way to leverage them both….

What is RAID1/0?

RAID1/0 is everything you wanted out of RAID0 and RAID1. It is the peanut butter and jelly sandwich.  (Note: please do not attempt to combine your storage array with peanut butter or jelly.  Especially chunky peanut butter.  And even more especiallyer chunky jelly)

Essentially RAID1/0 looks like a combination of RAID1 and RAID0, hence the label.  More accurately, it is a combination of mirroring and striping in that order.  RAID1/0 replaces the individual disks of a RAID0 stripe set with RAID1 mirror pairs.  It is also important to understand what RAID1/0 is and what it is not.  It is true that it leverages the good things out of both RAID types, but it also still maintains the bad things of both RAID types. This will become apparent as we dive into it.

raid10

This is a busy image, but bear with me as I break it down.

  • This is an eight disk RAID1/0 configuration, and on this configuration (similar to the Part 2 examples) we are writing A,B,C,D to it. For simplicity’s sake we ignore write order and just go alphabetically
  • The orange and green help indicate what is happening at their particular parts of the diagram
  • The physical disks themselves (the black boxes) are in mirrored pairs that should hopefully be familiar by now (indicated by the green boxes and plus signs).  This is the same RAID1 config that I’ve covered previously.
  • The weirdness picks up at the orange part. The orange box indicates that we are striping across every mirrored pair.  This is also identical to the RAID0 configuration, except that the the physical disks of the RAID0 config have been replaced with these RAID1 pairs.

This is what is meant by RAID1/0.  First comes RAID1 – we build mirrored pairs.  Then comes RAID0 – we stripe data across the members, which happen to be those mirrored pairs.  It may help to think about RAID1/0 as RAID0 with an added level of protection at the member level (since we know RAID0 provides no protection otherwise).

As the host writes A,B,C,D, the diagram indicates where the data will land, but let’s cover the order of operations.

  1. The host writes A to the RAID1/0 set
  2. A is intercepted by the RAID controller.  The particular strip it is targeted for is identified.
  3. The strip is recognized to be on a mirrored pair, and due to the mirror configuration the write is split.
  4. A lands on both disks that make up the first member of the RAID0 set.
  5. Once the write is confirmed on both disks, the write is acknowledged back to the host as completed
  6. The host writes B to the RAID1/0 set
  7. B is intercepted by the RAID controller.  The particular strip it is targeted for is identified.  Due to the mirror configuration the write is split.
  8. B lands on both disks that make up the second member of the RAID0 set.
  9. Once the write is confirmed on both disks, the write is acknowledged back to the host as completed
  10. The host writes C to the RAID1/0 set
  11. etc.

Hopefully this gives an accurate, comprehensible version of the how’s of RAID1/0.  Now, let’s look at RAID1/0 using the same terminology we’ve been using.

From a usable capacity perspective, RAID1/0 maintains the same penalty as RAID1.  Because every member is a RAID1 pair, and every RAID1 pair has a 50% capacity penalty, it stands to reason that RAID1/0 also has a 50% capacity penalty as a whole.  No matter how many members are in a RAID1/0 group, the usable capacity penalty is always 50%.

The write penalty is a similar tune.  Because every member is a RAID1 pair, and every RAID1 pair has a 2:1 write penalty, RAID1/0 also has a write penalty of 2:1.  Again no matter how many members are in the set, the write penalty is always 2:1.

RAID1/0 reminds me of the Facts of Life. You know, you take the good, you take the bad?  RAID1/0 is a leap up from RAID0 and RAID1, but it doesn’t mean that we’ve gotten rid of their problems.  It is better to think that we’ve worked around their problems.  The same usable capacity penalty exists, but now I have the ability to aggregate capacity by putting more and more members into a RAID1/0 configuration.  The same write penalty exists, but again I can now add more spindles to the RAID1/0 configuration for a performance boost.

The protection factor is weird, but still a combination of the two.  How many disks failures can a RAID1/0 set survive?  The answer is, it depends.  There is still striping on the outer layer, and by now we have beaten the dead horse enough to know that RAID0 can’t lose any physical disks.  It is a little clearer, especially for this transition, to think of this concept as RAID0 can’t survive any member failures, and in traditional RAID0 members are physical disks.  In this capacity, RAID1/0 is the same: RAID1/0 can’t survive any member failures.  The difference is that now a member is made up of two physical disks that are protecting each other.  So can a RAID1/0 set lose a disk and continue running?  Absolutely – RAID1/0 can always survive one physical disk failure.

…But, can it survive two?  This is where it gets questionable.  If the second disk failure is the other half of the mirrored pair, the data is toast.  Just as toast as if RAID0 had lost one physical disk since the effect is the same.  But what if it doesn’t lose that specific disk?  What if it loses a disk that is part of another RAID1 pair?  No problem, everything keeps running.  In fact, in our example, we can lose 4 disks like this and keep running:

raid10_4fails

You can lose as many as half of the disks in the RAID1/0 set and continue running, just as long as they are the right disks.  Again, if we lose two disks like this, ’tis a bad day:

raid10_2fails

So there are a few rules about the protection of RAID1/0

  • RAID1/0 can always survive a single disk failure
  • RAID1/0 can survive multiple disk failures, so long as the disk failures aren’t within the same mirrored pair
  • With RAID1/0 data loss can occur over as little as two disk failures (if they are part of the same mirror pair) and is guaranteed to occur at (n/2)+1 failures where n is the total disk count in the RAID1/0 set. 

Degraded and rebuild concepts are identical to RAID1 because the striping portion provides no protection and no rebuild ability.

  • Any mirror pair in degraded mode will see a write performance increase (splitting writes no longer necessary), and potentially a read performance decrease.  Other mirror pairs continue to operate as normal
  • Any mirror pair in rebuild mode will see a heavy performance penalty.  Other mirror pairs continue to operate as normal with no performance penalty.

Why not RAID0/1?

This is one of my favorite interview questions, and if you are interviewing with me (or at places I’ve been) this might give you a free pass on at least one technical question.  I picked it up from a colleague of mine and have used it ever since.

Why not RAID0/1?  Or is there even a concept of RAID0/1?  Would it be the same as RAID1/0?

It does exist, and it is extremely similar on the surface.  The only difference is the order of operations: RAID1/0 is mirrored, then striped, and RAID0/1 is striped, then mirrored.  This seemingly minor difference in theory actually manifests as a very large difference in practice.

raid01

Most things about RAID0/1 are identical to RAID1/0 (like performance and usable capacity), with one notable exception – what happens during disk failure?

I covered the failure process of RAID1/0 above so I won’t rehash that. For RAID0/1, remember that any failure of a RAID0 member invalidates the entire set.  So, what happens whenever the top left disk in RAID0/1 fails?  Yep, the entire top RAID0 set fails, and now it is effectively running as RAID0 using only the bottom set.

This has two implications.  The most severe being that RAID0/1 can survive a single disk failure, but never two disk failures.  The other is that if a disk failed and a hot spare was available (or the bad disk was swapped out with a good disk), the rebuild affects the entire RAID set rather than just a portion of it.

It would be possible to design a RAID controller to get around this.  It could recognize that there is still a valid member available to continue running from in the second stripe set.  But then essentially what it is doing is trying to make RAID0/1 be like RAID1/0.  Why not just use RAID1/0 instead?  That is why RAID1/0 is a common implementation and RAID0/1 is not.

Wrap Up

In Part 4 I’m going to cover parity and hopefully RAID5 and 6, and then I’ll provide some notes to bring the entire discussion together.  However, I wanted to include some thoughts about RAID1/0 in case someone stumbled on this and had some specific questions or issues related to performance, simply because I’ve seen this a lot.

RAID1/0 performs more efficiently than other RAID types from a write perspective only.  A lot of people seem to think that RAID1/0 is “the fastest one,” and hence should always be used for performance applications.  This is demonstrably untrue.  As I’ve stated previously, there is no such thing as a read penalty for any RAID type.  If your application is entirely or mostly read oriented, using RAID1/0 instead of RAID5 or 6 does nothing but cost you money in the form of usable capacity.  And yes, there are workloads with enormous performance requirements that are 100% read.

RAID1/0 has a massive usable capacity penalty.  If you are protecting data with RAID1/0, you need to purchase twice as much storage as it needs.  If you are replicating that data like-for-like, you need to purchase four times the amount of storage that it needs.  Additionally, sometimes your jumping off point locks you into a RAID type as well, so a decision to use RAID1/0 today may impact the future costs of storage as well.  I can’t emphasize this point enough – RAID1/0 is extremely expensive and not always needed.

I like to think of people who always demand RAID1/0 like the people who might bring a Ferrari when asked to “bring your best vehicle.”  But it turns out, I needed to tow a trailer full of concrete blocks up a mountain.  Different vehicles are the best at different things…just like RAID types.  We need to fully understand the requirements before we bring the sports car.

If you are having performance problems, or more likely someone is telling you they are having performance problems, jumping from RAID5 to RAID1/0 may not do a thing for you.  It is important to do a detailed analysis of the ENTIRE storage environment and figure out what the best fit solution is.  You don’t want to be that guy who advocated a couple hundred thousand dollars of a storage purchase when it turned out there was a host misconfiguration.