Friday February 10, 2012 5:20 AM AEST

RAID Theory

By Ashton Mills
10:52 May 19, 2008 | 4 Comments
Tags: RAID
«  »
RAID Theory
On that note, we have to talk about the RAID hardware component. First and foremost, lets be clear: the onboard RAID controllers that you find on NVIDIA and Intel based motherboards are not hardware RAID. Don’t be fooled by the fact the feature is provided by a ‘chip’ on the motherboard – this is little more than disk translation firmware. In truth onboard RAID controllers – often called FakeRAID controllers – do all their work in their driver in the OS. In other words, your CPU is the RAID ‘hardware’.

Proper hardware RAID controllers are external plug-in cards that utilise their own onboard processors, and sometimes memory as well. They are optimised to manage RAID devices as well to provide for more complex RAID schemes like RAID 5 and above. Yes, many FakeRAID controllers also support RAID 5, but they do it rather poorly and won’t give you good performance – if you’re determined to use RAID 5, buy a proper hardware solution. For onboard controllers stick to RAID 0, RAID 1, or RAID 0+1.

Of course, external RAID cards are also expensive whereas onboard motherboard RAID is technically free. As always, you get what you pay for. Still, while they won’t work miracles you can build some pretty beefy arrays with onboard controllers and, more importantly, software RAID.

What’s software RAID? Both Windows and Linux come with the capability to build arrays through the OS itself. If you’re wondering what the difference is between driver-based FakeRAID controllers and software RAID, seeing as they both use the CPU, the answer is not a lot – although, as we’ll also cover later in the benchmarks, they can perform quite differently (see, we think of everything!).


Stripe science
Now onto the meat of this feature – the all important stripe size. Quick primer for those scratching their heads: the stripe size is the size of the data that will be ‘striped’ across each drive in an array. If you have a 96k file to write on a three drive array with a stripe size of 64k, it will be striped across two drives. If the array used a stripe size of 32k, it would span all three drives. Having files striped across more drives is conducive to better throughput, but this doesn’t necessarily translate to better performance – it all comes back to the workload.

Unfortunately some enthusiasts new to RAID will select the ‘recommended’ stripe size in their FakeRAID controller, usually 128k. Conventional wisdom states that a stripe size of 128k and above is ideal for large file transfer performance, and indeed this is true – but only because it’s assumed the files are so large as to spread across multiple disks with the given stripe size. What actually gives better throughput performance however are lower stripes – the smaller these are the more files that will be striped across the drives, and in more stripes (that is – more files spanning across more drives).

Indeed, in terms of seek vs throughput, larger stripes are actually better – why? Because small files will reside on one disk only, freeing up the other disk for the next I/O request. Smart RAID algorithms can use this to parallelise requests across the disks, each serving up separate complete files.

So those of you thinking you’ll jump to the low stripe size of 8k or 16k – yes your throughput scores will skyrocket, but your random access scores will plummet. Use a large stripe size of 128k or 256k and you’ll ensure great seek performance scores, but your throughput won’t be taking full advantage of the striped nature of your array.

This is why a stripe size of 64k is often recommended as the best ‘balance’. And, in fact, it does perform very well for both types of workload. However there’s one more factor you need to consider – the number of drives in the array.

If, for example, you set up a four-drive RAID 0 array with a stripe size of 128k, a good chunk of the files on your system will never benefit from being striped, or only be striped across two or three drives. If throughput is your priority, if you’re building for video streaming or benchmarking for example, then a good rule of thumb is that the more drives you use, the smaller you need your stripe size to be to ensure files are striped across the drives in the array.

And yes, this means of course there is a ceiling to the effectiveness of drives in a RAID 0 array – you can’t keep throwing drives at it and expect to see a linear increase in performance. For this reason, RAID-0 arrays are optimal between two and four drives, but if you go for four drives you might as well use RAID 0+1 and gain both speed and redundancy.

In summary smaller stripe sizes of 32k, 16k, and even 8k can offer greater performance for sustained throughput, but this comes at the cost of seek performance (if this is important to you). Similarly 128k and 256k stripes allow for much greater parallelism of the drives, but throughput can suffer on files that could otherwise be striped.

As a result the accepted norm for a balanced value is 64k – and this is exactly what Windows will use if you create a ‘striped’ RAID volume. Unfortunately, Windows won’t let you change this so it’s the only option if you use software RAID.

So that’s plenty of theory, let’s see how it works in practice. To see just what impact stripe size, array size, seek vs throughput, and software RAID vs FakeRAID all have on performance we did the only thing a sane Atomican can do – we benchmarked to buggery.

 
«  »
 
This article appeared in the May, 2008 issue of Atomic.

Behind the scenes with Mass Effect 3! GTX 560 VGA round-up! Essential Skyrim tweaks to improve your game! Plus reviews, news, hardware, more games, and easy to following modding guides for PC builders. ON SALE NOW!
4 Comments
Fat_Bodybuilder
Sep 17, 2008 10:31 AM
Is parallelism a real word? :P

Nice work ;-)
osama_bin_athlon
Sep 17, 2008 8:15 PM
er, HDD's are cheaper than ever......under $80 for a 500G (Maxtor 500G $78 @ MSY, for instance), how's that expensive?
Goonit
Sep 20, 2008 9:28 AM
Wow, I've been under the impression it would make a huge difference to load times, doesn't seem the case at all.

One newer generation hard drive, is better then 2 older in raed. :)

Atomic always answers the questions we ponder,
Thanks for the article.

Fat_Bodybuilder
Sep 21, 2008 7:44 PM
Remember that this is covering RAID0, which is not really RAID at all.

And this is a very old article, HDDs were still a little expensive, then.
Comments have been disabled on this article.
 
Latest Competitions
 
Atomic Magazine

Issue: 133 | February, 2012

Atomic is a magazine aimed squarely at computer enthusiasts, gamers, and serious PC upgraders.

Every month we bring you the latest reviews of new technology and PC components, in depth features on everything from overclocking to console hacking, and gaming previews and interviews.
 
Latest Comments
 
Latest User Reviews
Battlefield 3 is the new benchmark online FPS
90%
A very fun and realistic multiplayer ride.
 
Antec Kuhler 920 - liquid cool
90%
Antec Kuhler 920 silent but effientive out of the box no maintence water cooling kit
 
Antec's Lanboy Air - our new favourite case
90%
Antec Lan boy Air in red a very cool design
 
Antec's Lanboy Air - our new favourite case
90%
This product overall is awesome.
 
MSI's GT780 laptop as fast as it gets
90%
Nice laptop
 
 
Close Get the February, 2012 issue of Atomic mailed to you for $8.95, including postage.

SubscribeBuy nowDigital Version