Author Topic: RAID Boxes  (Read 4106 times)

Offline per

  • Member
  • Posts: 114
Re: RAID Boxes
« Reply #20 on: August 04, 2009, 08:42:07 PM »
Hardware raid cards are mostly a total waste if you use ZFS, ZFS does not really use them at all (the only gain is for the write cache. But adding a SLD SSD for the intent log is better, really).

My home-raid can do 650MB/second streaming read, and more than 100MB/second doing 100% random access.
While seeding the 100 or so torrents I'm currently seeding, it's using less than 1% I/O capacity.
Considering the fact that a Gbit network can only handle a little bit more than 100Mb/second, it's sort of good enough. :-)

It contains 15 drives in a 3x5 stripe/raid5 configuration (and a SSD drive for the OS and cache), and is using two rather cheap 8-port PCI-express SATA controllers.

And I really think that ZFS is extremely easy to set up, compared to any of the alternatives, but I have been a unix system administrator since '92.

On a separate note, when you have more than 3 or so drives, you really need to have some kind of redundancy.

If a single drive as a mean time before failure of 5-10 years (which is more or less what I have noticed), you are likely to get a failure per year with 4 drives.

Offline K7IA

  • Member
  • Posts: 884
  • :)
Re: RAID Boxes
« Reply #21 on: August 04, 2009, 08:43:20 PM »
RAID, imo, still sounds like an acronym special to corporate IT environments and benchmark enthusiasts. I always tried to stay away from it in my personal environment, just because I thought it made simple things more difficult.

It's a widely adopted technology in corporate environments and has numerous benefits. if you had small 15krpm scsi hdds, then it was considered a beast several years ago, but rich companies would eventually go ramsan anyway.

Things I don't like in RAID
 - when adding new hdds, the matter of consistency of hdd sizes in the array.
 - removing a hdd when it's needed elsewhere
 - recovery

Hdd sizes getting ridiculously bigger everyday, terabyte disks are everywhere and gigabyte is the new kilobyte.

I recommend, always prefer accessing physical drives directly, don't merge them and refrain from raid logical drives. If you don't like drive letters, you can assign ntfs folders to local drives easily.

By the way, I still don't understand why some clever guy didn't come with an os embedded tool that will allow us to implement secure data storage (thru parity or something else) with a single click to a folder or a file. I don't want everything there to be duplicated or parity checked. Why should I deal with the whole hard disk?

ps. I liked the design of the device though :)

Offline per

  • Member
  • Posts: 114
Re: RAID Boxes
« Reply #22 on: August 04, 2009, 08:50:36 PM »
RAID, imo, still sounds like an acronym special to corporate IT environments and benchmark enthusiasts. I always tried to stay away from it in my personal environment, just because I thought it made simple things more difficult.

Well. It's supposedly supposed to be standardized rather soon in macos-x (ZFS).

Quote
Things I don't like in RAID
 - when adding new hdds, the matter of consistency of hdd sizes in the array.

Not a problem with ZFS, really. Just start a new raid-stripe with the size of the new drives.

Quote
- removing a hdd when it's needed elsewhere
If it's _a_ hdd, that's not a problem either, if you have redundancy enabled. You will loose the redundancy, though (actually, with ZFS, you only lose redundancy for the old files, new files will be written with redundancy to the new lower number of drives).
 
Quote
- recovery

Hm? Like restoring from backup? Well, yes, that takes a while when you have 30+Tb of storage. Not that I have a backup.. :-)

Quote
By the way, I still don't understand why some clever guy didn't come with an os embedded tool that will allow us to implement secure data storage (thru parity or something else) with a single click to a folder or a file. I don't want everything there to be duplicated or parity checked. Why should I deal with the whole hard disk?

Included in ZFS.. :-) (for folders, not files)

And the pure cool-factor of having an actual server is not to be ignored.

And the convenience of not having all the noisy drives in the room you are sitting in..

« Last Edit: August 04, 2009, 08:54:25 PM by per »

Offline K7IA

  • Member
  • Posts: 884
  • :)
Re: RAID Boxes
« Reply #23 on: August 04, 2009, 09:08:57 PM »
^ per, are you from the ZFS promotion group in sun :)

ok ZFS is a very good fs but you need a dedicated device for it running sun solaris or other supported os, if you are using windows.

May be we can use it with those small footprint hw virtual machines in the future, I doubt microsoft will implement it :P

Offline per

  • Member
  • Posts: 114
Re: RAID Boxes
« Reply #24 on: August 04, 2009, 09:15:57 PM »
^ per, are you from the ZFS promotion group in sun :)

ok ZFS is a very good fs but you need a dedicated device for it running sun solaris or other supported os, if you are using windows.

May be we can use it with those small footprint hw virtual machines in the future, I doubt microsoft will implement it :P

Yes, but the original subject was about a RAID box. :)

And, no I'm not from SUN, if I were I would not be using cheap generic PC hardware. ;)

Offline K7IA

  • Member
  • Posts: 884
  • :)
Re: RAID Boxes
« Reply #25 on: August 04, 2009, 09:48:05 PM »

Yes, but the original subject was about a RAID box. :)

And, no I'm not from SUN, if I were I would not be using cheap generic PC hardware. ;)

But it's not about building your own raid box. People are commenting on possible configurations on a raid setup, if those options are available. Unless of course Tatsujin is a hardware guru who intends to build a box from scratch :)

Cheap generic pc hardware? The setup looks perfectly clean and nice to me, just put a cray logo on the box :)



Offline kureshii

  • Former Staff
  • Member
  • Posts: 4485
  • May typeset edited light novels if asked nicely.
Re: RAID Boxes
« Reply #26 on: August 04, 2009, 11:30:33 PM »
I understand how parity works on RAID 2-5. But can anyone explain how the double parity in RAID 6 works? I've done some reading on R-S code, and while I can understand most of the math, I can't seem to put the pieces together.  ???
To the best of my knowledge, it is just taking RAID 5 one step further in terms of data redundancy.

While in RAID 5 you have one additional parity block being generated, in RAID 6 you have two. The parity blocks are distributed across all disks instead of being stored on a separate parity disk.

For each stripe of data you have 2 parity blocks, which gives you two "layers" of redundancy. In RAID 5, if you lost a disk, you still have an effectively "complete" copy of all your data (thanks to single-parity striping). In RAID 6, if you lose 2 disks, you still have an effectively "complete" copy of all your data (double-parity striping). In either case, once you lose one more disk, your data is no longer complete, and you will be unable to fully recover your data.

Once you replace the dead disks, the missing blocks on those disks can be recalculated from the blocks on the other disks (belonging to the same data stripe). So you can treat RAID 6 as a more robust buffer against disk failure; this is usually used only in disk arrays containing 6 or more disks, since in such arrays the chances of a second disk failure happening while you are still recovering data from the first disk crash is higher (than in a 5-or-less disk array). That is, of course, bad news for a RAID 5 setup.

I don't know the fine details of the algorithm used, you'll have to google that :)

----------
By the way, one thing that doesn't seem to have been mentioned yet is what is commonly called "silent data corruption" (and less commonly called RAID-5 write hole). A quick google search sufficiently illustrates this issue. Not to bring undue attention to it or to overstate its danger, since this occurs on individual-disk setups as well, but it should be pointed out that RAID 5 is a data redundancy solution, not a protection against data corruption.

Some might feel that "silent data corruption" is too strong/scary a term for it, since it's not like the RAID actively corrupts your data while pretending to be all fine and well; it just doesn't tell you if data is corrupted due to bad RAM or other reasons (since there's no way for it to know unless programmed to do so). This probably is of little concern to most readers, so just take what you want out of these two paragraphs.

----------
I have a home server build coming up in another month or so, already have a RAID-Z setup planned for it :) Just a pity that RAID-Z and RAID-Z2 pools cannot be dynamically expanded by adding new devices just yet...

For Tatsujin: If you have the cash to fork out for them, pre-assembled NASes are the fastest way to get started on a robust RAID setup. Manufacturers like Synology and QNAP (to name a couple) have such devices for the consumer market, although they definitely are nothing short of pricey.

Right now I'm on a Synology CS-407 (discontinued higher-end version of the CS407e), it's a dream to use. For the casual user who doesn't like taking time to set things up, everything is accessible through the main web-management interface, which is really slick and convenient. Time to having a 3-disk RAID-5 up and running was 10 min of putting disks in, a few clicks in a browser, about half a day for it to initialise (it's using a 500MHz ARM processor with 128MB RAM). Expanding that to a 4-disk array took another half a day.

Those who like using shells can even activate terminal services (although doing so voids your warranty, but Synology is cool about that; they even have a section on the forums for users to discuss software/hardware hacks) for SSH fun. It runs a slim BusyBox distro, and new ARM-compiled packages are available via ipkg.

Of course, without doubt, the better value-for-money proposition is always to re-purpose an old PC :)
« Last Edit: August 05, 2009, 02:40:24 AM by kureshii »

Offline bcr123

  • Member
  • Posts: 1171
  • Blah Blah Blah.. Woof.
    • Nothing Really
Re: RAID Boxes
« Reply #27 on: August 05, 2009, 01:21:01 AM »
Buy.com has the 4 drive (4x500GB) Buffalo linkstation on special right now for example:

buffalo-linkstation-quad-2tb

Online halfelite

  • Member
  • Posts: 1153
Re: RAID Boxes
« Reply #28 on: August 05, 2009, 03:13:07 AM »
Hardware raid cards are mostly a total waste if you use ZFS, ZFS does not really use them at all (the only gain is for the write cache. But adding a SLD SSD for the intent log is better, really).

My home-raid can do 650MB/second streaming read, and more than 100MB/second doing 100% random access.
While seeding the 100 or so torrents I'm currently seeding, it's using less than 1% I/O capacity.
Considering the fact that a Gbit network can only handle a little bit more than 100Mb/second, it's sort of good enough. :-)

It contains 15 drives in a 3x5 stripe/raid5 configuration (and a SSD drive for the OS and cache), and is using two rather cheap 8-port PCI-express SATA controllers.

And I really think that ZFS is extremely easy to set up, compared to any of the alternatives, but I have been a unix system administrator since '92.

On a separate note, when you have more than 3 or so drives, you really need to have some kind of redundancy.

If a single drive as a mean time before failure of 5-10 years (which is more or less what I have noticed), you are likely to get a failure per year with 4 drives.

I think people with no experience will have a tough time with ZFS. My only problem with ZFS is its on opensolaris lol. the hacked in versions on other distro's  dont please me. In the next 3 years though file systems will take off and turn very good almost replacing hardware raid cards.  btrfs is one I have been watching.

Offline Talapus

  • Member
  • Posts: 358
Re: RAID Boxes
« Reply #29 on: August 05, 2009, 12:11:44 PM »
...on the math and links this as it's source. It might be worth a read.

That's exactly what I needed. I understood how XOR parity worked, but I was confused as how you could generate an independant parity bit that could combine with the XOR parity to reproduce a second lost data bit. That article lays it out much nicer than what I was reading before. It's still not intuitive in my mind, but the math works out.

Thanks  :D

Offline houkouonchi

  • Member
  • Posts: 249
    • http://xevious.homeip.net
Re: RAID Boxes
« Reply #30 on: August 14, 2009, 09:26:37 PM »
Hardware raid cards are mostly a total waste if you use ZFS, ZFS does not really use them at all (the only gain is for the write cache. But adding a SLD SSD for the intent log is better, really).

My home-raid can do 650MB/second streaming read, and more than 100MB/second doing 100% random access.
While seeding the 100 or so torrents I'm currently seeding, it's using less than 1% I/O capacity.
Considering the fact that a Gbit network can only handle a little bit more than 100Mb/second, it's sort of good enough. :-)

It contains 15 drives in a 3x5 stripe/raid5 configuration (and a SSD drive for the OS and cache), and is using two rather cheap 8-port PCI-express SATA controllers.

And I really think that ZFS is extremely easy to set up, compared to any of the alternatives, but I have been a unix system administrator since '92.

On a separate note, when you have more than 3 or so drives, you really need to have some kind of redundancy.

If a single drive as a mean time before failure of 5-10 years (which is more or less what I have noticed), you are likely to get a failure per year with 4 drives.

100% random access writing and reading what chunk size? There is no way you are can get 650 megabytes/sec with the randomness/small chunks of torrents.

Also what are using to measure io load? iowait is broken on opensolaris. ZFS is all fine and dandy using as a NAS/file-server but it doesn't really work for local storage as I can't run everything I want to using solaris/open solaris so ZFS isn't really an option for me. In that case a hardware raid controller works great and it gives me around 830 megabytes/sec reads and 750 megabytes/sec writes (in raid6) which also happens to be faster than I have ever seen on any ZFS box (especially in writes).

Offline per

  • Member
  • Posts: 114
Re: RAID Boxes
« Reply #31 on: August 16, 2009, 04:49:59 PM »
 
Quote
Also what are using to measure io load? iowait is broken on opensolaris. ZFS is all fine and dandy using as a NAS/file-server but it doesn't really work for local storage as I can't run everything I want to using solaris/open solaris so ZFS isn't really an option for me. In that case a hardware raid controller works great and it gives me around 830 megabytes/sec reads and 750 megabytes/sec writes (in raid6) which also happens to be faster than I have ever seen on any ZFS box (especially in writes).
Bonnie++ on a 100Gb ZFS filesystem, 4Kb blocksize.

And a 120Gb SSD as cache.

Offline boxer4

  • Member
  • Posts: 280
  • Yes, EJ205
Re: RAID Boxes
« Reply #32 on: August 23, 2009, 12:09:34 AM »
I think I posted this before somewhere but can't find it.

To think of parity and the 'xor' function, here is a very simple case - a 1-bit one: think of a 2-switch 3-way switch light system, like your stairways light where there's a switch on the bottom and on the top.  If you flip one of the switches, the light turns off if it's on, and on if it was off.  Think the light bulb as 'parity'.

Now think of a disk failure as if you cover up a switch or the light bulb (NOT if the light bulb burns out).  (Depending on how your house was wired or switch was installed, the explanation can vary...) Because of the unique property that each switch will change the status of the light bulb, if you knew how your house was assembled, given that you knew the position of one of the two switches (up or down) and whether the light was on or off, you can deduce the position of the other switch -- unambiguously.  Same if the light bulb - if you knew the position of the two switches, you can tell whether the light should be on or off.

for this example house, say

Switch1....Switch 2...Lamp
down....down....off
down....up......on
up....down......on
up......up......off

There is no other possible 'legal' states this system can be in... like down-down-ON is impossible (the light spontaneously turned on for no reason???)

Because you know whether the unknown switch is on or off, or if the unknown light was on or off, you can now replace that bit-piece of information, and that's basically how data is recovered from RAID.

If you code up your disk to billions of little switches and light bulbs, 8 per byte, you can see how you can recover the data if you lose a disk.

I run software RAID5 (Linux-MD) on my home machine with 4 disks.  Still trying to switch over from my 4x120G disk array to my 4x500G hot swappable array...  (That day will come when WoW is stable in Linux, I suppose...)