ZFS-FUSE vs. BTRFS, for an arch backup RAID array

wolfdogg · 2012-04-30 17:53:09

I have to make a few decisions here regarding a new backup array that i would like to place on my arch box.

I have a 2TB drive, which is not enough space, so i have either 2x750, or 2x500 that i need to span(JBOD) with the 2Tb using either the mobo settings on a Gigabyte GA-M61PM-S2 mobo http://www.gigabyte.com/products/produc … id=2373#sp or directly from the shell. Raid5 would be nice, but since the two smaller drives are less than half of the largest this is currently out of the question.

Data integrity is A #1 Importance.
Speed of file access is not of concern since only a backup process will be using this drive. I have been looking into both ZFS (fuse) and BTRFS as choices. i have a few questions regarding this so i can set up my Arch box as a formidable backup box.

Is it even smart to use these filesystem on RAID?
I understand to fully utilize ZFS it needs be isolated away from hardware RAID channels to keep the integrity, is the same true for software RAID?
Is BTRFS mature enough for this use?

Last edited by wolfdogg (2012-06-24 20:24:07)

paddlaren · 2012-06-22 13:07:31

I am looking into a similar problem right now.

My server with 4x2TB shall be secured against silent corruption.
Current setup is RAD5 + LVM + EXT4 (i.e. nodata checksums).

My main alternatives are ZFS (fuse) and BTRFS.

I have tested disk failures on a system where I build RAD10/RAIDZ arrays, fill with data and crash one disk (mkreiserfs on one of the partitions).

Both works pretty nice but I have found a serious problem with BTRFS; if the disk is pretty full it will be unable to replace a broken disk in the array and I cannot release the broken disk. A full BTRFS seems to be quite problematic and I have found no limit for the disk. I do realize that the balance command will need to rewrite every file on the system and possibly space will be needed enough for the bigest file there are (some virtual-box-drives in my case -> 10 GB).

The btrfs test was:
* Create raid10 system on /dev/sda5 /dev/sda6 /dev/sda7 /dev/sda8
* Fill with data.
* Remove some data to make a small space
* Destroy sda8 (mkreiserfs /dev/sda8)
* Mount degraded
* Add sda9 to the array (you must add a new before the old is removed).
Then things failed:
* Cannot remove sda8 (error without help)
* Cannot balance; disk is full.

The ZFS test was:
* Create raidz system on /dev/sda5 /dev/sda6 /dev/sda7 /dev/sda8
* Fill with data.
* Remove some data to make a small space (2.6GB)
* Destroy sda7 (mkreiserfs /dev/sda7)
* export, import and scrub to detect the problem
* Replace sda7 with sda9
* Wait for resilvering to finish, takes reasonable time and it shall take time to repopulate a disk.

Nothing went wrong. Checked 200 files in an audiobook with md5sum, no errors found.

I expect my server to be filled up right in time for first disc to crash. And I like to extra space gained using raidz compared to raid10. I will probably go for ZFS-Fuse on my server.

BR
Erik

wolfdogg · 2012-06-22 22:10:59

that info is very helpful. So you were able to duplicate that problem of not being able to remove disks from the array using BTRFS? or was it a one time event. I trust that you have tried it from many angles, it sounds like. If my interpretation is correct of what you did, the main test was to bashup the partition tables on one drive, then rebuild from a stripe correct? RAIDZ sounds like the way i want to go, RAID-% like but omitting the write-hole possibility.

SoRAID-Z and ZFS-Fuse sounds like a real good solution then. I just read up on this article http://superuser.com/questions/9991/is- … or-example and seems like were not the only ones talking about it. An even better artile is this one https://blogs.oracle.com/bonwick/entry/raid_z .

edit; i found another good article on this http://forums.freenas.org/archive/index.php/t-221.html

Last edited by wolfdogg (2012-06-24 20:14:32)

wolfdogg · 2012-06-24 19:31:06

ok, i was able to create the ZFS on one drive, i am trying to figure out how to set this up so i created a discussion here https://wiki.archlinux.org/index.php/Us … hard_drive

now that i have one drive set up ZFS im thinking i need to look into RAIDZ. I cant really find enough documentation on that regarding how to set up in Arch, but i do see enough docs to make it look like something i would definetely want. I want to go with RAIDZ1, as opposed to RAIDZ2 because its just a backup drive, as opposed to an only-copy of something.

So can someone steer me in the right direction? I believe what i need to do is set up the array first, then create the 'zpool' but i can also see an instance where it might be posible to create the file system as i had, then use LVM to span the next drive into the array, is this possible? Actually im not sure what to expect if i create a file-system on the drive, because when i zpooled it, there was instantly an active file system on it, does this mean that the drive was previously formatted to a different file system (ext4 i suspect) or how does this work right away after zpooling if not? im not sure if i need to look to LVM next, or the RAID array.

EDIT:
After reading this http://www.unixconsult.org/zfs_vs_lvm.html i notice a couple things that might be a problem. Is LVM not possible on ZFS? And the worse of the two, My drives are a 2TB, and 2 500GB's, so i dont think i can use RAID-Z now or that will get me only 1TB of data. I believe i need to JBOD span them, or equivalent because i need more than 2Tb of space. I do already have then JBOD spanned on my nvidia motherboard settings, but Linux doesnt recognize this. Im assuming i need to disregard the mobo array settings and delete the array there.

Last edited by wolfdogg (2012-06-24 20:20:40)

wolfdogg · 2012-06-25 00:54:21

well i have spent most of the day on it. on a side note- I did end up deleting the array from the mobo settings, since its redundant.

I zer0ed all partition tables on the 3 drives and then set them to GUID (GPT) using gdisk.
I was able to set up the raidz using zpool, and the datasets using zfs, but im a bit confused about drive sizes and actual formatting of them. Im not sure if the drives are automatically formatted leaving the maximum size availabe unless specified, or if its just a partition table write to each because everything is happening relatively fast. Do i have to format these things at any point along the path before or after creating a zpool? I have tried to create a volume but it didnt seem to work

# zpool status
  pool: pool
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Sun Jun 24 16:11:48 2012
config:

        NAME                                                      STATE     READ WRITE CKSUM
        pool                                                      ONLINE       0     0     0
          raidz1-0                                                ONLINE       0     0     0
            disk/by-id/ata-ST2000DM001-9YN164_W1E07E0G            ONLINE       0     0     0
            disk/by-id/ata-WDC_WD5000AADS-00S9B0_WD-WCAV93947658  ONLINE       0     0     0
            disk/by-id/ata-WDC_WD5000AADS-00S9B0_WD-WCAV93917591  ONLINE       0     0     0  4K resilvered

errors: No known data errors

# zpool list
NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
pool  1.36T   289K  1.36T     0%  1.00x  ONLINE  -

# zpool get all pool
NAME  PROPERTY       VALUE       SOURCE
pool  size           1.36T       -
pool  capacity       0%          -
pool  altroot        -           default
pool  health         ONLINE      -
pool  guid           3763231657625009273  default
pool  version        23          default
pool  bootfs         -           default
pool  delegation     on          default
pool  autoreplace    off         default
pool  cachefile      -           default
pool  failmode       panic       local
pool  listsnapshots  off         default
pool  autoexpand     on          local
pool  dedupditto     0           default
pool  dedupratio     1.00x       -
pool  free           1.36T       -
pool  allocated      289K        -

obviously the ST2000DM001 is the 2Tb drive. i did enable autoexpand to see if that would help in my problem, and i did enable panic for testing purposes.

[solaris zfs reference http://docs.huihoo.com/opensolaris/sola … uide/html/ ]

Last edited by wolfdogg (2012-06-25 23:14:02)

wolfdogg · 2012-07-01 05:13:00

ok, i figured it out. basically i had to create the pool with no raid type. then when i created the dataset, and checked the status, it showed the full size, well atleast over 2.36TB. i cant figure why its not larger, since it should be the full 3TB. is 2.36TB the proper size for 1- 2tb, adn 2 500Gb drives? once i shared the pools dataset using samba i am now able to use the array across the network.

details on how i set this up are in 'my talk' on this website for my user. i will create a wiki sooner or later once i make sure things are working to par.

Arch Linux

#1 2012-04-30 17:53:09

ZFS-FUSE vs. BTRFS, for an arch backup RAID array

#2 2012-06-22 13:07:31

Re: ZFS-FUSE vs. BTRFS, for an arch backup RAID array

#3 2012-06-22 22:10:59

Re: ZFS-FUSE vs. BTRFS, for an arch backup RAID array

#4 2012-06-24 19:31:06

Re: ZFS-FUSE vs. BTRFS, for an arch backup RAID array

#5 2012-06-25 00:54:21

Re: ZFS-FUSE vs. BTRFS, for an arch backup RAID array

#6 2012-07-01 05:13:00

Re: ZFS-FUSE vs. BTRFS, for an arch backup RAID array

Board footer