[solved] ZFS on Linux does not detect pools on LVM

kerberizer · 2015-04-19 19:15:15

Lockheed wrote:

Thanks for this exhaustive clarification.

You're most welcome, but keep in mind that I'm just an ordinary system administrator, so take my words with a grain of salt.

What is your opinion on Raid1 mirror made on one physical disk with two partitions? Does it make sense?
My limited understanding is that it keeps 100% read speed, lose 50% write speed and in exchange protects against bitrot. But the last time I discussed it I found the idea universally criticised but the above positive was not addressed nor refuted.

Well, I guess you could say this is a poor man's solution to bit rot, indeed. But it's just that: a poor man's solution that's most likely much more trouble than it's worth...

On the other hand, if bit rot is not a real-life problem, then the is no point to it on today's drives which already in contain protection against such issues.

...Well, my impression is that you're overestimating the importance of bit rot, and underestimating the importance of drive failures. While I don't have any statistical information at hand, common sense tells me that you're much more likely to suffer catastrophic damage to your data due to drive problems (not just complete drive failure, but also head crashes that produce massive areas of damaged magnetic surface, etc).

If you still insist on using only one disk (sometimes, that's simply not a matter of choice after all), I strongly suggest that you use a single partition (or the whole disk) dedicated to ZFS and set the 'copies' property to '2' or even '3' for those datasets that are important to you.

But perhaps more important is another thing: what's the purpose of that data? If you consider it as an online backup, than I'd say that it's best to simply have an offline backup as well, preferable even more than one in different physical locations and perhaps even on different type of media (some small, but especially crucial things I even keep printed in hex on paper -- laugh at me, if you'd like). While you can use the built-in data integrity facilities of ZFS and Btrfs, you could achieve basically the same results with the various available tools that do file checksumming. In this case, you don't really need redundancy, although it won't hurt either (except your pocket, to an extent, of course). If, on the other hand, you will be serving the data, then redundancy obviously is an advantage, because it can protect you from downtimes. But still having the same type of backup as above is the best thing to do.

I suppose this answer is rather vague and trivial common sense, but it's hard to be more specific given the information known.

Lockheed · 2015-04-19 20:10:47

kerberizer wrote:

...Well, my impression is that you're overestimating the importance of bit rot, and underestimating the importance of drive failures.

Well, I was using root BTRFS where the filesystem itself seemed to corrupt data. Hence putting it on Raid1 and scrubbing regularly seemed like a good idea. I still do that on some systems and had no problems since. Hence I though it might be a good idea with ZFS, too.

you're much more likely to suffer catastrophic damage to your data due to drive problems (not just complete drive failure, but also head crashes that produce massive areas of damaged magnetic surface, etc).

Right. But this is precisely what Raid1 could help with. The disk is 2 TiB HDD with 2 or 3 platters. If platter 1 is damaged, the copy of the data is still on another platter.

If you still insist on using only one disk (sometimes, that's simply not a matter of choice after all), I strongly suggest that you use a single partition (or the whole disk) dedicated to ZFS and set the 'copies' property to '2' or even '3' for those datasets that are important to you.

This is exactly the problem. My server is Zotac AQ01 with only one SATA port.

This is a very interesting thing you mentioned here, but I'm not sure I understood it.
Does it mean ZFS can work as-if in Raid1 for only selected datasheets (subvolumes? folders?), while keeping rest of the partition as single drive? And then during scrub, if data is corrupted elsewhere, ZFS will inform me, but if it is corrupted in those selected datasheets, it will seamlessly recover it form its "raid1" copy while serving me the correct data?

Other than that, I will have the most important 500GiB of this drive backed up to Copy service. So that will hope with off-site backup. But in general, this 2TiB disk is to serve as off-site backup for my other machines.

kerberizer · 2015-04-20 00:54:28

Lockheed wrote:

Well, I was using root BTRFS where the filesystem itself seemed to corrupt data. Hence putting it on Raid1 and scrubbing regularly seemed like a good idea. I still do that on some systems and had no problems since. Hence I though it might be a good idea with ZFS, too. [...] Right. But this is precisely what Raid1 could help with. The disk is 2 TiB HDD with 2 or 3 platters. If platter 1 is damaged, the copy of the data is still on another platter.

I'm afraid you might be oversimplifying the problem. Bad things can happen to data in many different ways and in many different places. We could talk about the need for ECC memory, because bit rot might actually be more common or more dangerous there, we could go on about the need for checksumming the data that goes through different buses and so on, and so on. If you are serious about it, there's plenty of information on the Internet, but I'm afraid it's exactly the sheer volume of it that's a problem on its own. And we certainly cannot hope to cover it in a forum thread -- not to mention that I'm certainly not the best person to talk about such matters.

Just a brief example: long gone are the days of the ST-506 interface, where you could be pretty sure that you know the correct physical geometry of the hard drive. Nowadays, what you guess about how one of your partitions is on one platter and the other is on another is just a shot in the dark. While there certainly is some logic behind it, you can never know it for sure (except, perhaps, if you disassemble the drive and do some very complex tests with it), and that kinda defeats the whole idea of "platter" pseudo-redundancy. And I'm not even sure why you think that the possibility for a damage entirely constrained to one platter is that high. In the head crash example, it's not just the direct physical damage to the magnetic surface that matters: more problems could actually arise from the debris that gets thrown all around and the last thing you want in a hard drive are such pieces flying all around. Again, you'd better talk to experts in the area if you want to know more about how hard drives can and are likely to fail.

This is exactly the problem. My server is Zotac AQ01 with only one SATA port.
This is a very interesting thing you mentioned here, but I'm not sure I understood it.
Does it mean ZFS can work as-if in Raid1 for only selected datasheets (subvolumes? folders?), while keeping rest of the partition as single drive? And then during scrub, if data is corrupted elsewhere, ZFS will inform me, but if it is corrupted in those selected datasheets, it will seamlessly recover it form its "raid1" copy while serving me the correct data?

A dataset in ZFS is more or less what Btrfs calls a subvolume; typically you create a dataset as part of the filesystem hierarchy which holds certain type of data: there are many examples around.

What I would do in your case is to give the whole disk to ZFS, and set on the datasets that will keep the important data the 'copies' property to either '2' or '3' (I'm almost sure you have to do this before you actually copy the data there). In the event that some part of the disk holding this data turns out to be defective, ZFS will notice the corruption and supply the data from the other copies -- provided, of course, that they hadn't been damaged either (it's still one drive). So, yes, that would be your 2-way or even 3-way mirror, speaking broadly. Better analogy might be how you could create yourself manually second, third, etc. copies of your data on the same disk. But ZFS here does it in a transparent way with the added benefit of automagically knowing when data becomes corrupt.

Here's what man zfs says about it:

copies=1 | 2 | 3

           Controls the number of copies of data stored for this dataset. These copies are in addition to any  redundancy
           provided by the pool, for example, mirroring or RAID-Z. The copies are stored on different disks, if possible.
           The space used by multiple copies is charged to the associated file and dataset, changing  the  used  property
           and counting against quotas and reservations.

           Changing  this  property only affects newly-written data. Therefore, set this property at file system creation
           time by using the -o copies=N option.

So, yes, last paragraph confirms what I just wrote about having to set the property before writing the data.

Other than that, I will have the most important 500GiB of this drive backed up to Copy service. So that will hope with off-site backup. But in general, this 2TiB disk is to serve as off-site backup for my other machines.

Seems reasonable. The only thing that I could possibly add is that there is no such thing as over-backuping.

Lockheed · 2015-04-20 05:36:14

Again, many thanks for that explanation. I am now better equipped to make an informed decision.

One more question - as I understand ZFS is also a volume manager. If I give the whole disk to ZFS and create pool on it, can I later carve out a part of that disk and give it to a vdev (visible under /dev/) formatted with another filesystem?
And are overheads in such case greater than when doing it with LVM?

kerberizer · 2015-04-20 19:47:19

Lockheed wrote:

One more question - as I understand ZFS is also a volume manager. If I give the whole disk to ZFS and create pool on it, can I later carve out a part of that disk and give it to a vdev (visible under /dev/) formatted with another filesystem?

Yes, that would be a ZVOL.

And are overheads in such case greater than when doing it with LVM?

I don't have any first hand experience, but I wouldn't be surprised if they are, and not just marginally. Don't forget, however, that you get the benefits of data integrity, transparent compression, sparse reservation and, depending on the setup, redundancy. ZFS snapshots also might be more convenient, but I've never done snapshots on LVM, so I don't know.

Arch Linux

#26 2015-04-19 19:15:15

Re: [solved] ZFS on Linux does not detect pools on LVM

#27 2015-04-19 20:10:47

Re: [solved] ZFS on Linux does not detect pools on LVM

#28 2015-04-20 00:54:28

Re: [solved] ZFS on Linux does not detect pools on LVM

#29 2015-04-20 05:36:14

Re: [solved] ZFS on Linux does not detect pools on LVM

#30 2015-04-20 19:47:19

Re: [solved] ZFS on Linux does not detect pools on LVM

Board footer