You are not logged in.
In the past, I experienced some problems when copying ISO images of Linux live systems to USB thumb drives using `dd`: The live systems reported checksum errors or did not boot at all. As it turns out, the source ISO images were fine but the images on the USB thumb drives were broken. Using `dd` for writing the whole images again did not help: Now, other spots of the images on the USB thumb drives were broken while the previous ones were repaired. Buying new USB thumb drives did not help either. That is why I started developing `ddpolymerase`, a tool which rewrites only those blocks of a destination file (or block device) which do not match the blocks of a source file (or block device), proofreads the result and re-repairs where necessary.
EDIT: You can also use `ddpolymerase` in order to copy and verify files and block devices. See the manpage for details.
Last edited by tokidev (2022-02-08 21:51:19)
Offline
I released a new version (0.1.1) containing several bug fixes.
In my previous post, I wrote that I "started" developing `ddpolymerase`. Actually, I consider it full-featured. I currently do not plan to add more features but I am open for contributions, suggestions and bug reports.
Offline
While ddpolymerase seems a useful tool in general, and kudos to you for developing it(!), check if in this particular scenario the problem is not elsewhere. You may save yourself a lot of pain.
Does the damage occur only if you unplug the USB stick from the port? That should be done with the eject command to ensure proper flushing and powering down of the device. Otherwise some data may remain unwritten or get corrupted. Messages in the kernel journal (or dmesg) will indicate when the USB stick is powered down: its capacity changing to 0.
I wonder if I will live to see times when people stop associating dd with disk operations and often breaking things due to mistyped or misunderstood options. Does the damage persist if you use tools suitable for writing images, like cp? Or pv or cat with redirection? If no… well, you know the culprit. While dd may be used for that purpose, the risk of misuse is high, reasons to call that command are scarce and one using it looks like someone trying to drive a nail with a screwdriver.
If both of the above does not explain the damage, the USB stick itself seems to be broken. Either from wear or because it reports fake capacity. In either case it should no longer be trusted. I do understand some people have no money for a new one and that in this case the reliability may be not an issue, but in general it’s a bad idea to continue using it. If that’s the case, ddpolymerase may be doing nothing more than shifting damage around until the error isn’t noticed while you boot the ISO.
Also, good use of the “nocache” iflag there. Something many people miss in such scenarios.
Last edited by mpan (2022-02-08 23:59:23)
Sometimes I seem a bit harsh — don’t get offended too easily!
Offline
@mpan:
Thank you for the feedback.
Testing via cat, pv and cp is a good idea. To be done.
In the past, I experienced a shifting damage: Some "bad blocks" (in ddpolymerase parlance) arose in the first pass and shifted slightly around during the other passes. And they never vanished. That was a particularly bad USB stick.
But in most cases a single run with ddpolymerase --copy-first succeeds (albeit it sometimes needs a second pass for repairing some blocks). So, for the purpose of copying ISO images to USB sticks, those USB sticks work quite fine. No need for another USB stick in most cases. But I can try to detect other reasons for writing errors.
Using eject is a good advice. Never used it before on USB sticks. But not using it does not seem to be the (main) source for the writing errors.
Offline
I often need to prepare USB drives with ISO files and then use them on machines that are difficult to get physical access to in the first place, sometimes only for the live system to tell me that it must be broken because of a checksum mismatch. Even writing twice in a row using plain dd does not always work.
So I looked around and found tips like using cmp but basically none of those tips bother bypassing caches, so I figured they cannot be trusted. However, even if I were to find a difference, what was I supposed to do instead of just retrying writing another whole image to the drive, just to get more errors and wear the drive even faster? Not having a tool that performs this kind of verification sucks big time, and I am certainly not the first one to ask this kind of question.
Enter ddpolymerase. From what I understand, it connects multiple dd instances to ensure that the destination file actually is a copy of the source file. That might sound trivial but my experience so far tells me that it's not. And I am not the first one here, either.
Anyway, my few tests with ddpolymerase tell me that it works perfectly in my case. It even comes with a nice progress bar and defaults better suited for today's drives like a larger block size, so no need to figure out a good one myself. I simply plug in the drive, pick the example in the man page that suits my case, and ddpolymerase will do just the Right Thing™. Nice!
Does the damage persist if you use tools suitable for writing images, like cp? Or pv or cat with redirection? If no… well, you know the culprit.
At the end, it does not matter how the wrong bits got onto the drive, and tools like cp or cat won't proofread the result either.
If both of the above does not explain the damage, the USB stick itself seems to be broken. Either from wear or because it reports fake capacity. In either case it should no longer be trusted.
Even if ddpolymerase finds bad blocks and even if they are really because of a bad medium, at least you will know right away instead of continuing trusting this device because the live system you are booting afterwards just happens to not check the bad sectors on startup.
By the way, I think I found the reason why some of my USB drives ended up not passing checksum tests, purely by accident. I run badblocks for new drives to make sure they are fine. It happened that I plugged in another new drive on the USB slot nearby (maybe sharing a bus with the first one) to run another badblocks on it at the same time. However, the first badblocks was in the middle of verification when I started the second badblocks, and it suddenly saw bad blocks everywhere! The effect disappeared as soon as I stopped the second process. So I restarted both badblock runs on different USB slots, everything worked fine and both drives and their respective cables were perfectly okay, These observations indicate that the USB connection within the computer (hardware or software/driver/firmware) is probably the issue here, and badblocks cannot be trusted either. Go figure!
@tokidev: It would be great to see ddpolymerase connect directly to the files instead of through dd as it might make things run faster. Also, after a quick glance, I could not find anything in the source code trying to re-align to physical sectors in case there is some kind of misalignment. (Not sure whether this is actually a thing, but I imagine something like dmsetup on a part of a device with a 512 byte offset compared to its 4096 byte physical sector size.)
Also, thanks for the fullblock iflag, I apparently missed it whenever I read the man page of dd.
Offline
At the end, it does not matter how the wrong bits got onto the drive, and tools like cp or cat won't proofread the result either.
That is making the assumption that the target device is broken. Such assumption has not been made in my response and has only been mentioned after that fragment. The logic in what I wrote is: valid data written, not stored on the medium → wrong data written → valid data written, stored with errors. The two first possibilities must be eliminated to boost likeliness of the third option. For the first part the tool used bears little significance, as the issue is elsewhere. After that is eliminated, but before device corruption is considered, one must be certain damage is not introduced by the tool used. cp, cat and pv with redirection/teeing can’t do that, while dd can and is known for notoriously being used wrong: both due to its obscure semantics and poor advice. Only after those two things are unlikely explanations, the storage medium being broken is worth considering. Otherwise one risks discarding a perfectly fine device.
Sometimes I seem a bit harsh — don’t get offended too easily!
Offline
@itektur: Nice to hear that you like ddpolymerase so quickly. You are not alone with that issue. I know other people trying to boot from USB sticks and having the same problem.
@tokidev: It would be great to see ddpolymerase connect directly to the files instead of through dd as it might make things run faster. Also, after a quick glance, I could not find anything in the source code trying to re-align to physical sectors in case there is some kind of misalignment. (Not sure whether this is actually a thing, but I imagine something like dmsetup on a part of a device with a 512 byte offset compared to its 4096 byte physical sector size.)
I use dd for good reason but in the future I could try to incorporate the appropriate functionality of dd - if things will not become too complicated by doing so.
You are right, currently I do not care about misalignment. Maybe solved in a future version.
Thank you for the links!
Offline
New version (0.2.0) published: Changelog
It is thoroughly tested via automated use case tests. However, there is no warranty!
I am open to feedback, contributions, suggestions and bug reports.
Offline