You are not logged in.
I'm having issues with re-formatting an external hard drive using dm-crypt. It was previously formatted with TrueCrypt/NTFS, which I used as a shared backup drive between Windows and Arch. At some point, it stopped being able to mount, which I attributed to allowing Windows to "fix" it after improper dismount (e.g. a hard kill).
I decided to re-format with ext4 and only use it from Arch, but now I'm wondering if I may have a hardware issue with the drive. I've tried a lot more (like going through the full zero write after mounting the drive as a temporary dm-crypt device), but here's the condensed version to illustrate the problem.
system info
This is on a fresh boot. Just adding that as I've had issues with kernel modules after updating if a new kernel comes through. A fresh boot removes that potential issue.
$ uname -a
Linux arch_840 4.0.3-1-ARCH #1 SMP PREEMPT Wed May 13 15:38:47 CEST 2015 x86_64 GNU/Linux
$ lsmod | grep dm_
dm_crypt 28672 2
dm_mod 98304 5 dm_crypt
$ lsmod |grep xts
xts 16384 2 serpent_sse2_x86_64,twofish_x86_64_3way
gf128mul 16384 2 lrw,xtssmartctl status
Figured I should check the drive. There's a lot of old age and pre-fail warnings, but this post would seem to suggest I'm okay?
# smartctl -A /dev/sdb
smartctl 6.3 2014-07-26 r3976 [x86_64-linux-4.0.3-1-ARCH] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 0
2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0
3 Spin_Up_Time 0x0023 090 089 025 Pre-fail Always - 3330
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 703
5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0
8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 3707
10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 104
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 734
191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age Always - 17
192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0
194 Temperature_Celsius 0x0002 064 053 000 Old_age Always - 24 (Min/Max 16/47)
195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0
196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 252 252 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 3
223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 104
225 Load_Cycle_Count 0x0032 079 079 000 Old_age Always - 214068Disk info, delete existing partition, new MBR, create new partition
# fdisk /dev/sdb
Welcome to fdisk (util-linux 2.26.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): p
Disk /dev/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x76d37b6d
Device Boot Start End Sectors Size Id Type
/dev/sdb1 63 976768064 976768002 465.8G 83 Linux
Command (m for help): d
Selected partition 1
Partition 1 has been deleted.
Command (m for help): o
Created a new DOS disklabel with disk identifier 0x2cd60f13.
Command (m for help): n
Partition type
p primary (0 primary, 0 extended, 4 free)
e extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1):
First sector (2048-976773167, default 2048):
Last sector, +sectors or +size{K,M,G,T,P} (2048-976773167, default 976773167):
Created a new partition 1 of type 'Linux' and of size 465.8 GiB.
Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.trying to format with cryptsetup
I had a bunch of custom options, but other Arch posts suggested just trying the default, which is what I've done here. It fails with the same error as when I try to pass a cipher, key size, etc. ("Command failed with code 5: IO error while encrypting keyslot.").
# truecrypt -v --debug luksFormat /dev/sdb1
bash: truecrypt: command not found
[root@arch_840 jwhendy]# cryptsetup -v --debug luksFormat /dev/sdb1
# cryptsetup 1.6.6 processing "cryptsetup -v --debug luksFormat /dev/sdb1"
# Running command luksFormat.
# Locking memory.
# Installing SIGINT/SIGTERM handler.
# Unblocking interruption on signal.
WARNING!
========
This will overwrite data on /dev/sdb1 irrevocably.
Are you sure? (Type uppercase yes): YES
# Allocating crypt device /dev/sdb1 context.
# Trying to open and read device /dev/sdb1.
# Initialising device-mapper backend library.
# Timeout set to 0 miliseconds.
# Iteration time set to 1000 miliseconds.
# Interactive passphrase entry requested.
Enter passphrase:
Verify passphrase:
# Formatting device /dev/sdb1 as type LUKS1.
# Crypto backend (gcrypt 1.6.3) initialized.
# Detected kernel Linux 4.0.3-1-ARCH x86_64.
# Topology: IO (512/0), offset = 0; Required alignment is 1048576 bytes.
# Checking if cipher aes-xts-plain64 is usable.
# Using userspace crypto wrapper to access keyslot area.
# Generating LUKS header version 1 using hash sha1, aes, xts-plain64, MK 32 bytes
# KDF pbkdf2, hash sha1: 996745 iterations per second.
# Data offset 4096, UUID 181fed4d-42f2-4f0f-8b70-cb7ba459e25f, digest iterations 121625
# Updating LUKS header of size 1024 on device /dev/sdb1
# Key length 32, device size 976771120 sectors, header size 2050 sectors.
# Reading LUKS header of size 1024 from device /dev/sdb1
# Key length 32, device size 976771120 sectors, header size 2050 sectors.
# Adding new keyslot -1 using volume key.
# Calculating data for key slot 0
# KDF pbkdf2, hash sha1: 1008246 iterations per second.
# Key slot 0 use 492307 password iterations.
# Using hash sha1 for AF in key slot 0, 4000 stripes
# Updating key slot 0 [0x1000] area.
# Using userspace crypto wrapper to access keyslot area.
IO error while encrypting keyslot.
# Releasing crypt device /dev/sdb1 context.
# Releasing device-mapper backend.
# Unlocking memory.
Command failed with code 5: IO error while encrypting keyslot.Things also tend to hang with respect to the drive at this point. For example, fdisk -l spits out /dev/sda partitions immediately and then just hangs instead of printing out /dev/sdb info, then eventually quits (without ever writing it).
Any suggestions on where to look/how to troubleshoot? I found some possibly related posts, but nothing that looks promising:
- Impossible to crypt the drive using cryptsetup (fixed by rebooting)
- cryptsetup fails to open Udev cookie 0xd4d94f5 (semid 0) waiting for z (no responses; the hang after seems similar)
There's a couple odds and ends references to cryptsetup 1.6.6 having issues. I downloaded 1.6.4-1 and 1.6.5-1 and -2 from ARM to try, but wanted to post this in the meantime in case something stuck out.
Last edited by jwhendy (2015-05-29 16:01:40)
Offline
Had a very similar situation lately, improper dismount, tried everything you did, to no success.
The strange thing was, I could format it to ext4 and mount the disk. I would then write a few files to it and it would stop. After a check I found the disk changed from read/write to read only. I tried all suggestions found, no luck.
Also that I/O 5 error you have suggests it can't read or write to that I guess keyslot.
You could try and format the disk ext4, use it and see if you can read and write to it, otherwise it might be bricked.
Offline
anything in dmesg?
does the disk pass a smartctl -t long self-test?
Offline
@qinohe I thought of that and the other day started formatting with mkfs.ext4; unfortunately, it was at work and I had to leave before I could let it finish. It had been chugging along a good few hours, and I was surprised it would take that long. I was able to format it with ext4 using Windows 7 (I dual boot) with the MiniTool Partition Wizard but I didn't use it like that before trying to solve the cryptsetup issue again.
This last time around, I was getting unresponsive behavior. I think I need to reboot each time I try something with cryptsetup, as any commands related to that drive seem to hang afterwards (fdisk, umount, eject, mkfs, or trying crypsetup again). Perhaps I'll just let it cook overnight with mkfs and see if I can at least have an unencrypted, but functional drive.
One interesting tidbit is that even though cryptsetup fails, when I've tried to issue mkfs afterward, it asks me to confirm that I want to format the disk since it has a LUKS header... so something appears to have been written. Is it possible the header is causing some issues? I don't know much about the structure of a disk (like what range the MBR resides in, what constitutes a header, etc.) but have been wondering if there's some way to start really, really clean with the disk. Like I'd just bought it -- something appears to be lingering around from previous efforts?
@frostschutz I'll check tomorrow. That's a good question. Just checked journalctl and here are some of the errors that appear; unfortunately, I wasn't watching so I can't tell you what matches up with what command:
May 23 09:32:22 arch_840 systemd-udevd[7784]: inotify_add_watch(7, /dev/sdb1, 10) failed: No such file or directory
May 23 09:32:22 arch_840 kernel: usb 3-4: stat urb: status -108
### there's lots like this; like 10 in a row with various sector values listed
May 23 09:32:19 arch_840 kernel: Buffer I/O error on dev sdb1, logical block 61341696, lost async page write
May 23 09:32:19 arch_840 kernel: blk_update_request: I/O error, dev sdb, sector 490735616
### there's also a bunch like this, from tab #0 -> #29 (not colored red, so not sure they're errors?)
May 23 09:32:19 arch_840 kernel: sd 2:0:0:0: [sdb] tag#0 CDB: opcode=0x2a 2a 00 1d 07 bc 10 00 04 00 00
May 23 09:32:18 arch_840 kernel: sd 2:0:0:0: [sdb] tag#0 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD OUT I paged down quite a ways and those seem like the unique messages when I search the journal for "sdb". Anything stand out? I will say that the same sector numbers appeared in multiple blocks of the third error type listed, so that makes me wonder if something is genuinely wrong with the disk. I'll post the output of the full smartctl scan when I hopefully run it tomorrow.
Thanks for chiming in!
Offline
This last time around, I was getting unresponsive behavior. I think I need to reboot each time I try something with cryptsetup, as any commands related to that drive seem to hang afterwards (fdisk, umount, eject, mkfs, or trying crypsetup again). Perhaps I'll just let it cook overnight with mkfs and see if I can at least have an unencrypted, but functional drive.
Let it cook overnight?...., that doesn't seem right, should be done in a matter of moments
or how old is that machine...
One interesting tidbit is that even though cryptsetup fails, when I've tried to issue mkfs afterward, it asks me to confirm that I want to format the disk since it has a LUKS header... so something appears to have been written. Is it possible the header is causing some issues? I don't know much about the structure of a disk (like what range the MBR resides in, what constitutes a header, etc.) but have been wondering if there's some way to start really, really clean with the disk. Like I'd just bought it -- something appears to be lingering around from previous efforts?
Start clean, with something like
dd if=/dev/zero of=/dev/sdXWish you luck with it, but my guess you try to reanimate an already dead disk.
Offline
You should investigate the cause of those I/O errors; no point doing anything else with it, unless you enjoy data loss.
Offline
Thanks to both of you. I googled mkfs.ext4 times and now see it shouldn't be taking that long. I'm running a smartctl selftest right now (estimate of ~2hrs to complete), and will give the zero overwrite a try. I'll post back with the smartctl info when it completes. I'm tending to agree that this may be looking more and more like a dead disk. This has never happened to me before... perhaps I'm just reluctant to admit defeat ![]()
Offline
Assuming this means it's dying (read failure)?
# smartctl -l selftest /dev/sdb
smartctl 6.3 2014-07-26 r3976 [x86_64-linux-4.0.4-2-ARCH] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 60% 3714 524232704Offline
Yup. If you zero the drive completely with dd, mayhap it will reallocate some sectors, but such a drive is no longer trustworthy for important data.
Offline
Good to know. I guess I'll be getting a new drive. Bummer to have spent so much time assuming I was doing cryptsetup wrong to find out it's this! Just for comparison, I checked a different drive today and get this from the short test:
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 88 -So clearly a difference in the status. Thanks for all the assistance -- will update title (as I think it would be better for this to match hits around drive health and not people looking to solve cryptsetup issues) and mark it solved.
Offline