You are not logged in.

#1 2015-08-06 22:56:42

jwhendy
Member
Registered: 2010-04-01
Posts: 621

Issues with internal/external SSDs (momentary freezes and dismounts)

Greetings,


I'm having issues with both an internal M.2 SATA (SanDisk SD6PP4M-256G-1006) and an SSD inside an external SATA/USB enclosure (Samsung 850 EVO 120G). My apologies if these are unrelated and should be split. For now, I'll post the errors together. I've googled the errors but don't find much in the way of a definitive issue or resolution. Examples:

- exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen: nothing standing out to me (possibly blacklisting some modules, but not really the error I have either)
- failed command: READ DMA EXT: all the posts I find feature a {DRDY ERR} in the logs, which I don't have.
- device reported invalid CHS sector 0: try replacing cables, drive failure possible, libata.force=noncq (done long ago)
- blk_update_request I/O error: set BIOS to AHCI or better drive cooling (my BIOS is already set to AHCI and this is external anyway?), noncq (again, done already), failing drive
- Synchronize Cache(10) failed: report of faulty usb stick, suggestion to run ntfsfix

I should note that this computer is < 2mos old. The Samsung 850 EVO is maybe 3-4 weeks old. I can run smartctl again if the output is really desired, but I highly doubt both drives are failing. They have both passed the longtest run within the past week. I'm applying trim via the systemctl service, have turned off UAS (per this method) for the external drive (that was causing some real unrecoverable freezes on high writes e.g. rsync restoration of a backup), and have noncq passed via syslinux.

I'm looking for suggestions on how to troubleshoot further. The momentary freezes are just a minor annoyance, but make me wonder if something deeper is wrong with my setup. The auto-dismounts are more concerning, as the drive is encrypted with Truecrypt to share data with Windows and I'm prompted to run CHKDISK when I boot into Windows, indicating the dismounts are not clean. I'd hate to ruin a nice drive's lifespan or lose my data due to bad dismounts on an encrypted drive. Thanks for any suggestions.


Internal SATA SSD

Occasionally, I get momentary system freezes. Things become unresponsive like chrome or thunar, and recover after perhaps 5-10 seconds. I see errors like this via dmesg:

[ 2311.211151] ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 2311.211163] ata7.00: failed command: READ DMA EXT
[ 2311.211166] ata7.00: cmd 25/00:08:f8:ad:23/00:00:18:00:00/e0 tag 0 dma 4096 in
[ 2311.211168] ata7.00: status: { DRDY }
[ 2311.211170] ata7: hard resetting link
[ 2311.528811] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 2311.529530] ata7.00: configured for UDMA/100
[ 2311.529534] ata7.00: device reported invalid CHS sector 0
[ 2311.529542] ata7: EH complete

External SSD via USB enclosure

For the external drive, the primary issue I experience is automatic dismounting sometimes upon suspend/resume cycles:

[ 3708.988034] sd 8:0:0:0: [sdc] Synchronizing SCSI cache
[ 3712.426381] sd 8:0:0:0: [sdc] UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
[ 3712.426386] sd 8:0:0:0: [sdc] CDB: opcode=0x28 28 00 00 18 08 18 00 00 08 00
[ 3712.426388] blk_update_request: I/O error, dev sdc, sector 1574936
[ 3712.426412] sd 8:0:0:0: [sdc] UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
[ 3712.426414] sd 8:0:0:0: [sdc] CDB: opcode=0x28 28 00 00 18 08 20 00 00 80 00
[ 3712.426415] blk_update_request: I/O error, dev sdc, sector 1574944
[ 3712.459668] sd 8:0:0:0: [sdc] UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
[ 3712.459671] sd 8:0:0:0: [sdc] CDB: opcode=0x28 28 00 00 18 08 28 00 00 78 00
[ 3712.459673] blk_update_request: I/O error, dev sdc, sector 1574952
[ 3712.493056] sd 8:0:0:0: [sdc] UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
[ 3712.493057] sd 8:0:0:0: [sdc] CDB: opcode=0x28 28 00 00 18 08 10 00 00 08 00
[ 3712.493059] blk_update_request: I/O error, dev sdc, sector 1574928
[ 3712.493090] EXT4-fs error (device sdc1): __ext4_get_inode_loc:3922: inode #49153: block 196610: comm updatedb: unable to read itable block
[ 3712.493090] sd 8:0:0:0: [sdc] UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
[ 3712.493092] sd 8:0:0:0: [sdc] CDB: opcode=0x28 28 00 00 18 08 a0 00 00 78 00
[ 3712.493093] blk_update_request: I/O error, dev sdc, sector 1575072
[ 3712.526414] sd 8:0:0:0: [sdc] UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
[ 3712.526417] sd 8:0:0:0: [sdc] CDB: opcode=0x28 28 00 00 18 08 a8 00 00 70 00
[ 3712.526418] blk_update_request: I/O error, dev sdc, sector 1575080
[ 3712.559818] sd 8:0:0:0: [sdc] UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
[ 3712.559821] sd 8:0:0:0: [sdc] CDB: opcode=0x2a 2a 00 00 00 08 00 00 00 08 00
[ 3712.559823] blk_update_request: I/O error, dev sdc, sector 2048
[ 3712.559826] Buffer I/O error on dev sdc1, logical block 0, lost sync page write
[ 3712.770644] blk_update_request: I/O error, dev sdc, sector 0
[ 3712.772025] sd 8:0:0:0: [sdc] Synchronizing SCSI cache
[ 3712.772059] sd 8:0:0:0: [sdc] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00
[ 3712.800110] EXT4-fs (sdc1): previous I/O error to superblock detected
[ 3712.800119] Buffer I/O error on dev sdc1, logical block 0, lost sync page write
[ 3714.217677] sd 9:0:0:0: [sdc] 234441648 512-byte logical blocks: (120 GB/111 GiB)
[ 3714.269634] sd 9:0:0:0: [sdc] Write Protect is off
[ 3714.269637] sd 9:0:0:0: [sdc] Mode Sense: 10 00 00 00
[ 3714.323579] sd 9:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 3714.582912]  sdc: sdc1 sdc2
[ 3714.714528] sd 9:0:0:0: [sdc] Attached SCSI disk
[ 6145.434861] sd 9:0:0:0: [sdc] Synchronizing SCSI cache
[ 6145.436353] sd 9:0:0:0: [sdc] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00
[ 6146.928105] sd 10:0:0:0: [sdc] 234441648 512-byte logical blocks: (120 GB/111 GiB)
[ 6146.980435] sd 10:0:0:0: [sdc] Write Protect is off
[ 6146.980438] sd 10:0:0:0: [sdc] Mode Sense: 10 00 00 00
[ 6147.032799] sd 10:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 6147.289711]  sdc: sdc1 sdc2
[ 6147.420406] sd 10:0:0:0: [sdc] Attached SCSI disk

This is more rare, but literally just happened while I was writing this post and noticed that /dev/sdc1 wasn't mounted, so I guess it's not even limited to a suspend/resume cycle. The drive was re-allocated to /dev/sdd with these messages:

[24512.804156] CPU: 2 PID: 2109 Comm: umount Tainted: G        W       4.1.3-1-ARCH #1
[24512.804157] Hardware name: Hewlett-Packard HP ZBook 15 G2/2253, BIOS M70 Ver. 80.08 04/13/2015
[24512.804158]  0000000000000000 0000000035188f7b ffff880666103c48 ffffffff815866ee
[24512.804160]  0000000000000000 ffff880666103ca0 ffff880666103c88 ffffffff81078c9a
[24512.804161]  0000000000000002 ffff88003629ee10 ffff88003629ebe8 ffff880697029470
[24512.804162] Call Trace:
[24512.804167]  [<ffffffff815866ee>] dump_stack+0x4c/0x6e
[24512.804170]  [<ffffffff81078c9a>] warn_slowpath_common+0x8a/0xc0
[24512.804171]  [<ffffffff81078d25>] warn_slowpath_fmt+0x55/0x70
[24512.804175]  [<ffffffff81187928>] ? pcpu_free_area+0xf8/0x1c0
[24512.804176]  [<ffffffff8120dfaf>] __mark_inode_dirty+0x43f/0x460
[24512.804178]  [<ffffffff81215b5c>] __set_page_dirty+0x7c/0xe0
[24512.804179]  [<ffffffff81215ce1>] mark_buffer_dirty+0x61/0x100
[24512.804186]  [<ffffffffa045c7da>] ext4_commit_super+0x16a/0x220 [ext4]
[24512.804191]  [<ffffffffa045d635>] ext4_put_super+0xe5/0x350 [ext4]
[24512.804195]  [<ffffffff811e3096>] generic_shutdown_super+0x76/0x100
[24512.804196]  [<ffffffff811e3457>] kill_block_super+0x27/0x80
[24512.804198]  [<ffffffff811e37c9>] deactivate_locked_super+0x49/0x80
[24512.804199]  [<ffffffff811e3c3c>] deactivate_super+0x6c/0x80
[24512.804202]  [<ffffffff812016d3>] cleanup_mnt+0x43/0xa0
[24512.804203]  [<ffffffff81201782>] __cleanup_mnt+0x12/0x20
[24512.804206]  [<ffffffff81095be4>] task_work_run+0xd4/0xf0
[24512.804210]  [<ffffffff81015d25>] do_notify_resume+0x75/0x80
[24512.804212]  [<ffffffff8158c17c>] int_signal+0x12/0x17
[24512.804213] ---[ end trace e5732c8c03bb3333 ]---
[24512.804217] Buffer I/O error on dev sdc1, logical block 0, lost sync page write
[24512.997248] usb 4-5: new SuperSpeed USB device number 12 using xhci_hcd
[24513.102015] usb 4-5: UAS is blacklisted for this device, using usb-storage instead
[24513.102018] usb-storage 4-5:1.0: USB Mass Storage device detected
[24513.102274] usb-storage 4-5:1.0: Quirks match for vid 059b pid 0070: 800000
[24513.102450] scsi host17: usb-storage 4-5:1.0
[24514.115744] scsi 17:0:0:0: Direct-Access     OEM      Ext Hard Disk    0000 PQ: 0 ANSI: 5
[24514.220071] sd 17:0:0:0: [sdd] 234441648 512-byte logical blocks: (120 GB/111 GiB)
[24514.273519] sd 17:0:0:0: [sdd] Write Protect is off
[24514.273532] sd 17:0:0:0: [sdd] Mode Sense: 10 00 00 00
[24514.325158] sd 17:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[24514.585779]  sdd: sdd1 sdd2
[24514.716827] sd 17:0:0:0: [sdd] Attached SCSI disk
[24898.474928] usb 4-5: USB disconnect, device number 12
[24898.475419] sd 17:0:0:0: [sdd] Synchronizing SCSI cache
[24898.475442] sd 17:0:0:0: [sdd] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00
[24898.781391] usb 4-5: new SuperSpeed USB device number 13 using xhci_hcd
[24898.883740] usb 4-5: UAS is blacklisted for this device, using usb-storage instead
[24898.883743] usb-storage 4-5:1.0: USB Mass Storage device detected
[24898.883801] usb-storage 4-5:1.0: Quirks match for vid 059b pid 0070: 800000
[24898.883814] scsi host18: usb-storage 4-5:1.0
[24899.896979] scsi 18:0:0:0: Direct-Access     OEM      Ext Hard Disk    0000 PQ: 0 ANSI: 5
[24900.000215] sd 18:0:0:0: [sdd] 234441648 512-byte logical blocks: (120 GB/111 GiB)
[24900.054667] sd 18:0:0:0: [sdd] Write Protect is off
[24900.054672] sd 18:0:0:0: [sdd] Mode Sense: 10 00 00 00
[24900.107710] sd 18:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[24900.367604]  sdd: sdd1 sdd2
[24900.498937] sd 18:0:0:0: [sdd] Attached SCSI disk

System details

This is an HP zBook 15 running Arch x86_64.

Miscellaneous details for reference:

$ lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor DRAM Controller (rev 06)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller (rev 06)
00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06)
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 04)
00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 (rev 04)
00:16.3 Serial controller: Intel Corporation 8 Series/C220 Series Chipset Family KT Controller (rev 04)
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I217-LM (rev 04)
00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 (rev d4)
00:1c.4 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #5 (rev d4)
00:1c.6 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #7 (rev d4)
00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation QM87 Express LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller (rev 04)
01:00.0 VGA compatible controller: NVIDIA Corporation GK106GLM [Quadro K2100M] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GK106 HDMI Audio Controller (rev a1)
3b:00.0 SATA controller: Marvell Technology Group Ltd. 88SS9183 PCIe SSD Controller (rev 14)
3c:00.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
3d:01.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
3d:02.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
3d:03.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
3e:00.0 Network controller: Intel Corporation Wireless 7260 (rev 6b)
60:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5249 PCI Express Card Reader (rev 01)


$ lsusb
Bus 002 Device 002: ID 8087:8000 Intel Corp. 
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 8087:8008 Intel Corp. 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 011: ID 059b:0070 Iomega Corp. eGo Portable Hard Drive  ### this is the external enclosure I'm using. Was a 500GB HDD which died, and I repurposed the enclosure
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 004: ID 04f2:b477 Chicony Electronics Co., Ltd 
Bus 003 Device 003: ID 138a:003f Validity Sensors, Inc. VFS495 Fingerprint Reader
Bus 003 Device 002: ID 0781:5571 SanDisk Corp. Cruzer Fit
Bus 003 Device 006: ID 8087:07dc Intel Corp. 
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub


$ cat /etc/fstab 
# UUID=9ad12135-5057-4966-9dbf-77da6438b32c
/dev/mapper/root   	/         	ext4      	rw,relatime,data=ordered,discard,noatime	0 1

# samsung boot
# /dev/sdc1           	/boot     	ext2      	rw,relatime	0 2 
UUID=26584098-cc33-49ad-a0d3-39344a4c9b56           	/boot     	ext2      	rw,relatime	0 2

# samsung vault
# vault
/dev/mapper/vault  /mnt/vault  ntfs-3g  defaults,uid=1000,gid=1000,dmask=022,fmask=133,noauto,noatime  0 0

$ cat /boot/syslinux/syslinux.cfg
LABEL arch-uuid
    MENU LABEL arch-uuid
    LINUX ../vmlinuz-linux
    APPEND root=/dev/mapper/root cryptdevice=UUID=5efd2b85-7d45-46f4-8407-1f08cca9847f:root:allow-discards crypto=sha512:aes-xts-plain64:512:: libata.force=noncq rw
    INITRD ../intel-ucode.img,../initramfs-linux.img


$ cat /etc/modprobe.d/ignore_uas.conf 
options usb-storage quirks=0x059b:0x0070:u

$ journalctl -u fstrim
-- Logs begin at Thu 2015-07-16 00:02:11 CDT, end at Thu 2015-08-06 17:45:33 CDT. --
Jul 16 23:12:37 arch_zbook systemd[1]: Starting Discard unused blocks...
Jul 16 23:12:46 arch_zbook systemd[1]: Started Discard unused blocks.
-- Reboot --
Jul 20 10:31:47 arch_zbook systemd[1]: Starting Discard unused blocks...
Jul 20 10:32:04 arch_zbook systemd[1]: Started Discard unused blocks.
-- Reboot --
Jul 27 09:08:14 arch_zbook systemd[1]: Starting Discard unused blocks...
Jul 27 09:08:30 arch_zbook systemd[1]: Started Discard unused blocks.
-- Reboot --
Aug 03 08:43:13 arch_zbook systemd[1]: Starting Discard unused blocks...
Aug 03 08:43:31 arch_zbook systemd[1]: Started Discard unused blocks.

Please let me know if further information would be helpful!

Offline

#2 2016-01-04 21:51:46

edhex
Member
Registered: 2016-01-04
Posts: 1

Re: Issues with internal/external SSDs (momentary freezes and dismounts)

Were you able to fix this problem on HP Z Book 15?

Offline

#3 2016-01-05 16:44:51

jwhendy
Member
Registered: 2010-04-01
Posts: 621

Re: Issues with internal/external SSDs (momentary freezes and dismounts)

@edhex: forgot about this post! I didn't get any replies and when it kept happening I started a new thread here. You can read through that one if you'd like. The most helpful posts might be the later ones, but so you know my most successful solution has been to append "pci=nomsi" to my syslinux.cfg kernel line. I haven't had any freezes and no messages like the above. There's links to the kernel error documentation where I noted that this option was recommended for the errors with "(timeout)" in them.

I plan to leave it like this for a couple weeks and if I still have no issues I'll try the other potentially less-all-encompassing options (like I reference in this post on that thread.

Hope that helps!

Offline

Board footer

Powered by FluxBB