You are not logged in.
I have a 2TB SATA drive which I used in a stationary computer (as a server). The drive is encrypted using LUKS and for a filesystem I use xfs. Unfortunately that computer died, so I figured I'd migrate everything over to a laptop instead. I put the 2TB SATA drive in an USB drive adapter, and it worked as it expected.
Until a few hours after when I suddenly couldn't use the drive any longer. It always happens after I haven't used the drive in a little while.
dmesg shows the following relevant information:
[ 81.300065] usb 1-2: new high speed USB device number 2 using ehci_hcd
[ 81.516917] usbcore: registered new interface driver uas
[ 81.538799] Initializing USB Mass Storage driver...
[ 81.538966] scsi2 : usb-storage 1-2:1.0
[ 81.539424] usbcore: registered new interface driver usb-storage
[ 81.539429] USB Mass Storage support registered.
[ 82.548407] scsi 2:0:0:0: Direct-Access WDC WD20 EARS-00J2GB0 PQ
: 0 ANSI: 2
[ 82.551630] sd 2:0:0:0: Attached scsi generic sg2 type 0
[ 82.552000] sd 2:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.
81 TiB)
[ 82.554255] sd 2:0:0:0: [sdb] Write Protect is off
[ 82.554262] sd 2:0:0:0: [sdb] Mode Sense: 38 00 00 00
[ 82.554266] sd 2:0:0:0: [sdb] Assuming drive cache: write through
[ 82.557630] sd 2:0:0:0: [sdb] Assuming drive cache: write through
[ 82.593632] sdb: unknown partition table
[ 82.597254] sd 2:0:0:0: [sdb] Assuming drive cache: write through
[ 82.597262] sd 2:0:0:0: [sdb] Attached SCSI disk
[ 104.697278] XFS (dm-1): Mounting Filesystem
[ 105.023577] XFS (dm-1): Ending clean mount
[ 4249.281149] usb 1-2: USB disconnect, device number 2
[ 4440.332930] INFO: task khubd:370 blocked for more than 120 seconds.
[ 4440.332935] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4440.332939] khubd D 00000000 0 370 2 0x00000000
[ 4440.332947] dea97d74 00000046 c1030357 00000000 00000001 00000003 5d6764c7 000003dd
[ 4440.332956] 00000096 ddd02180 dea97d04 c10303a4 00000000 00000096 dea97d18 c1527480
[ 4440.332965] ddc84c38 c1527480 dec06480 de38f8e0 de2a0d20 00000000 00024592 00000000
[ 4440.332974] Call Trace:
[ 4440.332987] [<c1030357>] ? __wake_up_common+0x47/0x70
[ 4440.332993] [<c10303a4>] ? __wake_up_locked+0x24/0x30
[ 4440.332998] [<c10318ab>] ? cpuacct_charge+0x5b/0x70
[ 4440.333006] [<c11bd4b4>] ? rb_insert_color+0xc4/0x100
[ 4440.333014] [<c13487d5>] schedule_timeout+0x265/0x2e0
[ 4440.333022] [<c103f883>] ? check_preempt_wakeup+0x123/0x1b0
[ 4440.333028] [<c1036f72>] ? check_preempt_curr+0x72/0x90
[ 4440.333033] [<c1347527>] wait_for_common+0x97/0x120
[ 4440.333038] [<c10410a0>] ? try_to_wake_up+0x350/0x350
[ 4440.333043] [<c13475c7>] wait_for_completion+0x17/0x20
[ 4440.333048] [<c106390e>] kthread_stop+0x3e/0x130
[ 4440.333058] [<e00e6178>] release_everything+0x28/0xa0 [usb_storage]
[ 4440.333064] [<e00e620f>] usb_stor_disconnect+0x1f/0x30 [usb_storage]
[ 4440.333089] [<e0f9d64c>] usb_unbind_interface+0x3c/0x140 [usbcore]
[ 4440.333096] [<c1264271>] __device_release_driver+0x51/0xb0
[ 4440.333101] [<c12642f4>] device_release_driver+0x24/0x40
[ 4440.333106] [<c1263e6a>] bus_remove_device+0x5a/0x80
[ 4440.333110] [<c1261dc8>] device_del+0x108/0x170
[ 4440.333121] [<e0f9b691>] usb_disable_device+0x91/0x1a0 [usbcore]
[ 4440.333130] [<e0f94d1a>] usb_disconnect+0x9a/0x120 [usbcore]
[ 4440.333141] [<e0f967dc>] hub_thread+0x86c/0xf60 [usbcore]
[ 4440.333146] [<c1347a25>] ? schedule+0x275/0x9c0
[ 4440.333151] [<c1040efc>] ? try_to_wake_up+0x1ac/0x350
[ 4440.333157] [<c1063fa0>] ? abort_exclusive_wait+0x80/0x80
[ 4440.333161] [<c1031b4e>] ? complete+0x4e/0x60
[ 4440.333172] [<e0f95f70>] ? usb_remote_wakeup+0x40/0x40 [usbcore]
[ 4440.333176] [<c10638bd>] kthread+0x6d/0x80
[ 4440.333181] [<c1063850>] ? kthread_worker_fn+0x160/0x160
[ 4440.333187] [<c134b8be>] kernel_thread_helper+0x6/0x10
At first I figured the drive was suspending, so I've tried to play with the settings in /sys/bus/usb/devices/.../power/. To my knownledge the default settings are set to never suspend, and the dmesg doesn't say anything about suspending any drives (should it?), so I doubt this is the case. And also, I can still hear that the drive is on.
I have tried changing the USB cable but I got the same error.
So I figure, it's either a hardware problem (the USB drive adapter is bad) or it's the kernel -- I've read through some mailing lists that older kernels had some problems with USB devices disconnecting.
What I would like to try is to disable ehci_hcd and try to use a different module, though I need a little help with what module and how I would go about doing this. (I see in the lspci a reference to uhci, maybe try that?).
Anyone has any other ideas I can try that would also be greatly appreciated.
Thanks in advance.
lspci
00:00.0 Host bridge: Intel Corporation Mobile 915GM/PM/GMS/910GML Express Processor to DRAM Controller (rev 03)
00:02.0 VGA compatible controller: Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller (rev 03)
00:02.1 Display controller: Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #2 (rev 03)
00:1d.2 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #3 (rev 03)
00:1d.3 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #4 (rev 03)
00:1d.7 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB2 EHCI Controller (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev d3)
00:1e.2 Multimedia audio controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) AC'97 Audio Controller (rev 03)
00:1f.0 ISA bridge: Intel Corporation 82801FBM (ICH6M) LPC Interface Bridge (rev 03)
00:1f.1 IDE interface: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) IDE Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) SMBus Controller (rev 03)
02:01.0 CardBus bridge: Texas Instruments PCI1510 PC card Cardbus Controller
02:03.0 Network controller: Broadcom Corporation BCM4318 [AirForce One 54g] 802.11g Wireless LAN Controller (rev 02)
02:08.0 Ethernet controller: Intel Corporation 82562ET/EZ/GT/GZ - PRO/100 VE (LOM) Ethernet Controller Mobile (rev 03)
lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 002: ID 04fc:0c25 Sunplus Technology Co., Ltd SATALink SPIF225A
Last edited by Canute (2011-07-29 13:36:55)
Offline
Use smartmon tools to see if this is a hardware issue? Might be an idea to try.
Offline
I think I may have fixed it, or rather; it has worked the last 8 hours which is twice as long as the longest it did before.
Through ubuntu's bug tracker I found out that there appears to be a bug in the kernel (#32432). From what I read on the bugzilla, that bug appears to be only for 2.6.27-2.6.38 (is it??) and I was using 2.6.39 so I was really unsure if that bug applied to my kernel.
Nevertheless I built my own kernel with the help of abs. I used linux version 2.6.38, the standard arch linux patch for 2.6.38 and the patch located here.
Offline