You are not logged in.

#1 2011-07-22 11:48:55

Xyne
Administrator/PM
Registered: 2008-08-03
Posts: 6,965
Website

Death of a USB device... natural causes or foul play?

I've had a little usb flash drive mp3 player for years. The last time that I mounted it (mount -o umask=0 ...) to replace some songs, the songs were correctly copied but umount hung when I tried to dismount it.

After about 5 minutes I killed umount and checked that the device was unmounted, which it was. After removing the device, I turned it on to check it but it wouldn't proceed past the initial logo screen.

I can no longer mount the device. Here's the dmesg output from multiple attempts:

[  101.066671] usb 6-2: new full speed USB device number 2 using uhci_hcd
[  101.515930] usbcore: registered new interface driver uas
[  101.520972] Initializing USB Mass Storage driver...
[  101.521079] scsi10 : usb-storage 6-2:1.0
[  101.521176] usbcore: registered new interface driver usb-storage
[  101.521182] USB Mass Storage support registered.
[  102.903338] usb 6-2: reset full speed USB device number 2 using uhci_hcd
[  103.020004] usb 6-2: device descriptor read/64, error -71
[  103.240002] usb 6-2: device descriptor read/64, error -71
[  103.450003] usb 6-2: reset full speed USB device number 2 using uhci_hcd
[  103.620002] usb 6-2: device descriptor read/64, error -71
[  103.840000] usb 6-2: device descriptor read/64, error -71
[  104.050005] usb 6-2: reset full speed USB device number 2 using uhci_hcd
[  104.463350] usb 6-2: device not accepting address 2, error -71
[  104.570003] usb 6-2: reset full speed USB device number 2 using uhci_hcd
[  104.983340] usb 6-2: device not accepting address 2, error -71
[  104.983365] usb 6-2: USB disconnect, device number 2
[  105.090003] usb 6-2: new full speed USB device number 3 using uhci_hcd
[  105.260002] usb 6-2: device descriptor read/64, error -71
[  105.480006] usb 6-2: device descriptor read/64, error -71
[  105.690003] usb 6-2: new full speed USB device number 4 using uhci_hcd
[  105.806673] usb 6-2: device descriptor read/64, error -71
[  106.080003] usb 6-2: device descriptor read/64, error -71
[  106.290002] usb 6-2: new full speed USB device number 5 using uhci_hcd
[  106.703342] usb 6-2: device not accepting address 5, error -71
[  106.810006] usb 6-2: new full speed USB device number 6 using uhci_hcd
[  107.223340] usb 6-2: device not accepting address 6, error -71
[  107.223350] hub 6-0:1.0: unable to enumerate USB device on port 2
[  847.596593] usb 6-2: new full speed USB device number 7 using uhci_hcd
[  847.781385] scsi11 : usb-storage 6-2:1.0
[  869.493259] usb 6-2: reset full speed USB device number 7 using uhci_hcd
[  879.739928] usb 6-2: reset full speed USB device number 7 using uhci_hcd
[  895.993259] usb 6-2: reset full speed USB device number 7 using uhci_hcd
[  896.243260] usb 6-2: reset full speed USB device number 7 using uhci_hcd
[  906.489922] usb 6-2: reset full speed USB device number 7 using uhci_hcd
[  906.636310] scsi 11:0:0:0: Device offlined - not ready after error recovery
[ 1240.346582] usb 6-2: USB disconnect, device number 7
[ 1242.973223] usb 6-1: new full speed USB device number 8 using uhci_hcd
[ 1243.141696] scsi12 : usb-storage 6-1:1.0
[ 1265.439888] usb 6-1: reset full speed USB device number 8 using uhci_hcd
[ 1275.693217] usb 6-1: reset full speed USB device number 8 using uhci_hcd
[ 1291.943223] usb 6-1: reset full speed USB device number 8 using uhci_hcd
[ 1292.193219] usb 6-1: reset full speed USB device number 8 using uhci_hcd
[ 1302.439883] usb 6-1: reset full speed USB device number 8 using uhci_hcd
[ 1302.586484] scsi 12:0:0:0: Device offlined - not ready after error recovery

I get the same messages on a different system.

I've tried removing the ehci-hcd module as recommended in the Linux USB FAQ and in several threads concerning similar issues. I've even taken the device apart to see if anything was broken, fried or disconnected, but I didn't notice anything.

Before I give up and accept that it's an irreparable hardware failure, is there anything that I should try?

I should also mention that I swapped some songs recently, so if this was caused by recent kernel changes, then they were probably introduced in 2.6.39.3. I've found the following in the kernel changelog, which may be relevant:

commit b8680d130d565da6eb07567bd6ff20b73f747498
Author: Alan Stern <stern@rowland.harvard.edu>
Date:   Wed Jul 6 17:03:45 2011 -0400

    USB: additional regression fix for device removal
    
    commit ca5c485f55d326d9a23e4badd05890148aa53f74 upstream.
    
    Commit e534c5b831c8b8e9f5edee5c8a37753c808b80dc (USB: fix regression
    occurring during device removal) didn't go far enough.  It failed to
    take into account that when a driver claims multiple interfaces, it may
    release them all at the same time.  As a result, some interfaces can
    get released before they are unregistered, and we deadlock trying to
    acquire the bandwidth_mutex that we already own.
    
    This patch (asl478) handles this case by setting the "unregistering"
    flag on all the interfaces before removing any of them.
    
    Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
    Tested-by: Éric Piel <eric.piel@tremplin-utc.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


commit 042fe1a2805b175b318db0fa10cf8c9df192fa7c
Author: Alan Stern <stern@rowland.harvard.edu>
Date:   Fri Jul 1 16:43:02 2011 -0400

    USB: fix regression occurring during device removal
    
    commit e534c5b831c8b8e9f5edee5c8a37753c808b80dc upstream.
    
    This patch (as1476) fixes a regression introduced by
    fccf4e86200b8f5edd9a65da26f150e32ba79808 (USB: Free bandwidth when
    usb_disable_device is called).  usb_disconnect() grabs the
    bandwidth_mutex before calling usb_disable_device(), which calls down
    indirectly to usb_set_interface(), which tries to acquire the
    bandwidth_mutex.
    
    The fix causes usb_set_interface() to return early when it is called
    for an interface that has already been unregistered, which is what
    happens in usb_disable_device().
    
    Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
    Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

commit 8c603fc5c6608bac3e3df537f8f4a70a24e4edd0
Author: Alan Stern <stern@rowland.harvard.edu>
Date:   Wed Jun 15 16:29:16 2011 -0400

    USB: don't let the hub driver prevent system sleep
    
    commit cbb330045e5df8f665ac60227ff898421fc8fb92 upstream.
    
    This patch (as1465) continues implementation of the policy that errors
    during suspend or hibernation should not prevent the system from going
    to sleep.
    
    In this case, failure to turn on the Suspend feature for a hub port
    shouldn't be reported as an error.  There are situations where this
    does actually occur (such as when the device plugged into that port
    was disconnected in the recent past), and it turns out to be harmless.
    There's no reason for it to prevent a system sleep.
    
    Also, don't allow the hub driver to fail a system suspend if the
    downstream ports aren't all suspended.  This is also harmless (and
    should never happen, given the change mentioned above); printing a
    warning message in the kernel log is all we really need to do.
    
    Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

commit c4c3048b2bda6562bcdf5507bb9c6c248a87f675
Author: Alan Stern <stern@rowland.harvard.edu>
Date:   Wed Jun 15 16:27:43 2011 -0400

    USB: don't let errors prevent system sleep
    
    commit 0af212ba8f123c2eba151af7726c34a50b127962 upstream.
    
    This patch (as1464) implements the recommended policy that most errors
    during suspend or hibernation should not prevent the system from going
    to sleep.  In particular, failure to suspend a USB driver or a USB
    device should not prevent the sleep from succeeding:
    
    Failure to suspend a device won't matter, because the device will
    automatically go into suspend mode when the USB bus stops carrying
    packets.  (This might be less true for USB-3.0 devices, but let's not
    worry about them now.)
    
    Failure of a driver to suspend might lead to trouble later on when the
    system wakes up, but it isn't sufficient reason to prevent the system
    from going to sleep.
    
    Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

commit a96e5158f0cab04b29e9236f04214c056efe3a04
Author: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Date:   Sun Jun 5 23:22:22 2011 -0700

    USB: Free bandwidth when usb_disable_device is called.
    
    commit fccf4e86200b8f5edd9a65da26f150e32ba79808 upstream.
    
    Tanya ran into an issue when trying to switch a UAS device from the BOT
    configuration to the UAS configuration via the bConfigurationValue sysfs
    file.  Before installing the UAS configuration, set_bConfigurationValue()
    calls usb_disable_device().  That function is supposed to remove all host
    controller resources associated with that device, but it leaves some state
    in the xHCI host controller.
    
    Commit 0791971ba8fbc44e4f476079f856335ed45e6324
        usb: allow drivers to use allocated bandwidth until unbound
    added a call to usb_disable_device() in usb_set_configuration(), before
    the xHCI bandwidth functions were invoked.  That commit fixed a bug, but
    also introduced a bug that is triggered when a configured device is
    switched to a new configuration.
    
    usb_disable_device() goes through all the motions of unbinding the drivers
    attached to active interfaces and removing the USB core structures
    associated with those interfaces, but it doesn't actually remove the
    endpoints from the internal xHCI host controller bandwidth structures.
    
    When usb_disable_device() calls usb_disable_endpoint() with reset_hardware
    set to true, the entries in udev->ep_out and udev->ep_in will be set to
    NULL.  Usually, when the USB core installs a new configuration,
    usb_hcd_alloc_bandwidth() will drop all non-NULL endpoints in udev->ep_out
    and udev->ep_in before adding any new endpoints.  However, when the new
    UAS configuration was added, all those entries were null, so none of the
    old endpoints in the BOT configuration were dropped.
    
    The xHCI driver blindly added the UAS configuration endpoints, and some of
    the endpoint addresses overlapped with the old BOT configuration
    endpoints.  This caused the xHCI host to reject the Configure Endpoint
    command.  Now that the xHCI driver code is cleaned up to reject a
    double-add of active endpoints, we need to fix the USB core to properly
    drop old endpoints in usb_disable_device().
    
    If the host controller driver needs bandwidth checking support, make
    usb_disable_device() call usb_disable_endpoint() with
    reset_hardware set to false, drop the endpoints from the xHCI host
    controller, and then call usb_disable_endpoint() again with
    reset_hardware set to true.
    
    The first call to usb_disable_endpoint() will cancel any pending URBs and
    wait on them to be freed in usb_hcd_disable_endpoint(), but will keep the
    pointers in udev->ep_out and udev->ep in intact.  Then
    usb_hcd_alloc_bandwidth() will use those pointers to know which endpoints
    to drop.
    
    The final call to usb_disable_endpoint() will do two things:
    
    1. It will call usb_hcd_disable_endpoint() again, which should be harmless
    since the ep->urb_list should be empty after the first call to
    usb_disable_endpoint() returns.
    
    2. It will set the entries in udev->ep_out and udev->ep in to NULL, and call
    usb_hcd_disable_endpoint().  That call will have no effect, since the xHCI
    driver doesn't set the endpoint_disable function pointer.
    
    Note that usb_disable_device() will now need to be called with
    hcd->bandwidth_mutex held.
    
    This should be backported to kernels as old as 2.6.32.
    
    Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
    Reported-by: Tanya Brokhman <tlinder@codeaurora.org>
    Cc: ablay@codeaurora.org
    Cc: Alan Stern <stern@rowland.harvard.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Does anyone know if any of those changes could kill a USB flash drive?


Linux fun fact #3487: if you remove the uhci-hcd module, your usb keyboard and mouse stop working tongue (luckily I still have an old non-usb keyboard)

Last edited by Xyne (2011-07-22 11:49:37)


My Arch Linux StuffForum EtiquetteCommunity Ethos - Arch is not for everyone

Offline

#2 2011-07-22 17:55:27

StenM
Member
Registered: 2011-07-03
Posts: 33

Re: Death of a USB device... natural causes or foul play?

I had a while ago similar problem with USB memory stick. At the end, I don't know why, just formatting it using WinXP helped me.
S-

Last edited by StenM (2011-07-22 17:56:32)

Offline

Board footer

Powered by FluxBB