You are not logged in.

#1 2018-03-11 20:59:44

nrz
Member
Registered: 2016-07-24
Posts: 19

[SOLVED] Kernel RCU crash - Seagate 0bc2:3312

Here I have Seagate SRD00F2 3TB External HDD connected to the USB-3 port on the MSI X99A RAIDER motherboard with 16 GB of installed RAM. After copying ~15.95 GB from the external HDD the following happens (start looking from 20:05:41):

Mar 11 19:58:56 arch-box kernel: usb 5-2: new SuperSpeed USB device number 5 using xhci_hcd
Mar 11 19:58:56 arch-box kernel: usb 5-2: New USB device found, idVendor=0bc2, idProduct=3312
Mar 11 19:58:56 arch-box kernel: usb 5-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Mar 11 19:58:56 arch-box kernel: usb 5-2: Product: Expansion Desk
Mar 11 19:58:56 arch-box kernel: usb 5-2: Manufacturer: Seagate 
Mar 11 19:58:56 arch-box kernel: usb 5-2: SerialNumber: NA4K8DMB
Mar 11 19:58:56 arch-box kernel: scsi host7: uas
Mar 11 19:58:56 arch-box kernel: scsi 7:0:0:0: Direct-Access     Seagate  Expansion Desk   0740 PQ: 0 ANSI: 6
Mar 11 19:58:56 arch-box kernel: sd 7:0:0:0: [sdd] Spinning up disk...
Mar 11 19:59:02 arch-box kernel: .
Mar 11 19:59:04 arch-box kernel: .
Mar 11 19:59:05 arch-box kernel: .
Mar 11 19:59:06 arch-box kernel: .
Mar 11 19:59:07 arch-box kernel: .
Mar 11 19:59:08 arch-box kernel: .
Mar 11 19:59:08 arch-box kernel: .
Mar 11 19:59:08 arch-box kernel: ready
Mar 11 19:59:08 arch-box kernel: sd 7:0:0:0: [sdd] 732566645 4096-byte logical blocks: (3.00 TB/2.73 TiB)
Mar 11 19:59:08 arch-box kernel: sd 7:0:0:0: [sdd] Write Protect is off
Mar 11 19:59:08 arch-box kernel: sd 7:0:0:0: [sdd] Mode Sense: 2b 00 10 08
Mar 11 19:59:08 arch-box kernel: sd 7:0:0:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
Mar 11 19:59:08 arch-box kernel:  sdd: sdd1
Mar 11 19:59:08 arch-box kernel: sd 7:0:0:0: [sdd] Attached SCSI disk
Mar 11 20:00:49 arch-box kernel: EXT4-fs (dm-11): mounted filesystem with ordered data mode. Opts: (null)
Mar 11 20:02:52 arch-box kernel: sd 7:0:0:0: [sdd] tag#1 uas_eh_abort_handler 0 uas-tag 2 inflight: IN 
Mar 11 20:02:52 arch-box kernel: sd 7:0:0:0: [sdd] tag#1 CDB: Read(10) 28 00 22 b4 70 a0 00 00 20 00
Mar 11 20:02:52 arch-box kernel: scsi host7: uas_eh_device_reset_handler start
Mar 11 20:02:52 arch-box kernel: usb 5-2: reset SuperSpeed USB device number 5 using xhci_hcd
Mar 11 20:02:52 arch-box kernel: scsi host7: uas_eh_device_reset_handler success
Mar 11 20:05:41 arch-box kernel: sd 7:0:0:0: [sdd] tag#0 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN 
Mar 11 20:05:41 arch-box kernel: sd 7:0:0:0: [sdd] tag#0 CDB: Read(10) 28 00 23 3b 0c e0 00 00 20 00
Mar 11 20:05:41 arch-box kernel: sd 7:0:0:0: [sdd] tag#1 uas_eh_abort_handler 0 uas-tag 2 inflight: CMD IN 
Mar 11 20:05:41 arch-box kernel: sd 7:0:0:0: [sdd] tag#1 CDB: Read(10) 28 00 23 3b 0c c0 00 00 20 00
Mar 11 20:05:41 arch-box kernel: WARNING: CPU: 3 PID: 33 at kernel/rcu/tree.c:2792 rcu_do_batch.isra.28+0x231/0x250
Mar 11 20:05:41 arch-box kernel: Modules linked in: veth nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo br_netfilter fuse xt_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables devlink ip6table_filter ip6_tables tun overlay xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_tcpudp xt_recent xt_mark xt_comment iptable_filter xt_addrtype iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat ip_vs nf_conntrack libcrc32c crc32c_generic 8021q mrp bridge stp llc arc4 nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi nvidia(PO) iwlmvm mac80211 snd_hda_codec_realtek snd_hda_codec_generic intel_rapl iTCO_wdt iTCO_vendor_support mxm_wmi iwlwifi drm_kms_helper x86_pkg_temp_thermal intel_powerclamp drm coretemp kvm_intel cfg80211 kvm joydev input_leds mousedev led_class snd_hda_intel
Mar 11 20:05:41 arch-box kernel:  snd_hda_codec snd_hda_core agpgart snd_hwdep irqbypass rfkill snd_pcm ipmi_devintf intel_cstate e1000e ipmi_msghandler intel_uncore syscopyarea snd_timer sysfillrect intel_rapl_perf sysimgblt fb_sys_fops mei_me snd mei i2c_i801 soundcore pcspkr ptp lpc_ich shpchp pps_core rtc_cmos wmi evdev mac_hid crypto_user ip_tables x_tables ext4 crc16 mbcache jbd2 fscrypto algif_skcipher af_alg uas usb_storage hid_generic usbhid hid sd_mod dm_crypt dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel ahci aes_x86_64 ehci_pci crypto_simd libahci glue_helper xhci_pci libata xhci_hcd ehci_hcd cryptd scsi_mod usbcore usb_common
Mar 11 20:05:41 arch-box kernel: CPU: 3 PID: 33 Comm: rcuc/3 Tainted: P           O     4.15.7-1-ARCH #1
Mar 11 20:05:41 arch-box kernel: Hardware name: MSI MS-7885/X99A RAIDER (MS-7885), BIOS P.50 07/19/2016
Mar 11 20:05:41 arch-box kernel: RIP: 0010:rcu_do_batch.isra.28+0x231/0x250
Mar 11 20:05:41 arch-box kernel: RSP: 0018:ffffa17c81a4fe50 EFLAGS: 00010002
Mar 11 20:05:41 arch-box kernel: RAX: ffffffffffffd800 RBX: ffff9dd36f2e2380 RCX: 00000000002e2801
Mar 11 20:05:41 arch-box kernel: RDX: 0000000000000001 RSI: ffffa17c81a4fe50 RDI: ffff9dd36f2e23b8
Mar 11 20:05:41 arch-box kernel: RBP: ffff9dd36f2e23b8 R08: 000023a910807ee0 R09: ffffffff8d0ee1f9
Mar 11 20:05:41 arch-box kernel: R10: ffffa17c8192fdf0 R11: 0000000000000001 R12: 0000000000000246
Mar 11 20:05:41 arch-box kernel: R13: ffffffff8e054250 R14: ffffffffffffffe3 R15: ffff9dd36c908000
Mar 11 20:05:41 arch-box kernel: FS:  0000000000000000(0000) GS:ffff9dd36f2c0000(0000) knlGS:0000000000000000
Mar 11 20:05:41 arch-box kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 11 20:05:41 arch-box kernel: CR2: 00007f307850a0e0 CR3: 000000012800a003 CR4: 00000000003606e0
Mar 11 20:05:41 arch-box kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 11 20:05:41 arch-box kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mar 11 20:05:41 arch-box kernel: Call Trace:
Mar 11 20:05:41 arch-box kernel:  ? rcu_cpu_kthread+0x49/0x2d0
Mar 11 20:05:41 arch-box kernel:  ? sort_range+0x20/0x20
Mar 11 20:05:41 arch-box kernel:  rcu_cpu_kthread+0x10d/0x2d0
Mar 11 20:05:41 arch-box kernel:  ? sort_range+0x20/0x20
Mar 11 20:05:41 arch-box kernel:  smpboot_thread_fn+0x19e/0x240
Mar 11 20:05:41 arch-box kernel:  kthread+0x113/0x130
Mar 11 20:05:41 arch-box kernel:  ? kthread_create_on_node+0x70/0x70
Mar 11 20:05:41 arch-box kernel:  ret_from_fork+0x35/0x40
Mar 11 20:05:41 arch-box kernel: Code: 48 83 6c 24 18 01 e9 e8 fe ff ff 48 3b 15 d0 5f f6 00 0f 8f 6b ff ff ff 48 8b 05 d3 5f f6 00 48 89 83 b0 00 00 00 e9 58 ff ff ff <0f> 0b eb 8d 0f 0b e9 65 fe ff ff e8 8f cc f8 ff 0f 1f 44 00 00 
Mar 11 20:05:41 arch-box kernel: ---[ end trace 1dc3fbeec661ea24 ]---
# ps -o pid,ppid,comm,args 33
  PID  PPID COMMAND         COMMAND
   33     2 rcuc/3          [rcuc/3]

The aftermath of this is that all the operations on the external HDD (sdd1) are stalled, for instance ls or umount just block forever...

Does anyone know what's going on and what can I do about it?

# uname -a
Linux arch-box 4.15.7-1-ARCH #1 SMP PREEMPT Wed Feb 28 19:01:57 UTC 2018 x86_64 GNU/Linux

Last edited by nrz (2018-03-19 21:16:25)

Offline

#2 2018-03-11 21:11:46

loqs
Member
Registered: 2014-03-06
Posts: 17,321

Re: [SOLVED] Kernel RCU crash - Seagate 0bc2:3312

Offline

#3 2018-03-12 07:19:39

nrz
Member
Registered: 2016-07-24
Posts: 19

Re: [SOLVED] Kernel RCU crash - Seagate 0bc2:3312

Yes, it seems that the old trick still work or at least I was able to copy ~150GB of the sample data at ~100MBps:

$ cat /etc/modprobe.d/Seagate_SRD00F2.conf 
options usb-storage quirks=0x0bc2:0x3312:u

Interestingly in the thread which you have refered Hans de Goede (author of the uas module) has mentioned that he included 0bc2:3312 quirk to the module. Presumably these changes did not propagate to the Arch kernel?

Offline

#4 2018-03-13 01:17:32

nrz
Member
Registered: 2016-07-24
Posts: 19

Re: [SOLVED] Kernel RCU crash - Seagate 0bc2:3312

I was too quick to confirm that there's a fix, unfortunately it seems there's none:

...
Mar 12 21:03:09 arch-box kernel:  ? commit_timeout+0x10/0x10 [jbd2]
Mar 12 21:03:09 arch-box kernel:  kthread+0x113/0x130
Mar 12 21:03:09 arch-box kernel:  ? kthread_create_on_node+0x70/0x70
Mar 12 21:03:09 arch-box kernel:  ? do_syscall_64+0x18a/0x190
Mar 12 21:03:09 arch-box kernel:  ret_from_fork+0x35/0x40
Mar 12 21:05:12 arch-box kernel: INFO: task jbd2/dm-10-8:4069 blocked for more than 120 seconds.
Mar 12 21:05:12 arch-box kernel:       Tainted: P        W  O     4.15.7-1-ARCH #1
Mar 12 21:05:12 arch-box kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 12 21:05:12 arch-box kernel: jbd2/dm-10-8    D    0  4069      2 0x80000000
Mar 12 21:05:12 arch-box kernel: Call Trace:
Mar 12 21:05:12 arch-box kernel:  ? __schedule+0x24b/0x8c0
Mar 12 21:05:12 arch-box kernel:  ? bit_wait+0x50/0x50
Mar 12 21:05:12 arch-box kernel:  schedule+0x32/0x90
Mar 12 21:05:12 arch-box kernel:  io_schedule+0x12/0x40
Mar 12 21:05:12 arch-box kernel:  bit_wait_io+0xd/0x50
Mar 12 21:05:12 arch-box kernel:  __wait_on_bit+0x44/0x80
Mar 12 21:05:12 arch-box kernel:  ? submit_bio+0x6c/0x140
Mar 12 21:05:12 arch-box kernel:  out_of_line_wait_on_bit+0x91/0xb0
Mar 12 21:05:12 arch-box kernel:  ? bit_waitqueue+0x30/0x30
Mar 12 21:05:12 arch-box kernel:  jbd2_journal_commit_transaction+0xfc9/0x18b0 [jbd2]
Mar 12 21:05:12 arch-box kernel:  ? __update_idle_core+0x20/0xb0
Mar 12 21:05:12 arch-box kernel:  ? kjournald2+0xc0/0x270 [jbd2]
Mar 12 21:05:12 arch-box kernel:  kjournald2+0xc0/0x270 [jbd2]
Mar 12 21:05:12 arch-box kernel:  ? __wake_up_common+0x74/0x120
Mar 12 21:05:12 arch-box kernel:  ? wait_woken+0x80/0x80
Mar 12 21:05:12 arch-box kernel:  ? commit_timeout+0x10/0x10 [jbd2]
Mar 12 21:05:12 arch-box kernel:  kthread+0x113/0x130
Mar 12 21:05:12 arch-box kernel:  ? kthread_create_on_node+0x70/0x70
Mar 12 21:05:12 arch-box kernel:  ? do_syscall_64+0x18a/0x190
Mar 12 21:05:12 arch-box kernel:  ret_from_fork+0x35/0x40
Mar 12 21:07:15 arch-box kernel: INFO: task jbd2/dm-10-8:4069 blocked for more than 120 seconds.
Mar 12 21:07:15 arch-box kernel:       Tainted: P        W  O     4.15.7-1-ARCH #1
Mar 12 21:07:15 arch-box kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 12 21:07:15 arch-box kernel: jbd2/dm-10-8    D    0  4069      2 0x80000000
Mar 12 21:07:15 arch-box kernel: Call Trace:
Mar 12 21:07:15 arch-box kernel:  ? __schedule+0x24b/0x8c0
Mar 12 21:07:15 arch-box kernel:  ? bit_wait+0x50/0x50
Mar 12 21:07:15 arch-box kernel:  schedule+0x32/0x90
Mar 12 21:07:15 arch-box kernel:  io_schedule+0x12/0x40
Mar 12 21:07:15 arch-box kernel:  bit_wait_io+0xd/0x50
Mar 12 21:07:15 arch-box kernel:  __wait_on_bit+0x44/0x80
Mar 12 21:07:15 arch-box kernel:  ? submit_bio+0x6c/0x140
Mar 12 21:07:15 arch-box kernel:  out_of_line_wait_on_bit+0x91/0xb0
Mar 12 21:07:15 arch-box kernel:  ? bit_waitqueue+0x30/0x30
Mar 12 21:07:15 arch-box kernel:  jbd2_journal_commit_transaction+0xfc9/0x18b0 [jbd2]
Mar 12 21:07:15 arch-box kernel:  ? __update_idle_core+0x20/0xb0
Mar 12 21:07:15 arch-box kernel:  ? kjournald2+0xc0/0x270 [jbd2]
Mar 12 21:07:15 arch-box kernel:  kjournald2+0xc0/0x270 [jbd2]
Mar 12 21:07:15 arch-box kernel:  ? __wake_up_common+0x74/0x120
Mar 12 21:07:15 arch-box kernel:  ? wait_woken+0x80/0x80
Mar 12 21:07:15 arch-box kernel:  ? commit_timeout+0x10/0x10 [jbd2]
Mar 12 21:07:15 arch-box kernel:  kthread+0x113/0x130
Mar 12 21:07:15 arch-box kernel:  ? kthread_create_on_node+0x70/0x70
Mar 12 21:07:15 arch-box kernel:  ? do_syscall_64+0x18a/0x190
Mar 12 21:07:15 arch-box kernel:  ret_from_fork+0x35/0x40
Mar 12 21:22:04 arch-box kernel: perf: interrupt took too long (23170 > 21965), lowering kernel.perf_event_max_sample_rate to 8000

Any more ideas?

Offline

#5 2018-03-13 08:36:36

seth
Member
Registered: 2012-09-03
Posts: 51,017

Offline

#6 2018-03-13 08:55:58

nrz
Member
Registered: 2016-07-24
Posts: 19

Re: [SOLVED] Kernel RCU crash - Seagate 0bc2:3312

Yes, I have ext4 which is layered on top of the crypsetup.

I'll try to downgrade to the v4.15.3 to see if that makes any difference at all.

Offline

#7 2018-03-13 09:27:17

loqs
Member
Registered: 2014-03-06
Posts: 17,321

Re: [SOLVED] Kernel RCU crash - Seagate 0bc2:3312

You could try 4.16-rc5 or apply https://git.kernel.org/pub/scm/linux/ke … 2bcd8e305b to 4.15.9
Edit:
patch is also queued for 4.15.10 https://git.kernel.org/pub/scm/linux/ke … fbff07ef47

Last edited by loqs (2018-03-13 19:11:30)

Offline

#8 2018-03-19 07:43:51

ezacaria
Member
Registered: 2007-12-10
Posts: 113

Re: [SOLVED] Kernel RCU crash - Seagate 0bc2:3312

I see this on the eSATA port (MB is ASUS Rampage IV) but not on the USB3 port (that has no UAS support). I am using a Seagate disk "ST2000LM 003 HN-M201RAD" on an enclosure with both eSATA/USB3.

As a workaround, perhaps your MB has a USB3 port without UAS support?

Offline

#9 2018-03-19 09:29:04

loqs
Member
Registered: 2014-03-06
Posts: 17,321

Re: [SOLVED] Kernel RCU crash - Seagate 0bc2:3312

@ezacaria the issue is not fixed with 4.15.10 on your system?

Offline

#10 2018-03-19 19:18:39

ezacaria
Member
Registered: 2007-12-10
Posts: 113

Re: [SOLVED] Kernel RCU crash - Seagate 0bc2:3312

Thanks for pointing that out. It seems that the 4.15.10 kernel became available between my previous system update and today's post. However, I am glad to report that that the eSATA port's behaviour is seemingly back to normal smile

Offline

#11 2018-03-19 21:04:21

nrz
Member
Registered: 2016-07-24
Posts: 19

Re: [SOLVED] Kernel RCU crash - Seagate 0bc2:3312

I just moved around over 70GB - no problem. Before v4.15.10 HDD would get locked at ~30GB... Conclusively kernel upgrade has resolved it for the USB-3 too.

Offline

Board footer

Powered by FluxBB