You are not logged in.
Problem:
Random system freezes on usb device/dock disconnect. Not reproducable yet (since it doesn't freeze every time).
...
Feb 14 23:30:35 <hostname> kernel: usb 3-5: USB disconnect, device number 14
Feb 14 23:30:36 <hostname> kernel: BUG: kernel NULL pointer dereference, address: 0000000000000030
Feb 14 23:30:36 <hostname> kernel: #PF: supervisor write access in kernel mode
Feb 14 23:30:36 <hostname> kernel: #PF: error_code(0x0002) - not-present page
Feb 14 23:30:36 <hostname> kernel: PGD 0 P4D 0
Feb 14 23:30:36 <hostname> kernel: Oops: 0002 [#1] PREEMPT SMP NOPTI
Feb 14 23:30:36 <hostname> kernel: CPU: 0 PID: 7 Comm: kworker/0:0 Not tainted 6.1.11-arch1-1 #1 a4e1ab564378dba05cc0d5c9f99dce3dc67f88f0
Feb 14 23:30:36 <hostname> kernel: Hardware name: LENOVO 21BWS37K00/21BWS37K00, BIOS N3MET11W (1.10 ) 12/07/2022
Feb 14 23:30:36 <hostname> kernel: Workqueue: kacpi_notify acpi_os_execute_deferred
Feb 14 23:30:36 <hostname> kernel: RIP: 0010:queue_work_on+0x19/0x50
Feb 14 23:30:36 <hostname> kernel: Code: ff e9 c5 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 53 9c 58 0f 1f 40 00 48 89 c3 fa 0f 1f 44 00 00 <f0> 48 0f ba 2a 00 73 15 31 c9 80 e7 02 74 06 fb 0f 1f >
Feb 14 23:30:36 <hostname> kernel: RSP: 0000:ffffbd07400ffe38 EFLAGS: 00010006
Feb 14 23:30:36 <hostname> kernel: RAX: 0000000000000206 RBX: 0000000000000206 RCX: 0000000000000000
Feb 14 23:30:36 <hostname> kernel: RDX: 0000000000000030 RSI: ffff9f5a80051000 RDI: 0000000000000140
Feb 14 23:30:36 <hostname> kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000080380026
Feb 14 23:30:36 <hostname> kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9f61bf43ba00
Feb 14 23:30:36 <hostname> kernel: R13: 0000000000000000 R14: ffff9f5a80212b40 R15: ffff9f5a86468c98
Feb 14 23:30:36 <hostname> kernel: FS: 0000000000000000(0000) GS:ffff9f61bf400000(0000) knlGS:0000000000000000
Feb 14 23:30:36 <hostname> kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 14 23:30:36 <hostname> kernel: CR2: 0000000000000030 CR3: 00000004d8210001 CR4: 0000000000f70ef0
Feb 14 23:30:36 <hostname> kernel: PKRU: 55555554
Feb 14 23:30:36 <hostname> kernel: Call Trace:
Feb 14 23:30:36 <hostname> kernel: <TASK>
Feb 14 23:30:36 <hostname> kernel: ucsi_acpi_notify+0xac/0xc0 [ucsi_acpi 9c6a23f21d3ec74e5573e3e6b769395c9b0898ad]
Feb 14 23:30:36 <hostname> kernel: acpi_ev_notify_dispatch+0x4b/0x63
Feb 14 23:30:36 <hostname> kernel: acpi_os_execute_deferred+0x17/0x30
Feb 14 23:30:36 <hostname> kernel: process_one_work+0x1c4/0x380
Feb 14 23:30:36 <hostname> kernel: worker_thread+0x51/0x390
Feb 14 23:30:36 <hostname> kernel: ? rescuer_thread+0x3b0/0x3b0
Feb 14 23:30:36 <hostname> kernel: kthread+0xdb/0x110
Feb 14 23:30:36 <hostname> kernel: ? kthread_complete_and_exit+0x20/0x20
Feb 14 23:30:36 <hostname> kernel: ret_from_fork+0x1f/0x30
Feb 14 23:30:36 <hostname> kernel: </TASK>
Feb 14 23:30:36 <hostname> kernel: Modules linked in: tls authenc echainiv esp4 ccm rfcomm snd_seq_dummy snd_hrtimer snd_seq cmac algif_hash algif_skcipher af_alg r8153_ecm cdc_ether usbnet r8152 mii snd_usb_audio snd_usbmi>
Feb 14 23:30:36 <hostname> kernel: snd_soc_core snd_compress ac97_bus iwlmvm intel_tcc_cooling snd_pcm_dmaengine x86_pkg_temp_thermal i915 intel_powerclamp snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi hid_multitouch c>
Feb 14 23:30:36 <hostname> kernel: i2c_hid int3400_thermal intel_hid acpi_tad acpi_pad wmi acpi_thermal_rel sparse_keymap mac_hid crypto_user fuse ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6>
Feb 14 23:30:36 <hostname> kernel: CR2: 0000000000000030
Feb 14 23:30:36 <hostname> kernel: ---[ end trace 0000000000000000 ]---
Feb 14 23:30:36 <hostname> kernel: RIP: 0010:queue_work_on+0x19/0x50
Feb 14 23:30:36 <hostname> kernel: Code: ff e9 c5 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 53 9c 58 0f 1f 40 00 48 89 c3 fa 0f 1f 44 00 00 <f0> 48 0f ba 2a 00 73 15 31 c9 80 e7 02 74 06 fb 0f 1f >
Feb 14 23:30:36 <hostname> kernel: RSP: 0000:ffffbd07400ffe38 EFLAGS: 00010006
Feb 14 23:30:36 <hostname> kernel: RAX: 0000000000000206 RBX: 0000000000000206 RCX: 0000000000000000
Feb 14 23:30:36 <hostname> kernel: RDX: 0000000000000030 RSI: ffff9f5a80051000 RDI: 0000000000000140
Feb 14 23:30:36 <hostname> kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000080380026
Feb 14 23:30:36 <hostname> kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9f61bf43ba00
Feb 14 23:30:36 <hostname> kernel: R13: 0000000000000000 R14: ffff9f5a80212b40 R15: ffff9f5a86468c98
Feb 14 23:30:36 <hostname> kernel: FS: 0000000000000000(0000) GS:ffff9f61bf400000(0000) knlGS:0000000000000000
Feb 14 23:30:36 <hostname> kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 14 23:30:36 <hostname> kernel: CR2: 0000000000000030 CR3: 00000004d8210001 CR4: 0000000000f70ef0
Feb 14 23:30:36 <hostname> kernel: PKRU: 55555554
Spec:
- 6.1.11-arch1-1
- GNOME Wayland
- LENOVO ThinkPad T16 Gen 1 / 12th Gen Intel(R) Core(TM) i7-1260P
Any ideas?
Edit: remove "on update" since it exists before too
Last edited by dcy3rka (2023-02-23 07:30:10)
Offline
Looks a lot like https://bugs.archlinux.org/task/75666
Unresolved, though.
Edit:
debian, https://groups.google.com/g/linux.debia … xZu0?pli=1
ubuntu/gnome, https://gitlab.gnome.org/GNOME/gnome-sh … ssues/6001
Does the undocking condition meet your perception?
Edit: and in case this is a regression, what was your previous kernel version and does it happen w/ the lts kernel as well?
Last edited by seth (2023-02-15 16:46:07)
Online
After some days of testing and analyzing I can say the following:
- It's hard to reproduce. On some devices it happens quite often on other never (maybe other using behavior of the user)
- It's always in some connection with dock or USB device unplugging
- Our first freeze like this happens on Jan 5, 2023 but on other comments they are affected since Kernel 5.19
And yes, your link are reporting the same. Here some other from a debian mailing list:
https://groups.google.com/g/linux.debia … xZu0?pli=1
Offline
The output in #1 is copied and pasted from the pager (please don't do that in general, it caps the lines)
Can you please post the output of
lsmod
?
Edit: ideally also more context of the oops.
https://ask.fedoraproject.org/t/system- … rage/31881
https://bbs.archlinux.org/viewtopic.php?id=238513
https://bugs.archlinux.org/task/59096
On a guess: blacklist thinkpad_acpi, https://wiki.archlinux.org/title/Kernel … acklisting
Last edited by seth (2023-02-21 15:12:14)
Online
Oh sorry, I oversee that...
Here my lsmod output:
# lsmod
Module Size Used by
tls 135168 0
authenc 16384 2
echainiv 16384 2
esp4 32768 2
ccm 20480 0
rfcomm 94208 4
snd_seq_dummy 16384 0
snd_hrtimer 16384 1
snd_seq 94208 7 snd_seq_dummy
cmac 16384 3
algif_hash 16384 1
algif_skcipher 16384 1
af_alg 36864 6 algif_hash,algif_skcipher
r8153_ecm 16384 0
cdc_ether 24576 1 r8153_ecm
usbnet 57344 2 r8153_ecm,cdc_ether
snd_usb_audio 397312 2
r8152 143360 1 r8153_ecm
snd_usbmidi_lib 45056 1 snd_usb_audio
mii 16384 2 usbnet,r8152
snd_rawmidi 49152 1 snd_usbmidi_lib
snd_seq_device 16384 2 snd_seq,snd_rawmidi
snd_ctl_led 24576 0
snd_soc_skl_hda_dsp 24576 4
snd_soc_intel_hda_dsp_common 20480 1 snd_soc_skl_hda_dsp
snd_soc_hdac_hdmi 45056 1 snd_soc_skl_hda_dsp
snd_sof_probes 24576 0
bnep 32768 2
snd_hda_codec_hdmi 86016 1
snd_hda_codec_realtek 172032 1
snd_hda_codec_generic 98304 1 snd_hda_codec_realtek
uvcvideo 163840 8
videobuf2_vmalloc 20480 1 uvcvideo
btusb 65536 0
btrtl 28672 1 btusb
videobuf2_memops 20480 1 videobuf2_vmalloc
btbcm 24576 1 btusb
videobuf2_v4l2 40960 1 uvcvideo
btintel 45056 1 btusb
videobuf2_common 86016 4 videobuf2_vmalloc,videobuf2_v4l2,uvcvideo,videobuf2_memops
btmtk 16384 1 btusb
videodev 319488 7 videobuf2_v4l2,uvcvideo,videobuf2_common
bluetooth 937984 36 btrtl,btmtk,btintel,btbcm,bnep,btusb,rfcomm
mc 77824 9 videodev,snd_usb_audio,videobuf2_v4l2,uvcvideo,videobuf2_common
ecdh_generic 16384 2 bluetooth
crc16 16384 1 bluetooth
nft_reject_inet 16384 1
nf_reject_ipv4 16384 1 nft_reject_inet
nf_reject_ipv6 20480 1 nft_reject_inet
nft_reject 16384 1 nft_reject_inet
nft_limit 16384 1
snd_soc_dmic 16384 1
snd_sof_pci_intel_tgl 16384 0
snd_sof_intel_hda_common 221184 1 snd_sof_pci_intel_tgl
nft_ct 24576 2
soundwire_intel 57344 1 snd_sof_intel_hda_common
nf_conntrack 184320 1 nft_ct
soundwire_generic_allocation 16384 1 soundwire_intel
soundwire_cadence 45056 1 soundwire_intel
nf_defrag_ipv6 24576 1 nf_conntrack
nf_defrag_ipv4 16384 1 nf_conntrack
snd_sof_intel_hda 20480 1 snd_sof_intel_hda_common
snd_sof_pci 24576 2 snd_sof_intel_hda_common,snd_sof_pci_intel_tgl
snd_sof_xtensa_dsp 20480 1 snd_sof_intel_hda_common
snd_sof 339968 3 snd_sof_pci,snd_sof_intel_hda_common,snd_sof_probes
snd_sof_utils 20480 1 snd_sof
snd_soc_hdac_hda 28672 1 snd_sof_intel_hda_common
iwlmvm 532480 0
snd_hda_ext_core 36864 3 snd_sof_intel_hda_common,snd_soc_hdac_hdmi,snd_soc_hdac_hda
snd_soc_acpi_intel_match 69632 2 snd_sof_intel_hda_common,snd_sof_pci_intel_tgl
joydev 28672 0
snd_soc_acpi 16384 2 snd_soc_acpi_intel_match,snd_sof_intel_hda_common
soundwire_bus 126976 3 soundwire_intel,soundwire_generic_allocation,soundwire_cadence
intel_tcc_cooling 16384 0
mac80211 1314816 1 iwlmvm
snd_soc_core 393216 8 soundwire_intel,snd_sof,snd_sof_intel_hda_common,snd_soc_hdac_hdmi,snd_soc_hdac_hda,snd_sof_probes,snd_soc_dmic,snd_soc_skl_hda_dsp
nf_tables 286720 29 nft_ct,nft_reject_inet,nft_limit,nft_reject
snd_compress 28672 2 snd_soc_core,snd_sof_probes
ac97_bus 16384 1 snd_soc_core
x86_pkg_temp_thermal 20480 0
mousedev 24576 0
snd_pcm_dmaengine 16384 1 snd_soc_core
nfnetlink 20480 1 nf_tables
libarc4 16384 1 mac80211
intel_powerclamp 20480 0
coretemp 20480 0
snd_hda_intel 61440 0
snd_intel_dspcfg 36864 3 snd_hda_intel,snd_sof,snd_sof_intel_hda_common
i915 3481600 40
snd_intel_sdw_acpi 20480 2 snd_sof_intel_hda_common,snd_intel_dspcfg
snd_hda_codec 188416 8 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec_realtek,snd_soc_intel_hda_dsp_common,snd_soc_hdac_hda,snd_sof_intel_hda,snd_soc_skl_hda_dsp
iTCO_wdt 16384 0
hid_multitouch 32768 0
kvm_intel 393216 0
snd_hda_core 118784 11 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_ext_core,snd_hda_codec,snd_hda_codec_realtek,snd_soc_intel_hda_dsp_common,snd_sof_intel_hda_common,snd_soc_hdac_hdmi,snd_soc_hdac_hda,snd_sof_intel_hda
mei_hdcp 24576 0
mei_pxp 20480 0
intel_pmc_bxt 16384 1 iTCO_wdt
nxp_nci_i2c 20480 0
processor_thermal_device_pci 16384 0
drm_buddy 20480 1 i915
nxp_nci 16384 1 nxp_nci_i2c
iTCO_vendor_support 16384 1 iTCO_wdt
intel_rapl_msr 20480 0
iwlwifi 491520 1 iwlmvm
pmt_telemetry 16384 0
processor_thermal_device 20480 1 processor_thermal_device_pci
snd_hwdep 16384 2 snd_usb_audio,snd_hda_codec
think_lmi 40960 0
kvm 1146880 1 kvm_intel
pmt_class 16384 1 pmt_telemetry
firmware_attributes_class 16384 1 think_lmi
wmi_bmof 16384 0
irqbypass 16384 1 kvm
rapl 16384 0
intel_cstate 20480 0
thinkpad_acpi 184320 0
intel_uncore 217088 0
psmouse 212992 0
pcspkr 16384 0
ucsi_acpi 16384 0
processor_thermal_rfim 16384 1 processor_thermal_device
snd_pcm 172032 14 snd_hda_codec_hdmi,snd_hda_intel,snd_usb_audio,snd_hda_codec,soundwire_intel,snd_sof,snd_sof_intel_hda_common,snd_soc_hdac_hdmi,snd_compress,snd_soc_core,snd_sof_utils,snd_hda_core,snd_pcm_dmaengine
ttm 94208 1 i915
vfat 24576 1
ledtrig_audio 16384 3 snd_ctl_led,snd_hda_codec_generic,thinkpad_acpi
nci 86016 2 nxp_nci,nxp_nci_i2c
processor_thermal_mbox 16384 2 processor_thermal_rfim,processor_thermal_device
intel_lpss_pci 28672 0
mei_me 57344 2
i2c_i801 45056 0
spi_nor 118784 0
typec_ucsi 53248 1 ucsi_acpi
drm_display_helper 212992 1 i915
platform_profile 16384 1 thinkpad_acpi
snd_timer 49152 3 snd_seq,snd_hrtimer,snd_pcm
intel_lpss 16384 1 intel_lpss_pci
processor_thermal_rapl 20480 1 processor_thermal_device
nfc 143360 2 nci,nxp_nci
cfg80211 1126400 3 iwlmvm,iwlwifi,mac80211
fat 98304 1 vfat
i2c_hid_acpi 16384 0
int3400_thermal 20480 0
cec 81920 2 drm_display_helper,i915
typec 90112 1 typec_ucsi
intel_rapl_common 32768 2 intel_rapl_msr,processor_thermal_rapl
snd 131072 32 snd_ctl_led,snd_hda_codec_generic,snd_seq,snd_seq_device,snd_hda_codec_hdmi,snd_hwdep,snd_hda_intel,snd_usb_audio,snd_usbmidi_lib,snd_hda_codec,snd_hda_codec_realtek,snd_sof,snd_timer,snd_soc_hdac_hdmi,snd_compress,thinkpad_acpi,snd_soc_core,snd_pcm,snd_rawmidi
mtd 94208 1 spi_nor
int3403_thermal 20480 0
e1000e 331776 0
i2c_smbus 20480 1 i2c_i801
mei 176128 5 mei_hdcp,mei_pxp,mei_me
video 65536 2 thinkpad_acpi,i915
idma64 20480 0
thunderbolt 401408 0
intel_gtt 28672 1 i915
roles 16384 1 typec_ucsi
igen6_edac 32768 0
intel_vsec 20480 0
i2c_hid 40960 1 i2c_hid_acpi
soundcore 16384 2 snd_ctl_led,snd
int340x_thermal_zone 20480 2 int3403_thermal,processor_thermal_device
rfkill 32768 8 iwlmvm,nfc,bluetooth,thinkpad_acpi,cfg80211
acpi_thermal_rel 16384 1 int3400_thermal
intel_hid 28672 0
acpi_tad 20480 0
acpi_pad 24576 0
wmi 45056 3 video,wmi_bmof,think_lmi
sparse_keymap 16384 1 intel_hid
mac_hid 16384 0
crypto_user 24576 0
fuse 176128 5
ip_tables 36864 0
x_tables 57344 1 ip_tables
btrfs 1941504 1
blake2b_generic 20480 0
libcrc32c 16384 3 nf_conntrack,btrfs,nf_tables
crc32c_generic 16384 0
xor 24576 1 btrfs
raid6_pq 122880 1 btrfs
dm_crypt 61440 1
cbc 16384 0
encrypted_keys 28672 1 dm_crypt
trusted 53248 2 encrypted_keys,dm_crypt
asn1_encoder 16384 1 trusted
tee 36864 1 trusted
hid_cmedia 16384 0
usbhid 73728 0
dm_mod 192512 3 dm_crypt
crct10dif_pclmul 16384 1
crc32_pclmul 16384 0
serio_raw 20480 0
crc32c_intel 24576 2
atkbd 36864 0
polyval_clmulni 16384 0
polyval_generic 16384 1 polyval_clmulni
libps2 20480 2 atkbd,psmouse
gf128mul 16384 1 polyval_generic
vivaldi_fmap 16384 1 atkbd
ghash_clmulni_intel 16384 0
sha512_ssse3 53248 1
nvme 61440 2
aesni_intel 393216 10
crypto_simd 16384 1 aesni_intel
nvme_core 208896 3 nvme
spi_intel_pci 16384 0
xhci_pci 24576 0
cryptd 24576 6 crypto_simd,ghash_clmulni_intel
spi_intel 32768 1 spi_intel_pci
xhci_pci_renesas 24576 1 xhci_pci
nvme_common 24576 1 nvme_core
i8042 49152 0
serio 28672 6 serio_raw,atkbd,psmouse,i8042
I will test with blacklisted thinkpad_acpi.
Offline
Nope, with blacklisted thinkpad_acpi the freeze is also present
Offline
What kernel was in use before the update? If you downgrade only the kernel packages to the pre-update versions is the issue still present?
Offline
Did you ensure thinkpad_acpi wasn't loaded despite being blacklisted (it gets pulled by some other modules)?
Did you use the /bin/true method? Or module_blacklist=thinkpad_acpi?
Online
@loqs: I thought it was since an update, but it seems to be wrong. Some devices had freezes before too. Unfortunately, I can not reproduce it but it happens randomly.
@seth: I used the /etc/modprobe.d/*.conf variant and checked after reboot with lsmod if thinkpad_acpi was loaded.
Offline
What means the test was not in effect.
use the /bin/true method … or module_blacklist=thinkpad_acpi
As long las "lsmod | grep thinkpad_acpi" shows that it's loaded, it remains the potential offender.
Online
Sorry for the ambiguity. I checked with "lsmod | grep thinkpad_acpi" if it was loaded and it was not.
Offline
You could address the immediate cause and blacklist ucsi_acpi, but idk how the xhci stack will react to that, so ideally do this via the kernel commandline editor of your bootloader.
Online
Here is described the same bug and should be fixed upstream:
https://bugzilla.kernel.org/show_bug.cgi?id=217106
https://lore.kernel.org/stable/09ae066c … hat.com/T/
Offline
Looks like the commits are queued here https://git.kernel.org/pub/scm/linux/ke … =usb-linus waiting the next usb pull to mainline.
https://git.kernel.org/pub/scm/linux/ke … 68fb182764
https://git.kernel.org/pub/scm/linux/ke … e8ede6a22c
https://git.kernel.org/pub/scm/linux/ke … 030dc6934d
Edit:
Two of the fixes are now queued for 6.2.9
https://git.kernel.org/pub/scm/linux/ke … 4a18a320ee
https://git.kernel.org/pub/scm/linux/ke … 4a18a320ee
Last edited by loqs (2023-03-28 21:40:31)
Offline
It looks like the problem is solved in 6.2.9-arch1-1
Which contains:
usb: ucsi_acpi: Increase the command completion timeout
usb: ucsi: Fix NULL pointer deref in ucsi_connector_change()
Offline