You are not logged in.

#1 2023-01-22 13:32:20

fancieux
Member
Registered: 2023-01-22
Posts: 6

Laptop crash after suspending: kernel NULL pointer dereference

Under standard linux kernel, my laptop crashes every time it resumes from suspend. The following is the corresponding journal.

Jan 22 20:17:29 Archlinux kernel: BUG: kernel NULL pointer dereference, address: 0000000000000078
Jan 22 20:17:29 Archlinux kernel: #PF: supervisor read access in kernel mode
Jan 22 20:17:29 Archlinux kernel: #PF: error_code(0x0000) - not-present page
Jan 22 20:17:29 Archlinux kernel: PGD 0 P4D 0 
Jan 22 20:17:29 Archlinux kernel: Oops: 0000 [#1] PREEMPT SMP PTI
Jan 22 20:17:29 Archlinux kernel: CPU: 2 PID: 178 Comm: kworker/2:2 Tainted: G           OE      6.1.7-arch1-1 #1 a2d6f1dcaa775aaae1f25aaf758ae968e3493665
Jan 22 20:17:29 Archlinux kernel: Hardware name: LENOVO 81HX/LNVNB161216, BIOS 6UCN53WW(V4.08) 09/26/2018
Jan 22 20:17:29 Archlinux kernel: Workqueue: events_long ucsi_resume_work [typec_ucsi]
Jan 22 20:17:29 Archlinux kernel: RIP: 0010:ucsi_resume_work+0x32/0x80 [typec_ucsi]
Jan 22 20:17:29 Archlinux kernel: Code: 00 55 31 c9 31 d2 53 48 8b b7 a0 00 00 00 48 89 fb 48 83 ef 38 48 83 ce 05 e8 aa f6 ff ff 85 c0 0f 88 32 20 00 00 48 8b 5b f8 <48> 83 7b 78 00 74 38 48 8d 6b 10 48 89 ef e8 eb 33 75 ca 31 c9 48
Jan 22 20:17:29 Archlinux kernel: RSP: 0000:ffffb5598107be80 EFLAGS: 00010246
Jan 22 20:17:29 Archlinux kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
Jan 22 20:17:29 Archlinux kernel: RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff8ad506619cb8
Jan 22 20:17:29 Archlinux kernel: RBP: ffff8ad669eb2840 R08: 0000000000000001 R09: 0000000000000000
Jan 22 20:17:29 Archlinux kernel: R10: 0000000000000002 R11: 0000000000000000 R12: ffff8ad669eb8b00
Jan 22 20:17:29 Archlinux kernel: R13: 0000000000000000 R14: ffff8ad5019f7840 R15: ffff8ad506619c40
Jan 22 20:17:29 Archlinux kernel: FS:  0000000000000000(0000) GS:ffff8ad669e80000(0000) knlGS:0000000000000000
Jan 22 20:17:29 Archlinux kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 22 20:17:29 Archlinux kernel: CR2: 0000000000000078 CR3: 0000000203c10001 CR4: 00000000003706e0
Jan 22 20:17:29 Archlinux kernel: Call Trace:
Jan 22 20:17:29 Archlinux kernel:  <TASK>
Jan 22 20:17:29 Archlinux kernel:  process_one_work+0x1c4/0x380
Jan 22 20:17:29 Archlinux kernel:  worker_thread+0x51/0x390
Jan 22 20:17:29 Archlinux kernel:  ? rescuer_thread+0x3b0/0x3b0
Jan 22 20:17:29 Archlinux kernel:  kthread+0xdb/0x110
Jan 22 20:17:29 Archlinux kernel:  ? kthread_complete_and_exit+0x20/0x20
Jan 22 20:17:29 Archlinux kernel:  ret_from_fork+0x1f/0x30
Jan 22 20:17:29 Archlinux kernel:  </TASK>
Jan 22 20:17:29 Archlinux kernel: Modules linked in: nft_chain_nat xt_REDIRECT nf_nat nf_conntrack xt_mark nft_compat nf_tables libcrc32c nfnetlink xt_TPROXY nf_tproxy_ipv6 nf_tproxy_ipv4 snd_soc_avs nf_defrag_ipv6 snd_soc_hda_codec nf_defrag_ipv4 snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi_intel_match ccm snd_soc_acpi algif_aead snd_hda_codec_hdmi snd_soc_core cbc intel_tcc_cooling snd_hda_codec_conexant snd_compress x86_pkg_temp_thermal snd_hda_codec_generic intel_powerclamp crct10dif_pclmul ledtrig_audio des_generic ac97_bus hid_logitech_hidpp snd_pcm_dmaengine libdes crc32_pclmul bnep polyval_clmulni ecb snd_hda_intel iTCO_wdt polyval_generic gf128mul ath10k_pci snd_intel_dspcfg joydev ghash_clmulni_intel algif_skcipher snd_intel_sdw_acpi sha512_ssse3 intel_pmc_bxt snd_hda_codec cmac mousedev ath10k_core iTCO_vendor_support serio_raw aesni_intel ath md4 snd_hda_core atkbd crypto_simd 8021q libps2 snd_hwdep mei_pxp mei_hdcp btusb intel_rapl_msr garp
Jan 22 20:17:29 Archlinux kernel:  hid_logitech_dj vivaldi_fmap cryptd snd_pcm algif_hash btrtl mrp mac80211 btbcm stp rapl uvcvideo af_alg coretemp llc libarc4 r8169 intel_cstate videobuf2_vmalloc btintel snd_timer realtek videobuf2_memops intel_uncore btmtk wmi_bmof intel_wmi_thunderbolt processor_thermal_device_pci_legacy snd i2c_i801 videobuf2_v4l2 mdio_devres processor_thermal_device i2c_smbus soundcore ucsi_acpi cfg80211 libphy bluetooth videobuf2_common processor_thermal_rfim mei_me intel_lpss_pci typec_ucsi vfat processor_thermal_mbox intel_lpss mei processor_thermal_rapl i2c_hid_acpi fat idma64 videodev i2c_hid ecdh_generic typec intel_xhci_usb_role_switch intel_rapl_common mc usbhid intel_pch_thermal intel_soc_dts_iosf roles elan_i2c ideapad_laptop sparse_keymap platform_profile rfkill int3403_thermal int340x_thermal_zone i8042 serio int3400_thermal acpi_thermal_rel soc_button_array acpi_pad mac_hid vmmon(OE) vmw_vmci pkcs8_key_parser dm_multipath crypto_user fuse bpf_preload ip_tables x_tables
Jan 22 20:17:29 Archlinux kernel:  ext4 crc32c_generic crc16 mbcache jbd2 dm_mod nvme nvme_core crc32c_intel xhci_pci nvme_common xhci_pci_renesas i915 drm_buddy intel_gtt video wmi drm_display_helper cec ttm
Jan 22 20:17:29 Archlinux kernel: CR2: 0000000000000078
Jan 22 20:17:29 Archlinux kernel: ---[ end trace 0000000000000000 ]---
Jan 22 20:17:29 Archlinux kernel: RIP: 0010:ucsi_resume_work+0x32/0x80 [typec_ucsi]
Jan 22 20:17:29 Archlinux kernel: Code: 00 55 31 c9 31 d2 53 48 8b b7 a0 00 00 00 48 89 fb 48 83 ef 38 48 83 ce 05 e8 aa f6 ff ff 85 c0 0f 88 32 20 00 00 48 8b 5b f8 <48> 83 7b 78 00 74 38 48 8d 6b 10 48 89 ef e8 eb 33 75 ca 31 c9 48
Jan 22 20:17:29 Archlinux kernel: RSP: 0000:ffffb5598107be80 EFLAGS: 00010246
Jan 22 20:17:29 Archlinux kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
Jan 22 20:17:29 Archlinux kernel: RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff8ad506619cb8
Jan 22 20:17:29 Archlinux kernel: RBP: ffff8ad669eb2840 R08: 0000000000000001 R09: 0000000000000000
Jan 22 20:17:29 Archlinux kernel: R10: 0000000000000002 R11: 0000000000000000 R12: ffff8ad669eb8b00
Jan 22 20:17:29 Archlinux kernel: R13: 0000000000000000 R14: ffff8ad5019f7840 R15: ffff8ad506619c40
Jan 22 20:17:29 Archlinux kernel: FS:  0000000000000000(0000) GS:ffff8ad669e80000(0000) knlGS:0000000000000000
Jan 22 20:17:29 Archlinux kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 22 20:17:29 Archlinux kernel: CR2: 0000000000000078 CR3: 0000000203c10001 CR4: 00000000003706e0

While in the current linux-lts kernel, there is no such problem, and I can suspend and resume normally. I have absolutely no idea what to do with it since I can't get anything in the bug report. Can someone help me? Do I need to provide any more information?

Offline

#2 2023-01-22 16:42:16

attila123
Member
Registered: 2012-04-20
Posts: 11

Re: Laptop crash after suspending: kernel NULL pointer dereference

I would also be interested in this topic. My new work laptop, a ThinkPad T14 Gen 2 Intel also black screen-s sometimes after coming back from sleep. Happened just now with the standard 6.1.7-arch1-1 kernel.
I also see those 'kernel NULL pointer dereference' messages sometimes as I go back to check the kernel messages to previous boots (see below).
It may be more frequent with the 'linux' kernel than the 'linux-lts', but I do not really have enough samples to make a conclusion.
Going back to -7 reboot (see below) linux-lts (5.15.88-2-lts) also had the NULL pointer dereference in my laptop.

I found https://unix.stackexchange.com/question … on-dmesg-0 which helped me to check kernel messages for previous boots.

TL;DR:
The amount of boots you can look back on can be viewed with the following.
journalctl --list-boot

Current boot : journalctl -o short-precise -k
Last boot : journalctl -o short-precise -k -b -1
Two boots prior : journalctl -o short-precise -k -b -2
<and so on>

Form me, this is the message with 6.1.7-arch1-1. Ignore 't470' in the hostname, that is my previous Laptop, but this happened with ThinkPad T14 Gen 2 Intel.

Jan 21 15:31:01.678447 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 21 15:31:01.678765 t470 kernel: #PF: supervisor read access in kernel mode
Jan 21 15:31:01.679039 t470 kernel: #PF: error_code(0x0000) - not-present page
Jan 21 15:31:01.680318 t470 kernel: PGD 0 P4D 0 
Jan 21 15:31:01.680450 t470 kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Jan 21 15:31:01.680491 t470 kernel: CPU: 5 PID: 495 Comm: irq/61-RAYD0001 Tainted: G           OE      6.1.7-arch1-1 #1 a2d6f1dcaa775aaae1f25aaf758ae968e3493665
Jan 21 15:31:01.680520 t470 kernel: Hardware name: LENOVO 20W1S2JS34/20W1S2JS34, BIOS N34ET53W (1.53 ) 08/31/2022
Jan 21 15:31:01.680549 t470 kernel: RIP: 0010:raydium_i2c_irq+0x4c/0x1b0 [raydium_i2c_ts]
Jan 21 15:31:01.680576 t470 kernel: Code: f3 0f b6 4e 66 48 8b 56 58 48 8b 3b 8b 76 60 e8 da fc ff ff 41 89 c4 85 c0 0f 85 50 01 00 00 0f b6 43 64 48 8b 6b 58 0f b6 d0 <44> 0f b7 44 05 00 4>
Jan 21 15:31:01.680606 t470 kernel: RSP: 0018:ffff96f40091fe78 EFLAGS: 00010246
Jan 21 15:31:01.680634 t470 kernel: RAX: 00000000000000fe RBX: ffff8a9005e977a8 RCX: 0000000000000000
Jan 21 15:31:01.680666 t470 kernel: RDX: 00000000000000fe RSI: 000000000003b606 RDI: ffff8a9003cf8400
Jan 21 15:31:01.680697 t470 kernel: RBP: 0000000000000010 R08: ffff8a934f673770 R09: 0000000000000000
Jan 21 15:31:01.680736 t470 kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
Jan 21 15:31:01.680764 t470 kernel: R13: ffff8a9014c02300 R14: ffffffff8c7208b0 R15: ffff8a9008e46740
Jan 21 15:31:01.680791 t470 kernel: FS:  0000000000000000(0000) GS:ffff8a934f740000(0000) knlGS:0000000000000000
Jan 21 15:31:01.680825 t470 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 21 15:31:01.680853 t470 kernel: CR2: 000000000000010e CR3: 0000000240210003 CR4: 0000000000770ee0
Jan 21 15:31:01.680881 t470 kernel: PKRU: 55555554
Jan 21 15:31:01.680908 t470 kernel: Call Trace:
Jan 21 15:31:01.680938 t470 kernel:  <TASK>
Jan 21 15:31:01.680969 t470 kernel:  irq_thread_fn+0x20/0x60
Jan 21 15:31:01.681000 t470 kernel:  irq_thread+0xfb/0x1c0
Jan 21 15:31:01.681030 t470 kernel:  ? irq_thread_fn+0x60/0x60
Jan 21 15:31:01.681062 t470 kernel:  ? irq_thread_check_affinity+0xd0/0xd0
Jan 21 15:31:01.681094 t470 kernel:  kthread+0xdb/0x110
Jan 21 15:31:01.681120 t470 kernel:  ? kthread_complete_and_exit+0x20/0x20
Jan 21 15:31:01.681147 t470 kernel:  ret_from_fork+0x1f/0x30
Jan 21 15:31:01.681174 t470 kernel:  </TASK>
Jan 21 15:31:01.681201 t470 kernel: Modules linked in: rfcomm ccm cmac algif_hash algif_skcipher af_alg xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptab>
Jan 21 15:31:01.681321 t470 kernel:  polyval_generic ac97_bus intel_rapl_msr gf128mul snd_pcm_dmaengine ghash_clmulni_intel sha512_ssse3 think_lmi snd_hda_intel aesni_intel firmware_attribu>
Jan 21 15:31:01.681401 t470 kernel:  int3403_thermal int340x_thermal_zone video btrfs wmi intel_hid int3400_thermal blake2b_generic sparse_keymap xor raid6_pq acpi_thermal_rel libcrc32c acp>
Jan 21 15:31:01.681434 t470 kernel: CR2: 000000000000010e
Jan 21 15:31:01.681458 t470 kernel: ---[ end trace 0000000000000000 ]---
Jan 21 15:31:01.681492 t470 kernel: RIP: 0010:raydium_i2c_irq+0x4c/0x1b0 [raydium_i2c_ts]
Jan 21 15:31:01.681516 t470 kernel: Code: f3 0f b6 4e 66 48 8b 56 58 48 8b 3b 8b 76 60 e8 da fc ff ff 41 89 c4 85 c0 0f 85 50 01 00 00 0f b6 43 64 48 8b 6b 58 0f b6 d0 <44> 0f b7 44 05 00 4>
Jan 21 15:31:01.681542 t470 kernel: RSP: 0018:ffff96f40091fe78 EFLAGS: 00010246
Jan 21 15:31:01.681566 t470 kernel: RAX: 00000000000000fe RBX: ffff8a9005e977a8 RCX: 0000000000000000
Jan 21 15:31:01.681590 t470 kernel: RDX: 00000000000000fe RSI: 000000000003b606 RDI: ffff8a9003cf8400
Jan 21 15:31:01.681617 t470 kernel: RBP: 0000000000000010 R08: ffff8a934f673770 R09: 0000000000000000
Jan 21 15:31:01.681655 t470 kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
Jan 21 15:31:01.681691 t470 kernel: R13: ffff8a9014c02300 R14: ffffffff8c7208b0 R15: ffff8a9008e46740
Jan 21 15:31:01.681714 t470 kernel: FS:  0000000000000000(0000) GS:ffff8a934f740000(0000) knlGS:0000000000000000
Jan 21 15:31:01.681738 t470 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 21 15:31:01.681762 t470 kernel: CR2: 000000000000010e CR3: 0000000240210003 CR4: 0000000000770ee0
Jan 21 15:31:01.681785 t470 kernel: PKRU: 55555554
Jan 21 15:31:01.681808 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
Jan 21 15:31:01.681877 t470 kernel: #PF: supervisor write access in kernel mode
Jan 21 15:31:01.682186 t470 kernel: #PF: error_code(0x0002) - not-present page
Jan 21 15:31:01.682363 t470 kernel: PGD 0 P4D 0 
Jan 21 15:31:01.682538 t470 kernel: Oops: 0002 [#2] PREEMPT SMP NOPTI
Jan 21 15:31:01.682623 t470 kernel: CPU: 5 PID: 495 Comm: irq/61-RAYD0001 Tainted: G      D    OE      6.1.7-arch1-1 #1 a2d6f1dcaa775aaae1f25aaf758ae968e3493665
Jan 21 15:31:01.682660 t470 kernel: Hardware name: LENOVO 20W1S2JS34/20W1S2JS34, BIOS N34ET53W (1.53 ) 08/31/2022
Jan 21 15:31:01.682687 t470 kernel: RIP: 0010:mutex_lock+0x1d/0x30
Jan 21 15:31:01.682714 t470 kernel: Code: 00 00 be 02 00 00 00 e9 51 f8 ff ff 90 f3 0f 1e fa 0f 1f 44 00 00 53 48 89 fb 2e 2e 2e 31 c0 31 c0 65 48 8b 14 25 c0 0b 02 00 <f0> 48 0f b1 13 75 0>
Jan 21 15:31:01.682743 t470 kernel: RSP: 0018:ffff96f40091fe58 EFLAGS: 00010246
Jan 21 15:31:01.682770 t470 kernel: RAX: 0000000000000000 RBX: 0000000000000a69 RCX: 00000000000001b0
Jan 21 15:31:01.682796 t470 kernel: RDX: ffff8a90046c2700 RSI: 0000000000001cc9 RDI: 0000000000000a69
Jan 21 15:31:01.682823 t470 kernel: RBP: ffff8a90046c2700 R08: ffff8a90000412d0 R09: 000000008020001f
Jan 21 15:31:01.682850 t470 kernel: R10: 0000000000000003 R11: ffffffff8e4cb828 R12: 0000000000000009
Jan 21 15:31:01.682887 t470 kernel: R13: ffff8a90083fa101 R14: 0000000000000a69 R15: 0000000000000a89
Jan 21 15:31:01.682912 t470 kernel: FS:  0000000000000000(0000) GS:ffff8a934f740000(0000) knlGS:0000000000000000
Jan 21 15:31:01.682937 t470 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 21 15:31:01.682960 t470 kernel: CR2: 0000000000000a69 CR3: 0000000240210003 CR4: 0000000000770ee0
Jan 21 15:31:01.682988 t470 kernel: PKRU: 55555554
Jan 21 15:31:01.683012 t470 kernel: Call Trace:
Jan 21 15:31:01.683040 t470 kernel:  <TASK>
Jan 21 15:31:01.683065 t470 kernel:  perf_event_exit_task+0x41/0x2a0
Jan 21 15:31:01.683096 t470 kernel:  do_exit+0x35c/0xae0
Jan 21 15:31:01.683126 t470 kernel:  ? task_work_run+0x5a/0x90
Jan 21 15:31:01.683152 t470 kernel:  ? do_exit+0x34c/0xae0
Jan 21 15:31:01.683179 t470 kernel:  ? make_task_dead+0x55/0x60
Jan 21 15:31:01.683205 t470 kernel:  ? rewind_stack_and_make_dead+0x17/0x20
Jan 21 15:31:01.683232 t470 kernel:  </TASK>
Jan 21 15:31:01.683260 t470 kernel: Modules linked in: rfcomm ccm cmac algif_hash algif_skcipher af_alg xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptab>
Jan 21 15:31:01.683376 t470 kernel:  polyval_generic ac97_bus intel_rapl_msr gf128mul snd_pcm_dmaengine ghash_clmulni_intel sha512_ssse3 think_lmi snd_hda_intel aesni_intel firmware_attribu>
Jan 21 15:31:01.683427 t470 kernel:  int3403_thermal int340x_thermal_zone video btrfs wmi intel_hid int3400_thermal blake2b_generic sparse_keymap xor raid6_pq acpi_thermal_rel libcrc32c acp>
Jan 21 15:31:01.683461 t470 kernel: CR2: 0000000000000a69
Jan 21 15:31:01.683489 t470 kernel: ---[ end trace 0000000000000000 ]---
Jan 21 15:31:01.683512 t470 kernel: RIP: 0010:raydium_i2c_irq+0x4c/0x1b0 [raydium_i2c_ts]
Jan 21 15:31:01.683537 t470 kernel: Code: f3 0f b6 4e 66 48 8b 56 58 48 8b 3b 8b 76 60 e8 da fc ff ff 41 89 c4 85 c0 0f 85 50 01 00 00 0f b6 43 64 48 8b 6b 58 0f b6 d0 <44> 0f b7 44 05 00 4>
Jan 21 15:31:01.683567 t470 kernel: RSP: 0018:ffff96f40091fe78 EFLAGS: 00010246
Jan 21 15:31:01.683591 t470 kernel: RAX: 00000000000000fe RBX: ffff8a9005e977a8 RCX: 0000000000000000
Jan 21 15:31:01.683616 t470 kernel: RDX: 00000000000000fe RSI: 000000000003b606 RDI: ffff8a9003cf8400
Jan 21 15:31:01.683639 t470 kernel: RBP: 0000000000000010 R08: ffff8a934f673770 R09: 0000000000000000
Jan 21 15:31:01.683663 t470 kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
Jan 21 15:31:01.683687 t470 kernel: R13: ffff8a9014c02300 R14: ffffffff8c7208b0 R15: ffff8a9008e46740
Jan 21 15:31:01.683713 t470 kernel: FS:  0000000000000000(0000) GS:ffff8a934f740000(0000) knlGS:0000000000000000
Jan 21 15:31:01.683737 t470 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 21 15:31:01.683757 t470 kernel: CR2: 0000000000000a69 CR3: 0000000240210003 CR4: 0000000000770ee0
Jan 21 15:31:01.683780 t470 kernel: PKRU: 55555554
Jan 21 15:31:01.683808 t470 kernel: Fixing recursive fault but reboot is needed!
Jan 21 15:31:01.683887 t470 kernel: BUG: scheduling while atomic: irq/61-RAYD0001/495/0x00000000
Jan 21 15:31:01.684021 t470 kernel: Modules linked in: rfcomm ccm cmac algif_hash algif_skcipher af_alg xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptab>
Jan 21 15:31:01.684068 t470 kernel:  polyval_generic ac97_bus intel_rapl_msr gf128mul snd_pcm_dmaengine ghash_clmulni_intel sha512_ssse3 think_lmi snd_hda_intel aesni_intel firmware_attribu>
Jan 21 15:31:01.684109 t470 kernel:  int3403_thermal int340x_thermal_zone video btrfs wmi intel_hid int3400_thermal blake2b_generic sparse_keymap xor raid6_pq acpi_thermal_rel libcrc32c acp>
Jan 21 15:31:01.684137 t470 kernel: CPU: 5 PID: 495 Comm: irq/61-RAYD0001 Tainted: G      D    OE      6.1.7-arch1-1 #1 a2d6f1dcaa775aaae1f25aaf758ae968e3493665
Jan 21 15:31:01.684162 t470 kernel: Hardware name: LENOVO 20W1S2JS34/20W1S2JS34, BIOS N34ET53W (1.53 ) 08/31/2022
Jan 21 15:31:01.684186 t470 kernel: Call Trace:
Jan 21 15:31:01.684209 t470 kernel:  <TASK>
Jan 21 15:31:01.684235 t470 kernel:  dump_stack_lvl+0x48/0x60
Jan 21 15:31:01.684259 t470 kernel:  __schedule_bug.cold+0x4b/0x57
Jan 21 15:31:01.684285 t470 kernel:  __schedule+0xe8d/0x12a0
Jan 21 15:31:01.684312 t470 kernel:  do_task_dead+0x43/0x50
Jan 21 15:31:01.684338 t470 kernel:  make_task_dead.cold+0x51/0xab
Jan 21 15:31:01.684365 t470 kernel:  rewind_stack_and_make_dead+0x17/0x20
Jan 21 15:31:01.684397 t470 kernel: RIP: 0000:0x0
Jan 21 15:31:01.684422 t470 kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6.
Jan 21 15:31:01.684450 t470 kernel: RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
Jan 21 15:31:01.684476 t470 kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jan 21 15:31:01.684508 t470 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jan 21 15:31:01.684536 t470 kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Jan 21 15:31:01.684557 t470 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Jan 21 15:31:01.684584 t470 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jan 21 15:31:01.684610 t470 kernel:  </TASK>

Last edited by attila123 (2023-01-22 16:56:28)

Offline

#3 2023-01-22 17:23:39

attila123
Member
Registered: 2012-04-20
Posts: 11

Re: Laptop crash after suspending: kernel NULL pointer dereference

Also some "fun" statistics, I ran `journalctl -o short-precise -k -b all | grep NULL` in my old Thinkpad T470 for the past 170 boots from May 2 2021, it had zero results for this NULL pointer dereference error.
On this new T14 Gen2 Intel (I think it's Tiger Lake so 11th gen Intel), however, it found 13 occurrences since Jan 10:

$ journalctl -o short-precise -k -b all | grep NULL
Jan 10 13:48:16.545471 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 10 13:48:16.560922 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
Jan 12 09:32:00.189451 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 12 09:32:00.197823 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
Jan 16 09:13:21.536894 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 16 09:13:21.545711 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
Jan 16 18:37:22.576113 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 16 18:37:22.585825 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
Jan 18 19:19:05.939491 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 18 21:04:45.603106 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 18 21:04:45.620137 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
Jan 21 15:31:01.678447 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 21 15:31:01.681808 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
$ journalctl -o short-precise -k -b all | grep NULL | wc -l
13

Offline

#4 2023-01-22 19:32:52

attila123
Member
Registered: 2012-04-20
Posts: 11

Re: Laptop crash after suspending: kernel NULL pointer dereference

Offline

#5 2023-01-23 06:17:09

fancieux
Member
Registered: 2023-01-22
Posts: 6

Re: Laptop crash after suspending: kernel NULL pointer dereference

I am very sorry to tell you that we may not have the same problem. I noticed a line in my bug report that said:

Jan 22 20:17:29 Archlinux kernel: Workqueue: events_long ucsi_resume_work [typec_ucsi]

And I noticed that on every startup I have the journal like:

Jan 22 20:16:57 Archlinux kernel: ucsi_acpi USBC000:00: PPM init failed (-16)

I tried to blacklist the module `typec_ucsi` and it works. I can suspend and resume normally under both linux and linux-lts kernels now.
But the problem is that the above error report of `PPM init failed (-16)` will appear under both kernels, but only in the new kernel, it will cause a crash when suspend and resume. Does someone have any clues?

Offline

#6 2023-01-23 21:17:22

loqs
Member
Registered: 2014-03-06
Posts: 15,623

Re: Laptop crash after suspending: kernel NULL pointer dereference

@fancieux have you tried the suggestions from the linked bug report?  Such as trying the latest mainline kernel or bisecting the kernel?

Offline

#7 2023-01-24 05:34:15

fancieux
Member
Registered: 2023-01-22
Posts: 6

Re: Laptop crash after suspending: kernel NULL pointer dereference

I tried linux 6.2rc4, which I downloaded in an unofficial user repo. The bug still exists.
Since linux 6.2rc5 was just released yesterday, the linux-mainline in that repo has not been compiled yet. I'm not going to compile it on my computer, it would be a waste of time, maybe a compiled version will be released later today, and I can try it then. But I suspect that the situation may not change.
I tried to downgrade the linux package, and it can suspend and resume normally in 6.0.12 and before. In 6.1.1 and later, the same bug appears as now.
But in version 6.1.0, the system freezes directly after suspending, and journalctl does not capture anything after

Jan 24 12:54:06 Archlinux systemd[1]: Starting System Suspend...
Jan 24 12:54:06 Archlinux systemd-sleep[2642]: Entering sleep state 'suspend'...
Jan 24 12:54:06 Archlinux kernel: PM: suspend entry (deep)
Jan 24 12:54:06 Archlinux kernel: Filesystems sync: 0.012 seconds

Should I preceed with bisect git repo? It may take a long time.

Offline

#8 2023-01-24 06:26:23

loqs
Member
Registered: 2014-03-06
Posts: 15,623

Re: Laptop crash after suspending: kernel NULL pointer dereference

You can find some prebuilt bisection kernsl in https://bugs.archlinux.org/task/77109#comment214325 and the link from there.

Offline

#9 2023-01-24 14:27:13

fancieux
Member
Registered: 2023-01-22
Posts: 6

Re: Laptop crash after suspending: kernel NULL pointer dereference

git bisect start
# status: waiting for both good and bad commits
# good: [4fe89d07dcc2804c8b562f6c7896a45643d34b2f] Linux 6.0
git bisect good 4fe89d07dcc2804c8b562f6c7896a45643d34b2f
# status: waiting for bad commit, 1 good commit known
# bad: [830b3c68c1fb1e9176028d02ef86f3cf76aa2476] Linux 6.1
git bisect bad 830b3c68c1fb1e9176028d02ef86f3cf76aa2476
# good: [33e591dee915832c618cf68bb1058c8e7d296128] Merge tag 'phy-for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
git bisect good 33e591dee915832c618cf68bb1058c8e7d296128
# good: [de492c83cae0af72de370b9404aacda93dafcad5] prandom: remove unused functions
git bisect good de492c83cae0af72de370b9404aacda93dafcad5
# bad: [c4d25ce6e9de47f6d9fb6cc1a34b47ce5f0a46ab] Merge tag 'usb-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
git bisect bad c4d25ce6e9de47f6d9fb6cc1a34b47ce5f0a46ab
# good: [8636df94ec917019c4cb744ba0a1f94cf9057790] Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
git bisect good 8636df94ec917019c4cb744ba0a1f94cf9057790
# good: [ca4582c286aa4465f9d1a72bef34b04ee907d42e] Revert "mfd: syscon: Remove repetition of the regmap_get_val_endian()"
git bisect good ca4582c286aa4465f9d1a72bef34b04ee907d42e
# good: [e3493d682516e2b7ef69587ddf91b0371a1511d0] Merge tag 'drm-fixes-2022-10-28' of git://anongit.freedesktop.org/drm/drm
git bisect good e3493d682516e2b7ef69587ddf91b0371a1511d0
# good: [576e61cea1e4b66f52f164dee0edbe4b1c999997] Merge tag 's390-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
git bisect good 576e61cea1e4b66f52f164dee0edbe4b1c999997
# good: [c6e0e874a8fa055b6b2f536c282a523b9439b209] Merge tag 'block-6.1-2022-10-28' of git://git.kernel.dk/linux
git bisect good c6e0e874a8fa055b6b2f536c282a523b9439b209
# good: [28b7bd4ad25f7dc662a84636a619e61c97ac0e06] Merge tag '6.1-rc2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
git bisect good 28b7bd4ad25f7dc662a84636a619e61c97ac0e06
# bad: [5aed5b7c2430ce318a8e62f752f181e66f0d1053] xhci: Remove device endpoints from bandwidth list when freeing the device
git bisect bad 5aed5b7c2430ce318a8e62f752f181e66f0d1053
# good: [48ed32482c4100069d0c0eebdc6b198c6ae5f71f] usb: gadget: aspeed: Fix probe regression
git bisect good 48ed32482c4100069d0c0eebdc6b198c6ae5f71f
# bad: [19905240aef0181d1e6944070eb85fce75f75bcd] usb: gadget: uvc: limit isoc_sg to super speed gadgets
git bisect bad 19905240aef0181d1e6944070eb85fce75f75bcd
# bad: [4e3a50293c2b21961f02e1afa2f17d3a1a90c7c8] usb: typec: ucsi: acpi: Implement resume callback
git bisect bad 4e3a50293c2b21961f02e1afa2f17d3a1a90c7c8
# good: [99f6d43611135bd6f211dec9e88bb41e4167e304] usb: typec: ucsi: Check the connection on resume
git bisect good 99f6d43611135bd6f211dec9e88bb41e4167e304
# first bad commit: [4e3a50293c2b21961f02e1afa2f17d3a1a90c7c8] usb: typec: ucsi: acpi: Implement resume callback
4e3a50293c2b21961f02e1afa2f17d3a1a90c7c8 is the first bad commit
commit 4e3a50293c2b21961f02e1afa2f17d3a1a90c7c8
Author: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Date:   Fri Oct 7 13:09:51 2022 +0300

    usb: typec: ucsi: acpi: Implement resume callback

    The ACPI driver needs to resume the interface by calling
    ucsi_resume(). Otherwise we may fail to detect connections
    and disconnections that happen while the system is
    suspended.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=210425
    Fixes: a94ecde41f7e ("usb: typec: ucsi: ccg: enable runtime pm support")
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
    Link: https://lore.kernel.org/r/20221007100951.43798-3-heikki.krogerus@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 drivers/usb/typec/ucsi/ucsi_acpi.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

What's next? Should I report directly upstream?

Offline

#10 2023-01-24 14:40:02

loqs
Member
Registered: 2014-03-06
Posts: 15,623

Re: Laptop crash after suspending: kernel NULL pointer dereference

Yes please report it upstream on the mailing list or bugzilla https://www.kernel.org/doc/html/latest/ … ssues.html
Edit:
Possibly the same issue as https://bugzilla.kernel.org/show_bug.cgi?id=216697 or https://bugzilla.kernel.org/show_bug.cgi?id=216706 which has a patch you could try fix was already applied https://github.com/torvalds/linux/commi … 5f03cc42eb.
6.2-rc5 contained https://github.com/torvalds/linux/commi … 3374a1c7e5 which might be worth trying.

Last edited by loqs (2023-01-24 14:53:51)

Offline

#11 2023-01-24 15:27:57

fancieux
Member
Registered: 2023-01-22
Posts: 6

Re: Laptop crash after suspending: kernel NULL pointer dereference

Just tried 6.2-rc5 and it seems that as Alexander Chernaev said in https://bugzilla.kernel.org/show_bug.cgi?id=216697#c9 that patch doesn't work either

Offline

#12 2023-01-24 16:16:12

fancieux
Member
Registered: 2023-01-22
Posts: 6

Re: Laptop crash after suspending: kernel NULL pointer dereference

I added a comment here https://bugzilla.kernel.org/show_bug.cgi?id=216697#c11 Is this OK? Do I need to open a new bug report?

Offline

#13 2023-01-24 18:51:18

loqs
Member
Registered: 2014-03-06
Posts: 15,623

Re: Laptop crash after suspending: kernel NULL pointer dereference

You could open a new bug report including your kernel OOPS and bisection result or wait for Heikki Krogerus to respond to your comment first.  I do not know which would be more appropriate.

Offline

Board footer

Powered by FluxBB