You are not logged in.

#26 2021-08-02 15:52:22

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,410

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Check the output of

efibootmgr -uv

to find out which actual entry is used. it's unlikely these have any lasting or interfering impact as long as you consistently end up on the kernel you are actually selecting (i.e. output of uname -a and pacman -Q linux{,-lts} matches with what you expect)

Online

#27 2021-08-02 17:27:38

halogene
Member
Registered: 2013-05-29
Posts: 47

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Thank you. It seems to me that the entry is used that has the latest changes i.e. is the one generated by the last install. I assumed it would be automatically updated by efibootmgr. Yes, the kernel is selected correctly by grub.

Offline

#28 2021-08-02 20:06:52

seth
Member
Registered: 2012-09-03
Posts: 49,948

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Did you at any point test "intel_idle.max_cstate=1"?

Online

#29 2021-08-03 08:47:01

halogene
Member
Registered: 2013-05-29
Posts: 47

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Yes, adding it to the boot line via grub (selecting 5.13 kernel entry from advanced options, pressing "e", adding it before "loglevel=3 quiet", then booting the entry with <ctrl><x>). I still get the CPU hangups, sometimes even before the password prompt for opening the luks encrypted root partition. :oS

Offline

#30 2021-10-17 10:11:06

beryllium
Member
Registered: 2021-10-17
Posts: 7

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

I'm not entirely sure I have the exact same problem as the OP, but it seems like it.

Running the same hardware, also using LUKS encrypted partition.
Also seeing various soft lockup messages, sometimes before the LUKS passphrase prompt, sometimes after.
(I'm also seeing suspend issues, where sometimes suspend doesn't seem to kick in, or only delayed, but I'm not sure whether these are related yet.)

First lockup in my current session:

[    1.647311] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.14
[    1.647317] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.647321] usb usb1: Product: xHCI Host Controller
[    1.647324] usb usb1: Manufacturer: Linux 5.14.9-arch2-1 xhci-hcd
[    1.647326] usb usb1: SerialNumber: 0000:00:14.0
[    1.647517] hub 1-0:1.0: USB hub found
[    1.647699] hub 1-0:1.0: 12 ports detected
[    1.650117] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 5.14
[    1.650124] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.650128] usb usb2: Product: xHCI Host Controller
[    1.650131] usb usb2: Manufacturer: Linux 5.14.9-arch2-1 xhci-hcd
[    1.650134] usb usb2: SerialNumber: 0000:00:14.0
[    1.650299] hub 2-0:1.0: USB hub found
[    1.650387] hub 2-0:1.0: 6 ports detected
[    1.682061] cryptd: max_cpu_qlen set to 1000
[    1.688695] tsc: Refined TSC clocksource calibration: 2304.014 MHz
[    1.688706] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x213604fb50d, max_idle_ns: 440795292814 ns
[    2.102704] random: crng init done
[   28.125486] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [migration/0:17]
[   28.125547] Modules linked in: crypto_simd tpm_crb cryptd xhci_pci xhci_pci_renesas tpm_tis tpm_tis_core i8042 tpm serio rng_core
[   28.125560] CPU: 0 PID: 17 Comm: migration/0 Not tainted 5.14.9-arch2-1 #1 3d250f0857a0255dbbcb433ce1895c81c4740764
[   28.125566] Hardware name: LENOVO 81TC/LNVNB161216, BIOS BNCN43WW 05/21/2021
[   28.125567] Stopper: multi_cpu_stop+0x0/0x110 <- stop_machine_cpuslocked+0x173/0x1c0
[   28.125578] RIP: 0010:stop_machine_yield+0x2/0x10
[   28.125583] Code: 2b 14 25 28 00 00 00 75 0d 4c 8b 65 f8 c9 c3 b8 fe ff ff ff eb e3 e8 1d fd 94 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 90 <c3> 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 41 57 41
[   28.125586] RSP: 0000:ffffb45840117e70 EFLAGS: 00000246
[   28.125590] RAX: 0000000000000000 RBX: ffffb45840127d48 RCX: 0000000000000000
[   28.125592] RDX: 0000000000000002 RSI: 0000000000000140 RDI: ffffffffbddfb360
[   28.125594] RBP: ffffb45840127d6c R08: ffff89539061f6f0 R09: 0000000000000004
[   28.125596] R10: 0000000000000213 R11: 0000000000000007 R12: 0000000000000001
[   28.125598] R13: ffffffffbddfb360 R14: 0000000000000000 R15: 0000000000000001
[   28.125600] FS:  0000000000000000(0000) GS:ffff895390600000(0000) knlGS:0000000000000000
[   28.125603] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.125605] CR2: 0000559f0666b0c8 CR3: 0000000100066006 CR4: 00000000003706f0
[   28.125608] Call Trace:
[   28.125611]  multi_cpu_stop+0x9b/0x110
[   28.125617]  ? stop_machine_yield+0x10/0x10
[   28.125622]  cpu_stopper_thread+0x90/0x140
[   28.125627]  smpboot_thread_fn+0xd5/0x1c0
[   28.125632]  ? smpboot_register_percpu_thread+0xf0/0xf0
[   28.125637]  kthread+0x12f/0x160
[   28.125640]  ? set_kthread_struct+0x40/0x40
[   28.125643]  ret_from_fork+0x1f/0x30
[   28.128813] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [migration/1:23]
[   28.128862] Modules linked in: crypto_simd tpm_crb cryptd xhci_pci xhci_pci_renesas tpm_tis tpm_tis_core i8042 tpm serio rng_core
[   28.128875] CPU: 1 PID: 23 Comm: migration/1 Tainted: G             L    5.14.9-arch2-1 #1 3d250f0857a0255dbbcb433ce1895c81c4740764
[   28.128880] Hardware name: LENOVO 81TC/LNVNB161216, BIOS BNCN43WW 05/21/2021
[   28.128881] Stopper: multi_cpu_stop+0x0/0x110 <- stop_machine_cpuslocked+0x173/0x1c0
[   28.128891] RIP: 0010:rcu_momentary_dyntick_idle+0x24/0x40
[   28.128897] Code: 00 00 00 00 00 90 48 c7 c0 40 e5 02 00 65 c6 05 05 e2 d0 43 00 65 48 03 05 e9 11 cf 43 ba 04 00 00 00 f0 0f c1 90 20 01 00 00 <83> e2 02 74 0e 65 48 8b 3c 25 c0 7b 01 00 e9 79 ff ff ff 0f 0b eb
[   28.128900] RSP: 0000:ffffb4584017fe70 EFLAGS: 00000206
[   28.128903] RAX: ffff89539066e540 RBX: ffffb45840127d48 RCX: 0000000000000000
[   28.128905] RDX: 000000003ec74636 RSI: 0000000000000140 RDI: ffffffffbddfb360
[   28.128907] RBP: ffffb45840127d6c R08: ffff89539065f6f0 R09: 0000000000000000
[   28.128909] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
[   28.128911] R13: ffffffffbddfb360 R14: 0000000000000000 R15: 0000000000000001
[   28.128913] FS:  0000000000000000(0000) GS:ffff895390640000(0000) knlGS:0000000000000000
[   28.128916] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.128918] CR2: 0000557bdcb62008 CR3: 0000000102c78004 CR4: 00000000003706e0
[   28.128921] Call Trace:
[   28.128923]  multi_cpu_stop+0xb9/0x110
[   28.128929]  ? stop_machine_yield+0x10/0x10
[   28.128934]  cpu_stopper_thread+0x90/0x140
[   28.128939] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [swapper/2:0]
[   28.128939]  smpboot_thread_fn+0xd5/0x1c0
[   28.128944]  ? smpboot_register_percpu_thread+0xf0/0xf0
[   28.128996] Modules linked in: crypto_simd
[   28.128997]  kthread+0x12f/0x160
[   28.128999]  tpm_crb cryptd
[   28.129000]  ? set_kthread_struct+0x40/0x40
[   28.129003]  xhci_pci xhci_pci_renesas tpm_tis tpm_tis_core i8042
[   28.129004]  ret_from_fork+0x1f/0x30
[   28.129008]  tpm serio rng_core
[   28.129012] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G             L    5.14.9-arch2-1 #1 3d250f0857a0255dbbcb433ce1895c81c4740764
[   28.129017] Hardware name: LENOVO 81TC/LNVNB161216, BIOS BNCN43WW 05/21/2021
[   28.129018] RIP: 0010:__do_softirq+0x79/0x2b4
[   28.129026] Code: 81 67 2c ff f7 ff ff be 00 01 00 00 e8 a0 22 2d ff c7 44 24 10 0a 00 00 00 65 66 c7 05 4e ce 02 43 00 00 fb 66 0f 1f 44 00 00 <b8> ff ff ff ff 49 c7 c3 c0 60 c0 bd 41 0f bc c7 89 c5 83 c5 01 74
[   28.129028] RSP: 0018:ffffb458401e0fa0 EFLAGS: 00000246
[   28.129031] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000001f
[   28.129033] RDX: 000000000000000c RSI: 0000000037a7279d RDI: fffffffe11b26e53
[   28.129036] RBP: ffffb45840137df8 R08: 0000000034f932b2 R09: 0000000002856b2f
[   28.129038] R10: 0000000002850080 R11: 0000000000000003 R12: 0000000000000001
[   28.129040] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000080
[   28.129042] FS:  0000000000000000(0000) GS:ffff895390680000(0000) knlGS:0000000000000000
[   28.129045] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.129047] CR2: 00007f1e7c7eb2d4 CR3: 000000016be10001 CR4: 00000000003706e0
[   28.129050] Call Trace:
[   28.129052]  <IRQ>
[   28.129055]  irq_exit_rcu+0xa9/0xc0
[   28.129060]  sysvec_apic_timer_interrupt+0x72/0x90
[   28.129066]  </IRQ>
[   28.129068]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[   28.129072] RIP: 0010:cpuidle_enter_state+0xc7/0x380
[   28.129078] Code: 8b 3d 55 5e 5e 43 e8 18 68 8a ff 49 89 c5 0f 1f 44 00 00 31 ff e8 39 75 8a ff 45 84 ff 0f 85 da 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 11 01 00 00 49 63 d6 4c 2b 2c 24 48 8d 04 52 48 8d
[   28.129080] RSP: 0018:ffffb45840137ea8 EFLAGS: 00000246
[   28.129082] RAX: ffff8953906ad700 RBX: 0000000000000001 RCX: 000000000000001f
[   28.129084] RDX: 0000000000000000 RSI: 0000000037a7279d RDI: 0000000000000000
[   28.129086] RBP: ffff8953906b7f00 R08: 0000000034ca006d R09: 0000000000000008
[   28.129088] R10: 0000000000000007 R11: 0000000000000006 R12: ffffffffbdd49240
[   28.129090] R13: 0000000034ca006d R14: 0000000000000001 R15: 0000000000000000
[   28.129094]  ? cpuidle_enter_state+0xb7/0x380
[   28.129099]  cpuidle_enter+0x29/0x40
[   28.129103]  do_idle+0x1e1/0x270
[   28.129108]  cpu_startup_entry+0x19/0x20
[   28.129112]  secondary_startup_64_no_verify+0xc2/0xcb
[   28.129639] clocksource: Switched to clocksource tsc
[   28.138523] AVX2 version of gcm_enc/dec engaged.
[   28.139465] AES CTR mode by8 optimization enabled
[   28.160610] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2
[   28.196586] device-mapper: uevent: version 1.0.3
[   28.196711] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22) initialised: dm-devel@redhat.com
[   28.219812] audit: type=1130 audit(1634377993.037:8): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-ask-password-console comm="systemd" exe="/init" hostname=? addr=? terminal=? res=success'

Second lockup:

[   42.369362] acpi PNP0C14:01: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[   42.369505] acpi PNP0C14:03: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[   42.369581] acpi PNP0C14:04: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[   42.453724] FAT-fs (nvme0n1p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[   42.466054] input: Ideapad extra buttons as /devices/pci0000:00/0000:00:1f.0/PNP0C09:00/VPC2004:00/input/input8
[   42.517396] ideapad_acpi VPC2004:00: DYTC interface is not available
[   68.128915] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [systemd-udevd:477]
[   68.129299] Modules linked in: ideapad_laptop platform_profile sparse_keymap vfat rfkill fjes(-) pcc_cpufreq(-) int3403_thermal fat wmi int340x_thermal_zone int3400_thermal acpi_thermal_rel acpi_pad acpi_tad video ext4 crc16 mbcache jbd2 squashfs loop drm crypto_user fuse ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq dm_crypt cbc encrypted_keys trusted asn1_encoder tee usbhid dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel serio_raw ghash_clmulni_intel atkbd libps2 aesni_intel crypto_simd tpm_crb cryptd xhci_pci xhci_pci_renesas tpm_tis tpm_tis_core i8042 tpm serio rng_core
[   68.129322] CPU: 2 PID: 477 Comm: systemd-udevd Tainted: G             L    5.14.9-arch2-1 #1 3d250f0857a0255dbbcb433ce1895c81c4740764
[   68.129325] Hardware name: LENOVO 81TC/LNVNB161216, BIOS BNCN43WW 05/21/2021
[   68.129325] RIP: 0010:__do_softirq+0x79/0x2b4
[   68.129329] Code: 81 67 2c ff f7 ff ff be 00 01 00 00 e8 a0 22 2d ff c7 44 24 10 0a 00 00 00 65 66 c7 05 4e ce 02 43 00 00 fb 66 0f 1f 44 00 00 <b8> ff ff ff ff 49 c7 c3 c0 60 c0 bd 41 0f bc c7 89 c5 83 c5 01 74
[   68.129330] RSP: 0000:ffffb458401e0fa0 EFLAGS: 00000246
[   68.129331] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000001f
[   68.129332] RDX: 0000000000000073 RSI: 00000000378e22c4 RDI: fffffffe12be1c6a
[   68.129333] RBP: ffffb45841ec3dc8 R08: 00000009e3efb5dc R09: 7fffffffffffffff
[   68.129334] R10: 00000009e66e0200 R11: 00000009e66f1f6e R12: 0000000000000000
[   68.129334] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000200
[   68.129335] FS:  00007fa4fc419a40(0000) GS:ffff895390680000(0000) knlGS:0000000000000000
[   68.129336] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   68.129337] CR2: 00005600db33f000 CR3: 0000000143866002 CR4: 00000000003706e0
[   68.129338] Call Trace:
[   68.129339]  <IRQ>
[   68.129341]  irq_exit_rcu+0xa9/0xc0
[   68.129343]  sysvec_apic_timer_interrupt+0x72/0x90
[   68.129346]  </IRQ>
[   68.129346]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[   68.129348] RIP: 0010:preempt_schedule_irq+0x37/0x60
[   68.129350] Code: a9 ff ff ff 7f 75 44 9c 58 0f 1f 44 00 00 f6 c4 02 75 38 65 48 8b 1c 25 c0 7b 01 00 65 ff 05 90 11 35 43 fb 66 0f 1f 44 00 00 <bf> 01 00 00 00 e8 df e5 ff ff fa 66 0f 1f 44 00 00 65 ff 0d 71 11
[   68.129351] RSP: 0000:ffffb45841ec3e78 EFLAGS: 00000202
[   68.129352] RAX: 0000000000000046 RBX: ffff89508408c000 RCX: 000000000000080b
[   68.129352] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb45841ec3e88
[   68.129353] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[   68.129354] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   68.129354] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   68.129356]  asm_sysvec_reschedule_ipi+0x12/0x20
[   68.129357] RIP: 0010:exit_to_user_mode_prepare+0xb6/0x170
[   68.129359] Code: e3 02 75 64 fa 66 0f 1f 44 00 00 65 48 8b 04 25 c0 7b 01 00 48 8b 18 f7 c3 0e 30 02 00 0f 84 6f ff ff ff fb 66 0f 1f 44 00 00 <f6> c3 08 74 c3 e8 00 eb 99 00 f6 c7 10 74 be 4c 89 e7 e8 43 dc 10
[   68.129360] RSP: 0000:ffffb45841ec3f30 EFLAGS: 00000202
[   68.129360] RAX: 0000000000000000 RBX: 0000000000000228 RCX: 000000000000001f
[   68.129361] RDX: ffff895040b7c000 RSI: 0000000000000000 RDI: ffffb45841ec3f58
[   68.129362] RBP: ffff89508408c000 R08: 0000000000000000 R09: 0000000000000000
[   68.129362] R10: 0000000000000019 R11: 0000000000000000 R12: ffffb45841ec3f58
[   68.129363] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   68.129364]  ? asm_common_interrupt+0x8/0x40
[   68.129366]  irqentry_exit_to_user_mode+0x5/0x10
[   68.129367]  asm_common_interrupt+0x1e/0x40
[   68.129368] RIP: 0033:0x5600d9286ffb
[   68.129370] Code: 48 8d 56 08 48 89 fb 41 89 ce 48 89 54 24 30 48 89 44 24 10 48 85 c0 0f 84 68 02 00 00 48 8b 44 24 10 48 89 43 20 48 8b 48 10 <48> 89 48 08 48 8d 44 24 40 48 89 44 24 18 48 8d 44 24 60 48 89 0c
[   68.129371] RSP: 002b:00007ffea1b13ee0 EFLAGS: 00000206
[   68.129372] RAX: 00005600db1bc0c0 RBX: 00005600db199fb0 RCX: 00005600db1bc0f0
[   68.129372] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   68.129373] RBP: 0000000000000000 R08: 00000000fffffffe R09: 0000000000000000
[   68.129373] R10: d6745100d788f26b R11: 0000000000000002 R12: 00005600db1be6d0
[   68.129374] R13: 0000000000000000 R14: 0000000000000009 R15: 00005600db2941a0
[   68.196235] kauditd_printk_skb: 76 callbacks suppressed
[   68.196237] audit: type=1130 audit(1634378033.014:93): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-tmpfiles-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   68.218494] audit: type=1334 audit(1634378033.034:94): prog-id=29 op=LOAD
[   68.223402] audit: type=1127 audit(1634378033.040:95): pid=514 uid=0 auid=4294967295 ses=4294967295 msg=' comm="systemd-update-utmp" exe="/usr/lib/systemd/systemd-update-utmp" hostname=? addr=? terminal=? res=success'
[   68.226706] audit: type=1130 audit(1634378033.044:96): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-update-utmp comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   68.234093] audit: type=1130 audit(1634378033.050:97): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-journal-catalog-update comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   68.541509] audit: type=1130 audit(1634378033.357:98): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-timesyncd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   68.652864] Linux agpgart interface v0.103

I have installed intel-ucode, made sure it's loaded by rEFInd and it seems to be working:

root@yoga /boot # journalctl -k --grep=microcode
-- Journal begins at Sat 2019-12-21 16:30:34 CET, ends at Sun 2021-10-17 11:36:08 CEST. --
Oct 16 11:52:46 archlinux kernel: microcode: sig=0x806ec, pf=0x4, revision=0xea
Oct 16 11:52:46 archlinux kernel: microcode: Microcode Update Driver: v2.2.

Also used my Windows install to have the latest BIOS version applied.

uname -a

Linux yoga 5.14.9-arch2-1 #1 SMP PREEMPT Fri, 01 Oct 2021 19:03:20 +0000 x86_64 GNU/Linux

/proc/cmdline

\\vmlinuz-linux rd.luks.name=f9f7f469-cb86-4b8f-b513-1979e315d9b9=crypt root=UUID=bae3e4da-c406-49f0-bb23-3bdea0e50a87 rootflags=subvol=/arch/@ initrd=\intel-ucode.img initrd=\initramfs-linux.img

/proc/cpuinfo (only first CPU shown)

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 142
model name	: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
stepping	: 12
microcode	: 0xea
cpu MHz		: 800.019
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust sgx bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities
vmx flags	: vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple pml ept_mode_based_exec
bugs		: spectre_v1 spectre_v2 spec_store_bypass swapgs itlb_multihit srbds
bogomips	: 4601.60
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

Please let me know if I need to provide more info!

Last edited by beryllium (2021-10-17 10:12:22)

Offline

#31 2021-10-17 13:03:29

seth
Member
Registered: 2012-09-03
Posts: 49,948

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Looks like https://lore.kernel.org/lkml/CACkBjsZa6 … l.com/T/#u
There's also https://bbs.archlinux.org/viewtopic.php?id=258829 and 5.8 lines up w/ the bisection result in the lkml thread.

Online

#32 2021-10-17 17:37:31

beryllium
Member
Registered: 2021-10-17
Posts: 7

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Thanks for the super fast reply!

But to be honest, I find 'my' backtraces a bit hard to understand.
First one seems to have to do with bringing up SMP, but perhaps it's normal to see SMP/APIC interrupt stuff in these kinds of lockups.
The second one is even more 'abstract' (to me), so I'm not sure whether the lkml post is similar to the issues I'm facing. It may very well be.

That forum post seems to be specific to using NFS, so it's probably unrelated, although perhaps there is some kind of sommon root cause.

Offline

#33 2021-10-17 21:03:26

seth
Member
Registered: 2012-09-03
Posts: 49,948

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

lkml wrote:

It looks like something broken in the kernel recently and now instead
of diagnosing a stall on one CPU, it diagnoses it as a stall in
smp_call_function on another CPU. This produces a large number of
actionable crash reports.

Sounds as if CPU stalls are diagnosed all over the place, what fits the "pattern" … and makes it pretty hard to pin them down.
From the context and you both having the same HW - my gut suggests to blacklist tpm.

Online

#34 2021-10-18 19:31:26

beryllium
Member
Registered: 2021-10-17
Posts: 7

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Yeah, it's getting really annoying: because of some (probably unrelated) issue, my laptop apparently won't survive a day in standby (or wakes up in the middle of it, not sure yet). So I just again came back to find the battery ran down completely.
It then took me 4 boots before I finally managed to get a graphical login.

The boots get stuck on wildly different times though, in order of decreasing appearence:
1) shortly after the LUKS passphrase
2) shortly before the passphrase
3) after I type my login password in LightDM, but before Cinnamon loads (and in this case it never seems to continue, waited 4 minutes. I can switch to the text console and back, and mouse moves, but doesn't seem to get unstuck)
4) even before anything on the framebuffer starts to show (i.e. I only see the lines from rEFInd, then black for ~20s, then suddenly lines from systemd start to appear)

The latest crash again seems to point to being busy with ACPI stuff, although based on your comments, perhaps that's just a bogus trace?

[    0.722351] NET: Registered PF_PACKET protocol family
[    1.011272] ata1: SATA link down (SStatus 4 SControl 300)
[    1.637810] tsc: Refined TSC clocksource calibration: 2303.997 MHz
[    1.637814] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x2135f518237, max_idle_ns: 440795271980 ns
[    3.499440] random: crng init done
[   28.144445] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [migration/0:17]
[   28.144450] Modules linked in:
[   28.144452] CPU: 0 PID: 17 Comm: migration/0 Not tainted 5.14.12-arch1-1 #1 67368bca17a1c518e2f20656bc1c93aa65e7e6fe
[   28.144455] Hardware name: LENOVO 81TC/LNVNB161216, BIOS BNCN43WW 05/21/2021
[   28.144455] Stopper: multi_cpu_stop+0x0/0x110 <- stop_machine_cpuslocked+0x173/0x1c0
[   28.144461] RIP: 0010:stop_machine_yield+0x2/0x10
[   28.144464] Code: 2b 14 25 28 00 00 00 75 0d 4c 8b 65 f8 c9 c3 b8 fe ff ff ff eb e3 e8 8d fb 94 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 90 <c3> 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 41 57 41
[   28.144465] RSP: 0018:ffff9e0640117e70 EFLAGS: 00000246
[   28.144467] RAX: 0000000000000000 RBX: ffff9e0640127d48 RCX: 0000000000000000
[   28.144468] RDX: 0000000000000002 RSI: 0000000000000140 RDI: ffffffff979fafa0
[   28.144469] RBP: ffff9e0640127d6c R08: ffff8cde9061f6f0 R09: 0000000000000004
[   28.144470] R10: 0000000000000092 R11: 0000000000000007 R12: 0000000000000001
[   28.144471] R13: ffffffff979fafa0 R14: 0000000000000000 R15: 0000000000000001
[   28.144472] FS:  0000000000000000(0000) GS:ffff8cde90600000(0000) knlGS:0000000000000000
[   28.144473] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.144475] CR2: 00007fc574b482d4 CR3: 0000000134a10002 CR4: 00000000003706f0
[   28.144476] Call Trace:
[   28.144478]  multi_cpu_stop+0x9b/0x110
[   28.144481]  ? stop_machine_yield+0x10/0x10
[   28.144483]  cpu_stopper_thread+0x90/0x140
[   28.144485]  smpboot_thread_fn+0xd5/0x1c0
[   28.144488]  ? smpboot_register_percpu_thread+0xf0/0xf0
[   28.144491]  kthread+0x12f/0x160
[   28.144492]  ? set_kthread_struct+0x40/0x40
[   28.144494]  ret_from_fork+0x1f/0x30
[   28.144498] fbcon: Taking over console
[   28.147778] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [migration/1:23]
[   28.147782] Modules linked in:
[   28.147784] CPU: 1 PID: 23 Comm: migration/1 Tainted: G             L    5.14.12-arch1-1 #1 67368bca17a1c518e2f20656bc1c93aa65e7e6fe
[   28.147786] Hardware name: LENOVO 81TC/LNVNB161216, BIOS BNCN43WW 05/21/2021
[   28.147787] Stopper: multi_cpu_stop+0x0/0x110 <- stop_machine_cpuslocked+0x173/0x1c0
[   28.147791] RIP: 0010:stop_machine_yield+0x0/0x10
[   28.147794] Code: 65 48 2b 14 25 28 00 00 00 75 0d 4c 8b 65 f8 c9 c3 b8 fe ff ff ff eb e3 e8 8d fb 94 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 <f3> 90 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 41
[   28.147795] RSP: 0000:ffff9e064017fe70 EFLAGS: 00000246
[   28.147796] RAX: 0000000000000000 RBX: ffff9e0640127d48 RCX: 0000000000000000
[   28.147798] RDX: 0000000000000002 RSI: 0000000000000140 RDI: ffffffff979fafa0
[   28.147799] RBP: ffff9e0640127d6c R08: ffff8cde9065f6f0 R09: 0000000000000000
[   28.147800] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
[   28.147800] R13: ffffffff979fafa0 R14: 0000000000000000 R15: 0000000000000001
[   28.147801] FS:  0000000000000000(0000) GS:ffff8cde90640000(0000) knlGS:0000000000000000
[   28.147803] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.147804] CR2: 0000000000000000 CR3: 0000000134a10001 CR4: 00000000003706e0
[   28.147805] Call Trace:
[   28.147806]  multi_cpu_stop+0x9b/0x110
[   28.147809]  ? stop_machine_yield+0x10/0x10
[   28.147811]  cpu_stopper_thread+0x90/0x140
[   28.147813]  smpboot_thread_fn+0xd5/0x1c0
[   28.147816]  ? smpboot_register_percpu_thread+0xf0/0xf0
[   28.147818]  kthread+0x12f/0x160
[   28.147819]  ? set_kthread_struct+0x40/0x40
[   28.147821]  ret_from_fork+0x1f/0x30
[   28.147928] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u16:1:9]
[   28.147932] Modules linked in:
[   28.147933] CPU: 2 PID: 9 Comm: kworker/u16:1 Tainted: G             L    5.14.12-arch1-1 #1 67368bca17a1c518e2f20656bc1c93aa65e7e6fe
[   28.147935] Hardware name: LENOVO 81TC/LNVNB161216, BIOS BNCN43WW 05/21/2021
[   28.147936] Workqueue: events_unbound async_run_entry_fn
[   28.147939] RIP: 0010:__do_softirq+0x79/0x2b4
[   28.147943] Code: 81 67 2c ff f7 ff ff be 00 01 00 00 e8 60 23 2d ff c7 44 24 10 0a 00 00 00 65 66 c7 05 4e ce 42 69 00 00 fb 66 0f 1f 44 00 00 <b8> ff ff ff ff 49 c7 c3 c0 60 80 97 41 0f bc c7 89 c5 83 c5 01 74
[   28.147944] RSP: 0000:ffff9e06401e0fa0 EFLAGS: 00000246
[   28.147946] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000002a97981e
[   28.147947] RDX: 0000000000000024 RSI: 000000002a97981e RDI: ffffffffffcd259a
[   28.147948] RBP: ffff9e06400d79e8 R08: 000000002a979842 R09: ffff8cdb40b500c0
[   28.147949] R10: 0000000000000280 R11: 0000000000000003 R12: 0000000000000000
[   28.147950] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000280
[   28.147951] FS:  0000000000000000(0000) GS:ffff8cde90680000(0000) knlGS:0000000000000000
[   28.147952] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.147953] CR2: 0000000000000000 CR3: 0000000134a10001 CR4: 00000000003706e0
[   28.147954] Call Trace:
[   28.147955]  <IRQ>
[   28.147957]  irq_exit_rcu+0xa9/0xc0
[   28.147959]  sysvec_apic_timer_interrupt+0x72/0x90
[   28.147962]  </IRQ>
[   28.147963]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[   28.147965] RIP: 0010:acpi_ut_update_object_reference+0x11e/0x210
[   28.147969] Code: de e8 74 f7 ff ff 4c 89 ef eb e7 48 8b 55 18 44 89 e8 48 8b 3c c2 48 85 ff 75 0b 41 ff c5 44 39 6d 2c 77 e7 eb 27 0f b6 47 09 <ff> c8 83 f8 02 77 09 89 de e8 42 f7 ff ff eb e1 48 89 e2 89 de e8
[   28.147970] RSP: 0000:ffff9e06400d7a98 EFLAGS: 00000282
[   28.147971] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000040
[   28.147972] RDX: ffff8cdb41869880 RSI: 0000000000000000 RDI: ffff8cdb4189dca8
[   28.147973] RBP: ffff8cdb4189d2d0 R08: 0000000000000000 R09: 0000000000000000
[   28.147974] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   28.147975] R13: 0000000000000000 R14: ffff8cdb40e18800 R15: 0000000000000000
[   28.147978]  acpi_ex_resolve_node_to_value+0x2a4/0x4cf
[   28.147985]  acpi_ex_resolve_to_value+0x3ba/0x468
[   28.147986]  acpi_ds_evaluate_name_path+0xb5/0x16e
[   28.147989]  ? acpi_ut_trace_ptr+0x25/0x66
[   28.147991]  acpi_ds_exec_end_op+0xcc/0x6ff
[   28.147993]  acpi_ps_parse_loop+0x7eb/0x8c3
[   28.147995]  acpi_ps_parse_aml+0x1aa/0x547
[   28.147997]  acpi_ps_execute_method+0x203/0x2bf
[   28.147999]  acpi_ns_evaluate+0x34a/0x4e7
[   28.148001]  acpi_evaluate_object+0x184/0x3ac
[   28.148003]  acpi_battery_get_state+0x93/0x230
[   28.148006]  acpi_battery_update+0x64/0x2b0
[   28.148008]  acpi_battery_add+0xc8/0x120
[   28.148010]  acpi_device_probe+0x44/0x160
[   28.148013]  really_probe+0x1f2/0x3f0
[   28.148017]  __driver_probe_device+0xfe/0x180
[   28.148019]  driver_probe_device+0x1e/0x90
[   28.148021]  __driver_attach+0xc0/0x1c0
[   28.148024]  ? __device_attach_driver+0xe0/0xe0
[   28.148026]  ? __device_attach_driver+0xe0/0xe0
[   28.148028]  bus_for_each_dev+0x86/0xd0
[   28.148031]  bus_add_driver+0x12b/0x1e0
[   28.148033]  driver_register+0x8f/0xe0
[   28.148035]  acpi_battery_init_async+0x26/0x5e
[   28.148038]  async_run_entry_fn+0x2d/0x130
[   28.148040]  process_one_work+0x1e0/0x3b0
[   28.148042]  worker_thread+0x50/0x3b0
[   28.148044]  ? process_one_work+0x3b0/0x3b0
[   28.148046]  kthread+0x12f/0x160
[   28.148047]  ? set_kthread_struct+0x40/0x40
[   28.148048]  ret_from_fork+0x1f/0x30
[   28.148296] clocksource: Switched to clocksource tsc
[   28.148332] Console: switching to colour frame buffer device 240x67
[   28.150268] microcode: sig=0x806ec, pf=0x4, revision=0xea
[   28.150382] microcode: Microcode Update Driver: v2.2.
[   28.150387] IPI shorthand broadcast: enabled
[   28.150399] sched_clock: Marking stable (28141195023, 8971306)->(28124997901, 25168428)
[   28.151034] registered taskstats version 1
[   28.154140] Loading compiled-in X.509 certificates
[   28.158259] Loaded X.509 cert 'Build time autogenerated kernel key: 1d8db0e7a2fe4f8548eed7c91ad7c927176169f6'
[   28.158870] zswap: loaded using pool lz4/z3fold

It's followed by another stall shortly after:

[   45.170803] Adding 20971516k swap on /swap/swapfile.  Priority:-2 extents:1 across:20971516k FS
[   45.188360] loop0: detected capacity change from 0 to 112528
[   45.188362] loop2: detected capacity change from 0 to 50616
[   45.188363] loop1: detected capacity change from 0 to 111904
[   45.191847] loop3: detected capacity change from 0 to 55384
[   45.213254] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[   45.436952] EXT4-fs (nvme0n1p7): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
[   45.467554] FAT-fs (nvme0n1p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[   45.583386] acpi PNP0C14:01: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[   45.583730] acpi PNP0C14:03: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[   45.583804] acpi PNP0C14:04: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[   45.718678] input: Ideapad extra buttons as /devices/pci0000:00/0000:00:1f.0/PNP0C09:00/VPC2004:00/input/input8
[   45.781693] ideapad_acpi VPC2004:00: DYTC interface is not available
[   72.147860] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [systemd-udevd:485]
[   72.148276] Modules linked in: syscopyarea(+) intel_ishtp(+) intel_pch_thermal(+) processor_thermal_rapl ideapad_laptop sysfillrect intel_rapl_common platform_profile sysimgblt i2c_hid_acpi intel_soc_dts_iosf fb_sys_fops sparse_keymap pcc_cpufreq(-) roles mac_hid fjes(-) i2c_hid rfkill wmi int3400_thermal acpi_thermal_rel int3403_thermal int340x_thermal_zone video acpi_tad acpi_pad vfat fat ext4 crc16 mbcache jbd2 squashfs loop crypto_user drm fuse ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq dm_crypt cbc encrypted_keys trusted asn1_encoder tee usbhid dm_mod serio_raw atkbd libps2 crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd tpm_crb xhci_pci xhci_pci_renesas tpm_tis i8042 tpm_tis_core tpm serio rng_core
[   72.148303] CPU: 2 PID: 485 Comm: systemd-udevd Tainted: G             L    5.14.12-arch1-1 #1 67368bca17a1c518e2f20656bc1c93aa65e7e6fe
[   72.148305] Hardware name: LENOVO 81TC/LNVNB161216, BIOS BNCN43WW 05/21/2021
[   72.148306] RIP: 0010:__do_softirq+0x79/0x2b4
[   72.148310] Code: 81 67 2c ff f7 ff ff be 00 01 00 00 e8 60 23 2d ff c7 44 24 10 0a 00 00 00 65 66 c7 05 4e ce 42 69 00 00 fb 66 0f 1f 44 00 00 <b8> ff ff ff ff 49 c7 c3 c0 60 80 97 41 0f bc c7 89 c5 83 c5 01 74
[   72.148311] RSP: 0018:ffff9e06401e0fa0 EFLAGS: 00000246
[   72.148313] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000001f
[   72.148313] RDX: 000000000000002c RSI: 00000000378e3da1 RDI: fffffffd9b3f1d76
[   72.148314] RBP: ffff9e0641ee35b8 R08: 0000000aa765824e R09: 7fffffffffffffff
[   72.148315] R10: 0000000aa6d70200 R11: 0000000aa6dc941c R12: 0000000000000000
[   72.148316] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000200
[   72.148316] FS:  00007f3f33d96a40(0000) GS:ffff8cde90680000(0000) knlGS:0000000000000000
[   72.148318] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   72.148318] CR2: 00007f3f3322a44b CR3: 0000000104212004 CR4: 00000000003706e0
[   72.148319] Call Trace:
[   72.148320]  <IRQ>
[   72.148321]  irq_exit_rcu+0xa9/0xc0
[   72.148324]  sysvec_apic_timer_interrupt+0x72/0x90
[   72.148327]  </IRQ>
[   72.148327]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[   72.148329] RIP: 0010:finish_task_switch.isra.0+0xb1/0x290
[   72.148331] Code: 08 48 89 da 48 8b 1b 4c 89 ff 48 c7 02 00 00 00 00 ff d0 0f 1f 00 48 85 db 75 e2 4c 89 ff e8 a6 fd ff ff fb 66 0f 1f 44 00 00 <65> 48 8b 04 25 c0 7b 01 00 0f 1f 44 00 00 4d 85 ed 74 21 65 48 8b
[   72.148332] RSP: 0018:ffff9e0641ee3660 EFLAGS: 00000282
[   72.148333] RAX: ffff8cdb40ce6000 RBX: 0000000000000000 RCX: 0000000000000002
[   72.148334] RDX: 0000000000000000 RSI: 0000000000000005 RDI: ffff8cde906ad700
[   72.148334] RBP: ffff9e0641ee3688 R08: 0000000000000002 R09: 0000000000000000
[   72.148335] R10: 0000000000000000 R11: 0000000000000002 R12: ffff8cdb40ce6000
[   72.148335] R13: ffff8cdb449b9540 R14: 0000000000000001 R15: ffff8cde906ad700
[   72.148337]  __schedule+0x33b/0x1530
[   72.148340]  ? btrfs_verify_level_key+0xca/0x110 [btrfs 8e400393aeea02ff06606243150fa6b07ebc3d4e]
[   72.148364]  preempt_schedule_irq+0x41/0x60
[   72.148365]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[   72.148367] RIP: 0010:btrfs_search_slot+0x815/0x990 [btrfs]
[   72.148381] Code: 8b 54 24 10 8b 44 24 0c 85 d2 0f 84 37 fd ff ff 41 83 46 40 01 e9 2d fd ff ff 4c 89 e7 e8 83 b2 ff ff 49 89 c6 e9 ef f8 ff ff <4c> 8b 24 24 44 8b 44 24 4c 4d 89 ef e9 94 f8 ff ff 4d 89 ef 45 89
[   72.148382] RSP: 0018:ffff9e0641ee3808 EFLAGS: 00000246
[   72.148383] RAX: 00000000fffffff5 RBX: 0000000000000001 RCX: 0000000000000003
[   72.148384] RDX: 0000000000000004 RSI: ffff9e0641ee36c7 RDI: 0000000000000000
[   72.148384] RBP: 0000000000000000 R08: 0000000000001000 R09: ffff8cde906ae1e8
[   72.148385] R10: 0000000001325cbb R11: 0000000000000000 R12: 0000000000000001
[   72.148386] R13: ffff8cdb8b92ccb0 R14: ffff8cdb8b92ccb4 R15: 00000000000000dd
[   72.148388]  ? btrfs_search_slot+0x22a/0x990 [btrfs 8e400393aeea02ff06606243150fa6b07ebc3d4e]
[   72.148402]  btrfs_lookup_csum+0x73/0x150 [btrfs 8e400393aeea02ff06606243150fa6b07ebc3d4e]
[   72.148418]  btrfs_lookup_bio_sums+0x22c/0x530 [btrfs 8e400393aeea02ff06606243150fa6b07ebc3d4e]
[   72.148434]  btrfs_submit_data_bio+0x10c/0x210 [btrfs 8e400393aeea02ff06606243150fa6b07ebc3d4e]
[   72.148451]  submit_one_bio+0x44/0x70 [btrfs 8e400393aeea02ff06606243150fa6b07ebc3d4e]
[   72.148471]  extent_readahead+0x3c0/0x3f0 [btrfs 8e400393aeea02ff06606243150fa6b07ebc3d4e]
[   72.148491]  read_pages+0xb7/0x2b0
[   72.148494]  page_cache_ra_unbounded+0x1a0/0x210
[   72.148497]  filemap_get_pages+0x106/0x600
[   72.148499]  filemap_read+0xb9/0x350
[   72.148501]  new_sync_read+0x14f/0x1e0
[   72.148503]  vfs_read+0xf3/0x180
[   72.148505]  ksys_read+0x67/0xe0
[   72.148506]  do_syscall_64+0x59/0x80
[   72.148508]  ? syscall_exit_to_user_mode+0x23/0x40
[   72.148509]  ? do_syscall_64+0x69/0x80
[   72.148510]  ? syscall_exit_to_user_mode+0x23/0x40
[   72.148511]  ? do_syscall_64+0x69/0x80
[   72.148513]  ? irqtime_account_irq+0x38/0xb0
[   72.148514]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   72.148516] RIP: 0033:0x7f3f347ab762
[   72.148517] Code: 48 8b 15 69 98 00 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb ba 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[   72.148518] RSP: 002b:00007ffdbd591c08 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[   72.148519] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3f347ab762
[   72.148520] RDX: 0000000000000006 RSI: 00007ffdbd591c40 RDI: 000000000000000f
[   72.148521] RBP: 00007ffdbd591c40 R08: 000055ca4fb8e100 R09: 00007f3f3478da60
[   72.148521] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000f
[   72.148522] R13: 00007ffdbd591c40 R14: 0000000000000006 R15: 000055ca4faa59b0
[   72.149009] clocksource: timekeeping watchdog on CPU2: Marking clocksource 'tsc' as unstable because the skew is too large:
[   72.149011] clocksource:                       'acpi_pm' wd_nsec: 0 wd_now: 94de67 wd_last: aceb18 mask: ffffff
[   72.149012] clocksource:                       'tsc' cs_nsec: 22994562358 cs_now: 2c3478b1e4 cs_last: 1fdea722c4 mask: ffffffffffffffff
[   72.149014] clocksource:                       'tsc' is current clocksource.
[   72.149227] tsc: Marking TSC unstable due to clocksource watchdog
[   72.149447] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
[   72.149447] sched_clock: Marking unstable (72140478594, 8969952)<-(72124278621, 25168428)
[   72.150365] clocksource: Checking clocksource tsc synchronization from CPU 5 to CPUs 0-1,4.
[   72.150564] clocksource: Switched to clocksource acpi_pm
[   72.169536] kauditd_printk_skb: 80 callbacks suppressed
[   72.169537] audit: type=1130 audit(1634583294.971:94): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=dbus comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   72.172631] audit: type=1334 audit(1634583294.975:95): prog-id=30 op=LOAD
[   72.173175] audit: type=1334 audit(1634583294.975:96): prog-id=31 op=LOAD
[   72.173717] audit: type=1334 audit(1634583294.975:97): prog-id=32 op=LOAD
[   72.209221] Linux agpgart interface v0.103
[   72.221188] audit: type=1130 audit(1634583295.021:98): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-logind comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   72.236946] audit: type=1130 audit(1634583295.038:99): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=NetworkManager comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   72.261801] audit: type=1130 audit(1634583295.065:100): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sshd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   72.265737] audit: type=1334 audit(1634583295.068:101): prog-id=33 op=LOAD
[   72.266181] audit: type=1334 audit(1634583295.068:102): prog-id=34 op=LOAD
[   72.266618] audit: type=1334 audit(1634583295.068:103): prog-id=35 op=LOAD

(The btrfs stuff is probably because I indeed uncleanly aborted the last stuck boot.)

In the boot before it, I saw some other line I scrolling by in one of the boots: intel_ish_ipc 0000:00:13.0: [ishtp-ish]: Timed out waiting for FW-initiated reset
This apparently also led to an OOPS.

Full context below:

Oct 18 20:51:50 yoga kernel: intel-lpss 0000:00:19.0: enabling device (0004 -> 0006)
Oct 18 20:51:50 yoga kernel: mei_me 0000:00:16.0: enabling device (0000 -> 0002)
Oct 18 20:51:50 yoga kernel: cfg80211: Loading compiled-in X.509 certificates for regulatory database
Oct 18 20:51:50 yoga kernel: cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
Oct 18 20:51:50 yoga kernel: platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
Oct 18 20:51:50 yoga kernel: cfg80211: failed to load regulatory.db
Oct 18 20:51:59 yoga kernel: intel_ish_ipc 0000:00:13.0: [ishtp-ish]: Timed out waiting for FW-initiated reset
Oct 18 20:51:59 yoga kernel: intel_ish_ipc 0000:00:13.0: ISH: hw start failed.
Oct 18 20:52:14 yoga kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:1:111]
Oct 18 20:52:14 yoga kernel: Modules linked in: snd_timer(+) typec_ucsi(+) intel_spi(+) processor_thermal_device_pci_legacy(+) drm_kms_helper(+) fjes(-) cfg80211 snd processor_thermal_device typec spi_nor cec processor_thermal_rfim mei_me i2c_i801 ideapad_laptop platform_profile int3400_thermal sparse_keymap roles>
Oct 18 20:52:14 yoga kernel:  xhci_pci_renesas tpm_tis i8042 tpm_tis_core serio tpm rng_core
Oct 18 20:52:14 yoga kernel: CPU: 2 PID: 111 Comm: kworker/2:1 Not tainted 5.14.12-arch1-1 #1 67368bca17a1c518e2f20656bc1c93aa65e7e6fe
Oct 18 20:52:14 yoga kernel: Hardware name: LENOVO 81TC/LNVNB161216, BIOS BNCN43WW 05/21/2021
Oct 18 20:52:14 yoga kernel: Workqueue: events fw_reset_work_fn [intel_ish_ipc]
Oct 18 20:52:14 yoga kernel: RIP: 0010:__do_softirq+0x79/0x2b4
Oct 18 20:52:14 yoga kernel: Code: 81 67 2c ff f7 ff ff be 00 01 00 00 e8 60 23 2d ff c7 44 24 10 0a 00 00 00 65 66 c7 05 4e ce 02 7e 00 00 fb 66 0f 1f 44 00 00 <b8> ff ff ff ff 49 c7 c3 c0 60 c0 82 41 0f bc c7 89 c5 83 c5 01 74
Oct 18 20:52:14 yoga kernel: RSP: 0000:ffffbb3e801e0fa0 EFLAGS: 00000246
Oct 18 20:52:14 yoga kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000001f
Oct 18 20:52:14 yoga kernel: RDX: 000000000000006c RSI: 00000000378e3a78 RDI: fffffffe1f534c7d
Oct 18 20:52:14 yoga kernel: RBP: ffffbb3e8083bc78 R08: 00000002dd99eaa3 R09: 7fffffffffffffff
Oct 18 20:52:14 yoga kernel: R10: 00000002dd0c0200 R11: 00000002dd0f249e R12: 0000000000000000
Oct 18 20:52:14 yoga kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000200
Oct 18 20:52:14 yoga kernel: FS:  0000000000000000(0000) GS:ffff8ef310680000(0000) knlGS:0000000000000000
Oct 18 20:52:14 yoga kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 18 20:52:14 yoga kernel: CR2: 000055af0caf6030 CR3: 0000000308210006 CR4: 00000000003706e0
Oct 18 20:52:14 yoga kernel: Call Trace:
Oct 18 20:52:14 yoga kernel:  <IRQ>
Oct 18 20:52:14 yoga kernel:  irq_exit_rcu+0xa9/0xc0
Oct 18 20:52:14 yoga kernel:  sysvec_apic_timer_interrupt+0x72/0x90
Oct 18 20:52:14 yoga kernel:  </IRQ>
Oct 18 20:52:14 yoga kernel:  asm_sysvec_apic_timer_interrupt+0x12/0x20
Oct 18 20:52:14 yoga kernel: RIP: 0010:_raw_spin_unlock_irqrestore+0x25/0x30
Oct 18 20:52:14 yoga kernel: Code: 00 00 00 00 00 0f 1f 44 00 00 c6 07 00 0f 1f 40 00 f7 c6 00 02 00 00 75 0a 65 ff 0d 75 ba 34 7e 74 0a c3 fb 66 0f 1f 44 00 00 <eb> ed e8 4b 85 53 ff c3 0f 1f 00 0f 1f 44 00 00 65 ff 05 54 ba 34
Oct 18 20:52:14 yoga kernel: RSP: 0000:ffffbb3e8083bd28 EFLAGS: 00000206
Oct 18 20:52:14 yoga kernel: RAX: 0000000000000001 RBX: ffff8eefc3d64028 RCX: ffff8eefc3d66170
Oct 18 20:52:14 yoga kernel: RDX: ffff8eefc3a29a40 RSI: 0000000000000297 RDI: ffff8eefc3d66180
Oct 18 20:52:14 yoga kernel: RBP: 0000000000000297 R08: ffff8eefc3a29a40 R09: ffff8eefc3a29a40
Oct 18 20:52:14 yoga kernel: R10: ffffbb3e8083bdd4 R11: 0000000000000002 R12: ffff8eefc3d66180
Oct 18 20:52:14 yoga kernel: R13: ffff8eefc3a29e40 R14: 0000000000000000 R15: 0000000000000000
Oct 18 20:52:14 yoga kernel:  write_ipc_from_queue.isra.0+0x1b3/0x280 [intel_ish_ipc 1c95de89f29e9cbb530b0f479cf1b975f88fd747]
Oct 18 20:52:14 yoga kernel:  write_ipc_to_queue+0x119/0x180 [intel_ish_ipc 1c95de89f29e9cbb530b0f479cf1b975f88fd747]
Oct 18 20:52:14 yoga kernel:  ipc_send_mng_msg+0xa4/0x130 [intel_ish_ipc 1c95de89f29e9cbb530b0f479cf1b975f88fd747]
Oct 18 20:52:14 yoga kernel:  fw_reset_work_fn+0x10d/0x240 [intel_ish_ipc 1c95de89f29e9cbb530b0f479cf1b975f88fd747]
Oct 18 20:52:14 yoga kernel:  process_one_work+0x1e0/0x3b0
Oct 18 20:52:14 yoga kernel:  worker_thread+0x50/0x3b0
Oct 18 20:52:14 yoga kernel:  ? process_one_work+0x3b0/0x3b0
Oct 18 20:52:14 yoga kernel:  kthread+0x12f/0x160
Oct 18 20:52:14 yoga kernel:  ? set_kthread_struct+0x40/0x40
Oct 18 20:52:14 yoga kernel:  ret_from_fork+0x1f/0x30
Oct 18 20:52:14 yoga kernel: clocksource: timekeeping watchdog on CPU2: Marking clocksource 'tsc' as unstable because the skew is too large:
Oct 18 20:52:14 yoga kernel: clocksource:                       'acpi_pm' wd_nsec: 0 wd_now: 701b5e wd_last: 8821a6 mask: ffffff
Oct 18 20:52:14 yoga kernel: clocksource:                       'tsc' cs_nsec: 22995001941 cs_now: 17bac8aa44 cs_last: b64e6fc5e mask: ffffffffffffffff
Oct 18 20:52:14 yoga kernel: clocksource:                       'tsc' is current clocksource.
Oct 18 20:52:14 yoga kernel: tsc: Marking TSC unstable due to clocksource watchdog
Oct 18 20:52:14 yoga kernel: BUG: unable to handle page fault for address: ffffbb3e80384034
Oct 18 20:52:14 yoga kernel: #PF: supervisor read access in kernel mode
Oct 18 20:52:14 yoga kernel: #PF: error_code(0x0000) - not-present page
Oct 18 20:52:14 yoga kernel: PGD 100000067 P4D 100000067 PUD 1001b7067 PMD 100dce067 PTE 0
Oct 18 20:52:14 yoga kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Oct 18 20:52:14 yoga kernel: CPU: 2 PID: 111 Comm: kworker/2:1 Tainted: G             L    5.14.12-arch1-1 #1 67368bca17a1c518e2f20656bc1c93aa65e7e6fe
Oct 18 20:52:14 yoga kernel: Hardware name: LENOVO 81TC/LNVNB161216, BIOS BNCN43WW 05/21/2021
Oct 18 20:52:14 yoga kernel: Workqueue: events fw_reset_work_fn [intel_ish_ipc]
Oct 18 20:52:14 yoga kernel: RIP: 0010:fw_reset_work_fn+0x11d/0x240 [intel_ish_ipc]
Oct 18 20:52:14 yoga kernel: Code: 89 41 38 b9 04 00 00 00 48 8d 54 24 04 be 04 00 00 00 48 89 ef e8 b3 fb ff ff bb d0 07 00 00 48 8b 85 40 22 00 00 48 83 c0 34 <8b> 00 83 e0 03 83 f8 03 41 0f 95 c4 75 6e 48 8b 85 40 22 00 00 48
Oct 18 20:52:14 yoga kernel: RSP: 0000:ffffbb3e8083be68 EFLAGS: 00010282
Oct 18 20:52:14 yoga kernel: RAX: ffffbb3e80384034 RBX: 00000000000007d0 RCX: ffff8eefc3d66170
Oct 18 20:52:14 yoga kernel: RDX: 0000000000000000 RSI: 0000000000000297 RDI: ffff8eefc3d66180
Oct 18 20:52:14 yoga kernel: RBP: ffff8eefc3d64028 R08: ffff8eefc3a29a40 R09: ffff8eefc3a29a40
Oct 18 20:52:14 yoga kernel: R10: ffffbb3e8083bdd4 R11: 0000000000000002 R12: ffff8eefc3d66180
Oct 18 20:52:14 yoga kernel: R13: ffff8ef3106b2900 R14: 0000000000000000 R15: 0000000000000000
Oct 18 20:52:14 yoga kernel: FS:  0000000000000000(0000) GS:ffff8ef310680000(0000) knlGS:0000000000000000
Oct 18 20:52:14 yoga kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 18 20:52:14 yoga kernel: CR2: ffffbb3e80384034 CR3: 0000000308210006 CR4: 00000000003706e0
Oct 18 20:52:14 yoga kernel: Call Trace:
Oct 18 20:52:14 yoga kernel:  process_one_work+0x1e0/0x3b0
Oct 18 20:52:14 yoga kernel:  worker_thread+0x50/0x3b0
Oct 18 20:52:14 yoga kernel:  ? process_one_work+0x3b0/0x3b0
Oct 18 20:52:14 yoga kernel:  kthread+0x12f/0x160
Oct 18 20:52:14 yoga kernel:  ? set_kthread_struct+0x40/0x40
Oct 18 20:52:14 yoga kernel:  ret_from_fork+0x1f/0x30
Oct 18 20:52:14 yoga kernel: Modules linked in: snd_timer(+) typec_ucsi(+) intel_spi(+) processor_thermal_device_pci_legacy(+) drm_kms_helper(+) fjes(-) cfg80211 snd processor_thermal_device typec spi_nor cec processor_thermal_rfim mei_me i2c_i801 ideapad_laptop platform_profile int3400_thermal sparse_keymap roles>
Oct 18 20:52:14 yoga kernel:  xhci_pci_renesas tpm_tis i8042 tpm_tis_core serio tpm rng_core
Oct 18 20:52:14 yoga kernel: CR2: ffffbb3e80384034
Oct 18 20:52:14 yoga kernel: ---[ end trace 5d4ee3906635d056 ]---
Oct 18 20:52:14 yoga kernel: RIP: 0010:fw_reset_work_fn+0x11d/0x240 [intel_ish_ipc]
Oct 18 20:52:14 yoga kernel: Code: 89 41 38 b9 04 00 00 00 48 8d 54 24 04 be 04 00 00 00 48 89 ef e8 b3 fb ff ff bb d0 07 00 00 48 8b 85 40 22 00 00 48 83 c0 34 <8b> 00 83 e0 03 83 f8 03 41 0f 95 c4 75 6e 48 8b 85 40 22 00 00 48
Oct 18 20:52:14 yoga kernel: RSP: 0000:ffffbb3e8083be68 EFLAGS: 00010282
Oct 18 20:52:14 yoga kernel: RAX: ffffbb3e80384034 RBX: 00000000000007d0 RCX: ffff8eefc3d66170
Oct 18 20:52:14 yoga kernel: RDX: 0000000000000000 RSI: 0000000000000297 RDI: ffff8eefc3d66180
Oct 18 20:52:14 yoga kernel: RBP: ffff8eefc3d64028 R08: ffff8eefc3a29a40 R09: ffff8eefc3a29a40
Oct 18 20:52:14 yoga kernel: R10: ffffbb3e8083bdd4 R11: 0000000000000002 R12: ffff8eefc3d66180
Oct 18 20:52:14 yoga kernel: R13: ffff8ef3106b2900 R14: 0000000000000000 R15: 0000000000000000
Oct 18 20:52:14 yoga kernel: FS:  0000000000000000(0000) GS:ffff8ef310680000(0000) knlGS:0000000000000000
Oct 18 20:52:14 yoga kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 18 20:52:14 yoga kernel: CR2: ffffbb3e80384034 CR3: 0000000308210006 CR4: 00000000003706e0
Oct 18 20:52:14 yoga kernel: intel_rapl_common: Found RAPL domain package
Oct 18 20:52:14 yoga kernel: intel-spi 0000:00:1f.5: w25q128 (16384 Kbytes)
Oct 18 20:52:14 yoga kernel: intel_rapl_common: Found RAPL domain dram
Oct 18 20:52:14 yoga kernel: Creating 1 MTD partitions on "0000:00:1f.5":
Oct 18 20:52:14 yoga kernel: 0x000000000000-0x000001000000 : "BIOS"
Oct 18 20:52:14 yoga systemd[1]: Started Network Time Synchronization.
Oct 18 20:52:14 yoga audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-timesyncd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 18 20:52:14 yoga systemd[1]: Reached target System Initialization.

There was another boot where I got I/O errors being reported on my SSD (haven't seen any of these before), but unfortunately, those logs weren't saved (which seems to make sense), and I forgot to take a picture.

There are also all kinds of warnings about the TSC being wrong (and that being detected and 'fixed'), not sure if that's harmless.

I'll google a bit on disabling TPM to see if that helps, thanks for the tip!

Offline

#35 2021-10-18 20:14:09

beryllium
Member
Registered: 2021-10-17
Posts: 7

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Disabling TPM doesn't seem to help sad

I disabled the TPM in my BIOS (by disabling the Intel Platform Trust Technology option).
First boot worked like a charm

Second and third boots failed just as before with that same "intel_ish_ipc 0000:00:13.0: [ishtp-ish]: Timed out waiting for FW-initiated reset" message.
It didn't continue to boot at all, Caps Lock stopped working, couldn't reboot with Ctrl-Alt-Del.

Fourth boot worked like a charm, but it seems that's just one of the lucky boots I tend to get...

I suppose I need to dig into that intel_ish_ipc stuff, but I need to get some actual work done still, so that's for tomorrow. Any tips are appreciated though smile

Offline

#36 2021-10-18 21:08:37

seth
Member
Registered: 2012-09-03
Posts: 49,948

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

my laptop apparently won't survive a day in standby (or wakes up in the middle of it, not sure yet)

Also used my Windows install to have the latest BIOS version applied.

3rd link below…

I disabled the TPM in my BIOS

Is the tpm module still loaded ("lsmod | grep -i tpm")?

Online

#37 2021-10-18 23:22:42

halogene
Member
Registered: 2013-05-29
Posts: 47

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Oh, something happened on this thread! Sorry for getting back on this late. I tried blacklisting tpm, but it still gets loaded because it is part of a dependency chain (dm_crypt -> encrypted-keys -> trusted -> tpm). Is it safe to blacklist it using "install tpm /bin/true" in a /etc/modprobe.d/blacklist.conf, even though other modules depend on it? Just asking, as I would like to avoid having to usb-boot and chroot again...

@beryllium: you can install linux-lts, that brings up the system nicely for me up to now (with lts kernel though).

Offline

#38 2021-10-19 06:01:38

seth
Member
Registered: 2012-09-03
Posts: 49,948

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Do you use the TPM to store the keys?
https://wiki.archlinux.org/title/Truste … _with_LUKS

You'll however have the module in the initramfs, so it'll load from there AND:
If you've two kernels and the tpm is in the initramfs of the lts kernel (and w/o a modprobe option to void it) you'll be able to boot that kernel no matter what.
(Well, hopefully - better safe than sorry and have a grml key on your keyring ;-)

Online

#39 2021-10-19 12:30:34

halogene
Member
Registered: 2013-05-29
Posts: 47

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

I did not configure TPM to store the keys, in fact TPM was disabled in BIOS / UEFI from the very start of the installation. The tpm module gets loaded because of the "encrypt" hook in mkinitcpio.conf, which loads dm-crypt and as a consequence of dependency chain then also the tpm module.

Having said that, I apologize for not being quite as acquainted with the kernel loading process as I maybe should be. As far as i understand, the generation of the initramfs is controlled by the mkinitcpio.conf. I only have specified ext4 as "module" in mkinitcpio.conf, however the tpm module will be loaded as dependency to the "encrypt" hook so I gather the tpm module will be included in both the lts and the current kernel. If I am now faking the loading of the tpm module via "install tpm /bin/true" in a modprobe.d configuration file, then this would, to my understanding, affect both kernel versions. So my question remains if the system can even boot up if I prevent the tpm module from being loaded, given that other modules that I require depend on it?

Offline

#40 2021-10-19 14:27:38

seth
Member
Registered: 2012-09-03
Posts: 49,948

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

The tpm module gets loaded because of the "encrypt" hook in mkinitcpio.conf

The point was that if you used the TPM to store the key, tpm would not be optional ;-)

If I am now faking the loading of the tpm module via "install tpm /bin/true" in a modprobe.d configuration file, then this would, to my understanding, affect both kernel versions.

Yes and no.
The initramfs will have copies of modules and configurations that are loaded before the system root is mounted, so if it is what loads the tpm module, whatever changes you make to your configuration on the root partition is irrelevant (unless reloading the module) - this allows you to decrypt the root partition itfp.
The configuration on the root partition gets picked up by mkinitcpio, but you do not have to re-create the initramfs for all installed kernels and can just update it for a selected kernel.

I don't expect the encryption module to fail if tpm isn't loaded (tpm is an optional kernel feature), but can't guarantee it either. It could just bail if a function cannot be resolved.

Online

#41 2021-10-19 19:06:31

beryllium
Member
Registered: 2021-10-17
Posts: 7

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

seth wrote:

3rd link below…

Thanks, but that's about filesystems still being mounted, and thus one should not remount them read/write while in Linux.
Would it also explain it randomly awaking from standby?

seth wrote:

Is the tpm module still loaded ("lsmod | grep -i tpm")?

Yes, the module itself is still loaded, but the actual hardware driver isn't:

martin@yoga ~ $ lsmod | grep -i tpm
tpm                    90112  1 trusted
rng_core               16384  1 tpm

I double-checked that with `cat /sys/class/tpm/tpm0/device/description` which now gives me No such file, and previously showed me the version of the module (IIRC).

So I suppose that should be 'disabled enough' for our purposes?

halogene wrote:

you can install linux-lts, that brings up the system nicely for me up to now (with lts kernel though).

That's good to know, thanks! I didn't dare before, given that I had troubles on older kernels before, but I suppose 'older' is already new enough for my system by now wink

Offline

#42 2021-10-19 20:08:34

seth
Member
Registered: 2012-09-03
Posts: 49,948

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

that's about filesystems still being mounted

I'm not sure how sufficiently explicit the article is, but if your running one OS while another one is hibernating, all sorts of seemingly random shit can happen - notably anything involving ACPI (is further sleep/hibernation)

Would it also explain it randomly awaking from standby?

It might not even be random, but rather windows having placed a wakeup call to run some updates.

This is absolutely non-optional.
If you want to run two OS on the same machine, it can be only one at a time and that means the other one must not be hibernating and for windows that implies to disable the "fast start" behavior (cause that's just hibernation in disguise)

If that's currently not the case, please fix this first and see what of your issues remain. (You want to cleanly reboot either OS at least once while the other one is not hibernating - twice won't hurt you ;-)

Online

#43 2021-10-21 19:01:34

pm3840
Member
Registered: 2013-12-16
Posts: 34

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

My arch in virutalbox has the same issue from time to time, soft lockup.
One worst case for me: the / was remounted ro and upon bootup the fsck needs to be done manually.
Sleep/wake triggers it.

Offline

#44 2021-10-21 19:52:17

beryllium
Member
Registered: 2021-10-17
Posts: 7

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

seth wrote:

This is absolutely non-optional.

Ok, I'll disable it and check again, thanks!

Offline

#45 2022-01-20 21:40:19

halogene
Member
Registered: 2013-05-29
Posts: 47

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

So 5.10 LTS has been replaced by 5.15. Now rebooting is sort of a gambling game. Most of the time the system gets stuck totally. Some times it boots up DESPITE throwing up soft lockup errors on the way. Then some times this produces weird side effects like components not working or login manager loading but login not working. I have verified fast boot is off in the windows installation (it already was all the time) and have also no windows hibernating. No idea what to do, since the arch linux usb boot medium won't start properly, too. I ran a complete memtest with no errors. Next step is I'll try a current boot medium of another linux distro to find out if it is the kernel after all or if it can be somehow allocated to arch linux (I sure hope it can't, I think I'm like 10 years using arch linux for work purposes).

Offline

#46 2022-01-20 21:51:36

seth
Member
Registered: 2012-09-03
Posts: 49,948

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

seth wrote:

Did you at any point test "intel_idle.max_cstate=1"?

I have installed intel-ucode, made sure it's loaded by rEFInd and it seems to be working:

The following grep does actually not suggest that the microcode was updated - what could be normal for new CPUs.
Has this meanwhile changed? https://wiki.archlinux.org/title/Microc … ed_on_boot

Some times it boots up DESPITE throwing up soft lockup errors on the way.

Do you have a system journal for such a boot? (Or ideally more than one)

Online

#47 2022-01-20 21:56:40

beryllium
Member
Registered: 2021-10-17
Posts: 7

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

halogene wrote:

So 5.10 LTS has been replaced by 5.15.

Oh, thanks for the heads up on the new LTS! I'll stay away from kernel updates for a while then...

I was also still having all kinds of lockup issues, also with fastboot disabled.
And together with the wakeup stuff that quickly became hair-pulling madness wink

The spurious wakeup from suspend stuff I was/am having actually appeared to be two-fold:
- it doesn't actually suspend due to fuse preventing it from entering suspend, and it basically immediately wakes up again. But the annoying thing is that the led stays 'glowing/blinking', which suggests that it actually is still suspended.
- it went into 's2idle' mode, which apparently doesn't really save much power (drains the battery in 12 hours or so). Adding mem_sleep_default=deep to the kernel parameters fixed that.

I still have another weird thing that the power button seems to work 1 time to get it into suspend, but after that pressing it just kills power completely after waiting ~10s. That's probably for another thread though.

Offline

#48 2022-01-24 08:53:29

halogene
Member
Registered: 2013-05-29
Posts: 47

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

I remember I tested the cstate limiter, but will test again. Just so I test this right, it would be sufficient to edit the GRUB menu line during bootup from within GRUB and add it to the kernel parameters, right?

When testing for microcode updates I get

Jan 23 18:07:38 datenfalter kernel: microcode: sig=0x806ec, pf=0x4, revision=0xea
Jan 23 18:07:38 datenfalter kernel: microcode: Microcode Update Driver: v2.2.

The CPU is Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz. So I gather there just are no updates for my CPU... intel-ucode is installed. On intel's website I found only an updated driver for the integrated graphics of that chipset,

What output would you like to see? Here is for example interesting stuff from the current boot (output of "sudo journalctl -b 0"). I can load up an entire journal to pastebin if that helps.

Jan 23 18:07:38 datenfalter kernel: [Firmware Bug]: TSC ADJUST differs within socket(s), fixing all errors
Jan 23 18:07:38 datenfalter kernel:  #2 #3 #4 #5 #6 #7
Jan 23 18:07:38 datenfalter kernel: ------------[ cut here ]------------
Jan 23 18:07:38 datenfalter kernel: WARNING: CPU: 7 PID: 124 at arch/x86/kernel/cpu/sgx/main.c:428 ksgxd+0x1d6/0x1f0
Jan 23 18:07:38 datenfalter kernel: Modules linked in:
Jan 23 18:07:38 datenfalter kernel: CPU: 7 PID: 124 Comm: ksgxd Not tainted 5.15.15-1-lts #1 3d281467c2a7b0dccedf86312492605ff493e1c7
Jan 23 18:07:38 datenfalter kernel: Hardware name: LENOVO 81TD/LNVNB161216, BIOS BNCN43WW 05/21/2021
Jan 23 18:07:38 datenfalter kernel: RIP: 0010:ksgxd+0x1d6/0x1f0
Jan 23 18:07:38 datenfalter kernel: Code: ff e9 f5 fe ff ff 48 89 df e8 e6 00 0d 00 84 c0 0f 84 cb fe ff ff 31 ff e8 47 01 0d 00 84 c0 0f 85 a1 fe ff ff e9 b7 fe ff ff <0f> 0b >
Jan 23 18:07:38 datenfalter kernel: RSP: 0000:ffffb9d50060fed0 EFLAGS: 00010283
Jan 23 18:07:38 datenfalter kernel: RAX: ffffb9d50054d5d0 RBX: ffffffffa1c618f0 RCX: 0000000000000000
Jan 23 18:07:38 datenfalter kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jan 23 18:07:38 datenfalter kernel: RBP: ffff93eec1d656c0 R08: 0000000000000000 R09: 0000000000000000
Jan 23 18:07:38 datenfalter kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff93eec1170980
Jan 23 18:07:38 datenfalter kernel: R13: ffffb9d50008fd78 R14: 0000000000000000 R15: ffff93eec1318000
Jan 23 18:07:38 datenfalter kernel: FS:  0000000000000000(0000) GS:ffff93f2107c0000(0000) knlGS:0000000000000000
Jan 23 18:07:38 datenfalter kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 23 18:07:38 datenfalter kernel: CR2: 0000000000000000 CR3: 0000000316610001 CR4: 00000000003706e0
Jan 23 18:07:38 datenfalter kernel: Call Trace:
Jan 23 18:07:38 datenfalter kernel:  <TASK>
Jan 23 18:07:38 datenfalter kernel:  ? __sgx_sanitize_pages.constprop.0+0x190/0x190
Jan 23 18:07:38 datenfalter kernel:  kthread+0x124/0x150
Jan 23 18:07:38 datenfalter kernel:  ? set_kthread_struct+0x50/0x50
Jan 23 18:07:38 datenfalter kernel:  ret_from_fork+0x1f/0x30
Jan 23 18:07:38 datenfalter kernel:  </TASK>
Jan 23 18:07:38 datenfalter kernel: ---[ end trace bffcacaa03b8ab6f ]---
Jan 23 18:07:38 datenfalter kernel: acpi PNP0C14:01: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
Jan 23 18:07:38 datenfalter kernel: acpi PNP0C14:03: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
Jan 23 18:07:38 datenfalter kernel: acpi PNP0C14:04: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
Jan 23 18:07:38 datenfalter kernel: input: Ideapad extra buttons as /devices/pci0000:00/0000:00:1f.0/PNP0C09:00/VPC2004:00/input/input3
Jan 23 18:07:39 datenfalter kernel: ideapad_acpi VPC2004:00: DYTC interface is not available
Jan 23 18:08:02 datenfalter kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [swapper/2:0]
Jan 23 18:08:02 datenfalter kernel: Modules linked in: ucsi_acpi(+) ttm typec_ucsi intel_gtt typec ideapad_laptop roles sparse_keymap platform_profile i2c_hid_acpi rfkill wmi pcc>
Jan 23 18:08:02 datenfalter kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G        W         5.15.15-1-lts #1 3d281467c2a7b0dccedf86312492605ff493e1c7
Jan 23 18:08:02 datenfalter kernel: Hardware name: LENOVO 81TD/LNVNB161216, BIOS BNCN43WW 05/21/2021
Jan 23 18:08:02 datenfalter kernel: RIP: 0010:__do_softirq+0x79/0x298
Jan 23 18:08:02 datenfalter kernel: Code: 81 67 2c ff f7 ff ff be 00 01 00 00 e8 e0 ab 2d ff c7 44 24 10 0a 00 00 00 65 66 c7 05 4e 09 63 5d 00 00 fb 66 0f 1f 44 00 00 <b8> ff ff>
Jan 23 18:08:02 datenfalter kernel: RSP: 0000:ffffb9d5001b0fa0 EFLAGS: 00000246
Jan 23 18:08:02 datenfalter kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jan 23 18:08:02 datenfalter kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jan 23 18:08:02 datenfalter kernel: RBP: ffffb9d500117df8 R08: 0000000000000000 R09: 0000000000000000
Jan 23 18:08:02 datenfalter kernel: R10: 0000000000000080 R11: 0000000000000000 R12: 0000000000000001
Jan 23 18:08:02 datenfalter kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000080
Jan 23 18:08:02 datenfalter kernel: FS:  0000000000000000(0000) GS:ffff93f210680000(0000) knlGS:0000000000000000
Jan 23 18:08:02 datenfalter kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 23 18:08:02 datenfalter kernel: CR2: 00007fdaf827b060 CR3: 0000000108402001 CR4: 00000000003706e0
Jan 23 18:08:02 datenfalter kernel: Call Trace:
Jan 23 18:08:02 datenfalter kernel:  <IRQ>
Jan 23 18:08:02 datenfalter kernel:  irq_exit_rcu+0x9b/0xc0
Jan 23 18:08:02 datenfalter kernel:  sysvec_apic_timer_interrupt+0x72/0x90
Jan 23 18:08:02 datenfalter kernel:  </IRQ>
Jan 23 18:08:02 datenfalter kernel:  <TASK>
Jan 23 18:08:02 datenfalter kernel:  asm_sysvec_apic_timer_interrupt+0x12/0x20
Jan 23 18:08:02 datenfalter kernel: RIP: 0010:cpuidle_enter_state+0xc7/0x360
Jan 23 18:08:02 datenfalter kernel: Code: 8b 3d f5 f5 b2 5d e8 e8 08 7f ff 49 89 c5 0f 1f 44 00 00 31 ff e8 89 16 7f ff 45 84 ff 0f 85 08 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6>
Jan 23 18:08:02 datenfalter kernel: RSP: 0000:ffffb9d500117ea8 EFLAGS: 00000246
Jan 23 18:08:02 datenfalter kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
Jan 23 18:08:02 datenfalter kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jan 23 18:08:02 datenfalter kernel: RBP: ffff93f2106bc000 R08: 0000000000000000 R09: 0000000000000000
Jan 23 18:08:02 datenfalter kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa3947860
Jan 23 18:08:02 datenfalter kernel: R13: 00000006920e102a R14: 0000000000000001 R15: 0000000000000000
Jan 23 18:08:02 datenfalter kernel:  ? cpuidle_enter_state+0xb7/0x360
Jan 23 18:08:02 datenfalter kernel:  cpuidle_enter+0x29/0x40
Jan 23 18:08:02 datenfalter kernel:  do_idle+0x1e9/0x280
Jan 23 18:08:02 datenfalter kernel:  cpu_startup_entry+0x19/0x20
Jan 23 18:08:02 datenfalter kernel:  secondary_startup_64_no_verify+0xc2/0xcb
Jan 23 18:08:02 datenfalter kernel:  </TASK>
Jan 23 18:08:30 datenfalter kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [swapper/2:0]
Jan 23 18:08:30 datenfalter kernel: Modules linked in: acpi_cpufreq(-) intel_ish_ipc mc(+) processor_thermal_device btbcm processor_thermal_rfim btintel processor_thermal_mbox bluetooth processor_thermal_rapl intel_rapl_common ecdh_gene>
Jan 23 18:08:30 datenfalter kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G        W    L    5.15.15-1-lts #1 3d281467c2a7b0dccedf86312492605ff493e1c7
Jan 23 18:08:30 datenfalter kernel: Hardware name: LENOVO 81TD/LNVNB161216, BIOS BNCN43WW 05/21/2021
Jan 23 18:08:30 datenfalter kernel: RIP: 0010:__do_softirq+0x79/0x298
Jan 23 18:08:30 datenfalter kernel: Code: 81 67 2c ff f7 ff ff be 00 01 00 00 e8 e0 ab 2d ff c7 44 24 10 0a 00 00 00 65 66 c7 05 4e 09 63 5d 00 00 fb 66 0f 1f 44 00 00 <b8> ff ff ff ff 49 c7 c3 c0 60 80 a3 41 0f bc c7 89 c5 83 c5 01 74
Jan 23 18:08:30 datenfalter kernel: RSP: 0018:ffffb9d5001b0fa0 EFLAGS: 00000246
Jan 23 18:08:30 datenfalter kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Jan 23 18:08:30 datenfalter kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jan 23 18:08:30 datenfalter kernel: RBP: ffffb9d500117df8 R08: 0000000000000000 R09: 0000000000000000
Jan 23 18:08:30 datenfalter kernel: R10: 0000000000000080 R11: 0000000000000000 R12: 0000000000000001
Jan 23 18:08:30 datenfalter kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000080
Jan 23 18:08:30 datenfalter kernel: FS:  0000000000000000(0000) GS:ffff93f210680000(0000) knlGS:0000000000000000
Jan 23 18:08:30 datenfalter kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 23 18:08:30 datenfalter kernel: CR2: 000055a3b30fe000 CR3: 0000000102a98006 CR4: 00000000003706e0
Jan 23 18:08:30 datenfalter kernel: Call Trace:
Jan 23 18:08:30 datenfalter kernel:  <IRQ>
Jan 23 18:08:30 datenfalter kernel:  irq_exit_rcu+0x9b/0xc0
Jan 23 18:08:30 datenfalter kernel:  sysvec_apic_timer_interrupt+0x72/0x90
Jan 23 18:08:30 datenfalter kernel:  </IRQ>
Jan 23 18:08:30 datenfalter kernel:  <TASK>
Jan 23 18:08:30 datenfalter kernel:  asm_sysvec_apic_timer_interrupt+0x12/0x20
Jan 23 18:08:30 datenfalter kernel: RIP: 0010:cpuidle_enter_state+0xc7/0x360
Jan 23 18:08:30 datenfalter kernel: Code: 8b 3d f5 f5 b2 5d e8 e8 08 7f ff 49 89 c5 0f 1f 44 00 00 31 ff e8 89 16 7f ff 45 84 ff 0f 85 08 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 14 01 00 00 49 63 d6 4c 2b 2c 24 48 8d 04 52 48 8d
Jan 23 18:08:30 datenfalter kernel: RSP: 0018:ffffb9d500117ea8 EFLAGS: 00000246
Jan 23 18:08:30 datenfalter kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
Jan 23 18:08:30 datenfalter kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jan 23 18:08:30 datenfalter kernel: RBP: ffff93f2106bc000 R08: 0000000000000000 R09: 0000000000000000
Jan 23 18:08:30 datenfalter kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa3947860
Jan 23 18:08:30 datenfalter kernel: R13: 0000000c829efd92 R14: 0000000000000001 R15: 0000000000000000
Jan 23 18:08:30 datenfalter kernel:  ? cpuidle_enter_state+0xb7/0x360
Jan 23 18:08:30 datenfalter kernel:  cpuidle_enter+0x29/0x40
Jan 23 18:08:30 datenfalter kernel:  do_idle+0x1e9/0x280
Jan 23 18:08:30 datenfalter kernel:  cpu_startup_entry+0x19/0x20
Jan 23 18:08:30 datenfalter kernel:  secondary_startup_64_no_verify+0xc2/0xcb
Jan 23 18:08:30 datenfalter kernel:  </TASK>
Jan 23 18:08:30 datenfalter kernel: mc: Linux media interface: v0.10
Jan 23 18:08:30 datenfalter kernel: intel_rapl_common: Found RAPL domain package
Jan 23 18:08:30 datenfalter kernel: intel_rapl_common: Found RAPL domain dram
Jan 23 18:08:31 datenfalter kernel: cfg80211: Loading compiled-in X.509 certificates for regulatory database
Jan 23 18:08:31 datenfalter kernel: cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
Jan 23 18:08:31 datenfalter kernel: platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
Jan 23 18:08:31 datenfalter kernel: cfg80211: failed to load regulatory.db
Jan 23 18:08:31 datenfalter kernel: videodev: Linux video capture interface: v2.00
Jan 23 18:08:31 datenfalter kernel: mei_me 0000:00:16.0: enabling device (0000 -> 0002)
Jan 23 18:08:31 datenfalter kernel: Intel(R) Wireless WiFi driver for Linux
Jan 23 18:08:31 datenfalter kernel: iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-66.ucode failed with error -2
Jan 23 18:08:31 datenfalter kernel: iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-65.ucode failed with error -2
Jan 23 18:08:31 datenfalter kernel: iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-64.ucode failed with error -2
Jan 23 18:08:31 datenfalter kernel: iwlwifi 0000:00:14.3: api flags index 2 larger than supported by driver

Offline

#49 2022-01-24 09:27:19

halogene
Member
Registered: 2013-05-29
Posts: 47

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Update: so I tested the intel_idle.max_cstate=1 setting by adding it to the "linux" line from within GRUB's boot line editor. I figured that should be the correct place, but just to make sure I also tested a couple of boots with placing it in lines before. Unfortunately, I don't seem to be able to get the system up with this setting. After 6 unsuccessful boots with the current kernel I started rebooting again with 5.15 LTS kernel and after a couple of tries I'm up and running again.

Offline

#50 2022-01-24 16:03:11

seth
Member
Registered: 2012-09-03
Posts: 49,948

Re: [solved] Kernel 5.11 -> "soft lockup", system doesn't boot up

Possibly https://bbs.archlinux.org/viewtopic.php?id=273451 - can you boot nomodeset?

Online

Board footer

Powered by FluxBB