You are not logged in.

#1 2024-07-13 21:00:17

ScottE
Member
Registered: 2024-07-13
Posts: 12

[SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

After upgrading to linux-lts 6.6.39-1 today I started encountering watchdog lockups at boot:

: running early hook [udev]
: starting systemd-udevd version 256.2-1-arch
: running hook [udev]
: Triggering uevents...
[   34.055126] watchdog: Watchdog detected hard LOCKUP on cpu 10
[   34.524386] watchdog: Watchdog detected hard LOCKUP on cpu 12
[   35.580115] watchdog: Watchdog detected hard LOCKUP on cpu 4
[   38.755618] watchdog: Watchdog detected hard LOCKUP on cpu 1
[   66.433586] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[   66.433602] rcu: o10 -... 0: (1 GPs behind) idle=ad24/1/0x4000000000000000 softirq=113/114 fqs=5994
[   66.433622] rcu: o12 -... 0: (1 GPs behind) idle=bb84/1/0x4000000000000002 softirq=115/115 fqs=5994
[   66.433642] rcu: o(detected by 5, t=18002 jiffies, g =- 875, q=1394 ncpus=16)

[Warning: OCRd from a photo, but all looks correct]

I am able to isolate this to the linux-lts update as I can reproduce this by taking my running system on linux-lts 6.6.38-1 and only upgrading the kernel package to 6.6.39-1 (via ZFS rootfs snapshots). Trying mainline linux 6.9 kernel is not currently an option as I rely on zfs-dkms for root.

I did try noacpi, nomodeset, iommu=off to no avail.

Ryzen 2700X, ASRock X370 Taichi - been running Linux perfectly fine since 2018 on this hardware until this kernel version. Two other Intel based systems upgraded fine with an otherwise similar configuration/installation.

I'll try again when 6.6.40 is out, but figured I'd post this in the off chance someone else has seen something similar.

Last edited by ScottE (2024-07-15 22:17:17)

Offline

#2 2024-07-13 21:04:52

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 553
Website

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

This sounds like a possible regression, which should be bisected and reported to the stable team (which maintains the tree for linux-lts) smile

Are you confident in doing that on your own or should I supply you some prebuilt images?
The bisection will be roughtly 6 steps, so its not a lot of stuff to test

Offline

#3 2024-07-13 21:20:50

ScottE
Member
Registered: 2024-07-13
Posts: 12

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

gromit wrote:

Are you confident in doing that on your own or should I supply you some prebuilt images?

Thanks - I appreciate the offer. It's been close to a decade since I've built my own kernels, it's just not something I've had to do recently, so I'd appreciate some help with images to help bisect. Thanks again!

Offline

#4 2024-07-13 21:53:14

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 553
Website

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

Please test the following image and report whether it works or not:

sudo pacman -U https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-lts-v6.6.38.r68.ge536e6e-1-x86_64.pkg.tar.zst

Offline

#5 2024-07-13 23:04:20

ScottE
Member
Registered: 2024-07-13
Posts: 12

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

I had some issues getting dkms to build, but I rebooted anyway, knowing the system won't boot without a root filesystem, but this hang happens early on before filesystems are mounted, so I think that's OK, if not ideal.

Result: The system hung on v6.6.38.r68.ge536e6e-1 with a hard lockup.

Offline

#6 2024-07-13 23:26:03

loqs
Member
Registered: 2014-03-06
Posts: 17,916

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

ScottE wrote:

I had some issues getting dkms to build, but I rebooted anyway, knowing the system won't boot without a root filesystem,

You can install the matching headers with:

sudo pacman -U https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-lts-v6.6.38.r68.ge536e6e-1-x86_64.pkg.tar.zst https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-lts-headers-v6.6.38.r68.ge536e6e-1-x86_64.pkg.tar.zst

Offline

#7 2024-07-13 23:26:20

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 553
Website

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

Please try the following

sudo pacman -U https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-lts-v6.6.38.r34.g3f25b5f-1-x86_64.pkg.tar.zst

Offline

#8 2024-07-13 23:48:33

ScottE
Member
Registered: 2024-07-13
Posts: 12

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

Thanks on the headers, loqs - I should have guessed that.

Result for v6.6.38.r34.g3f25b5f-1: Hard lockup.

Offline

#9 2024-07-14 00:06:26

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 553
Website

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

Please test the following:

sudo pacman -U https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-lts-v6.6.38.r17.g855ae72-1-x86_64.pkg.tar.zst https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-lts-headers-v6.6.38.r17.g855ae72-1-x86_64.pkg.tar.zst

Offline

#10 2024-07-14 00:15:16

ScottE
Member
Registered: 2024-07-13
Posts: 12

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

Result for v6.6.38.r17.g855ae72-1: Good boot - Linux 6.6.38-1-lts-00017-g855ae72c2031-dirty #1 SMP PREEMPT_DYNAMIC Sat, 13 Jul 2024 23:54:19 +0000 x86_64 GNU/Linux

Offline

#11 2024-07-14 07:00:00

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 553
Website

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

Please try the following:

sudo pacman -U https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-lts-v6.6.38.r25.gaf19067-1-x86_64.pkg.tar.zst https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-lts-headers-v6.6.38.r25.gaf19067-1-x86_64.pkg.tar.zst

Offline

#12 2024-07-14 13:10:24

ScottE
Member
Registered: 2024-07-13
Posts: 12

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

Looks like there's an issue with the linux-lts-headers signature for v6.6.38.r25.gaf19067-1:

error: failed to read signature file: /var/cache/pacman/pkg/linux-lts-headers-v6.6.38.r25.gaf19067-1-x86_64.pkg.tar.zst.sig
error: '/var/cache/pacman/pkg/linux-lts-headers-v6.6.38.r25.gaf19067-1-x86_64.pkg.tar.zst': unexpected error

Signature size is 0:

-rw-r--r-- 1 root root  25758640 Jul 13 17:24 /var/cache/pacman/pkg/linux-lts-headers-v6.6.38.r25.gaf19067-1-x86_64.pkg.tar.zst
-rw-r--r-- 1 root root         0 Jul 13 23:57 /var/cache/pacman/pkg/linux-lts-headers-v6.6.38.r25.gaf19067-1-x86_64.pkg.tar.zst.sig
-rw-r--r-- 1 root root 134429059 Jul 13 17:25 /var/cache/pacman/pkg/linux-lts-v6.6.38.r25.gaf19067-1-x86_64.pkg.tar.zst
-rw-r--r-- 1 root root       566 Jul 13 23:56 /var/cache/pacman/pkg/linux-lts-v6.6.38.r25.gaf19067-1-x86_64.pkg.tar.zst.sig

Offline

#13 2024-07-14 14:49:39

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 553
Website

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

Yeah you are right ... Seems like something went wrong when creating the signature. I have now fixed the issue, but you may need to delete the packages from the cache (something like "sudo rm /var/cache/pacman/pkg/linux-lts-headers-v6.6.38.r25.gaf19067-1-x86_64.pkg.tar.zst{,.sig}"). Afterwards you can just re-use the pacman command from above.

Offline

#14 2024-07-14 15:28:51

nicoadamo
Member
From: Chile
Registered: 2011-03-23
Posts: 14

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

Sorry for the lack of details (I haven't been able to get more information - if someone give me some guidelines, I could help), but I think this is as well related to linux-lts 6.6.39-1: when plugging in an external hard drive through a USB cable (encrypted drive [1]), the system freezes completely. If the drive is plugged in while booting, SDDM never gets loaded. If I unplug the drive and boot from scratch, I'm able to login, but if I connect the drive, the system gets freezed and totally unusable. Removing linux-lts and installing linux (currently 6.9.9-arch1-1) I was able to get the system useful again.

[1]https://wiki.archlinux.org/index.php/Dm … ile_system

Offline

#15 2024-07-14 15:44:59

ScottE
Member
Registered: 2024-07-13
Posts: 12

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

v6.6.38.r25.gaf19067-1: Good boot

nicoadamo wrote:

If the drive is plugged in while booting, SDDM never gets loaded. If I unplug the drive and boot from scratch, I'm able to login, but if I connect the drive, the system gets freezed and totally unusable.

This is very interesting. The system where I'm experiencing the lockup has 2 external USB drives. Once I've completed the bisect, unplugging the external drives will be a good test to confirm the same issue. Thanks for replying with a potential lead.

Last edited by ScottE (2024-07-14 15:46:03)

Offline

#16 2024-07-14 15:51:06

nicoadamo
Member
From: Chile
Registered: 2011-03-23
Posts: 14

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

You're welcome! I'm glad it was useful!

Offline

#17 2024-07-14 18:15:03

cryptearth
Member
Registered: 2024-02-03
Posts: 820

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

ScottE wrote:

Trying mainline linux 6.9 kernel is not currently an option as I rely on zfs-dkms for root.

Is https://github.com/archzfs/archzfs an option for you? I used to use the archzfs repo - but as the auto-build lacks often lacks behind I started to build ZFS myself - works without issues for my pool.

Offline

#18 2024-07-14 19:32:24

ScottE
Member
Registered: 2024-07-13
Posts: 12

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

cryptearth wrote:

The honest answer is that I don't know. I tend to stick with LTS kernels as a general preference for stability, especially on my servers, double especially with ZFS root.

I tried to figure out how to do bisect builds of the kernel myself, but couldn't find the right, current, source tree for the arch version of linux-lts, and after a couple of hours of trying different things it exceeded my effort:reward ratio. :-)

If this is related to the USB disk issue, I expect it will be resolved soon anyway: https://bugzilla.kernel.org/show_bug.cgi?id=219039 - I think I'll re-apply 6.6.39 and unplug the USB drives to see what happens - if it boots then I don't think there's much point in completing the bisect search, given this being a known issue.

Offline

#19 2024-07-14 19:33:49

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 553
Website

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

Please test:

sudo pacman -U https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-lts-v6.6.38.r29.gc727e46-1-x86_64.pkg.tar.zst https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-lts-headers-v6.6.38.r29.gc727e46-1-x86_64.pkg.tar.zst

Offline

#20 2024-07-14 19:45:55

ScottE
Member
Registered: 2024-07-13
Posts: 12

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

I'm certain that this is the USB issue, after unplugging the drives, 6.6.39-1 boots just fine.

gromit - I greatly appreciate your help in working through this bisect, but I think I'm going to call this one for now - given all the evidence that this is related to a known issue. Thank you so much for your time in building packages for me!

[Edit: Let me see if I can reproduce this on my test mule, rather than keep disrupting my server, to continue the rebase - I did see your comment gromit in the other thread about finishing the rebase on this one for confirmation].

Last edited by ScottE (2024-07-14 19:50:28)

Offline

#21 2024-07-14 21:03:55

cryptearth
Member
Registered: 2024-02-03
Posts: 820

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

ScottE wrote:
cryptearth wrote:

The honest answer is that I don't know. I tend to stick with LTS kernels as a general preference for stability, especially on my servers, double especially with ZFS root.

Oh, I see - good point then.
For me I use ZFS for a 8x 3tb raidz2 pool but a regular single nvme ssd for root and home (there's not much stored on home which isn't either easy obtainable by just downloading from official sources or has at least one copy on the zfs pool - so no point of moving home onto the zfs pool)
As for root on zfs: The instructions changed quite a lot - and relying on another distribution which comes with ZFS in the install media isn't the real true arch way for me.
As for dkms vs version specific: I guess from a technical point it's the same of either doing DKMS or package built for the current version.
Important: ZFS currently only supports up to 6.8 - 6.9 and upcomming 6.10 still in figuring out issues - so I guess stick to LTS is a good idea for Arch.

Offline

#22 2024-07-14 21:43:51

loqs
Member
Registered: 2014-03-06
Posts: 17,916

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

This is the bisection log to reach 9a24eb8010c2dc6a2eba56e3eb9fc07d14ffe00a which matches your results so far:

$ git bisect log
git bisect start
# status: waiting for both good and bad commits
# bad: [2ced7518a03d002284999ed8336ffac462a358ec] Linux 6.6.39
git bisect bad 2ced7518a03d002284999ed8336ffac462a358ec
# status: waiting for good commit(s), bad commit known
# good: [2928631d5304b8fec48bad4c7254ebf230b6cc51] Linux 6.6.38
git bisect good 2928631d5304b8fec48bad4c7254ebf230b6cc51
# bad: [e536e6efa65f447a7611b4fb07ede1a9c895f8ea] e1000e: Fix S0ix residency on corporate systems
git bisect bad e536e6efa65f447a7611b4fb07ede1a9c895f8ea
# bad: [3f25b5f1635449036692a44b771f39f772190c1d] net: dsa: mv88e6xxx: Correct check for empty list
git bisect bad 3f25b5f1635449036692a44b771f39f772190c1d
# good: [855ae72c20310e5402b2317fc537d911e87537ef] drm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_reloc
git bisect good 855ae72c20310e5402b2317fc537d911e87537ef
# good: [af19067bd58f0f6f90eb6c604babffb55c2d6a00] media: dw2102: Don't translate i2c read into write
git bisect good af19067bd58f0f6f90eb6c604babffb55c2d6a00
# good: [c727e46f0cc8bd81788bb29dac9a0a45f2dfa2eb] Input: ff-core - prefer struct_size over open coded arithmetic
git bisect good c727e46f0cc8bd81788bb29dac9a0a45f2dfa2eb
# bad: [ff6b26be13032c5fbd6b6a0b24358f8eaac4f3af] wifi: mt76: replace skb_put with skb_put_zero
git bisect bad ff6b26be13032c5fbd6b6a0b24358f8eaac4f3af
# bad: [9a24eb8010c2dc6a2eba56e3eb9fc07d14ffe00a] usb: xhci: prevent potential failure in handle_tx_event() for Transfer events without TRB
git bisect bad 9a24eb8010c2dc6a2eba56e3eb9fc07d14ffe00a
# first bad commit: [9a24eb8010c2dc6a2eba56e3eb9fc07d14ffe00a] usb: xhci: prevent potential failure in handle_tx_event() for Transfer events without TRB

linux-lts-6.6.39-1 with 9a24eb8010c2dc6a2eba56e3eb9fc07d14ffe00a reverted:
linux-lts-6.6.39-1.1-x86_64.pkg.tar.zst/linux-lts-headers-6.6.39-1.1-x86_64.pkg.tar.zst
Edit:
linux-lts-6.6.39-1 with the proposed fix from https://bugzilla.kernel.org/show_bug.cgi?id=219039#c6 applied:
linux-lts-6.6.39-1.2-x86_64.pkg.tar.zst/linux-lts-headers-6.6.39-1.2-x86_64.pkg.tar.zst.

Last edited by loqs (2024-07-14 22:10:07)

Offline

#23 2024-07-14 23:05:47

ScottE
Member
Registered: 2024-07-13
Posts: 12

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

Good news! System where I've been having this issue boots fine with this 6.6.39-1.2-lts version and with USB disks plugged back in. I'm comfortable enough with this proof that I don't see a reason to continue down the bisect tree (which is good as I was unable to repro the issue on my test mule system and testing on my home server was disruptive). Thank you for providing a test build with the proposed fix, I appreciate the time and effort!

Offline

#24 2024-07-15 17:05:29

ScottE
Member
Registered: 2024-07-13
Posts: 12

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

I retested with 6.6.40.1-lts from the testing repository and all is good there too.

Offline

#25 2024-07-15 20:13:28

ScottE
Member
Registered: 2024-07-13
Posts: 12

Re: [SOLVED] Watchdog lockup at boot with linux-lts 6.6.39-1

This topic can be marked SOLVED with 6.6.40.1-lts moving to release repos. Thank you!

Last edited by ScottE (2024-07-15 20:15:02)

Offline

Board footer

Powered by FluxBB