You are not logged in.

#1 2025-02-28 22:43:45

Afmer
Member
Registered: 2023-09-21
Posts: 23

Intermittent video signal after kernel upgrade to Linux 6.13.4.arch1-1

Good afternoon, everyone. I have a server with Supermicro X9DRi-LN4+/X9DR3-LN4+ motherboard with Arch Linux installed. After upgrading the kernel to Linux 6.13.4.arch1-1 after booting the system when connecting to IPMI KVM I have constant video stream disconnects that resemble a blinking monitor. I'm pretty sure it's because of the kernel upgrade, since I only upgraded it. At the same time I started getting strange errors in dmesg. I'm not entirely sure what the problem might be, so I've provided as much information about my system as possible.

[afmer@Sparkle-Server ~]$ sudo uname -a
[sudo] пароль для afmer: 
Linux Sparkle-Server 6.13.4-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 22 Feb 2025 00:37:05 +0000 x86_64 GNU/Linux
[afmer@Sparkle-Server ~]$ sudo dmesg | grep -i ipmi
[    4.710794] IPMI message handler: version 39.2
[    4.723700] ipmi device interface
[    4.734999] ipmi_si: IPMI System Interface driver
[    4.735018] ipmi_si dmi-ipmi-si.0: ipmi_platform: probing via SMBIOS
[    4.735022] ipmi_platform: ipmi_si: SMBIOS: io 0xca2 regsize 1 spacing 1 irq 0
[    4.735025] ipmi_si: Adding SMBIOS-specified kcs state machine
[    4.736652] ipmi_si: Trying SMBIOS-specified kcs state machine at i/o address 0xca2, slave address 0x20, irq 0
[    4.747503] ipmi_si dmi-ipmi-si.0: The BMC does not support clearing the recv irq bit, compensating, but the BMC needs to be fixed.
[    4.771648] ipmi_si dmi-ipmi-si.0: IPMI message handler: Found new BMC (man_id: 0x002a7c, prod_id: 0x0626, dev_id: 0x20)
[    4.892721] ipmi_si dmi-ipmi-si.0: IPMI kcs interface initialized
[afmer@Sparkle-Server ~]$ journalctl -xe | grep -i ipmi
journalctl -xe | grep -i video
мар 01 01:14:31 Sparkle-Server syncthing[1023]: [PIHHV] INFO: Ready to synchronize "Saved Video" (6nw86-wuxge) (receiveonly)
мар 01 01:14:33 Sparkle-Server syncthing[1023]: [PIHHV] INFO: Completed initial scan of receiveonly folder "Saved Video" (6nw86-wuxge)
[afmer@Sparkle-Server ~]$ lspci -v | grep -i VGA
08:01.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a) (prog-if 00 [VGA controller])
	DeviceName: Onboard Matrox VGA
[afmer@Sparkle-Server ~]$ sudo dmesg | grep -i error
[    1.989692] RAS: Correctable Errors collector initialized.
[    4.705046] ioatdma 0000:00:04.0: channel error register unreachable
[    4.705049] ioatdma 0000:00:04.0: channel enumeration error
[    4.705344] ioatdma 0000:00:04.1: channel error register unreachable
[    4.705346] ioatdma 0000:00:04.1: channel enumeration error
[    4.708450] ioatdma 0000:00:04.2: channel error register unreachable
[    4.708453] ioatdma 0000:00:04.2: channel enumeration error
[    4.711519] ioatdma 0000:00:04.3: channel error register unreachable
[    4.711522] ioatdma 0000:00:04.3: channel enumeration error
[    4.713761] ioatdma 0000:00:04.4: channel error register unreachable
[    4.713764] ioatdma 0000:00:04.4: channel enumeration error
[    4.717575] ioatdma 0000:00:04.5: channel error register unreachable
[    4.717576] ioatdma 0000:00:04.5: channel enumeration error
[    4.719213] ioatdma 0000:00:04.6: channel error register unreachable
[    4.719216] ioatdma 0000:00:04.6: channel enumeration error
[    4.720622] ioatdma 0000:00:04.7: channel error register unreachable
[    4.720625] ioatdma 0000:00:04.7: channel enumeration error
[    4.721305] ioatdma 0000:80:04.0: channel error register unreachable
[    4.721307] ioatdma 0000:80:04.0: channel enumeration error
[    4.721413] ioatdma 0000:80:04.1: channel error register unreachable
[    4.721414] ioatdma 0000:80:04.1: channel enumeration error
[    4.721484] ioatdma 0000:80:04.2: channel error register unreachable
[    4.721486] ioatdma 0000:80:04.2: channel enumeration error
[    4.721583] ioatdma 0000:80:04.3: channel error register unreachable
[    4.721585] ioatdma 0000:80:04.3: channel enumeration error
[    4.722525] ioatdma 0000:80:04.4: channel error register unreachable
[    4.722527] ioatdma 0000:80:04.4: channel enumeration error
[    4.722628] ioatdma 0000:80:04.5: channel error register unreachable
[    4.722630] ioatdma 0000:80:04.5: channel enumeration error
[    4.723252] ioatdma 0000:80:04.6: channel error register unreachable
[    4.723255] ioatdma 0000:80:04.6: channel enumeration error
[    4.723652] ioatdma 0000:80:04.7: channel error register unreachable
[    4.723654] ioatdma 0000:80:04.7: channel enumeration error
[afmer@Sparkle-Server ~]$ lsmod | grep ipmi
ipmi_si                98304  0
ipmi_devintf           20480  0
ipmi_msghandler        94208  2 ipmi_devintf,ipmi_si
[afmer@Sparkle-Server ~]$ lspci | grep -i bridge
00:00.0 Host bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 DMI2 (rev 04)
00:01.0 PCI bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 1a (rev 04)
00:01.1 PCI bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 1b (rev 04)
00:02.0 PCI bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 2a (rev 04)
00:03.0 PCI bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 3a (rev 04)
00:11.0 PCI bridge: Intel Corporation C600/X79 series chipset PCI Express Virtual Root Port (rev 06)
00:1c.0 PCI bridge: Intel Corporation C600/X79 series chipset PCI Express Root Port 1 (rev b6)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a6)
00:1f.0 ISA bridge: Intel Corporation C600/X79 series chipset LPC Controller (rev 06)
80:00.0 PCI bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port in DMI2 Mode (rev 04)
80:01.0 PCI bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 1a (rev 04)
80:02.0 PCI bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 2a (rev 04)
80:03.0 PCI bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 3a (rev 04)
[afmer@Sparkle-Server ~]$ 

journal

мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.0: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.0: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.0: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.1: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.1: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.1: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.2: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.2: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.2: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.3: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.3: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.3: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.4: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.4: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.4: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.5: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.5: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.5: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.6: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.6: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.6: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.7: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.7: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:00:04.7: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.0: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.0: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.0: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.1: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.1: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.1: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.2: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.2: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.2: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.3: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.3: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.3: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.4: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.4: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.4: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.5: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.5: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.5: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.6: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.6: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.6: Intel(R) I/OAT DMA Engine init failed
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.7: channel error register unreachable
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.7: channel enumeration error
мар 01 01:14:26 Sparkle-Server kernel: ioatdma 0000:80:04.7: Intel(R) I/OAT DMA Engine init failed

Offline

#2 2025-02-28 22:48:39

Afmer
Member
Registered: 2023-09-21
Posts: 23

Re: Intermittent video signal after kernel upgrade to Linux 6.13.4.arch1-1

The previous kernel was Linux 6.11.8-arch1-2

Offline

#3 2025-03-01 09:19:38

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 1,028
Website

Re: Intermittent video signal after kernel upgrade to Linux 6.13.4.arch1-1

Could you try the previous versions from the arch linux archive?

sudo pacman -U https://archive.archlinux.org/packages/l/linux/linux-6.12.arch1-1-x86_64.pkg.tar.zst
sudo pacman -U https://archive.archlinux.org/packages/l/linux/linux-6.13.arch1-1-x86_64.pkg.tar.zst

Offline

#4 2025-03-06 13:29:46

Afmer
Member
Registered: 2023-09-21
Posts: 23

Re: Intermittent video signal after kernel upgrade to Linux 6.13.4.arch1-1

gromit wrote:

Could you try the previous versions from the arch linux archive?

sudo pacman -U https://archive.archlinux.org/packages/l/linux/linux-6.12.arch1-1-x86_64.pkg.tar.zst
sudo pacman -U https://archive.archlinux.org/packages/l/linux/linux-6.13.arch1-1-x86_64.pkg.tar.zst

Checked both kernel versions and the IPMI problem also occurs. I don't know how important it is, but I used downgrade to change the linux version.

Offline

#5 2025-03-06 13:35:50

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 1,028
Website

Re: Intermittent video signal after kernel upgrade to Linux 6.13.4.arch1-1

sudo pacman -U https://archive.archlinux.org/packages/l/linux/linux-6.11.arch1-1-x86_64.pkg.tar.zst

And 6.11 works?

Using "downgrade" also works of course!

Offline

#6 2025-03-06 15:59:50

Afmer
Member
Registered: 2023-09-21
Posts: 23

Re: Intermittent video signal after kernel upgrade to Linux 6.13.4.arch1-1

gromit wrote:
sudo pacman -U https://archive.archlinux.org/packages/l/linux/linux-6.11.arch1-1-x86_64.pkg.tar.zst

And 6.11 works?

Using "downgrade" also works of course!

I'm currently on Linux kernel version 6.11.8-arch1-2 and everything works fine. I also put Linux version 6.11.arch1-1 and it also worked without problems

Offline

#7 2025-03-06 16:03:26

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 1,028
Website

Re: Intermittent video signal after kernel upgrade to Linux 6.13.4.arch1-1

This could be a kernel regression, which should be bisected and reported to the upstream kernel developers

Are you confident to do the bisection on your own or do you need some help?
If you want we could also provide you with pre-built kernel images for you to test (which greatly speeds up the test time) smile 

Good info to get you started is:
- https://docs.kernel.org/admin-guide/rep … sions.html
- https://wiki.archlinux.org/title/Kernel … egressions

Additionally it would be good to see if the latest mainline kernel is affected:

sudo pacman -U https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-mainline-6.14rc5-1-x86_64.pkg.tar.zst

(note that this installs the kernel as linux-mainline, so you need to configure your bootloader to boot it (for example via grub-mkconfig -o ... or by writing the systemd-boot loader entry))

Offline

#8 2025-03-08 21:01:48

Afmer
Member
Registered: 2023-09-21
Posts: 23

Re: Intermittent video signal after kernel upgrade to Linux 6.13.4.arch1-1

gromit wrote:

This could be a kernel regression, which should be bisected and reported to the upstream kernel developers

Are you confident to do the bisection on your own or do you need some help?
If you want we could also provide you with pre-built kernel images for you to test (which greatly speeds up the test time) smile 

Good info to get you started is:
- https://docs.kernel.org/admin-guide/rep … sions.html
- https://wiki.archlinux.org/title/Kernel … egressions

Additionally it would be good to see if the latest mainline kernel is affected:

sudo pacman -U https://pkgbuild.com/\~gromit/linux-bisection-kernels/linux-mainline-6.14rc5-1-x86_64.pkg.tar.zst

(note that this installs the kernel as linux-mainline, so you need to configure your bootloader to boot it (for example via grub-mkconfig -o ... or by writing the systemd-boot loader entry))


It took me all day, but it's done! I've done some research and I have the following information.

[afmer@Furina-PC linux-torvalds]$ git bisect good
d6460bd52c27fde97d6a73e3d9c7a8d747fbaa3e is the first bad commit
commit d6460bd52c27fde97d6a73e3d9c7a8d747fbaa3e
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Thu Jul 18 12:44:14 2024 +0200

    drm/mgag200: Add dedicated variables for blanking fields
    
    Represent fields for horizontal and vertical blanking with <hblkstr>,
    <hblkend>, <vblkstr> and <vblkend>. Aligns the code with the Matrox
    programming manuals.
    
    Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
    Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
    Link: [url]https://patchwork.freedesktop.org/patch/msgid/20240718104551.575912-5-tzimmermann@suse.de[/url]

 drivers/gpu/drm/mgag200/mgag200_mode.c | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)
[afmer@Furina-PC linux-torvalds]$ git bisect log
git bisect start
# status: waiting for both good and bad commits
# good: [98f7e32f20d28ec452afb208f9cffc08448a2652] Linux 6.11
git bisect good 98f7e32f20d28ec452afb208f9cffc08448a2652
# status: waiting for bad commit, 1 good commit known
# bad: [adc218676eef25575469234709c2d87185ca223a] Linux 6.12
git bisect bad adc218676eef25575469234709c2d87185ca223a
# bad: [509d2cd12a10d057fdf72f565b930f9a81140d59] Merge tag 'Smack-for-6.12' of [url]https://github.com/cschaufler/smack-next[/url]
git bisect bad 509d2cd12a10d057fdf72f565b930f9a81140d59
# good: [7b17f5ebd5fc5e9275eaa5af3d0771f2a7b01bbf] Merge tag 'soc-dt-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect good 7b17f5ebd5fc5e9275eaa5af3d0771f2a7b01bbf
# good: [54450af662369efbd4cb438ce7b553dfffa00f07] Merge tag 'parisc-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
git bisect good 54450af662369efbd4cb438ce7b553dfffa00f07
# bad: [f0b7dcf25834afd17df316367dfe5d4c890c713c] drm/amd/display: Wait for all pending cleared before full update
git bisect bad f0b7dcf25834afd17df316367dfe5d4c890c713c
# bad: [3f53d7e442197b7e7d56b470b02dfd37a8bc5c46] Merge tag 'drm-intel-gt-next-2024-08-23' of [url]https://gitlab.freedesktop.org/drm/i915/kernel[/url] into drm-next
git bisect bad 3f53d7e442197b7e7d56b470b02dfd37a8bc5c46
# bad: [91dae758bdb854367bf0811d97acb84e791764d9] Merge tag 'drm-misc-next-2024-08-01' of [url]https://gitlab.freedesktop.org/drm/misc/kernel[/url] into drm-next
git bisect bad 91dae758bdb854367bf0811d97acb84e791764d9
# bad: [def122b64e37daa39774d4afa433ad42b8a5eaf3] drm/nouveau/nvif: remove client version
git bisect bad def122b64e37daa39774d4afa433ad42b8a5eaf3
# good: [754c9129b9494b2b058add1d1a627fb3c9466a03] drm/mgag200: Use hexadecimal register indeces
git bisect good 754c9129b9494b2b058add1d1a627fb3c9466a03
# bad: [aa48c30f096bc10a583c2294d87713f2802986c2] dt-bindings: display: panel: Document Densitron DMT028VGHMCMI-1D TFT on ILI9806E DSI TCON
git bisect bad aa48c30f096bc10a583c2294d87713f2802986c2
# bad: [7e33fc2ff6754b5ff39b11297f713cd0841d9962] drm/panic: Add missing static inline to drm_panic_is_enabled()
git bisect bad 7e33fc2ff6754b5ff39b11297f713cd0841d9962
# bad: [02fa62d41c8abff945bae5bfc3ddcf4721496aca] drm/stm: ltdc: reset plane transparency after plane disable
git bisect bad 02fa62d41c8abff945bae5bfc3ddcf4721496aca
# bad: [d6460bd52c27fde97d6a73e3d9c7a8d747fbaa3e] drm/mgag200: Add dedicated variables for blanking fields
git bisect bad d6460bd52c27fde97d6a73e3d9c7a8d747fbaa3e
# good: [e8f834b559621d634a939381caf99a024e272211] drm/mgag200: Use adjusted mode values for CRTCs
git bisect good e8f834b559621d634a939381caf99a024e272211
# first bad commit: [d6460bd52c27fde97d6a73e3d9c7a8d747fbaa3e] drm/mgag200: Add dedicated variables for blanking fields
[afmer@Furina-PC linux-torvalds]$ 

What should I do next?

Offline

#9 2025-03-12 08:57:43

gromit
Package Maintainer (PM)
From: Germany
Registered: 2024-02-10
Posts: 1,028
Website

Re: Intermittent video signal after kernel upgrade to Linux 6.13.4.arch1-1

Did you already report the bug in accordance with https://docs.kernel.org/admin-guide/rep … sions.html ? If you want some help or want to share a draft with me feel free to do so!

Offline

Board footer

Powered by FluxBB