You are not logged in.

#1 2019-03-09 19:33:31

Swiggles
Member
Registered: 2014-08-02
Posts: 266

[SOLVED] Above 4G decoding fails to boot on Kernel 5.0

Hello Arch community!

I have a recent problem with the current 5.0 kernel where the system won't boot if I enable above 4G decoding in the UEFI settings.
Downgrading the kernel to 4.20.13 works perfectly fine, but for now I switched over to the LTS kernel (4.19.27-1-lts) so I don't have to run a partial upgraded system.

Is this a regression? A known issue?

My specs:

AMD Ryzen Threadripper 2950X
AsRock X399 Taichi (3.50 firmware)
4*16GB Samsung M391A2K43BB1-CRC (ECC)
Sapphire Radeon RX Vega 64 Nitro+(main GPU PCIe slot 1)
Sapphire Radeon RX 590 Nitro+(secondary GPU for VMs and other purposes, PCIe slot 4)
3*500GB Samsung 970 Evo M.2 2280 (LVM mixed RAID)

No patches applied to the kernel or any system critical software. So it's a stock Arch Linux install with Gnome 3 in X11 mode. No ignored packages, no partial upgrades.

I would assume this setup working just fine, because all hardware is newish, does support UEFI and 64bit mode and it was working before upgrading to 5.0.

LANG=C journalctl -b-1 (Oops starting at line 2094)

➜ grep '^[^#]' /etc/mkinitcpio.conf
MODULES=(dm-raid raid0 raid1 raid456 vfio_pci vfio vfio_iommu_type1 vfio_virqfd)
BINARIES=()
FILES=()
HOOKS=(base udev autodetect modconf block lvm2 filesystems usr keyboard fsck shutdown)
➜ grep '^[^#]' /etc/default/grub   
GRUB_DEFAULT=0
GRUB_TIMEOUT=1
GRUB_DISTRIBUTOR="Arch"
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX="libahci.ignore_sss=1 amd_iommu=on iommu=pt"
GRUB_PRELOAD_MODULES="part_gpt part_msdos"
GRUB_TERMINAL_INPUT=console
GRUB_GFXMODE=auto
GRUB_GFXPAYLOAD_LINUX=keep
GRUB_DISABLE_RECOVERY=true

Additional configs in modprobe.d:

options kvm ignore_msrs=1
options vfio-pci ids=1002:67df,1002:aaf0

I hope I have given enough info and someone might be able to point me to the right direction. Thanks for reading!

Last edited by Swiggles (2019-03-11 02:22:24)

Offline

#2 2019-03-09 21:40:15

ugjka
Member
From: Latvia
Registered: 2014-04-01
Posts: 1,808
Website

Re: [SOLVED] Above 4G decoding fails to boot on Kernel 5.0

what is "above 4G decoding"?


https://ugjka.net
paru > yay | webcord > discord
pacman -S spotify-launcher
mount /dev/disk/by-...

Offline

#3 2019-03-09 21:46:43

Swiggles
Member
Registered: 2014-08-02
Posts: 266

Re: [SOLVED] Above 4G decoding fails to boot on Kernel 5.0

It allows PCIe devices to map to 64bit address space. It's an option to disable for legacy hardware and OS.

Offline

#4 2019-03-10 12:14:22

Lone_Wolf
Forum Moderator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,920

Re: [SOLVED] Above 4G decoding fails to boot on Kernel 5.0

It allows PCIe devices to map to 64bit address space. It's an option to disable for legacy hardware and OS.

correct, basically it should be enabled for any 64-bit OS.


No problem here with same motherboard and 3.30 firmware .

I do remember having to disable CSM compatibility to get above 4G decoding to work , do you have that disabled ?

When you say "doesn't boot" , where does it fail ?
Can you access the log from a failed boot using journalctl -b some_negativenumber and post  it ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#5 2019-03-10 12:32:21

loqs
Member
Registered: 2014-03-06
Posts: 17,372

Re: [SOLVED] Above 4G decoding fails to boot on Kernel 5.0

Swiggles wrote:

LANG=C journalctl -b-1 (Oops starting at line 2094)

Mar 09 19:59:38 moebius kernel: efifb: cannot reserve video memory at 0x60000000
Mar 09 19:59:38 moebius kernel: ------------[ cut here ]------------
Mar 09 19:59:38 moebius kernel: ioremap on RAM at 0x0000000060000000 - 0x00000000608cffff
Mar 09 19:59:38 moebius kernel: WARNING: CPU: 16 PID: 1 at arch/x86/mm/ioremap.c:167 __ioremap_caller+0x313/0x330
Mar 09 19:59:38 moebius kernel: Modules linked in:
Mar 09 19:59:38 moebius kernel: CPU: 16 PID: 1 Comm: swapper/0 Not tainted 5.0.0-arch1-1-ARCH #1
Mar 09 19:59:38 moebius kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X399 Taichi, BIOS P3.50 12/24/2018
Mar 09 19:59:38 moebius kernel: RIP: 0010:__ioremap_caller+0x313/0x330
Mar 09 19:59:38 moebius kernel: Code: 05 ba ba 1b 01 49 09 c6 e9 b5 fe ff ff 48 8d 54 24 28 48 8d 74 24 18 48 c7 c7 9d 80 49 aa c6 05 8b d8 29 01 01 e8 d7 65 01 00 <0f> 0b 31 db e9 55 ff ff ff e8 bf 62 01 00 66 66 2e 0f 1f 84 00 00
Mar 09 19:59:38 moebius kernel: RSP: 0018:ffffae16c0117c90 EFLAGS: 00010286
Mar 09 19:59:38 moebius kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
Mar 09 19:59:38 moebius kernel: RDX: 0000000000000000 RSI: 0000000000000092 RDI: 00000000ffffffff
Mar 09 19:59:38 moebius kernel: RBP: 0000000060000000 R08: 0000000000000001 R09: 00000000000003d1
Mar 09 19:59:38 moebius kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 00000000008d0000
Mar 09 19:59:38 moebius kernel: R13: 00000000008d0000 R14: 0000000000000000 R15: ffffffffa98ad822
Mar 09 19:59:38 moebius kernel: FS:  0000000000000000(0000) GS:ffffa1f95e200000(0000) knlGS:0000000000000000
Mar 09 19:59:38 moebius kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 09 19:59:38 moebius kernel: CR2: 0000000000000000 CR3: 0000000c62a0e000 CR4: 00000000003406e0
Mar 09 19:59:38 moebius kernel: Call Trace:
Mar 09 19:59:38 moebius kernel:  ? printk+0x58/0x6f
Mar 09 19:59:38 moebius kernel:  ? __kmalloc+0x1fd/0x210
Mar 09 19:59:38 moebius kernel:  efifb_probe.cold.4+0x2ff/0x93d
Mar 09 19:59:38 moebius kernel:  ? kernfs_add_one+0xe7/0x130
Mar 09 19:59:38 moebius kernel:  platform_drv_probe+0x4f/0xa0
Mar 09 19:59:38 moebius kernel:  really_probe+0xf8/0x3b0
Mar 09 19:59:38 moebius kernel:  ? do_early_param+0x8e/0x8e
Mar 09 19:59:38 moebius kernel:  driver_probe_device+0xb3/0xf0
Mar 09 19:59:38 moebius kernel:  __driver_attach+0xdd/0x110
Mar 09 19:59:38 moebius kernel:  ? driver_probe_device+0xf0/0xf0
Mar 09 19:59:38 moebius kernel:  ? driver_probe_device+0xf0/0xf0
Mar 09 19:59:38 moebius kernel:  bus_for_each_dev+0x76/0xc0
Mar 09 19:59:38 moebius kernel:  bus_add_driver+0x152/0x230
Mar 09 19:59:38 moebius kernel:  ? vesafb_driver_init+0x13/0x13
Mar 09 19:59:38 moebius kernel:  driver_register+0x6b/0xb0
Mar 09 19:59:38 moebius kernel:  ? vesafb_driver_init+0x13/0x13
Mar 09 19:59:38 moebius kernel:  do_one_initcall+0x46/0x1f5
Mar 09 19:59:38 moebius kernel:  kernel_init_freeable+0x222/0x2b4
Mar 09 19:59:38 moebius kernel:  ? rest_init+0xbf/0xbf
Mar 09 19:59:38 moebius kernel:  kernel_init+0xa/0x101
Mar 09 19:59:38 moebius kernel:  ret_from_fork+0x22/0x40
Mar 09 19:59:38 moebius kernel: ---[ end trace 68825049282f785c ]---
Mar 09 19:59:38 moebius kernel: efifb: abort, cannot remap video memory 0x8d0000 @ 0x60000000
Mar 09 19:59:38 moebius kernel: efi-framebuffer: probe of efi-framebuffer.0 failed with error -5

Offline

#6 2019-03-10 17:49:38

Swiggles
Member
Registered: 2014-08-02
Posts: 266

Re: [SOLVED] Above 4G decoding fails to boot on Kernel 5.0

I already tried adding "video=efifb:off", but it doesn't seem to change anything, so I removed it again. Although I am almost certain it is some issue with the second GPU and grub/EFI:
UEFI info screen is displayed with GPU 1, if I enter the configuration it is also displayed via this GPU. Grub is displayed on GPU 2 until the initial ramdisk is loading (this is not intended, but didn't bother me too much). After this everything is handled by GPU 1.

So I think for whatever reason the efi boot image is using GPU 2, but as soon as the kernel is booting control of GPU 2 is handed off to the vfio driver and GPU 1 to amdgpu. While I am sure this is not optimal it was working fine before kernel 5.0. I would be happy with a solution fixing this behavior. Please help me if you have any clue.

Lone_Wolf wrote:

When you say "doesn't boot" , where does it fail ?
Can you access the log from a failed boot using journalctl -b some_negativenumber and post  it ?

I already put a full log in my original post. "LANG=C journalctl -b-1 is clickable. :-)

Offline

#7 2019-03-10 18:36:07

loqs
Member
Registered: 2014-03-06
Posts: 17,372

Re: [SOLVED] Above 4G decoding fails to boot on Kernel 5.0

Is bisecting 4.20 to 5.0 an option?

Offline

#8 2019-03-10 19:32:00

Swiggles
Member
Registered: 2014-08-02
Posts: 266

Re: [SOLVED] Above 4G decoding fails to boot on Kernel 5.0

This will take a bit. Let me try.

Offline

#9 2019-03-11 01:09:14

Swiggles
Member
Registered: 2014-08-02
Posts: 266

Re: [SOLVED] Above 4G decoding fails to boot on Kernel 5.0

Ok, the results are in, but useless as far as I can tell. It's very unfortunate, because it takes so long to recompile that many kernels.

Terminal screenshot

c97ea6a61b5eb1200f4d5ccbf6601bb9f5bc7d3f

Even though it is obvious the patch shouldn't be responsible I compiled the 5.0 kernel without that commit and of course it still did not boot.

I am a bit stumped, but noticed something: I checked for some errors in the logs and found the same Oops with the lte kernel, but it finishes booting without any ill effects as far as I can tell.

This is getting ugly.

Offline

#10 2019-03-11 01:19:04

loqs
Member
Registered: 2014-03-06
Posts: 17,372

Re: [SOLVED] Above 4G decoding fails to boot on Kernel 5.0

Are you using a PKGBUILD for the bisection so you can recheck the builds easily?
Was wondering if you could easily go back to the first bisection point af7ddd8a627c62a835524b3f5b471edbbbcce025 etc and see if boot fails again.
If it is an intermittent fault then it gets much harder.

Offline

#11 2019-03-11 02:19:41

Swiggles
Member
Registered: 2014-08-02
Posts: 266

Re: [SOLVED] Above 4G decoding fails to boot on Kernel 5.0

I think I solved the issue. Good news I can now boot kernel 5.0 with all configurations as is. The bad news I still see the same Oops in the logs (maybe it was unrelated?) and I have no clue why everything before 5.0 worked fine.

My solution: I opened the computer and swapped the graphics cards PCIe slots and now it looks like all output is displayed on my primary card. No problem with booting either kernel.

So I am inclined to mark the thread as solved, because the original problem is fixed. Thank you for the great support! :-)
I have to take another look at the kernel Oops though.

@loqs: Yes, I slightly modified the current linux package PKGBUILD for compiling and bisect in the source directory. Unfortunately it looks like it was a red herring. Thank you!

Offline

Board footer

Powered by FluxBB