You are not logged in.

#1 2023-07-06 04:18:18

OpusOne
Member
Registered: 2023-05-31
Posts: 186

amdgpu gets loaded relatively late during the boot process

Hi, I have an AMD GPU (6650XT) and the amdgpu module gets loaded rather late during boot.
I have set up/enabled plymouth splash (along w/ the 'quiet' option), and I suppose the reason the splash screen takes a while before appearing is because of the above? The end result is that the screen remains black for most of the boot, and the splash only appears for like less than one second before sddm gets loaded. Since I have a motherboard with many PCIe devices, the boot process takes a while, so that's a bit annoying.

$ systemd-analyze 
Startup finished in 14.758s (kernel) + 3.221s (userspace) = 17.980s 
graphical.target reached after 3.202s in userspace.

Yeah, that's not a fast boot. It's not horrific, but I'd prefer to have a splash screen during most of that time instead of nothing.

Currently, my /etc/mkinitcpio.conf has an empty MODULES list. Would adding 'amdgpu' to the MODULES list make it load much earlier and "solve" the above issue?
Is it a good idea and could there be any side-effect?

Offline

#2 2023-07-06 05:25:05

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,494

Re: amdgpu gets loaded relatively late during the boot process

Would adding 'amdgpu' to the MODULES list make it load much earlier and "solve" the above issue?

Yes (given you rebuild the initramfs afterwards) and maybe.
It's usually a good idea because systemd starts the graphical.target based on "optimism". The side-effect is a bigger initramfs (so more used space on your /boot partition)

Is the kms hook in the HOOKS?

Offline

#3 2023-07-06 06:03:34

OpusOne
Member
Registered: 2023-05-31
Posts: 186

Re: amdgpu gets loaded relatively late during the boot process

seth wrote:

Would adding 'amdgpu' to the MODULES list make it load much earlier and "solve" the above issue?

Yes (given you rebuild the initramfs afterwards) and maybe.
It's usually a good idea because systemd starts the graphical.target based on "optimism". The side-effect is a bigger initramfs (so more used space on your /boot partition)

OK thanks. I'll try that. I don't mind the added size - I have ample space on this /boot partition.
Is there any module I should add in the MODULES list before amdgpu (without which it may not load)? Or is adding just amdgpu enough?

seth wrote:

Is the kms hook in the HOOKS?

Yes. Here is my mkinitcpio.conf file (it's pretty much the default one, with just the 'plymouth' hook added):

# vim:set ft=sh
# MODULES
# The following modules are loaded before any boot hooks are
# run.  Advanced users may wish to specify all system modules
# in this array.  For instance:
#     MODULES=(usbhid xhci_hcd)
MODULES=()

# BINARIES
# This setting includes any additional binaries a given user may
# wish into the CPIO image.  This is run last, so it may be used to
# override the actual binaries included by a given hook
# BINARIES are dependency parsed, so you may safely ignore libraries
BINARIES=()

# FILES
# This setting is similar to BINARIES above, however, files are added
# as-is and are not parsed in any way.  This is useful for config files.
FILES=()

# HOOKS
# This is the most important setting in this file.  The HOOKS control the
# modules and scripts added to the image, and what happens at boot time.
# Order is important, and it is recommended that you do not change the
# order in which HOOKS are added.  Run 'mkinitcpio -H <hook name>' for
# help on a given hook.
# 'base' is _required_ unless you know precisely what you are doing.
# 'udev' is _required_ in order to automatically load modules
# 'filesystems' is _required_ unless you specify your fs modules in MODULES
# Examples:
##   This setup specifies all modules in the MODULES setting above.
##   No RAID, lvm2, or encrypted root is needed.
#    HOOKS=(base)
#
##   This setup will autodetect all modules for your system and should
##   work as a sane default
#    HOOKS=(base udev autodetect modconf block filesystems fsck)
#
##   This setup will generate a 'full' image which supports most systems.
##   No autodetection is done.
#    HOOKS=(base udev modconf block filesystems fsck)
#
##   This setup assembles a mdadm array with an encrypted root file system.
##   Note: See 'mkinitcpio -H mdadm_udev' for more information on RAID devices.
#    HOOKS=(base udev modconf keyboard keymap consolefont block mdadm_udev encrypt filesystems fsck)
#
##   This setup loads an lvm2 volume group.
#    HOOKS=(base udev modconf block lvm2 filesystems fsck)
#
##   NOTE: If you have /usr on a separate partition, you MUST include the
#    usr and fsck hooks.
HOOKS=(base udev plymouth autodetect modconf kms keyboard keymap consolefont block filesystems fsck)

# COMPRESSION
# Use this to compress the initramfs image. By default, zstd compression
# is used. Use 'cat' to create an uncompressed image.
#COMPRESSION="zstd"
#COMPRESSION="gzip"
#COMPRESSION="bzip2"
#COMPRESSION="lzma"
#COMPRESSION="xz"
#COMPRESSION="lzop"
#COMPRESSION="lz4"

# COMPRESSION_OPTIONS
# Additional options for the compressor
#COMPRESSION_OPTIONS=()

# MODULES_DECOMPRESS
# Decompress kernel modules during initramfs creation.
# Enable to speedup boot process, disable to save RAM
# during early userspace. Switch (yes/no).
#MODULES_DECOMPRESS="yes"

Offline

#4 2023-07-06 06:12:25

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,494

Re: amdgpu gets loaded relatively late during the boot process

The kms hook is supposed to add the amdgpu module anyway.
Please post your complete system journal for the boot:

sudo journalctl -b | curl -F 'f:1=<-' ix.io

Offline

#5 2023-07-06 20:43:28

OpusOne
Member
Registered: 2023-05-31
Posts: 186

Re: amdgpu gets loaded relatively late during the boot process

The output of 'sudo journalctl -b' is gigantic currently and flooded with entries due to pipewire bug (refer to another thread) and other things not related to the boot at all.

Wouldn't the output of  'sudo dmesg' be enough to figure it out?

http://ix.io/4zUO

Last edited by OpusOne (2023-07-06 20:50:43)

Offline

#6 2023-07-06 20:56:17

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,494

Re: amdgpu gets loaded relatively late during the boot process

Wouldn't the output of  'sudo dmesg' be enough to figure it out?

No, the system boots within 2/3 seconds - you're looking at a userspace issue.

I don't care how huge the file is or what clutters it, I'll juts get my big boy pants wink
If ix.io quotas you, try

sudo journalctl -b | curl -F 'file=@-' 0x0.st

Offline

#7 2023-07-06 21:24:50

loqs
Member
Registered: 2014-03-06
Posts: 18,880

Re: amdgpu gets loaded relatively late during the boot process

[    0.000000] Command line: root=UUID=6f88b7b3-941e-42e8-b440-f3163ac9dbfe rw add_efi_memmap initrd=\boot\intel-ucode.img initrd=\boot\initramfs-linux.img init=/usr/lib/systemd/systemd pci=noaer threadirqs quiet loglevel=3 systemd.show_status=auto libahci.ignore_sss=1 splash

Was there a storm of PCI AER before pci=noaer was added?

[    0.631149] acpi PNP0A03:03: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
[    1.648369] acpi PNP0A03:03: _OSC: platform does not support [PCIeHotplug SHPCHotplug LTR DPC]
[    3.675010] acpi PNP0A03:03: _OSC: OS now controls [PME PCIeCapability]
[    3.675013] acpi PNP0A03:03: FADT indicates ASPM is unsupported, using BIOS configuration
....
[    3.688703] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
[    4.715018] acpi PNP0A08:00: _OSC: platform does not support [PCIeHotplug SHPCHotplug LTR DPC]
[    6.741675] acpi PNP0A08:00: _OSC: OS now controls [PME PCIeCapability]
[    6.741677] acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration

Possibly an issue with the firmware's ACPI implementation or I could be misreading the timings.

Offline

#8 2023-07-06 21:27:03

OpusOne
Member
Registered: 2023-05-31
Posts: 186

Re: amdgpu gets loaded relatively late during the boot process

seth wrote:

Wouldn't the output of  'sudo dmesg' be enough to figure it out?

No, the system boots within 2/3 seconds - you're looking at a userspace issue.

I don't quite know yet what the full issue is, but my system definitely doesn't boot within 2/3 seconds, if you look at the dmesg output I linked to.

Just that may explain a significant part of the problem:

[    6.764971] pci 0000:03:00.0: vgaarb: setting as boot VGA device
[    6.764973] pci 0000:03:00.0: vgaarb: bridge control possible
[    6.764974] pci 0000:03:00.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    6.764977] vgaarb: loaded

vgaarb takes almost 7 seconds to load. I suppose that before that, it would be impossible to have anything on screen?

And later on yet:

[    9.915997] [drm] amdgpu kernel modesetting enabled.

Last edited by OpusOne (2023-07-06 21:28:25)

Offline

#9 2023-07-06 21:30:11

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,494

Re: amdgpu gets loaded relatively late during the boot process

Whoops, sorry - I conflated the threads, you system obviously does not boot in 2 seconds to GDM wink

Edit: I also didn't see the added dmesg, but this has nothing to do w/ the amdgpu kernel module loading somehow late.

libahci.ignore_sss=1

?

Explain that, remove the pci=noaer, post an updated dmesg and on a wild guess: the dvd drive.
Is there a disc inside and does it make a difference?

Last edited by seth (2023-07-06 21:41:57)

Offline

#10 2023-07-06 21:33:11

OpusOne
Member
Registered: 2023-05-31
Posts: 186

Re: amdgpu gets loaded relatively late during the boot process

loqs wrote:
[    0.000000] Command line: root=UUID=6f88b7b3-941e-42e8-b440-f3163ac9dbfe rw add_efi_memmap initrd=\boot\intel-ucode.img initrd=\boot\initramfs-linux.img init=/usr/lib/systemd/systemd pci=noaer threadirqs quiet loglevel=3 systemd.show_status=auto libahci.ignore_sss=1 splash

Was there a storm of PCI AER before pci=noaer was added?.

Yes, but they were not appearing this early during boot as far as I remember.
They were due to one of my NVMe's (Samsung 960 Pro), the AER errors with its controller are apparently a known issue.

loqs wrote:
[    0.631149] acpi PNP0A03:03: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
[    1.648369] acpi PNP0A03:03: _OSC: platform does not support [PCIeHotplug SHPCHotplug LTR DPC]
[    3.675010] acpi PNP0A03:03: _OSC: OS now controls [PME PCIeCapability]
[    3.675013] acpi PNP0A03:03: FADT indicates ASPM is unsupported, using BIOS configuration
....
[    3.688703] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
[    4.715018] acpi PNP0A08:00: _OSC: platform does not support [PCIeHotplug SHPCHotplug LTR DPC]
[    6.741675] acpi PNP0A08:00: _OSC: OS now controls [PME PCIeCapability]
[    6.741677] acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration

Possibly an issue with the firmware's ACPI implementation or I could be misreading the timings.

Yes, this part takes a pretty long time. I am not sure why.

Offline

#11 2023-07-06 21:42:17

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,494

Re: amdgpu gets loaded relatively late during the boot process

seth wrote:

Edit: I also didn't see the added dmesg, but this has nothing to do w/ the amdgpu kernel module loading somehow late.

libahci.ignore_sss=1

?

Explain that, remove the pci=noaer, post an updated dmesg and on a wild guess: the dvd drive.
Is there a disc inside and does it make a difference?

Offline

#12 2023-07-06 22:16:13

OpusOne
Member
Registered: 2023-05-31
Posts: 186

Re: amdgpu gets loaded relatively late during the boot process

I had added 'libahci.ignore_sss=1' to potentially speed up boot a bit, following this:
https://wiki.archlinux.org/title/Improv … ed_spin-up
(yes, SSS was being used in my case.)

(Edit: my machine contains the following drives:
- 2 NVMe SSDs
- 2 SATA SSDs
- 1 SATA DVD drive
)

There is no disc in my DVD drive. I don't know if having a disc in it would make any difference - would expect even slower with a disc inside?

Last edited by OpusOne (2023-07-06 23:13:52)

Offline

#13 2023-07-07 04:01:51

OpusOne
Member
Registered: 2023-05-31
Posts: 186

Re: amdgpu gets loaded relatively late during the boot process

It really just looks like this is a motherboard firmware issue, as loqs pointed out. I've found a couple oldish posts about that with a similar series of ASUS motherboards.

On my other machines,  the ACPI _OSC queries are almost instant. On this machine, they take a very long time. The BIOS is the latest that was released, so there's nothing much I can update from this side.

Disabling ACPI ('acpi=off') does solve the problem (this ACPI phase lasts almost 7 seconds in total with this motherboard), but I don't want to be without ACPI support. So I guess I'll have to live with this boot time and the black screen for a big part of it.
Unless someone has an idea.

Offline

#14 2023-07-07 06:57:37

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,494

Re: amdgpu gets loaded relatively late during the boot process

https://bbs.archlinux.org/viewtopic.php … 7#p1794647

I inserted a blank DVD in and the error went away

https://bbs.archlinux.org/viewtopic.php … 5#p1794655

seth wrote:

remove the pci=noaer, post an updated dmesg

Offline

#15 2023-08-01 21:41:09

OpusOne
Member
Registered: 2023-05-31
Posts: 186

Re: amdgpu gets loaded relatively late during the boot process

Forgot to post answers - I basically consider this issue unfortunately unsolvable at this point, unless maybe there was a way of handling ACPI initialization differently in the kernel, but I doubt the kernel team will ever bother with this as it definitely looks like a bad ACPI implementation on this particular motherboard series.

- I had no ata errors, so not sure what DVD drive would have to do with it. I still tried booting with a disc inside the drive out of curiosity. It unsurprisingly didn't change anything (except adding a couple more seconds for my boot manager to start - rEFInd - since it tried to scan the disc.)
- The pci=noaer was to strictly avoid AER reports for a single PCI device, I had absolutely made sure of it, and it's the NVMe controller of one my SSDs. The reports were correctable errors so it didn't have any impact except literally flooding dmesg content, which was a royal PITA. Searching on the web, it appeared to be a "known" issue with the controller on the Samsung 960 Pro NVMe series due to how it handles power states. Another option would have been to disable power management for PCI devices, but I prefered to keep that and just disable AER. Either way, there was absolutely no change in boot times.

I'm not a kernel expert, but what appears with this is that ACPI initialization happens very early in the boot process and no display can happen before this phase is over. Which is why the screen remains black for several seconds in my case. On most other machines I have, this phase happens within about the first 200ms and ACPI init itself only lasts a couple ms. So you in practice you don't notice it.

Here is a thread with the exact same behavior, with a motherboard of the same series (not the same one, but all X99 ASUS motherboards share the same BIOS as far as I've gathered, which is why it's named "All Series".)

https://askubuntu.com/questions/965078/ … g-x99-acpi

Discussing this with a few people, ACPI looks to be poorly implemented on many motherboards actually, most vendors doing just enough to make it run smoothly with Windows, not caring about really following the specs.
This one problem leading to 7s of delay is particularly annoying, but it's pretty rare that you don't get at least one ACPI error/conflict message at boot.

It looks as though the firmware of this motherboard either replies very late on some ACPI queries or it doesn't reply at all, which would trigger some kind of time-out. I don't know enough of the kernel to know if this 3s delay we can see is hard-coded as a time-out in the kernel or if it's in the motherboard's firmware. My guess is that the firmware doesn't seem to even care to reply to queries on capabilities that it doesn't support.

So, pretty much unsolvable as far as I can tell. Actually this series of motherboards is also pretty slow to POST, so the added 7s boot time for the kernel is annoying, but it's only part of the pain. Yes I know, it's already old and I could change to something a bit newer. But otherwise the machine runs pretty well with a Xeon CPU.

Offline

#16 2023-08-02 06:46:20

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 75,494

Re: amdgpu gets loaded relatively late during the boot process

so not sure what DVD drive would have to do with it.

seth wrote:

wild guess

- by symptoms and the specific model being on the record for this (and your libahci attempt)

Also certainly remove "quiet … splash" for tests, instead try to lie.

acpi_osi=! acpi_osi="Windows 2009"

you can play w/ the windows versions (2012/13/15): https://learn.microsoft.com/en-us/windo … inacpi-osi has a complete list

Offline

#17 2023-08-25 06:39:06

OpusOne
Member
Registered: 2023-05-31
Posts: 186

Re: amdgpu gets loaded relatively late during the boot process

Haven't had time to debug this further, but I'll investigate setting the 'acpi_osi' parameter and see how it goes. Thanks for the pointer.

Offline

#18 2023-08-26 01:07:17

OpusOne
Member
Registered: 2023-05-31
Posts: 186

Re: amdgpu gets loaded relatively late during the boot process

OK, I tried.
- With "Windows 2009", it made the computer freeze when reaching SDDM. There was still the same initial delay during boot.
- With > 2009 up to 2015 (latest version supported on this motherboard with this BIOS version), it did boot normally but there was still this delay during the _OSC queries.

So, no luck here. There's definitely something in this BIOS that makes the _OSC queries very slow. No clue what it is.
Out of interest, I had a look at the source code - well, some of it, like this: https://github.com/torvalds/linux/blob/ … pci_root.c , the "culprit" function seems to be acpi_pci_osc_control_set(). But I guess it's all dependent on how the BIOS will respond.

So, I'll leave it at that for the time being. Unless you have another idea.
Everything works fine so far, including suspend/resume (which I use often), so this abnormal delay during boot is more of a "minor" annoyance than anything else.

Offline

Board footer

Powered by FluxBB