You are not logged in.
I installed the nvidia dkms driver 470xx via yay. After the installation I edited the mkinitcpio.conf and added the modules and removed kms.
Then I built a new image via mkinitcpio -P and found it strange that the nvidia modules weren't listed in the build process.
Now after a reboot the system hangs directly after the bootloader and the screen is "distorted" so I can't figure out exactly where it is stuck.
I thought that I could edit the kernel params (systemd) by pressing e but it on shows "Entry does not support editing the command line" after pressing e.
Reboot a live image, chroot and then? Or is there another option to check first?
Last edited by Xerx0 (2024-12-28 22:54:52)
Offline
I installed the nvidia dkms driver 470xx via yay. After the installation I edited the mkinitcpio.conf and added the modules and removed kms.
Then I built a new image via mkinitcpio -P and found it strange that the nvidia modules weren't listed in the build process.
Now after a reboot the system hangs directly after the bootloader and the screen is "distorted" so I can't figure out exactly where it is stuck.
I thought that I could edit the kernel params (systemd) by pressing e but it on shows "Entry does not support editing the command line" after pressing e.
Reboot a live image, chroot and then? Or is there another option to check first? Or how can those problems be avoided?
Note: For the future, it better to always have a fallback option in the bootloader?
Offline
Did the modules actually build when you installed the drivers?
Offline
Yes they did.
Offline
Did they? When in the "stuck" state, can you switch VT with Ctrl+Alt+F2 and friends? What output does
dkms status
give you?
FWIW if you're reliant on such old drivers it's probably handy to have a older kernel around, you probably want to install LTS, so you can wait out potential issues with newer kernels.
Last edited by V1del (2024-12-06 16:55:02)
Offline
Unfortunately there is no newer driver available because it's an older card (from 2014). I tried the current nvidia package first which gave me a msg on boot that the card is only supported via the legacy driver.
I couldn't switch between Terminals with Ctrl+Alt+F
What I did so far is start from a live system and added another entry for the loader without the params for modeset etc. and the system is starting again but I'm unsure how to proceed?
dkms status
nvidia/470.256.02, 6.12.1-arch1-1, x86_64: installed
Offline
Mod note: Moving to AUR Issues.
Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD
Making lemonade from lemons since 2015.
Offline
What card, exactly?
The fact that you even tried the latest version tells me you haven't read https://wiki.archlinux.org/title/NVIDIA
Last edited by Scimmia (2024-12-06 17:31:46)
Offline
I did read it but it seems that I might have missed the very first sentence... It's a system with a dual GPU so Nvidia Optimus is probably the correct one.
01:00.0 VGA compatible controller: NVIDIA Corporation GK107M (GeForce GT 750M Mac Edition] (rev a1)
Offline
What I did so far is start from a live system and added another entry for the loader without the params for modeset etc. and the system is starting again but I'm unsure how to proceed?
Maybe it's time to assess what the problem actually is.
Can you boot the multi-user.target (2nd link below)?
If not, do not reboot by holding the power button - frenetically press ctrl+alt+del or use the https://wiki.archlinux.org/title/Keyboa … el_(SysRq) (the entire REISUB sequence)
Then access the journal of that boot from the chroot or the live environment, https://wiki.archlinux.org/title/System … al_to_view and post it
From the chroot:
sudo journalctl -b -1 | curl -F 'file=@-' 0x0.st
Offline
Sorry seth I have to ask because I'm unsure if I understand your suggestion correctly:
cat /proc/sys/kernel/sysrq
16
So the kernel was compiled with sysrq enabled? Unfortunately the keyboard on the device (it's a MacBook Pro) doesn't have a print screen key. How can I figure out what to press instead if I don't have another keyboard available?
When I try to boot with the added kernel parameters for the nvidia card, the system gets stuck even before I get to unlock the luks volume so I doubt that there are any logs present when I boot into a live system afterwards.
The kernel parameters I added are:
nvidia-drm.modeset=1 nvidia-drm.fbdev=1
If I have to guess it probably has something to do with the switch between the CPU integrated graphicscard to the nvidia one. Just a guess so how to investigate further?
Offline
You cannot reboot the system w/ ctrl+alt+del (or cmd+alt+del or whatever the macbook uses for ctrl)?
So the kernel was compiled with sysrq enabled?
Yes, all official kernels are.
But ansent a sysrq key, that's probably no viable vector anyway.
The 470xx drivers don't support nvidia-drm.fbdev=1 anyway - w/ the nvidia-drm.modeset=1 hack deactivated in 6.12.0 and 6.12.1 you can either wait for 6.12.2 or try the LTS kernel if we assume the simpledrm device gets in the way.
I doubt though that those are what breaks the boot? What happens if you omit them?
Offline
Without these parameters, the system boots, but it takes a long time. The boot menu appears; I select the fallback entry I created, which does not have the two parameters. Then, it takes a long time before I am asked for my LUKS password, and then the boot continues. I took a look at the dmesg output, but nothing unusual was there. After the boot is completed, the modules are listed as they should (lsmod).
Regarding the need for these parameters in general: Where can I find information that they are not needed with the driver I installed? Everything I have read so far indicated that they should be added.
The system boots into SDDM, but after entering the login credentials, it freezes with a black screen. It is not possible to switch to another tty (Fn+Ctrl+Alt+F) in that case.
I'm going to try the LTS kernel next.
Offline
Try to get us a journal of the troublesome boots by any means necessary
Where can I find information that they are not needed with the driver I installed?
It's not authorative but https://wiki.archlinux.org/title/NVIDIA … de_setting (there's a link to the PGKBUILD though, for fbdev (it's not "not needed", it's simply not supported) you'll have to look up the nvidia changelogs for announcing it) … but you're on 470xx, d'ohh.
So "nvidia-drm.modeset=1" is still necessary/a very good idea, but currently doesn't perform one of its two jobs (telling the simpledrm device to gfy) - fbdev=1 is simply inert for you.
Offline
I'm confused. I installed the LTS kernel and the LTS kernel headers, ran mkinitcpio, and finally added a new entry in the bootloader. No errors occurred during these steps.
After a reboot, I selected the new entry. Then, nearly one minute passed before I got a distorted screen, and all I could (barely) see was a message:
system logs. "system logs. "systemctl reboot" systemctl reboot" ..... to continue boot ....
it's hard to read.
ctrl+alt+backspace triggers a reboot.
Gonna check the Nvidia docs again, maybe I missed something, or is there anything else I should look at first?
Offline
Try to get us a journal of the troublesome boots by any means necessary
As pointed out in #10 you can access that boots journal from the iso or by chrooting into the system.
Offline
I totally understand the problem. It's hard to figure out where the problem is, especially without access to the computer, but those files include sensitive information, and I don't feel comfortable uploading them directly. I would need to redact some things.
There is one thing I don't understand. When I installed the LTS kernel my pacman hook triggered mkinitcpio which listed the nvidia modules with a warning (not found) because the headers were missing. I installed them and ran mkinitcpio again which was completed without errors but did not explicitly list the building process for those modules. Isn't it the case that the mkinitcpio build process use the information from the mkinitcpio.conf and builds the image with the modules listed?
Offline
I have quite "similar" hardware, GK107 - without "M" letter. Few minutes ago I've updated stock kernel to 6.12.3. Modules were build with success.
Isn't it the case that the mkinitcpio build process use the information from the mkinitcpio.conf and builds the image with the modules listed?
Are you 100% sure your settings are correct, btw?
Here is my config:
cat /etc/mkinitcpio.conf.d/custom.conf
# /etc/mkinitcpio.conf.d/custom.conf
MODULES=(nvidia nvidia_modeset nvidia_uvm nvidia_drm)
On kernel command line (passed to bootloader config) there is only one nvidia parameter:
nvidia-drm.modeset=1
Initramfs image contains nvidia modules, as follows:
lsinitcpio /boot/initramfs-linux.img| grep dkms
usr/lib/modules/6.12.3-arch1-1/updates/dkms/
usr/lib/modules/6.12.3-arch1-1/updates/dkms/nvidia-drm.ko.zst
usr/lib/modules/6.12.3-arch1-1/updates/dkms/nvidia-modeset.ko.zst
usr/lib/modules/6.12.3-arch1-1/updates/dkms/nvidia-uvm.ko.zst
usr/lib/modules/6.12.3-arch1-1/updates/dkms/nvidia.ko.zst
Offline
It's hard to figure out where the problem is, especially without access to the computer, but
No. Not "hard" - "impossible".
those files include sensitive information
The journal isn't supposed to contain sensitive data.
Once in a while a publically routable IPv6 shows up there, but that's *rare* (eg. NM filters them for the logs) - things one might find in there are
1. this actually isn't arch (no excuse)
2. you chose an embarrassing host or user name (nobody cares)
Partition UUIDs are traceable if you post them across the internet, but don't reveal anything - they exist for collision-free device addressing and your radio MACs are only useful for somebody in wifi range (and only if you use a MAC filter, which isn't good security tbw.) - your AP btw. yells its SSID and BSSID into the air all the time.
installed them and ran mkinitcpio again which was completed without errors but did not explicitly list the building process for those modules.
dkms likely built them when installing the headers, you can check "dkms status" for that.
The nice thing about 6.12.3 will be that "nvidia-drm.modeset=1" actually works again and the simplydumb device won't show up.
@Fixxer, your system works (again)? "glxinfo -B" says you're running on the nvidia GPU?
Offline
@Fixxer, your system works (again)? "glxinfo -B" says you're running on the nvidia GPU?
Yes, all works (rebooted), Nvidia GTX 650 Ti Boost is the only GPU on my PC.
glxinfo -B
name of display: :0.0
display: :0 screen: 0
direct rendering: Yes
Memory info (GL_NVX_gpu_memory_info):
Dedicated video memory: 2048 MB
Total available memory: 2048 MB
Currently available dedicated video memory: 1644 MB
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: NVIDIA GeForce GTX 650/PCIe/SSE2
OpenGL core profile version string: 4.6.0 NVIDIA 470.256.02
OpenGL core profile shading language version string: 4.60 NVIDIA
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL version string: 4.6.0 NVIDIA 470.256.02
OpenGL shading language version string: 4.60 NVIDIA
OpenGL context flags: (none)
OpenGL profile mask: (none)
OpenGL ES profile version string: OpenGL ES 3.2 NVIDIA 470.256.02
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
It's Xfce with SDDM, all latest, updated stable Arch branch.
Last edited by Fixxer (2024-12-07 21:42:52)
Offline
@seth I looked through the journal, and it contains CPU register contents, MAC addresses for my network interfaces, my Wi-Fi AP's SSID, local IP addresses, the daemon I use to block connections without a VPN connection, VPN connection parameters (settings and connection destination), public IP addresses, and other information. I will redact this information and upload it as soon as possible, but please understand that I am uncomfortable posting this and need to review the log before doing so.
@Fixxer I have the same parameters but not inside a seperate file but this shouldn't make a difference. Same kernel parameter as well.
dkms status lists them for both kernel (6.12.1-arch1-1 and 6.6.63-1-lts). The version of the driver is 470.256.02.
I will look through the log and upload it next. Thanks so far everyone.
Offline
@seth I looked through the journal, and it contains CPU register contents
You mean from a backtrace because a process crashed?
That's kinda the point of the entire backtrace thing and possibly very important here.
MAC addresses for my network interfaces, my Wi-Fi AP's SSID
Yes, I addressed that. Being close to your AP, I know your SSID and not being close to it, what will I do with that information anyway?
Likewise your MACs.
local IP addresses
LAN IPv4 is meaningless outside the LAN.
daemon I use to block connections
How is that "sensitive"?
VPN connection parameters (settings and connection destination)
I can't speak to your VPN, but
public IP addresses
From what process?? Your VPN provider?
I've not seen such in a long time.
When redacting the journal, please make sure to pseudonymize it (aaa => xxx, bbb => yyy; not aaa => xxx, bbb => xxx) and not break the structure of the journal by flat-out deleting columns (as that will break syntax highlighting), thanks.
dkms status lists them for both kernel
added or intalled?
Please don't paraphrase, https://bbs.archlinux.org/viewtopic.php?id=57855
Offline
That's kinda the point of the entire backtrace thing and possibly very important here.
True. But they (the CPU registers) could contain information I don't want to be known. The function calls should be enough to see where the problem might be.
Yes, I addressed that. Being close to your AP, I know your SSID and not being close to it, what will I do with that information anyway?
Likewise your MACs.
A simple lookup inside a SSID database would reveal my precise location. I don't want that.
LAN IPv4 is meaningless outside the LAN.
Reveals network structure.
How is that "sensitive"?
Login details inside the log.
From what process?? Your VPN provider?
I've not seen such in a long time.
Yes, my VPN provider.
I don't understand why I have to "defend" myself for not revealing those details....
added or intalled?
Both listed as installed.
Sorry for the formatting but I didn't want to requote everything.
Offline
Hopefully I have replaced everything. If not please let me know.
Offline
Dec 08 00:18:27 archlinux kernel: simple-framebuffer simple-framebuffer.0: [drm] Registered 1 planes with drm panic
Dec 08 00:18:27 archlinux kernel: [drm] Initialized simpledrm 1.0.0 for simple-framebuffer.0 on minor 0
Dec 08 00:18:27 archlinux kernel: simple-framebuffer simple-framebuffer.0: [drm] fb0: simpledrmdrmfb frame buffer device
…
Dec 08 00:18:29 archlinux kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
Dec 08 00:18:29 archlinux kernel: [drm] Initialized nvidia-drm 0.0.0 for 0000:01:00.0 on minor 1
Dec 08 00:18:29 archlinux kernel: Failed to initialize the nv-hotplug-helper DRM client (ensure DRM kernel mode setting is enabled via nvidia-drm.modeset=1).
Dec 08 00:18:29 archlinux kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Unloading driver
Dec 08 00:19:19 XYC systemd[1]: Starting Load Kernel Module drm...
Dec 08 00:19:19 XYC systemd[1]: modprobe@drm.service: Deactivated successfully.
…
Dec 08 00:19:40 XYC sddm-greeter-qt6[793]: Adding view for "None-1" QRect(0,0 2880x1800)
For now you're running into the simplydumb device, why have you not updated the system to get a kernel where the nvidia_drm.modeset=1 hack has been restored.
It would probably have spared use a lot of time spent
Off Topic:
----------------------------
True. But they (the CPU registers) could contain information I don't want to be known. The function calls should be enough to see where the problem might be.
I don't know what you saw or think you saw, but backtraces will contain that: stack positions of function calls, sometimes (if the binary was dwarf-enabled) of variables and even their values (but that's rare)
That's it.
If some unencrypted truely sensitive data (your background color is not) shows up in a backtrace, ever, file a bug against that software.
Highly sensitive data should not even be kept in memory unscrambled (luckily that's a super-duper rare condition to take care of itfp)
And there're only to kernel backtraces, the wl module triggers a warning twice.
Login details inside the log.
File a bug against whatever that is - this is insane.
However:
You're using mullvad and the address you redacted fits with 45.83.223.x:443 and is one of their servers and tells nothing about you, your VPN, your network, your location or anything else.
Reveals network structure.
No. (No routing or IP tables)
But even if: so what?
Also
Dec 08 00:19:35 XYC NetworkManager[692]: <info> [1733613575.7204] dhcp4 (wlan0): state changed new lease, address=111.111.111.22
It only says what lease *you* get - that's not a structure nor a network.
Here, mine is 192.168.11.2 - now hack me.
A simple lookup inside a SSID database would reveal my precise location. I don't want that.
Do you have access to such database?
Do you use a very unique SSID? Why?
If you're worried about that, I'd rather change that…
I don't understand why I have to "defend" myself for not revealing those details....
You don't have to defend anything, I'm suggesting that you drastically overestimate the sensitivity of anything in those logs.
You see "number I don't understand" and assue "It's probably encoding my penis length!"
You're free to disagree but if you care about privacy I'd invite you to try to figure what all these things are and mean and tell so you're not scared out of your mind.
Offline