You are not logged in.
Hi. I had an Arch installation I was using for a year or so with no problems. I was fiddling with scanner/printer drivers when my screen froze. I restarted and it wouldn't get past the systemd-udevd. I can't even access the TTY. I reinstalled the OS multiple times with no luck. I've also done everything related to my issue in this page: https://wiki.archlinux.org/title/NVIDIA/Troubleshooting page. Weirdly enough, nvidia works with Windows 10 so it's apparently not a hardware issue. My card is an RTX 2070 Super.
Here's the system logs:
`dmesg`: https://pastebin.com/0rMRxfzH
`journalctl`: https://pastebin.com/Z6W9L6nA
`lsmod`: https://pastebin.com/an79KqCa
`lspci`: https://pastebin.com/kCvYFmJc
Keep in mind that I logged these in arch-chroot, since I couldn't boot into the OS otherwise. These logs are from a fresh installation.
Thanks.
Offline
so at the first sign of trouble you just nuke the os and start over, probably would have been alot quicker and easier to just fix the problem but we will never know now.
what have you done ? tell us step by step how you installed the os and how you installed the nvidia drivers, from the looks of it you are using the built in nouveau driver.
Offline
so at the first sign of trouble you just nuke the os and start over, probably would have been alot quicker and easier to just fix the problem but we will never know now.
what have you done ? tell us step by step how you installed the os and how you installed the nvidia drivers, from the looks of it you are using the built in nouveau driver.
I tried to fix it myself at the first sign of trouble but had no luck. It says nouveau because that's what the arch-chroot instance uses.
As for the installation process, it was the basic arch installation. I pacstrapped base, base-devel, linux and the firmware. Then I simply ran `pacman -S nvidia` after grub installation and a reboot. I rebooted again for nvidia drivers to blacklist nouveau, and returned to the conundrum where the OS doesn't boot anymore.
Last edited by alpindale (2023-02-24 01:38:11)
Offline
Keep in mind that I logged these in arch-chroot, since I couldn't boot into the OS otherwise.
Means they're mostly useless.
https://pastebin.com/raw/Z6W9L6nA is from the installed system, but it's booting nouveau.
You'll need to post a(n older) journal w/ nvidia in use.
Try to only boot the multi-user.target, 1st link below.
And stop the re-installation nonsense, you're just creating a moving target which will be harder to hit.
Edit: for wild speculation, https://wiki.archlinux.org/title/NVIDIA
1. ibt=off
2. nvidia_drm.modeset=1
3. https://wiki.archlinux.org/title/NVIDIA#Early_loading
Last edited by seth (2023-02-24 07:32:05)
Offline
I encountered the exact same issue yesterday.
I was using latest kernel 5 LTS version...and when I did full upgrade my system decided to upgrade to the new kernel 6 LTS version. After the reboot I couldn't even start up the OS anymore. I spent the day trying to fix it, trying to find a combination of Nvidia drivers and kernel that work together, but I didn't manage to make it work.
The nvidia driver was my first suspect because similar situations had happen to me before... kernel update comes out before nvidia update does, and I end up with a problem if I do a full update in this time interval.
Anyway I ended up doing a full reinstall... I cleared my root partition, even swap and efi to be sure because I was pissed over the waste of time... and the full fresh install was working perfectly. So the kernel here was now 6.1.13-1-LTS. Then I just installed the latest nvidia-lts and nvidia-settings (simply using pacman -S). And needless to say, after the reboot = same issue that had started it all.
Fortunately all I had to do to fix it (after chrooting with a USB) was clear out the nvidia packages and it's depencencies. Simple reboot and now it's all fine. So I'm using nouveau right now, I got visual issues and bad performance, but for the time being I don't care because I got work to do.
Conclusion.. NVIDIA made a big poop again. Correct me if I'm wrong
Offline
upgrade to the new kernel 6 LTS version
Did you install proper LTS nvidia drivers?
<49,17,III,I> Fama di loro il mondo esser non lassa;
<50,17,III,I> misericordia e giustizia li sdegna:
<51,17,III,I> non ragioniam di lor, ma guarda e passa.
Offline
This problem is trivial to solve, was documented when it first appearead in the 5.18 kernel three months ago and amounts to adding ibt=off to your kernel parameters if you are on a new intel processor and need external modules (this has on it's own nothing to do with nvidia and is coincidental, all external modules are affected by this): Read the blue note in: https://wiki.archlinux.org/title/NVIDIA#Installation
Last edited by V1del (2023-02-24 10:06:41)
Online
denzil wrote:upgrade to the new kernel 6 LTS version
Did you install proper LTS nvidia drivers?
Yes I did.. that's why I mentioned I did pacman -S..
Here's exactly from the logs:
$ grep -i installed /var/log/pacman.log | grep nvidia │
[2023-02-23T16:31:13+0100] [ALPM] installed nvidia-utils (525.89.02-2) │
[2023-02-23T16:31:13+0100] [ALPM] installed nvidia-lts (1:525.89.02-5) │
[2023-02-23T16:31:13+0100] [ALPM] installed nvidia-settings (525.89.02-1)
Meantime I just found out 30min ago there's a colleague at work.. with exact same laptop/hardware as me.. and he did an update yesterday and everything works fine for him. I dont get it.
This is what he's got and it's working:
$ pacman -Q | grep linux-lts
linux-lts 6.1.13-1$ pacman -Q | grep nvidia-lts
nvidia-lts 1:525.89.02-5
Last edited by denzil (2023-02-24 10:09:07)
Offline
Did you read seth's and my post?
If "unbootable" is it actually unbootable or are you able to switch terminals with Ctrl+Alt+F2 and the like, can you boot with the nomodeset kernel parameter, can you boot with the systemd.unit=multi-user.target kernel parameter. What's the rest of your hardware? Is this a laptop or a desktop? If it is a desktop, and it's not up to the ibt=off parameter then another thing that regressed in *the kernel* is that they do not allocate consoles if another device steals the framebuffer, so if you have an integrated GPU that you are not actually using, then explicitly disable it in your UEFI.
Online
Did you read seth's and my post?
If "unbootable" is it actually unbootable or are you able to switch terminals with Ctrl+Alt+F2 and the like, can you boot with the nomodeset kernel parameter, can you boot with the systemd.unit=multi-user.target kernel parameter. What's the rest of your hardware? Is this a laptop or a desktop? If it is a desktop, and it's not up to the ibt=off parameter then another thing that regressed in *the kernel* is that they do not allocate consoles if another device steals the framebuffer, so if you have an integrated GPU that you are not actually using, then explicitly disable it in your UEFI.
It's completely unbootable, and I can't switch to another terminal. Removing the quiet parameter, I can see that the boot process is remains stuck at the
[ OK ] Finished Record System Boot/Shutdown in UTMP
I'm running on desktop, with an i7-8700 CPU, so it's not new enough for ibt=off to be an issue. I passed ibt, multi user, and the nomodeset parameter to little avail.
I mentioned this in the OP before, but it was working fine until now and I hadn't changed a single setting or messed with the hardware. I have also disabled the integrated GPU in UEFI settings.
Offline
I will try another distro, as I've only tried installing Arch and then Manjaro. Maybe it's an Arch specific problem. I'll report back if it works with Mint.
Offline
nother thing that regressed in *the kernel* is that they do not allocate consoles if another device steals the framebuffer, so if you have an integrated GPU that you are not actually using, then explicitly disable it in your UEFI
Or alternatively try passing "module_blacklist=i915" to the kernel commandline.
Otherwise, I think #9 was directed at denzil, but the most important part is that we'll need to see a journal of a bad boot.
You can obtain it w/ the install iso, make sure to not use the power button to reboot the system, but https://wiki.archlinux.org/title/Keyboa … el_(SysRq)
Offline
Yes I think #9 was meant for me too
Anyway @V1del.. I read seth's post, your haven't seen at the time of writing my own because it wasn't there yet at the time. You posted yours while I was writing mine pretty much.
That being said... yes, unbootable was actually unbootable. I tried switching to another terminal, it would render onscreen but after few seconds my screen would just show the first terminal and then completely freeze.
My problem here is that this is all a bit too advanced for me.. when you say "pass a parameter to kernel" I have no idea how to do that.. and when I try to google it you can be sure I won't find instructions. And it is not really something one can play around with and hope to guess it correctly
I understand that this is mostly just my own issue.. but that's how it is right now. I'm using Arch in order to learn, and ironically these situations are best for learning while the worst at the same time
I'll try some of these things over the weekend and I'll get back to you. I'm afraid I don't have any logs any more as I've done a fresh install. But I do have a phone picture of my screen which I used trying to get some help locally. I'm sure it will be informative to you, a lot more than it was informative for me. Check it out here.
On the other hand.. I can reproduce the whole thing my just installing nvidia-lts again..so maybe I'll just do that, and then give you jorunal output or something
Offline
You are certain to find instructions if you were to google because passing kernel parameters is a very common operation. In this particular case all of this, including links to how to add kernel parameters is documented in the installation section of the nvidia wiki page: https://wiki.archlinux.org/title/NVIDIA#Installation -- read the second bullet point in the blue note box.
Online
Yes I think #9 was meant for me too
Anyway @V1del.. I read seth's post, your haven't seen at the time of writing my own because it wasn't there yet at the time. You posted yours while I was writing mine pretty much.
That being said... yes, unbootable was actually unbootable. I tried switching to another terminal, it would render onscreen but after few seconds my screen would just show the first terminal and then completely freeze.My problem here is that this is all a bit too advanced for me.. when you say "pass a parameter to kernel" I have no idea how to do that.. and when I try to google it you can be sure I won't find instructions. And it is not really something one can play around with and hope to guess it correctly
I understand that this is mostly just my own issue.. but that's how it is right now. I'm using Arch in order to learn, and ironically these situations are best for learning while the worst at the same timeI'll try some of these things over the weekend and I'll get back to you. I'm afraid I don't have any logs any more as I've done a fresh install. But I do have a phone picture of my screen which I used trying to get some help locally. I'm sure it will be informative to you, a lot more than it was informative for me. Check it out here.
On the other hand.. I can reproduce the whole thing my just installing nvidia-lts again..so maybe I'll just do that, and then give you jorunal output or something
You can pass parameters by pressing the e button when arch linux is highlighted in the GRUB menu. Look for the line that starts with linux, and add your parameters at the end of the line then press Ctrl+X to boot with those params. Keep in mind that it's one-time only.
As for my own problem, I installed mint. I'm in the process of updating the system and installing nvidia drivers now. My bandwidth is extremely slow, so it'll take a while. If this didn't work, I'll reinstall arch and reboot with SysRq to get the proper logs in arch-chroot.
Offline
You are certain to find instructions if you were to google because passing kernel parameters is a very common operation. In this particular case all of this, including links to how to add kernel parameters is documented in the installation section of the nvidia wiki page: https://wiki.archlinux.org/title/NVIDIA#Installation -- read the second bullet point in the blue note box.
You are right.. now that I'm reading it when I got a little bit more experience with these issues, it does make sense.
When I first had this type of issue ~6months ago, I ran into a "add kernel parameter" solution and reading all this just made it all the more confusing. Add the beginners fear of "making it worse" and you end up nothing but confused.
Thank you for helping me realize this
And thank you for the patience aswell. I can only imagine how annoying it must be for you dealing with same beginner issues over and over again.
Offline
So here's what I've done today..
1. I added the ibt=off parameter, installed nvidia-lts and nvidia-settings (nvidia-utils got dragged in aswell) and the original problem is gone. I can boot up. However lspci at this point still says nouveau is loaded and I can tell that's true by performance.
2. So I removed kms from the HOOKS array in /etc/mkinitcpio.conf and regenerated the initramfs with mkinitcpio -P. Reboot and I can tell even on login screen that nvidia driver is now up and running. Some other issue surfaced here, I'll mention that in the end of the post. And following the information at wiki I made a hook to run mkinitcpio every time there is a nvidia driver update
3. Motivated by this new issue I decided to follow Early Loading information so I added nvidia nvidia_modeset nvidia_uvm nvidia-drm to initramfs. Speaking of which, I suppose I did this correctly, I added them to MODULES array in /etc/mkinitcpio.conf so it looks like this now:
MODULES=(nvidia nvidia_modeset nvidia_uvm nvidia-drm)
and after that I rebuilt the image again with mkinitcpio -P, but this didn't produce any visible change. BTW can you please confirm that this is how one adds something to initramfs?
4. Then I realized i forgot about this so I added nvidia_drm.modeset=1 kernel parameter... after reboot again no change visible.
At this point I gotta mention something wierd but unrelated to this topic.. I would add kernel parameters by manually editing the /boot/grub/grub.cfg file and after that run grub-mkconfig -o /boot/grub/grub.cfg. However after reboot I noticed the change didn't take effect. Whatever I added to the grub.cfg wasn't there anymore. I even tested this by using CTRL+X when grub loaded at boot time.. and noticed that whatever I did simply wasn't there. So the only way I managed to add those kernel parameters was to edit the /etc/default/grub file and then call the grub-mconfig. Any idea why editing the grub.cfg doesn't work for me?
Anyway so for the time being I got nvidia up and running. Here's my latest journal and dmesg logs.
You might notice another kernel parameter in there nvidia.NVreg_RegistryDwords=EnableBrightnessControl=1 which I added because my brightness hotkey aren't working anymore on my laptop. This parameter did not help.
And also I saw some messages in logs like module license 'NVIDIA' taints kernel .. wtf is that supposed to mean?
The other issue that appeared when my nvidia driver started working:
Wierdly enough my resolution is correct (2560x1600) and DPI is 100%... but GUI scale seems to differ depending on the application or even element within an application. Any application. Here's an example, Dolphin application is huge, fonts are so big it barely fits on my screen. Chrome which you can see in the background fully scales as it should and is tiny in comparison. I realize this screenshot my seem just fine to you, but before the nvidia my Dolphin was at least half that size. I'm not sure if this is related to some possible leftover nvidia settings somewhere on my system..or is something I need to fix. And like I mentioned, it's not just a Dolphin issue, for example if I open IDEA IntelliJ, some user interface elements are also huge, but others are alright and in line with my current resolution.
Last edited by denzil (2023-02-25 11:22:35)
Offline
BTW can you please confirm that this is how one adds something to initramfs?
lsinitpcio /boot/initramfs-linux.img | grep nvidia
I would add kernel parameters by manually editing the /boot/grub/grub.cfg file and after that run grub-mkconfig -o /boot/grub/grub.cfg.
"grub-mkconfig -o /boot/grub/grub.cfg" writes /boot/grub/grub.cfg from /etc/default/grub …
https://wiki.archlinux.org/title/Grub
GUI scale seems to differ depending on the application or even element within an application
https://wiki.archlinux.org/title/Hidpi
Edit:
module license 'NVIDIA' taints kernel .. wtf is that supposed to mean?
That you're loading an out-of-tree module and kernel developers will object to that when you're reporting bugs against the kernel.
It's perfectly normal and nothing to worry about.
Last edited by seth (2023-02-25 11:27:09)
Offline
Thanks for the quick reply seth.
I got it what I'm doing wrong with the grub configuraiton (/facepalm).. and thank you for the rest of the tips.
$ sudo lsinitcpio /boot/initramfs-linux-lts.img | grep nvidia
usr/lib/firmware/nvidia/
usr/lib/firmware/nvidia/525.89.02/
usr/lib/firmware/nvidia/525.89.02/gsp_ad10x.bin
usr/lib/firmware/nvidia/525.89.02/gsp_tu10x.bin
usr/lib/modprobe.d/nvidia-utils.conf
usr/lib/modules/6.1.13-1-lts/extramodules/nvidia-drm.ko
usr/lib/modules/6.1.13-1-lts/extramodules/nvidia-modeset.ko
usr/lib/modules/6.1.13-1-lts/extramodules/nvidia-uvm.ko
usr/lib/modules/6.1.13-1-lts/extramodules/nvidia.ko
I guess this means I've done it correctly.
Offline