You are not logged in.
Hi all,
I just received the update to the nvidia 378.13 driver, and now xorg won't start.
I'm getting the following in my logs:
Feb 17 03:07:28 host kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 378.13 Tue Feb 7 20:10:06 PST 2017
Feb 17 03:07:43 host kernel: NVRM: GPU at PCI:0000:01:00: GPU-1b13177f-8b34-bdf5-8eda-d884306b073c
Feb 17 03:07:43 host kernel: NVRM: Xid (PCI:0000:01:00): 61, 1899(157c) 00000000 00000000
Feb 17 03:07:43 host kernel: NVRM: Xid (PCI:0000:01:00): 62, 12958(282c) 00000000 00000000
Feb 17 03:08:15 host kernel: NVRM: RmInitAdapter failed! (0x53:0xffff:1857)
Feb 17 03:08:15 host kernel: NVRM: rm_init_adapter failed for device bearing minor number 0
Reverting back to 375.26 makes it work again. Graphics card in question is a GTX770.
Any ideas?
Last edited by BlackMastermind (2017-02-17 03:19:55)
Offline
I just received the update to the nvidia 378.13 driver
Does that mean you updated *only* the driver but not the kernel itself (nor xorg), ie. conducted a partial update?
Offline
I just received the update to the nvidia 378.13 driver
Does that mean you updated *only* the driver but not the kernel itself (nor xorg), ie. conducted a partial update?
No.
Ran a full pacman -Syu and rebooted.
Last edited by BlackMastermind (2017-02-17 12:50:14)
Offline
Ok, see https://wiki.archlinux.org/index.php/NV … iled.21.29 for a possible workaround
Happens from time to time*, but it's usually a kernel/module mismatch. Some future update will make the workaround superflous.
* https://www.google.de/search?hl=en&q="NVRM:+RmInitAdapter+failed!"+site:bbs.archlinux.org ;-)
Offline
looklike the kernel module was built against kernel-4.9.9 that was in testing, now that kernel is in extra and should fix the issue (if it's a kernel/module mismatch of course)
Offline
Ok, see https://wiki.archlinux.org/index.php/NV … iled.21.29 for a possible workaround
Happens from time to time*, but it's usually a kernel/module mismatch. Some future update will make the workaround superflous.* https://www.google.de/search?hl=en&q="NVRM:+RmInitAdapter+failed!"+site:bbs.archlinux.org ;-)
I tried this, and it didn't help.
Of course, I googled the issue before posting here, but I only found things from months or years ago, so I didn't think it would be applicable for a problem with a driver version that was released last week.
looklike the kernel module was built against kernel-4.9.9 that was in testing, now that kernel is in extra and should fix the issue (if it's a kernel/module mismatch of course)
Just upgraded the kernel, it didn't fix the issue.
$ uname -a
Linux hostname 4.9.9-1-ARCH #1 SMP PREEMPT Thu Feb 9 19:07:09 CET 2017 x86_64 GNU/Linux
$ journalctl |grep NVRM
Feb 19 16:54:45 hostname kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 378.13 Tue Feb 7 20:10:06 PST 2017
Feb 19 16:55:05 hostname kernel: NVRM: GPU at PCI:0000:01:00: GPU-1b13177f-8b34-bdf5-8eda-d884306b073c
Feb 19 16:55:05 hostname kernel: NVRM: Xid (PCI:0000:01:00): 61, 1899(157c) 00000000 00000000
Feb 19 16:55:05 hostname kernel: NVRM: Xid (PCI:0000:01:00): 62, 12958(282c) 00000000 00000000
Feb 19 16:55:37 hostname kernel: NVRM: RmInitAdapter failed! (0x53:0xffff:1857)
Feb 19 16:55:37 hostname kernel: NVRM: rm_init_adapter failed for device bearing minor number 0
Offline
Have you attempted to use nvidia-dkms? I also have a GTX770 (and two kernels) and I haven't had any issues with drivers since switching to dkms, other than having to manually rebuilt the initramfs/kernel every once in a while (mkinitcpio).
Offline
Have you attempted to use nvidia-dkms? I also have a GTX770 (and two kernels) and I haven't had any issues with drivers since switching to dkms, other than having to manually rebuilt the initramfs/kernel every once in a while (mkinitcpio).
Tried it, no joy.
I've been thinking about trying to use Nvidia's own installer instead of the Arch provided package. Would that be advisable or do I risk hosing my system then?
Offline
I did a quick google scan of your issue and it appears that it could be one of two problems. Either the driver doesn't support your card/pci location/display port or your bios needs an upgrade. Since I'm also using the same card and two of the cards display ports without any issues (including a kernel update this morning on dkms driver 378.13-2), you might want to see if your motherboard manufacturer has a bios update.
Offline
I did a quick google scan of your issue and it appears that it could be one of two problems. Either the driver doesn't support your card/pci location/display port or your bios needs an upgrade. Since I'm also using the same card and two of the cards display ports without any issues (including a kernel update this morning on dkms driver 378.13-2), you might want to see if your motherboard manufacturer has a bios update.
Thanks for the suggestion.
My motherboard manufacturer did have a bios update that was a couple of months newer, so I just installed it. FWIW, I have an Asus Z87-Pro and upgraded from bios version 1802 to 2103. Didn't fix the problem unfortunately
As for the PCI location etc. I really want to exhaust any other options before I open my case and start fiddling with hardware. I mean, the thing worked fine with every driver release for 3 years, why should I suddenly have to move my hardware around?
Offline
Another thing I just tried is to purge all nvidia packages from my system and install 378.13 using Nvidia's own installer, just to rule out any packaging issue. Predictably, I'm getting the same error.
For now, as a workaround, I installed the current long lived branch driver, 379.39, using Nvidia's own installer. This way I can install the driver under dkms and I don't have to keep back kernel updates anymore.
Why isn't the long lived branch in the repos btw? Might be a good idea to add it?
Offline
You should never have to hold back the kernel for nVidia packages on Arch.
The long-lived branch isn't in the repositories by default since it would add about 4 extra packages that would need to be maintained and (afaik) there is no real demand for it.
Feel free to create and maintain one yourself though. (and feel free to share on the AUR if you feel more people might be interested)
That aside, using the nVidia installer instead of the packaged version will probably not break your system but it's definitely not recommended.
It also prevents pacman from keeping track of the files, which means that pacman will not be able to update the driver for you and that you may end up in situations where the driver is incompatible with other packages that pacman did update (of which the kernel can very easily be one in this case).
And at that point you will probably not get any support here since you aren't using the pacman package.
PS; you should use the 'nvidia-dkms' package instead if you plan on using multiple/custom kernels. If you only use the 'linux' kernel package you can just use the 'nvidia' package, both of which are pushed to extra together. (the longest delay I have seen was 5 minutes, which was most likely due to slow mirror syncing)
Last edited by Omar007 (2017-02-22 17:03:03)
Offline
You should never have to hold back the kernel for nVidia packages on Arch.
Not in normal situations, because as you say, they get released together. In my case though, I had to hold back the (non-dkms) nvidia package to 375.26, which in turn meant that I had to hold back the kernel to 4.9.9. If I upgrade to 4.9.11 I get:
kernel: nvidia: disagrees about version of symbol boot_cpu_data
Switching to nvidia-dkms is not an option because it is already at 378.13, which doesn't work on my system.
That aside, using the nVidia installer instead of the packaged version will probably not break your system but it's definitely not recommended.
It also prevents pacman from keeping track of the files, which means that pacman will not be able to update the driver for you and that you may end up in situations where the driver is incompatible with other packages that pacman did update (of which the kernel can very easily be one in this case).
And at that point you will probably not get any support here since you aren't using the pacman package.
Yes that's understood. At this point it seems to be the best workaround though. If I stick to 375.26 from the repos, I have to hold back other stuff (it doesn't work with the current kernel and it also conflicts with the latest xorg-server package, something about libglx.so) so I risk getting too far behind on the update train.
I hope this is a temporary situation, I logged the issue with nvidia, so hopefully a future release will fix it and I can switch back to the driver in the repos.
Offline
What you could give a try is to grab the PKGBUILD for the nvidia and/or nvidia-dkms package, set the version number 375.39 and build the package.
In the best case you do not need any other tweaking and at least you'll have it tracked by pacman.
Ofcourse you should change the package name (e.g. nvidia-longlived(-dkms)) and set it to provide 'nvidia' so it pacman won't replace it with the actual nvidia package.
Last edited by Omar007 (2017-02-22 18:32:22)
Offline
What you could give a try is to grab the PKGBUILD for the nvidia and/or nvidia-dkms package, set the version number 375.39 and build the package.
In the best case you do not need any other tweaking and at least you'll have it tracked by pacman.Ofcourse you should change the package name (e.g. nvidia-longlived(-dkms)) and set it to provide 'nvidia' so it pacman won't replace it with the actual nvidia package.
Good tip.
It was a bit more involved than I thought (changing the interdependencies between the packages to make sure they depend on each other instead of an nvidia package from the repos, changing function names inside the PKGBUILD etc.), but it works.
Offline
I've same issue, I downgrade to version 375.26-2-x86_64
$ lspci |grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GK104M [GeForce GTX 670MX] (rev a1)
feb 23 03:04:17 AsusG75 kernel: nvidia-modeset: Version mismatch: nvidia.ko(375.26) nvidia-modeset.ko(378.13)
$ grep nvidia /etc/mkinitcpio.conf
MODULES="nvidia"
Probably for the mismatch I should rebuild the kernel, after dkms.
do it good first, it will be faster than do it twice the saint
Offline
I've same issue, I downgrade to version 375.26-2-x86_64
$ lspci |grep VGA 01:00.0 VGA compatible controller: NVIDIA Corporation GK104M [GeForce GTX 670MX] (rev a1) feb 23 03:04:17 AsusG75 kernel: nvidia-modeset: Version mismatch: nvidia.ko(375.26) nvidia-modeset.ko(378.13) $ grep nvidia /etc/mkinitcpio.conf MODULES="nvidia"
Probably for the mismatch I should rebuild the kernel, after dkms.
Sounds like an entirely different problem. Let's not confuse the issue here.
Offline
Sounds like an entirely different problem. Let's not confuse the issue here.
You're right, I fixed it. Sorry for the wrong alarm.
(For everyone information the module in the initcpio is not updated by dkms, so we must run mkinitcpio every time the module change)
do it good first, it will be faster than do it twice the saint
Offline
Still no solution for this, but I may have found another clue.
I booted into CentOS 7.3, manually installed the 378.13 driver there using the NVIDIA installer, and it worked. Then I did the same with Linux Mint 17.3 and again it worked. Those distros are using kernel version 3.10 and 3.19 respectively, so I'm thinking the issue may be kernel related?
Offline
Tried the linux-lts kernel? (you'll need the nvidia-lts package as well, but it's also 378.13)
Offline
Tried the linux-lts kernel? (you'll need the nvidia-lts package as well, but it's also 378.13)
Ah yes, forgot to add that. Yes, I did try that but no joy
Offline
just a hunch: lsmod?
Offline
just a hunch: lsmod?
It shows the three nvidia modules loaded: nvidia, nvidia_drm and nvidia_modeset.
Do you want the full output?
Offline
The only other interesting thing would be a conflicting "nouveau" module showing up there.
Add
modprobe.blacklist=nvidia_drm,nvidia_modeset
to the kernel parameters (I'd rather *not* use /etc/modprobe.conf.d for an initial test, you may face a black screen)
Do you presently have "nvidia-drm.modeset=1"?
Offline
The only other interesting thing would be a conflicting "nouveau" module showing up there.
Nope, nouveau is properly blacklisted.
Add
modprobe.blacklist=nvidia_drm,nvidia_modeset
to the kernel parameters (I'd rather *not* use /etc/modprobe.conf.d for an initial test, you may face a black screen)
What would that achieve?
Do you presently have "nvidia-drm.modeset=1"?
No
Offline