You are not logged in.

#1 2022-06-06 09:55:21

Malvineous
Member
From: Brisbane, Australia
Registered: 2011-02-03
Posts: 190
Website

nvidia: How to avoid frequent breakage

Hi all,

Often when upgrading an Arch Linux distribution, unless it's a barebones one with no customisation, the nvidia kernel module breaks.  The problem is that the nvidia module is compiled against a specific kernel version, but this version isn't listed in the package's dependencies.  This means it's trivial to install an incompatible kernel and nvidia module and then on reboot there is no graphical environment, and you have to go through a lengthy trial and error process installing different versions of the kernel and/or the nvidia package in order to get a working system.

A very simple solution would be to list the kernel version as a dependency of the nvidia package, which would make it difficult to install incompatible versions and save a lot of hassle.

Someone opened a bug report about this but it was closed as 'not a bug' without a word of explanation, so I'm wondering what the reasons are, and why it would be preferable to have an unreliable package over a reliable one?  I'm having trouble understanding the logic.  Is it just that it's too much effort to update the kernel version number each time the nvidia module is rebuilt?  Maybe there is another trusted user who could take over maintenance of the nvidia module if that's the case?

Anyway, since you can't comment on closed bugs I was just hoping to try to understand the reasons behind the decision to prefer a fragile upgrade process over a more robust one, given how seemingly simple it would be to achieve.

Offline

#2 2022-06-06 11:33:50

bjornp_
Member
Registered: 2020-12-31
Posts: 42

Re: nvidia: How to avoid frequent breakage

>The problem is that the nvidia module is compiled against a specific kernel version

This is why you should use the dkms version of the driver: It is not compiler against a specific kernel version as it is rebuilt when the kernel is updated


Fun fact: I actually have no clue what I'm doing

Offline

#3 2022-06-06 11:39:48

Scimmia
Fellow
Registered: 2012-09-01
Posts: 11,544

Re: nvidia: How to avoid frequent breakage

Sounds like you're doing partial updates, which is explicitly not supported. You updated everything or you update nothing.

Online

#4 2022-06-06 17:57:59

Lone_Wolf
Member
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 11,911

Re: nvidia: How to avoid frequent breakage

This means it's trivial to install an incompatible kernel and nvidia module and then on reboot there is no graphical environment,

There's a simple method to boot to a text console that allows to troubleshoot, just append systemd.unit=multi-user.target as kernel parameter to your bootloader .


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.


(A works at time B)  && (time C > time B ) ≠  (A works at time C)

Offline

#5 2022-06-30 13:02:11

Malvineous
Member
From: Brisbane, Australia
Registered: 2011-02-03
Posts: 190
Website

Re: nvidia: How to avoid frequent breakage

Scimmia wrote:

Sounds like you're doing partial updates, which is explicitly not supported. You updated everything or you update nothing.

The problem is I don't want to do partial updates, but it happens without warning.  The nvidia package needs to have the kernel version as a dependency to prevent this from happening.  I guess I just don't understand why it's such a problem to list the kernel version in the nvidia package?  Couldn't you script it so it would be automatic if that was the issue?

Lone_Wolf wrote:

There's a simple method to boot to a text console that allows to troubleshoot, just append systemd.unit=multi-user.target as kernel parameter to your bootloader

Unfortunately that's not so simple for me because it requires finding a keyboard and crawling into a tight space to plug it in, and I can't see the screen from where I am typing.  Normally I access the machine using Barrier (formerly Synergy) so it doesn't have any input devices plugged in.  So generally it's a pain to troubleshoot like this which is why I'd like to avoid having to do it.

bjornp_ wrote:

This is why you should use the dkms version of the driver: It is not compiler against a specific kernel version as it is rebuilt when the kernel is updated

This is probably the best workaround, as it sounds like nobody cares enough to fix the broken package!

Offline

#6 2022-06-30 13:05:00

Scimmia
Fellow
Registered: 2012-09-01
Posts: 11,544

Re: nvidia: How to avoid frequent breakage

Malvineous wrote:
Scimmia wrote:

Sounds like you're doing partial updates, which is explicitly not supported. You updated everything or you update nothing.

The problem is I don't want to do partial updates, but it happens without warning.

And how, exactly, does that happen? Do you have specific examples?

Online

#7 2022-06-30 14:22:47

seth
Member
Registered: 2012-09-03
Posts: 51,017

Re: nvidia: How to avoid frequent breakage

OP wrote:

This means it's trivial to install an incompatible kernel and nvidia module

No, it's not. I guess with the mostr rotten of luck (or a systematically out-of-sync mirror), you might update at a moment when one package is updated in the repos, but the other one is not for the next minute or so, but that's not "trivial" - it means you did something to make the gods hate you tongue

and then on reboot there is no graphical environment

Yes.

you have to go through a lengthy trial and error process installing different versions of the kernel and/or the nvidia package in order to get a working system.

No - you just update again.

There're two ways for this to be a problem:
1. As discussed: The god hates you. Really much. (In which case you should always wear a helmet, because a surprising amount of humans die from falling coconuts)
2. You're withholding critical information that amount to "I systematically run partial updates" and "installing different versions of the kernel" sounds a lot like this, eg. like if you're using a customized kernel or ignore the kernel for pacman updates.
In that case you should elaborate on your situation.

Offline

#8 2022-06-30 23:59:19

darkskyabove
Member
Registered: 2014-07-15
Posts: 10

Re: nvidia: How to avoid frequent breakage

One scenario that I don't see covered in this thread is the rare case when nvidia package is updated while the kernel is not. Two solutions I know of are to create a hook that rebuilds the kernel image any time nvidia is updated, or, as I do, manually rebuild kernel image.


To emulate flesh machines, I am learning...

Offline

#9 2022-07-01 01:07:40

Scimmia
Fellow
Registered: 2012-09-01
Posts: 11,544

Re: nvidia: How to avoid frequent breakage

darkskyabove wrote:

One scenario that I don't see covered in this thread is the rare case when nvidia package is updated while the kernel is not. Two solutions I know of are to create a hook that rebuilds the kernel image any time nvidia is updated, or, as I do, manually rebuild kernel image.

That will not cause a mismatch between the kernel and module, it's a different issue.

Online

#10 2022-07-01 07:41:03

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,657

Re: nvidia: How to avoid frequent breakage

That can cause a mismatch, but only if you actively added the nvidia modules to your initramfs, in which case you should already be aware and ideally already created a hook for that because you read the corresponding wiki page to add them to the initramfs in the first place.

Offline

#11 2022-07-01 09:09:34

Scimmia
Fellow
Registered: 2012-09-01
Posts: 11,544

Re: nvidia: How to avoid frequent breakage

That will cause a mismatch between the module and userspace tools, which isn't what the OP is talking about.

Online

Board footer

Powered by FluxBB