You are not logged in.
Hi All,
I've been an avid Archian for over a year and have found my way around using these forums and the excellent documentation of Arch Linux, never even having to actually post a query till now and I thank you guys for saving my life a hundred times. But there's always a first time and this latest problem has me totally stumped.
I have always had problems making my GPU (GeForce GTX 1050 Ti Mobile) work on my Laptop (MSI Prestige PE62 7RE). I'm avoiding the details here but let me know if it might be relevant. What finally worked for me is the nvidia-beta package. I also use cuda and cudnn. I have avoided upgrading these aforementioned packages as that typically caused incompatibilities with Tensorflow and PyTorch. But 2 days back, I finally decided to upgrade the entire system using pacman and the GPU stopped working. Specifically, the cuda samples stopped working and executing the sample at /opt/cuda/samples/1_Utilities/deviceQuery, I got an error:
cudaGetDeviceCount returned 30
-> unknown error
Result = FAILNote that I used pacman for the upgrade. So, cuda was upgraded from 9.1 to 9.2, cudnn was upgraded from 7.1 to 7.2 but nvidia-beta, which is from AUR, was not upgraded. Here's what I have tried till now to get the GPU working without any luck at all:
I downgraded cuda and cudnn to the older versions but the samples still didn't work
I switched to the nvidia package but that didn't help either. Moreover, reinstalling nvidia-beta with yaourt yields the following error right at the end:
cat: '/usr/lib/modules/extramodules-*-ARCH/version': No such file or directory
==> ERROR: A failure occurred in prepare().
Aborting...
==> ERROR: Makepkg was unable to build .
==> Restart building nvidia-beta ? [y/N]
==> ------------------------------------
==> But, there does exist a file /usr/lib/modules/extramodules-ARCH/version. This might be unrelated though as another user has posted the same error in the package comments
I also noticed, that currently, the nvidia modules are not loaded (using lsmod | grep nvidia). Trying to load the nvidia module (using modprobe nvidia) results in the following error:
modprobe: FATAL: Module nvidia not found in directory /lib/modules/4.17.14-arch1-1-ARCHnvidia.ko.gz is present in /lib/modules/extramodules-ARCH/ and in /usr/lib/modules/4.17.14-arch1-1-ARCH/extramodules/
Finally, I could't find any logs related to module load failure (usingsudo journalctl -b | grep -iE 'error|nvidia|mod')
I found some other threads with seemingly similar problems but their resolutions do not work for me. I haven't changed anything or done anything unusual apart from what I've mentioned above and am pretty clueless as to what I should try next. It seems like an installation issue from the above observations but I'm not sure if it's not a hardware fault either. Any advice or guidance would be wholeheartedly appreciated. Please let me know if you need any other piece of information.
Last edited by rbiswas143 (2018-08-17 18:06:12)
I'm new to this forum and have read the rules. I'd appreciate your feedback with regard to my adherence to the norms.
Offline
Not a kernel issue or hardware please move.
Why use nvidia-beta? Try changing to the normal drivers and tell us why they don't work.
Last edited by telis80 (2018-08-17 09:59:49)
I live at the Internet
-Edward Snowden
Offline
Sure, what will be a more appropriate forum for this?
As per the arch wiki, the normal drivers for my hardware are in the nvidia package. That is what I have currently installed and all the errors above are with the nvidia package installed.
I'm new to this forum and have read the rules. I'd appreciate your feedback with regard to my adherence to the norms.
Offline
Please post the output of the following
uname -a
pacman -Q linux linux-headers
pacman -Q nvidiaOffline
Not a kernel issue or hardware please move.
Seems appropriate to me
I shall leave this thread here.
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
The shortest way to ruin a country is to give power to demagogues.— Dionysius of Halicarnassus
---
How to Ask Questions the Smart Way
Offline
The nvidia-beta PKGBUILD will need to be updated to reflect Arch's new kernel configuration. The extramodules directory is no longer versioned, instead it is simply named "extramodules-ARCH".
Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD
Making lemonade from lemons since 2015.
Online
[~]$ uname -a
Linux rhinoMSi 4.17.14-arch1-1-ARCH #1 SMP PREEMPT Thu Aug 9 11:56:50 UTC 2018 x86_64 GNU/Linux[~]$ pacman -Q linux linux-headers
linux 4.17.14.arch1-1
linux-headers 4.17.14.arch1-1[~]$ pacman -Q nvidia
nvidia 396.51-1I'm new to this forum and have read the rules. I'd appreciate your feedback with regard to my adherence to the norms.
Offline
run depmod, the modprobe the module again.
Offline
The nvidia-beta PKGBUILD will need to be updated to reflect Arch's new kernel configuration. The extramodules directory is no longer versioned, instead it is simply named "extramodules-ARCH".
Thanks a ton!
I reinstalled nvidia-beta again after editing the PKGBUILD to remove versioning from the extramodules directory. Now, everything is magically back on track:
[~]$ lsmod | grep nvidia
nvidia_uvm 921600 0
nvidia_modeset 1093632 0
nvidia 14061568 65 nvidia_uvm,nvidia_modeset
ipmi_msghandler 57344 2 ipmi_devintf,nvidiaThe CUDA samples are working and I am able to use the GPU. I'm still clueless as to what went wrong but reinstalling nvidia-beta fixed the problem. Marking as [SOLVED].
I'm new to this forum and have read the rules. I'd appreciate your feedback with regard to my adherence to the norms.
Offline
If you update to the latest repo packages nvidia 396.51-3 and linux 4.18.1.arch1-1 the issue should be fixed there as well.
Offline