Okay, so I've got some weird issue when I try to load X at boot. Here's a basic description:
-Start computer with an Arch kernel
-Computer loads nvidia module in runlevel 3 (nvidia installed from pacman)
-Computer enters run level 5 and attempts to load X
- X fails to load: "nvidia driver not found" (or something similar [not at computer right now])
I then take the steps to rectify this situation:
- Login to the console
- Switch to run level 3
- reinstall nvidia package from pacman
- init 5
And then X loads normally. Notice I didn't change anything in my xorg.conf file or make any other configuration changes. I just reloaded the nvidia package from pacman.
The weird thing is that I have to do this everytime I start up. What gives? What do I need to do to get X to keep the files it's supposed to have or whatever? Checked out the wiki and didn't see anything related to this. Thanks for the help.
ok, couple initial questions:
what do your partitions look like (is the nvidia module on some odd partition?)
can you post your rc.conf? (please use [ code ][/ code ] tags, it makes it cleaner)
next time it fails, try running "depmod -a" and then doing it....
also, try removing it from rc.conf and doing a clean boot into runlevel 3 - from there, manually modprobe the module, seeing if there are any errors... check lsmod before and after...
Alrighty. First of all, I appreciate the help. I tried a couple of things. Also, to clear this up, I know that you have to resync nvidia every time you update the kernel. That's not the issue. The issue happens like this:
-- I boot up.
-- Nvidia loads in runlevel 3
-- system enters runlevel 5
-- X attempts fails to start.
-- I log into vc 1 and init 3 (a check with lsmod reveals that nvidia is indeed running).
-- pacman -S nvidia
-- init 5
-- I get X
-- Use computer for any period of time and then restart.
-- Computer boots up using the same kernel.
-- nvidia loads in runlevel 3
-- System enters runlevel 5
-- X once again fails to start.
What have my investigations yielded? Well, like I said, lsmod indicates that nvidia is indeed running. If I modprobe -r it and then modprobe it, I get the same problems. There are no errors when loading the nvidia module. It honestly appears to be loading just fine. But for some reason, X won't start unless I actively pacman -S nvidia. Here is the errored portion of the X log:
(II) Setting vga for screen 0. (**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32 (==) NVIDIA(0): RGB weight 888 (==) NVIDIA(0): Default visual is TrueColor (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0) (**) NVIDIA(0): Option "NoLogo" "True" (**) NVIDIA(0): Option "IgnoreEDID" "True" (**) NVIDIA(0): Option "ConnectedMonitor" "DFP-0, CRT-0" (**) NVIDIA(0): Option "RenderAccel" "True" (**) NVIDIA(0): Option "TwinView" "true" (**) NVIDIA(0): Option "TwinViewOrientation" "RightOf" (**) NVIDIA(0): Option "SecondMonitorHorizSync" "31.5-79.0" (**) NVIDIA(0): Option "SecondMonitorVertRefresh" "50-100" (**) NVIDIA(0): Option "MetaModes" "DFP-0: 1600x1200, CRT-0: 1024x768" (**) NVIDIA(0): Enabling experimental RENDER acceleration (**) NVIDIA(0): Ignoring EDIDs (**) NVIDIA(0): ConnectedMonitor string: "DFP-0, CRT-0" (**) NVIDIA(0): TwinView enabled (--) NVIDIA(0): Linear framebuffer at 0xC0000000 (--) NVIDIA(0): MMIO registers at 0xDE000000 (EE) NVIDIA(0): Failed to load the NVIDIA kernel module! (EE) NVIDIA(0): *** Aborting *** (II) UnloadModule: "nvidia"
Now, I've tried a few things as per phrakture's suggestions. I removed the nvidia driver from the system. I changed my default runlevel to 3, and I removed nvidia from the loaded modules in rc.conf. So this happens:
-- start up computer in run level 3
-- pacman -S nvidia
-- init 5
-- X works just fine.
-- comp starts up in runlevel 3
-- modprobe nvidia (since it's not in the rc.conf)
-- lsmod (yep, nvidia is there with no errors)
-- init 5
-- X fails to start with the above error in the X.0.log
So, since you asked, here is my rc.conf (imagine the nvidia module in the modules section when not running in "diagnostics mode". It is normally in the second or third module I load).
# /etc/rc.conf - Main Configuration for Arch Linux # # # Localization # # HARDWARECLOCK: set to "UTC" or "localtime" # TIMEZONE: timezones are found in /usr/share/zoneinfo # KEYMAP: keymaps are found in /usr/share/kbd/keymaps # CONSOLEFONT: found in /usr/share/kbd/consolefonts (only needed for non-US) # CONSOLEMAP: found in /usr/share/kbd/unimaps # USECOLOR: use ANSI color sequences in startup messages # HARDWARECLOCK="localtime" TIMEZONE=EST5EDT KEYMAP=us CONSOLEFONT= CONSOLEMAP= USECOLOR="yes" # Scan for LVM volume groups at startup, required if you use LVM USELVM="no" # # Networking # HOSTNAME="meatwad" # # Module to load at boot-up (in this order) # (prefix a module with a ! to disable it) # MODULES=(8139too snd-ice1712 snd-pcm-oss !snd-usb-audio !usbserial !ide-scsi) # # Interfaces to start at boot-up (in this order) # Declare each interface then list in INTERFACES # (prefix an interface in INTERFACES with a ! to disable it) # # Note: to use DHCP, set your interface to be "dhcp" (eth0="dhcp") # lo="lo 127.0.0.1" eth0="eth0 192.168.1.99 netmask 255.255.255.0 broadcast 192.168.1.255" #eth0="dhcp" INTERFACES=(lo eth0) # # Routes to start at boot-up (in this order) # Declare each route then list in ROUTES # (prefix a route in ROUTES with a ! to disable it) # gateway="default gw 192.168.1.1" ROUTES=(!gateway) # # Daemons to start at boot-up (in this order) # (prefix a daemon with a ! to disable it) # (prefix a daemon with a @ to start it up in the background) # DAEMONS=(syslog-ng hotplug !pcmcia network netfs crond cups samba sshd ntpd) # End of file
Additionally, this is some info from the kernel log pertaining to the nvidia module. I don't know how relevant it is, since I've never looked at the kernel log before.
Apr 19 21:11:32 meatwad [4294691.435000] nvidia: no version for "struct_module" found: kernel tainted. Apr 19 21:11:32 meatwad [4294691.454000] nvidia: module license 'NVIDIA' taints kernel.
So what do you think? Anyone have a clue? I would say that it might have something to do with the fact that I'm running dp's kernel26mm, but I'm also having the same problem whenever I boot into the normal kernel. So I think the problem is in the sytem and not the kernel. Make sense?
I really appreciate the help. Thanks for your feedback. I'll look forward to hearing whatever advice you may have.
I think I may have found the problem. It appears that each time I reboot, I lose the /dev/nvidia devices. They don't return until I pacman -S nvidia . That would probably cause said problems right? How would I go about making certain the permanence of the the devices? I'm using udev...
Normally loading nvidia module from /etc/rc.conf should do everything that is needed.
You can also create devices manually:
or search the forums for more info about this problem (it was discussed some time ago IIRC)
Thanks for bringing that link to my attention. Yes, I am loading the module in rc.conf, but the document admits that unless you've patched your nvidia kernel module, it may not be able to create the devices by default. I didn't always have these problems. Is it possible that the more recent nvidia packages don't have the patch included? If that were the case however, it seems like there would be more than just me screaming about it.... Unless of course, most nvidia users are also using devfs rather than udev...
I don't know. I think nvidia linux forums should answer this question. IIRC the problem was something like not enough time for udev to create device links before x starts. I'm not using init 5 because I prefer manual startx method for my own reasons. I've never experienced this problem with any nvidia driver/udev/xorg version.
i once had similar issue with my nvidia module, it simply loaded but didnt create the /dev node - i had to create it manually until i got an nvidia upgrade which fixed it. u might search the forums for the correct nvidia node creation command.
Thanks for the feedback.
Yeah, I read the part about the latency in the creation, but in my example above, when I load the module with modprobe and try to init 5 giving it plenty of time, it still fails, and it fails to create the devices. I'll look into stimulating the creation of those devices. Maybe it's just a script I can run regularly. Still not ideal, but it'd be less of a pain in the ass than what I'm dealing with now. Speaking of which, what there an NVinstalldevices.sh script that you were supposed to run to solve this problem. I looked on my system for something like it or equivalent, but couldn't find anything.
It still begs the question of what is so different about my system, though. You know?