You are not logged in.

#1 2019-06-11 21:22:32

theodore
Member
Registered: 2008-09-09
Posts: 151

[SOLVED] nvidia driver and 4g decoding with multiple gpus

Hi,

I've got to on my hands a FUJITSU CELSIUS R930 workstation with a fujitsu D3118 motherboard and where I am planning to have a multiple gpu installation (all nvidia). I am gonna be using an nvidia quadro M2000 for output and two tesla k40 for processing (cuda computing). At the moment I have a basic initial eufi installation which loads and runs without issues both with nouveau and nvidia when I have 4g decoding disabled in bios. Now, since I will need to have the other cards on the board I need to activate the 4g encoding option in the bios (recommended also on other threads online).

This is where the problems start, once I activate the 4g decoding and boot with nouveau the system still boots without issues but the point is that I need the nvidia driver for my processing and in order to access the other gpus. If now I install and use the nvidia driver the system freezes once the bootloader tries to load the kernel and more specifically at the spot that it says "Loading Initial Ramdisk". Then I thought that since nouveau uses kernel mode-setting I tried to activate it for the nvidia driver following the wiki guidlines but still no result. If I deactivate the the 4g decoding in the bios then the system boots without issues both with and without kernel mode-setting in the nvidia driver. I found two relevant threads in the forumhttps://bbs.archlinux.org/viewtopic.php?id=206223 and https://bbs.archlinux.org/viewtopic.php?id=244799 but none of their solutions really helps in my case.

This is what is loading in the modules and in the kernel parameters respectively:

# grep '^[^#]' /etc/mkinitcpio.conf
MODULES=(nvidia nvidia_modeset nvidia_uvm nvidia_drm vfio_pci vfio vfio_iommu_type1 vfio_virqfd)
BINARIES=()
FILES=()
HOOKS=(base udev autodetect modconf block filesystems keyboard fsck) 
# grep '^[^#]' /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Arch"
GRUB_CMDLINE_LINUX_DEFAULT="nvidia-drm.modeset=1 intel_iommu=on iommu=pt"
GRUB_CMDLINE_LINUX=""
GRUB_PRELOAD_MODULES="part_gpt part_msdos"
GRUB_TERMINAL_INPUT=console
GRUB_GFXMODE=auto
GRUB_GFXPAYLOAD_LINUX=keep
GRUB_DISABLE_RECOVERY=true
GRUB_COLOR_HIGHLIGHT="light-cyan/blue"

CSM compatibility is disabled as is suggested from some people, and I have enabled IOMMU as also is suggested. The arch installation as I said is a fresh one with the latest kernel.

Any ideas how to resolve this issue are welcome since I have already spent 3 days on this and I do not really know what else to try.

Last edited by theodore (2019-06-14 14:43:06)

Offline

#2 2019-06-12 10:55:01

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 15,104

Re: [SOLVED] nvidia driver and 4g decoding with multiple gpus

Are you sure it freezes or does the system continue to boot while stopping to display anything ?

boot to nouveau , try to find logs from a boot with nvidia and post them.

These workstations target *nix enterpise versions, maybe you should try one of those.


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#3 2019-06-14 14:42:22

theodore
Member
Registered: 2008-09-09
Posts: 151

Re: [SOLVED] nvidia driver and 4g decoding with multiple gpus

@Lone_Wolf thanks for the reply. For some reason the grub is not able to load the kernel and I cannot understand why, most likely it has to do with the framebuffer or something but I am not sure what else to try and how to debug this.
Anyways, actually after playing around with different suggestions around the net the only thing that worked (not directly though, look below) was to circumvent grub completely with EFI stub as suggested here https://bbs.archlinux.org/viewtopic.php … 9#p1707219 and load the kernel straight away.

However, in order to have efibootmgr correctly creating an EFI boot entry I had to run the command provided in the wiki while I was within my esp folder (i.e. /boot), otherwise I was getting an emergency terminal or my bootloader menu again. Below the efibootmgr command that worked for me:

 # efibootmgr --disk /dev/sda --part 1 --create --label "Arch Linux New" --loader /vmlinuz-linux --unicode 'root=/dev/sda2 rw initrd=\initramfs-linux.img nvidia-drm.modeset=1 intel_iommu=on iommu=pt' --verbose 

/dev/sda1 and /dev/sda2 are my EFI partition (for the bootloader) and my root partitions respectively from my initial arch installation. With the --unicode I've passed some other kernel parameters as well.

I will mark the thread solved for now.

Last edited by theodore (2019-06-14 14:50:19)

Offline

Board footer

Powered by FluxBB