You are not logged in.
Pages: 1
Hello, When trying to wakeup from Hibernation/Suspend on a Windows-11 Guest, the VM crash this is 100% because of Nvidia Drivers.
a Solution that I want to try is to keep the vGPU running after suspend or Hibernation.
before Hibernation:
root@bare-metal-instance-ubuntu-vgpu-latestnvidiadriver:/home/ubuntu# nvidia-smi vgpu --query
GPU 00000000:17:00.0
Active vGPUs : 1
vGPU ID : 3251636412
VM UUID : 04c1ce81-01ab-4b4e-8a07-f7eb8d78c5c1
VM Name : windows-11-star
vGPU Name : NVIDIA A10-2A
vGPU Type : 599
vGPU UUID : ed71e262-7182-11f0-a2e5-a9460a2b0c71
Guest Driver Version : 573.48
License Status : Unlicensed (Unrestricted)
GPU Instance ID : N/A
Placement ID : N/A
Accounting Mode : Disabled
ECC Mode : N/A
Accounting Buffer Size : 4000
Frame Rate Limit : 60 FPS
PCI
Bus Id : 00000000:04:00.0
FB Memory Usage
Total : 2048 MiB
Used : 1101 MiB
Free : 947 MiB
Utilization
GPU : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
Jpeg : 0 %
Ofa : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
GPU 00000000:31:00.0
Active vGPUs : 0
GPU 00000000:B1:00.0
Active vGPUs : 0
GPU 00000000:CA:00.0
Active vGPUs : 0
after Hibernation (before starting the VM):
root@bare-metal-instance-ubuntu-vgpu-latestnvidiadriver:/home/ubuntu# nvidia-smi vgpu --query
GPU 00000000:17:00.0
Active vGPUs : 0
GPU 00000000:31:00.0
Active vGPUs : 0
GPU 00000000:B1:00.0
Active vGPUs : 0
GPU 00000000:CA:00.0
Active vGPUs : 0
after Hibernation (after starting the VM):
GPU 00000000:17:00.0
Active vGPUs : 1
vGPU ID : 3251636615
VM UUID : 04c1ce81-01ab-4b4e-8a07-f7eb8d78c5c1
VM Name : windows-11-star
vGPU Name : NVIDIA A10-2A
vGPU Type : 599
vGPU UUID : 37cffc74-71ea-11f0-b192-0a2b0c719a82
Guest Driver Version : N/A
License Status : N/A (Expiry: N/A)
GPU Instance ID : N/A
Placement ID : N/A
Accounting Mode : N/A
ECC Mode : N/A
Accounting Buffer Size : 4000
Frame Rate Limit : N/A
PCI
Bus Id : 00000000:00:00.0
FB Memory Usage
Total : 2048 MiB
Used : 0 MiB
Free : 2048 MiB
Utilization
GPU : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
Jpeg : 0 %
Ofa : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
GPU 00000000:31:00.0
Active vGPUs : 0
GPU 00000000:B1:00.0
Active vGPUs : 0
GPU 00000000:CA:00.0
Active vGPUs : 0
Offline
Are this, https://bbs.archlinux.org/viewtopic.php?id=307361 and https://bbs.archlinux.org/viewtopic.php?id=307207 all part of the same setup/approach/problem complex?
If so, please don't start new threads but extend the original one to maintain context/information (it's ok to "bump" if you're actually adding new information and not only "bump") and report the others for the dustbin.
Also
root@bare-metal-instance-ubuntu-vgpu-latestnvidiadriver:/home/ubuntu# nvidia-smi vgpu --query
Is this on ubuntu?
a Solution that I want to try is to keep the vGPU running after suspend or Hibernation.
Offline
Hey
Sorry for Creating two similar threads
I tested this on multiple Hosts with different operating systems, and each time I'm facing the same problem
I tried disabling persistence mode (it was enabled)
nvidia-smi -i 0 -pm Disabled
now I'm getting:
internal error: Unknown PCI header type '127' for device '0000:17:00.4'
Failed to reset PCI device: internal error: Unknown PCI header type '127' for device '0000:17:00.4'
Offline
Right now I am focusing on virsh save However, When I run virsh save windows-11-vm windows.save for the first time, the command continues indefinitely, and the .save file keeps growing, I have to cancel it and restart then it will work
Offline
I tried disabling persistence mode (it was enabled)
That'd rather not be helpful.
Sanity check: is this a problem w/ a client that is *not* windows?
Offline
I have kept the same Host, and tried multiple guests
on Ubuntu / Rocky Linux Hibernation with Nvidia drivers is working as expected no problems there
on windows 10 - 11: I'm running into the problems described above
Offline
So for clarification:
You've some linux host and a windows VM.
Do you pass the GPU through to the VM or is it accessed as virtual device?
https://wiki.archlinux.org/title/PCI_pa … h_via_OVMF
You're then hibernating the host
a) while the VM is running?
or
b) before your start the VM?
In case of (b), what if you just run "nvidia-smi" some seconds before you start the VM?
The "VM" then crashes, what exactly does that mean?
The virtual machine itself or the guest system inside? And how? Is there some backtrace?
Offline
I have tried both
for a PCI passthrough everything is working fine, However when I used vGPUs that's when I started having problems
I'm not Hibernating the host, Im trying to hibernate/save the Guest (Windows Machine)
so I run the VM, hibernate it and it shutdown, after starting it again, it can't resume when I used journalctl to check log messages
I got Jul 25 14:43:39 bare-metal-instance-ubuntu-vgpu nvidia-vgpu-mgr[40379]: error: vmiop_log: (0x0): RPC RINGs are not valid
I'm happy to provide more logs if they can help
Offline
Pages: 1