You are not logged in.

#1 2016-11-09 09:51:35

mieLouk
Member
Registered: 2012-12-16
Posts: 44

Full System Freezes because of hardware decoding?

This morning I got multiple system freezes when playing videos. Trump seems to already have gotten to my hardware...

There's already a temporary solution for my problem, which is to disable hwdec in mpv.conf. I don't understand why this happens today but didn't yesterday.

The Problem seems to have appeared before https://wiki.archlinux.org/index.php/NV … sing_Flash but was falsely contributed to flash, as I don't have flash installed.

Nov 09 09:55:36 xfarch kernel: NVRM: GPU at PCI:0000:01:00: GPU-f4ea4dfa-1b20-0692-a16c-ab4a3573b1d3
Nov 09 09:55:36 xfarch kernel: NVRM: Xid (PCI:0000:01:00): 16, Head 00000000 Count 000067cf
Nov 09 09:55:39 xfarch kernel: NVRM: Xid (PCI:0000:01:00): 8, Channel 00000010
Nov 09 09:55:41 xfarch kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Nov 09 09:55:43 xfarch kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Nov 09 09:58:26 xfarch kernel: NVRM: GPU at PCI:0000:01:00: GPU-f4ea4dfa-1b20-0692-a16c-ab4a3573b1d3
Nov 09 09:58:26 xfarch kernel: NVRM: Xid (PCI:0000:01:00): 16, Head 00000000 Count 00000f74
Nov 09 10:10:10 xfarch kernel: NVRM: GPU at PCI:0000:01:00: GPU-f4ea4dfa-1b20-0692-a16c-ab4a3573b1d3
Nov 09 10:10:10 xfarch kernel: NVRM: Xid (PCI:0000:01:00): 62, 1710(1588) 00000000 00000000

Could this be a problem of the proprietary Nvidia driver? I'm running a GTX660 on an AMD, obviously with Arch 4.8.6-1 and Cinnamon.

Offline

#2 2016-11-09 11:11:31

positronik
Member
Registered: 2016-02-08
Posts: 94

Re: Full System Freezes because of hardware decoding?

This is similar to a problem I had with my old GTX 650 Ti Boost.
When I posted those error on Nvidia's forum I was told that it might be a hardware problem.
The thing is that I experienced the problem from the first day I bought it so your problem might be of different nature.
However you can try my solution, which is to force the gpu into performance mode.
For me, the freezes were happening when the gpu was scaling the frequentcy, therefore having the frequency fixed solved the problem for me.
There are few ways in which you can achieve this. For example, if you have a Xorg configuration file for your gpu
you can add the line

Option "RegistryDwords" "PerfLevelSrc=0x2222; PowerMizerLevel=0x1"

in the device section. I am not sure whether PowerMizerLevel should be set to 1 or 3, I forgot which value I was using.
You should be able to get the same also with nvidia-settings.

Offline

#3 2016-11-09 11:26:56

mieLouk
Member
Registered: 2012-12-16
Posts: 44

Re: Full System Freezes because of hardware decoding?

Thank you for your response. I'm trying your (temporary) solution right now. This morning it froze within 3 minutes into the video. The video I'm watching right now is already over that. nvidia-settings seemed to be adequate.

Could it be that my videocard is dying? Or is it more a software problem? Or could it be something different, like a faulty PSU? As e.g. here https://www.youtube.com/watch?v=TBk1o7BYj84

Offline

#4 2016-11-09 12:41:14

positronik
Member
Registered: 2016-02-08
Posts: 94

Re: Full System Freezes because of hardware decoding?

I can't say... Except that my gpu and your belongs to the same generation and they have similar problem...
My psu is xfx-550w so it should be reliable in principle and the vga thing is the only hardware problem that I experienced with my desktop.
So everything points toward an Nvidia problem...
However I didn't understand whether a fixed powermizer state solved the problem for you or not.
Do you still experience freezes after you fix the frequency?

Offline

#5 2016-11-09 13:20:34

mieLouk
Member
Registered: 2012-12-16
Posts: 44

Re: Full System Freezes because of hardware decoding?

Since I changed the powermizer to performance I haven't had a freeze, yet. Played many videos since then and it used hwdec most of the times.

Well, my PSU is at least 8 years old, but worked fine. The videocard is 6 years old, makes a lot of noise during high load, but worked fine, till now. Fortunately it will be replaced in the near future smile

Excuse my unclear post.

Offline

#6 2016-11-10 09:47:54

mieLouk
Member
Registered: 2012-12-16
Posts: 44

Re: Full System Freezes because of hardware decoding?

Well, it happened again this morning. The video kept playing on in the background, the mouse was still visible and movable. The keyboard froze and couldn't be reengaged by plugging in in another USB port. The rest of the screen was also completely frozen. Pushing the powerbutton didn't shutdown the system. Only a hard reset brought it back up again. Disabled hwdec in mpv.conf, now it runs. Before starting any video I'd set the powermizer to performance, which didn't help.

Are there other methods that I can use to investigate this?

Nov 10 10:10:30 xfarch kernel: NVRM: GPU at PCI:0000:01:00: GPU-f4ea4dfa-1b20-0692-a16c-ab4a3573b1d3
Nov 10 10:10:30 xfarch kernel: NVRM: Xid (PCI:0000:01:00): 79, GPU has fallen off the bus.
Nov 10 10:10:30 xfarch kernel: NVRM: GPU at 0000:01:00.0 has fallen off the bus.
Nov 10 10:10:30 xfarch kernel: NVRM: A GPU crash dump has been created. If possible, please run
                               NVRM: nvidia-bug-report.sh as root to collect this data before
                               NVRM: the NVIDIA kernel module is unloaded.
Nov 10 10:10:34 xfarch kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000917c:0:0:0x0000000f
Nov 10 10:10:34 xfarch kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000917c:0:0:0x0000000f
Nov 10 10:10:41 xfarch kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000917c:0:0:0x0000000f
Nov 10 10:10:41 xfarch kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000917c:0:0:0x0000000f
Nov 10 10:10:41 xfarch kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000917c:0:0:0x0000000f
Nov 10 10:14:18 xfarch kernel: usb 6-2: USB disconnect, device number 2
Nov 10 10:14:18 xfarch kernel: usb 1-1: USB disconnect, device number 2
Nov 10 10:14:20 xfarch dbus[629]: [system] Activating via systemd: service name='org.freedesktop.Avahi' unit='dbus-org.freedesktop.Avahi.service'
Nov 10 10:14:20 xfarch dbus[629]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.Avahi.service': Unit dbus-org.freedesktop.Avahi.service not found.
Nov 10 10:14:22 xfarch dbus[629]: [system] Activating via systemd: service name='org.freedesktop.Avahi' unit='dbus-org.freedesktop.Avahi.service'
Nov 10 10:14:22 xfarch dbus[629]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.Avahi.service': Unit dbus-org.freedesktop.Avahi.service not found.
Nov 10 10:14:26 xfarch kernel: usb 2-1: new high-speed USB device number 2 using xhci_hcd
Nov 10 10:14:27 xfarch kernel: hub 2-1:1.0: USB hub found
Nov 10 10:14:27 xfarch kernel: hub 2-1:1.0: 4 ports detected
Nov 10 10:14:29 xfarch dbus[629]: [system] Activating via systemd: service name='org.freedesktop.Avahi' unit='dbus-org.freedesktop.Avahi.service'
Nov 10 10:14:29 xfarch dbus[629]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.Avahi.service': Unit dbus-org.freedesktop.Avahi.service not found.
Nov 10 10:14:30 xfarch dbus[629]: [system] Activating via systemd: service name='org.freedesktop.Avahi' unit='dbus-org.freedesktop.Avahi.service'
Nov 10 10:14:30 xfarch dbus[629]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.Avahi.service': Unit dbus-org.freedesktop.Avahi.service not found.
Nov 10 10:14:45 xfarch kernel: usb 2-2: new low-speed USB device number 3 using xhci_hcd
Nov 10 10:14:45 xfarch kernel: input: daskeyboard as /devices/pci0000:00/0000:00:04.0/0000:03:00.0/usb2/2-2/2-2:1.0/0003:04D9:2013.0005/input/input22
Nov 10 10:14:45 xfarch kernel: hid-generic 0003:04D9:2013.0005: input,hidraw2: USB HID v1.10 Keyboard [daskeyboard] on usb-0000:03:00.0-2/input0
Nov 10 10:14:45 xfarch kernel: input: daskeyboard as /devices/pci0000:00/0000:00:04.0/0000:03:00.0/usb2/2-2/2-2:1.1/0003:04D9:2013.0006/input/input23
Nov 10 10:14:45 xfarch kernel: hid-generic 0003:04D9:2013.0006: input,hidraw3: USB HID v1.10 Device [daskeyboard] on usb-0000:03:00.0-2/input1
Nov 10 10:14:45 xfarch mtp-probe[2490]: checking bus 2, device 3: "/sys/devices/pci0000:00/0000:00:04.0/0000:03:00.0/usb2/2-2"
Nov 10 10:14:45 xfarch mtp-probe[2490]: bus: 2, device: 3 was not an MTP device
Nov 10 10:14:48 xfarch dbus[629]: [system] Activating via systemd: service name='org.freedesktop.Avahi' unit='dbus-org.freedesktop.Avahi.service'
Nov 10 10:14:48 xfarch dbus[629]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.Avahi.service': Unit dbus-org.freedesktop.Avahi.service not found.
Nov 10 10:14:49 xfarch dbus[629]: [system] Activating via systemd: service name='org.freedesktop.Avahi' unit='dbus-org.freedesktop.Avahi.service'
Nov 10 10:14:49 xfarch dbus[629]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.Avahi.service': Unit dbus-org.freedesktop.Avahi.service not found.
Nov 10 10:15:31 xfarch systemd-logind[626]: Power key pressed.
Nov 10 10:15:53 xfarch systemd-logind[626]: Power key pressed.

Running the mentioned script gives an output, but is not human readable.

As the problem only occurs in the morning with hwdec=auto, independent of the powermizer setting, but didn't appear with the settings hwdec=auto, powermizer=performance after a few hours of uptime I guess it is a hardware problem with my GPU.
Is there another way to test a GPU except benchmarking?

Last edited by mieLouk (2016-11-10 10:15:48)

Offline

#7 2016-11-10 18:35:05

seth
Member
Registered: 2012-09-03
Posts: 51,299

Re: Full System Freezes because of hardware decoding?

You get a gz compressed log and since you now *know* that the nvidia kernel module crashes, the log is *very* relevant to this problem.
Either you encounter a HW issue or a bug in the nvidia kernel module which you should report to nvidia.

"only occurs in the morning" and what changes over day? Temperature? Check whether the GPU (or RAM etc.) is properly seated or whether there's tension (in the morning ;-)

Offline

Board footer

Powered by FluxBB