You are not logged in.

#1 2017-07-22 19:13:31

supernobby
Member
Registered: 2017-07-22
Posts: 6

X server freeze after kernel update

Hello all!
for some time I am struggling with an issue, that I was not able to solve as usually with the great Arch Linux documentation or other resources. So, I try here.

I use Arch Linux on a Macbook Pro 11,3 (Late 2014) with Intel and Nvidia graphics. Up to kernel 4.9.11-1-ARCH all worked well with my configured installation. But then, after the kernel was updated, the X server started to completely freezes the system when I run startx. E. g. with the current LTS kernel I just see:

X.Org X Server 1.19.3
Release Date: 2O17-03-15
X Protocol Version 11, Revision 0
Build Operating System: Linux 4.9.11-1-ARCH x86_64
Current Operating System: Linux macbook 4.9.37-1-lts #1 SMP Wed Jul 12 19:22:39 CEST 2017 x86_64
Kernel command line: BOOT_IMAGE=/boot/vmlinux-linux-lts root=UUID=e800... rw libata.force=noncq
Build Date: O7 April 2017 05:42:48PM

Current version of pixman: 0.34.0
       Before reporting problems, check http://wiki.x.org
       to make sure that you have the latest version.
 Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implented, (??) unknown.
 (==) Log file: "/var/log/Xorg.0.log", Time: Sat Jul 22 18:34:10 2017
 (==) Using config diretory: "/etc/X11/xorg.conf.d"
 (==) Using system config diretory "/usr/share/X11/xorg.conf.d"

Then the keyboard is dead. The display still shows the console output. But I can just power-off. Not much more in the server log.
And when I switch back to kernel 4.9.11-1-ARCH, the X server starts without problem. So I don't think, there is a problem in the xorg config files, as the problem really depends on the kernel version. So I wonder, what can be the reason for the problem. I appreciate any help.

Some info about my configuration. I use no Login/Display Manager. I start with the console and run startx when required.
I chainload apple_set_os.efi with grub, to be able to use the Intel graphics for the X server. Nvidia uses the Nouveau driver, but just to be switched of with VGA switcheroo to save power.

Not sure what output / logfile could help. But if you have recommendations, how to investigate this further, I am happy to try this.
Thank you!
Andreas

Offline

#2 2017-07-22 20:18:06

seth
Member
Registered: 2012-09-03
Posts: 50,927

Re: X server freeze after kernel update

Look at the dmesg of the failing session, eg. for the former boot:

journalctl -b-1

You can probably also ssh into the stalled machine for a live inspection.

Offline

#3 2017-07-22 21:21:50

supernobby
Member
Registered: 2017-07-22
Posts: 6

Re: X server freeze after kernel update

Hello seth,
thanks for the reply. I used again the LTS kernel to recreate the issue.
I can still ssh to the frozen machine. But when I call dmesg there, there is nothing from this incident.
I switched back to the kernel 4.9.11-1-ARCH and did execute journalctl -b-1. There I found a suspicious entry:

Jul 22 22:48:27 capetown kernel: INFO: task kworker/0:0:4 blocked for more than 120 seconds.
Jul 22 22:48:27 capetown kernel:       Tainted: P           O    4.9.37-1-lts #1
Jul 22 22:48:27 capetown kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 22 22:48:27 capetown kernel: kworker/0:0     D    0     4      2 0x00000000
Jul 22 22:48:27 capetown kernel: Workqueue: pm pm_runtime_work
Jul 22 22:48:27 capetown kernel:  ffff88046a497c00 ffff88046a497c00 ffff88046ce82ac0 ffff8804699caac0
Jul 22 22:48:27 capetown kernel:  ffff88047f217d40 ffffc900018d3c28 ffffffff815f80fa 00ffc900018d3c58
Jul 22 22:48:27 capetown kernel:  ffff88047f217d40 ffff88046b1acc00 ffff88046ce82ac0 ffff88046c10e208
Jul 22 22:48:27 capetown kernel: Call Trace:
Jul 22 22:48:27 capetown kernel:  [<ffffffff815f80fa>] ? __schedule+0x22a/0x6b0
Jul 22 22:48:27 capetown kernel:  [<ffffffff815f85b6>] schedule+0x36/0x80
Jul 22 22:48:27 capetown kernel:  [<ffffffff81452a4e>] rpm_resume+0x10e/0x790
Jul 22 22:48:27 capetown kernel:  [<ffffffff810c1740>] ? wake_bit_function+0x60/0x60
Jul 22 22:48:27 capetown kernel:  [<ffffffff81453dcb>] pm_runtime_forbid+0x4b/0x50
Jul 22 22:48:27 capetown kernel:  [<ffffffffa0a0830c>] nouveau_pmops_runtime_suspend+0xcc/0xd0 [nouveau]
Jul 22 22:48:27 capetown kernel:  [<ffffffff8134e4d0>] ? pci_pm_runtime_resume+0xa0/0xa0
Jul 22 22:48:27 capetown kernel:  [<ffffffff8134e52f>] pci_pm_runtime_suspend+0x5f/0x180
Jul 22 22:48:27 capetown kernel:  [<ffffffff8134e4d0>] ? pci_pm_runtime_resume+0xa0/0xa0
Jul 22 22:48:27 capetown kernel:  [<ffffffff81452357>] __rpm_callback+0x27/0x70
Jul 22 22:48:27 capetown kernel:  [<ffffffff8134e4d0>] ? pci_pm_runtime_resume+0xa0/0xa0
Jul 22 22:48:27 capetown kernel:  [<ffffffff814523c4>] rpm_callback+0x24/0x80
Jul 22 22:48:27 capetown kernel:  [<ffffffff8134e4d0>] ? pci_pm_runtime_resume+0xa0/0xa0
Jul 22 22:48:27 capetown kernel:  [<ffffffff81453208>] rpm_suspend+0x138/0x610
Jul 22 22:48:27 capetown kernel:  [<ffffffff81453e3e>] pm_runtime_work+0x6e/0x90
Jul 22 22:48:27 capetown kernel:  [<ffffffff81096313>] process_one_work+0x1e3/0x470
Jul 22 22:48:27 capetown kernel:  [<ffffffff810965eb>] worker_thread+0x4b/0x4f0
Jul 22 22:48:27 capetown kernel:  [<ffffffff810965a0>] ? process_one_work+0x470/0x470
Jul 22 22:48:27 capetown kernel:  [<ffffffff8109c2a6>] kthread+0xe6/0x100
Jul 22 22:48:27 capetown kernel:  [<ffffffff8109c1c0>] ? kthread_park+0x60/0x60
Jul 22 22:48:27 capetown kernel:  [<ffffffff815fc895>] ret_from_fork+0x25/0x30
Jul 22 22:48:27 capetown kernel: INFO: task Xorg:596 blocked for more than 120 seconds.
Jul 22 22:48:27 capetown kernel:       Tainted: P           O    4.9.37-1-lts #1
Jul 22 22:48:27 capetown kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 22 22:48:27 capetown kernel: Xorg            D    0   596    595 0x00000000
Jul 22 22:48:27 capetown kernel:  0000000000000000 ffff88046a492800 ffff88046c7a2ac0 ffffffff81a0e500
Jul 22 22:48:27 capetown kernel:  ffff88047f217d40 ffffc90004027cc0 ffffffff815f80fa 0000000000000000
Jul 22 22:48:27 capetown kernel:  ffff88047f217d40 ffffc90004027cd8 ffff88046c7a2ac0 ffff88046c10e208
Jul 22 22:48:27 capetown kernel: Call Trace:
Jul 22 22:48:27 capetown kernel:  [<ffffffff815f80fa>] ? __schedule+0x22a/0x6b0
Jul 22 22:48:27 capetown kernel:  [<ffffffff815f85b6>] schedule+0x36/0x80
Jul 22 22:48:27 capetown kernel:  [<ffffffff814526ba>] __pm_runtime_barrier+0xaa/0x170
Jul 22 22:48:27 capetown kernel:  [<ffffffff810c1740>] ? wake_bit_function+0x60/0x60
Jul 22 22:48:27 capetown kernel:  [<ffffffff81453c16>] pm_runtime_barrier+0x56/0xc0
Jul 22 22:48:27 capetown kernel:  [<ffffffff8134c47b>] pci_config_pm_runtime_get+0x3b/0x60
Jul 22 22:48:27 capetown kernel:  [<ffffffff813513fd>] pci_read_config+0x8d/0x250
Jul 22 22:48:27 capetown kernel:  [<ffffffff81288a2a>] sysfs_kf_bin_read+0x4a/0x70
Jul 22 22:48:27 capetown kernel:  [<ffffffff812880b0>] kernfs_fop_read+0xb0/0x190
Jul 22 22:48:27 capetown kernel:  [<ffffffff81207f27>] __vfs_read+0x37/0x130
Jul 22 22:48:27 capetown kernel:  [<ffffffff812b34bb>] ? security_file_permission+0x9b/0xb0
Jul 22 22:48:27 capetown kernel:  [<ffffffff81208c56>] vfs_read+0x96/0x130
Jul 22 22:48:27 capetown kernel:  [<ffffffff8120a455>] SyS_pread64+0x95/0xb0
Jul 22 22:48:27 capetown kernel:  [<ffffffff815fc637>] entry_SYSCALL_64_fastpath+0x1a/0xa9

Does this tell something?
Thanks.
Andreas

Offline

#4 2017-07-22 22:51:11

seth
Member
Registered: 2012-09-03
Posts: 50,927

Re: X server freeze after kernel update

pass "nouveau.runpm=0" to the kernel

This will somehow relate to "just to be switched of with VGA switcheroo", so you should test avoiding that as well and inspect your xorg log for the nvidia card (eg. via the modesetting driver)
If you don't need the nvidia GPU anyway, what about simply blacklisting the nouveau kernel module (not sure whether that will prevent all/any power draw)?

Offline

#5 2017-07-23 12:42:23

supernobby
Member
Registered: 2017-07-22
Posts: 6

Re: X server freeze after kernel update

Hello again seth,
I tried "nouveau.runpm=0" which seems to solve the issue.
After some further reading in this direction, is it right, that this "runtime PM support" is the "new thing" in kernels post 4.9.11-1-ARCH? And this somehow does not work with the nouveau driver in my system.
So, can it be said, that using "nouveau.runpm=0" in post 4.9.11-1-ARCH kernels recreates the behaviour of the 4.9.11-1-ARCH and older kernels in this regard?

Yes, I use the gpu-switch tool to boot with the Intel graphics. If booting with IGD is detected, I do power OFF the Nvidia card via VGA switcheroo. I found this to be the so far most power saving method:

gfxswitch="/sys/kernel/debug/vgaswitcheroo/switch"
if [ -e $gfxswitch ]
then
  if cat $gfxswitch | grep -q "DIS:+"
  then
    gpu-switch -i
    reboot
  else
    echo OFF > $gfxswitch
  fi
fi

I fiddled around with this for some time and actually never got the manual switching from IGD to DIS working well. So I gave up as I currently do not use an external display.

But well, is the failing "runtime PM support" something to worry about?
Andreas

Offline

#6 2017-07-23 13:54:30

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,626

Re: X server freeze after kernel update

Runtime PM support switch will likely work correctly if you don't use anything else explicitly (like that gpu-switch script which still uses the old method to disable the device and that is what is clashing with the new method that is intended to be used) Technically either should work correctly (you will see nouveau suspend the card in dmesg if you just let it do it's thing) Whether one or the other method is better for powersaving purposes is something you'd have to try out, in theory both should do the same thing.

Offline

#7 2017-07-23 14:02:35

seth
Member
Registered: 2012-09-03
Posts: 50,927

Re: X server freeze after kernel update

The runpm parameter is much older, not sure what leads to the assumption it's the "new thing" (link?) but it could have been dead code in the module (at least for your GPU) before.

It might simply conflict with switcheroo (like you're trying to switch of the GPU while it at the same time tries to power down); what you should check for sure is whether
a) the GPU is now somehow used by Xorg (look up the log)
b) power comsumption raises w/ runpm turned off

Offline

#8 2017-07-23 20:17:25

supernobby
Member
Registered: 2017-07-22
Posts: 6

Re: X server freeze after kernel update

Not sure what I can really expect from the runtime PM. But after some tests with the latest kernel I went back to "nouveau.runpm=0" and my gpu-switch script.
Reasons are, that if I use runtime PM both cards boot up powered (confirmed by cat /sys/kernel/debug/vgaswitcheroo/switch). Should not the unused card powered off?
The X server does not seem to do the runtime PM either. If the initial fb is Intel and the X server uses Intel, the Nvidia still remains powered when X runs.
Further, the X server does not seem to be able to switch the display, if it should use the different driver than the initial fb (console Intel and X nouveau, or console nouveau and X Intel). 
I only can see runtime PM in action, when it boots with Intel fb and I switch to nouveau fb and back. Then the unused card is automatically powered off.

Anyhow, that is maybe already a bit off-topic for this thread. "nouveau.runpm=0" was the solution for the freeze problem. But the freeze should not happen anyhow. Should that be reported somewhere else? Maybe this user reports the same issue?

Offline

#9 2017-07-23 21:16:03

seth
Member
Registered: 2012-09-03
Posts: 50,927

Re: X server freeze after kernel update

https://nouveau.freedesktop.org/wiki/Bugs/
https://bugs.freedesktop.org/buglist.cg … oduct=xorg

Likely is https://bugs.freedesktop.org/show_bug.cgi?id=98690
The google groups link looks related, yes. But there's no stack trace (or I didn't see one)

Offline

Board footer

Powered by FluxBB