You are not logged in.

#1 2011-08-20 03:30:54

Deemoney14
Member
Registered: 2011-07-27
Posts: 70

[CLOSED] Display Locks Up on Screen Lock - GPU Hang?

This has been a very interesting, confusing problem I've noticed over the last week or two:

Apparently for some reason, if I lock my computer but unlock it shortly thereafter (<10 minutes, let's say as a ballpark estimate), no problems.  If it stays locked for longer than that, not in suspend or anything, the display goes black (still backlit) and can't be salvaged.  I've noticed this issue both with KDE 4.7 and Xfce 4.8; in KDE I think I was able to salvage the freeze sometimes by restarting X with Ctrl-Alt-Backspace; on Xfce that does not work.  Part of me thinks it's an issue with the screensaver.  Additionally, I see messages printed to various error logs:

The following shows up in the Xorg.0.log:

[ 11231.534] (EE) intel(0): Detected a hung GPU, disabling acceleration.
[ 11231.534] (EE) intel(0): When reporting this, please include i915_error_state from debugfs and the full dmesg.
[ 11231.534] (WW) intel(0): flip queue failed: Input/output error

A related message appears in the errors.log:

Aug 19 18:28:32 localhost kernel: [11227.018057] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Aug 19 18:28:32 localhost kernel: [11227.029698] [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 2104584 at 210
4577, next 2104585)

In the messages.log, I see what appears to be a dump/panic repeated multiple times in some instances:

Aug 19 15:16:13 localhost kernel: [ 2776.115962] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_stat
e
Aug 19 15:16:13 localhost kernel: [ 2776.129025] CPU 0 
Aug 19 15:16:13 localhost kernel: [ 2776.129035] Modules linked in: fuse ipv6 snd_hda_codec_hdmi snd_hda_codec_realtek joydev arc4 uvcv
ideo videodev media v4l2_compat_ioctl32 dell_wmi sparse_keymap iwlagn mac80211 snd_hda_intel snd_hda_codec snd_hwdep r8169 snd_pcm sg d
ell_laptop wmi i2c_i801 btusb bluetooth iTCO_wdt cfg80211 processor psmouse ac battery thermal pcspkr serio_raw dcdbas evdev rfkill snd
_timer snd soundcore snd_page_alloc mei(C) mii iTCO_vendor_support ext4 mbcache jbd2 crc16 sr_mod cdrom sd_mod ahci libahci libata scsi
_mod xhci_hcd ehci_hcd usbcore i915 drm_kms_helper drm intel_agp i2c_algo_bit button intel_gtt i2c_core video
Aug 19 15:16:13 localhost kernel: [ 2776.129424] 
Aug 19 15:16:13 localhost kernel: [ 2776.129439] Pid: 1366, comm: X Tainted: G         C  3.0-ARCH #1 Dell Inc.          Dell System In
spiron N4110/05TM8C
Aug 19 15:16:13 localhost kernel: [ 2776.129518] RIP: 0010:[<ffffffffa00a17ab>]  [<ffffffffa00a17ab>] i915_gem_object_unpin+0xab/0xb0 [
i915]
Aug 19 15:16:13 localhost kernel: [ 2776.129597] RSP: 0018:ffff880229f03b88  EFLAGS: 00010246
Aug 19 15:16:13 localhost kernel: [ 2776.129635] RAX: ffff880231f38000 RBX: ffff880231f38800 RCX: 0000000000006564
Aug 19 15:16:13 localhost kernel: [ 2776.129682] RDX: 0000000000060407 RSI: ffff8802323cc000 RDI: ffff88021a810000
Aug 19 15:16:13 localhost kernel: [ 2776.129730] RBP: ffff880229f03b88 R08: 2222222222222222 R09: 2222222222222222
Aug 19 15:16:13 localhost kernel: [ 2776.129776] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880231f38020
Aug 19 15:16:13 localhost kernel: [ 2776.129820] R13: ffff880231f38490 R14: 0000000000000000 R15: ffff88021a9b8c00
Aug 19 15:16:13 localhost kernel: [ 2776.129868] FS:  00007f6b1c74b880(0000) GS:ffff88023fa00000(0000) knlGS:0000000000000000
Aug 19 15:16:13 localhost kernel: [ 2776.129923] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 19 15:16:13 localhost kernel: [ 2776.129965] CR2: 00007f6b1c68c000 CR3: 0000000226147000 CR4: 00000000000406f0
Aug 19 15:16:13 localhost kernel: [ 2776.130015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 19 15:16:13 localhost kernel: [ 2776.130066] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug 19 15:16:13 localhost kernel: [ 2776.130118] Process X (pid: 1366, threadinfo ffff880229f02000, task ffff880231c43250)
Aug 19 15:16:13 localhost kernel: [ 2776.130178]  ffff880229f03ba8 ffffffffa00b1f59 ffff880231f38800 ffffffffa010e040
Aug 19 15:16:13 localhost kernel: [ 2776.130215]  ffff880229f03bd8 ffffffffa006a835 0000000000000001 ffff880231f38800
Aug 19 15:16:13 localhost kernel: [ 2776.130252]  ffff880229f03ca8 ffff880231f38460 ffff880229f03c98 ffffffffa006bba0
Aug 19 15:16:13 localhost kernel: [ 2776.130310]  [<ffffffffa00b1f59>] intel_crtc_disable+0x49/0x60 [i915]
Aug 19 15:16:13 localhost kernel: [ 2776.130341]  [<ffffffffa006a835>] drm_helper_disable_unused_functions+0x115/0x190 [drm_kms_helper]
Aug 19 15:16:13 localhost kernel: [ 2776.130379]  [<ffffffffa006bba0>] drm_crtc_helper_set_config+0x8f0/0xa00 [drm_kms_helper]
Aug 19 15:16:13 localhost kernel: [ 2776.130415]  [<ffffffff810f8f89>] ? __generic_file_aio_write+0x229/0x440
Aug 19 15:16:13 localhost kernel: [ 2776.130447]  [<ffffffffa003acae>] drm_framebuffer_cleanup+0xce/0x100 [drm]
Aug 19 15:16:13 localhost kernel: [ 2776.130481]  [<ffffffffa00b7501>] intel_user_framebuffer_destroy+0x21/0x70 [i915]
Aug 19 15:16:13 localhost kernel: [ 2776.130515]  [<ffffffffa003d70b>] drm_mode_rmfb+0xcb/0x120 [drm]
Aug 19 15:16:13 localhost kernel: [ 2776.130544]  [<ffffffffa002e444>] drm_ioctl+0x3e4/0x4c0 [drm]
Aug 19 15:16:13 localhost kernel: [ 2776.130572]  [<ffffffffa003d640>] ? drm_mode_addfb+0x180/0x180 [drm]
Aug 19 15:16:13 localhost kernel: [ 2776.130600]  [<ffffffff81157012>] ? do_sync_write+0xd2/0x110
Aug 19 15:16:13 localhost kernel: [ 2776.130626]  [<ffffffff81190952>] ? fsnotify+0x1c2/0x2a0
Aug 19 15:16:13 localhost kernel: [ 2776.130650]  [<ffffffff8116939f>] do_vfs_ioctl+0x8f/0x500
Aug 19 15:16:13 localhost kernel: [ 2776.130673]  [<ffffffff81157858>] ? vfs_write+0x148/0x180
Aug 19 15:16:13 localhost kernel: [ 2776.130696]  [<ffffffff811698a1>] sys_ioctl+0x91/0xa0
Aug 19 15:16:13 localhost kernel: [ 2776.130725]  [<ffffffff813f4402>] system_call_fastpath+0x16/0x1b
Aug 19 15:16:13 localhost kernel: [ 2776.131018]  RSP <ffff880229f03b88>
Aug 19 15:19:57 localhost kernel: [ 2999.492473] X               D 00000001000b56e2     0  1366    891 0x00400004
Aug 19 15:19:57 localhost kernel: [ 2999.493754]  ffff880229f035d8 0000000000000082 ffffffff00000000 ffff880229f03656
Aug 19 15:19:57 localhost kernel: [ 2999.495032]  ffff880231c43250 ffff880229f03fd8 ffff880229f03fd8 ffff880229f03fd8
Aug 19 15:19:57 localhost kernel: [ 2999.496281]  ffffffff8169b020 ffff880231c43250 000000007fffffff ffff880229f03646
Aug 19 15:19:57 localhost kernel: [ 2999.497507] Call Trace:
Aug 19 15:19:57 localhost kernel: [ 2999.498682]  [<ffffffff813f22b9>] __mutex_lock_slowpath+0x139/0x330
Aug 19 15:19:57 localhost kernel: [ 2999.499835]  [<ffffffff813f24c6>] mutex_lock+0x16/0x30
Aug 19 15:19:57 localhost kernel: [ 2999.500978]  [<ffffffffa00687b6>] drm_fb_helper_pan_display+0x36/0xd0 [drm_kms_helper]
Aug 19 15:19:57 localhost kernel: [ 2999.502109]  [<ffffffff8125b1dd>] fb_pan_display+0xbd/0x180
Aug 19 15:19:57 localhost kernel: [ 2999.503218]  [<ffffffff8126bc39>] bit_update_start+0x29/0x60
Aug 19 15:19:57 localhost kernel: [ 2999.504310]  [<ffffffff812694d6>] fbcon_switch+0x3a6/0x540
Aug 19 15:19:57 localhost kernel: [ 2999.505397]  [<ffffffff812caf09>] redraw_screen+0x179/0x270
Aug 19 15:19:57 localhost kernel: [ 2999.506477]  [<ffffffff8126896a>] fbcon_blank+0x20a/0x2c0
Aug 19 15:19:57 localhost kernel: [ 2999.507551]  [<ffffffff8106cc18>] ? lock_timer_base.isra.30+0x38/0x70
Aug 19 15:19:57 localhost kernel: [ 2999.508622]  [<ffffffff8106d391>] ? mod_timer+0x141/0x360
Aug 19 15:19:57 localhost kernel: [ 2999.509689]  [<ffffffff812ccdaa>] do_unblank_screen+0xaa/0x1b0
Aug 19 15:19:57 localhost kernel: [ 2999.510749]  [<ffffffff812ccec0>] unblank_screen+0x10/0x20
Aug 19 15:19:57 localhost kernel: [ 2999.511790]  [<ffffffff81226ead>] bust_spinlocks+0x1d/0x40
Aug 19 15:19:57 localhost kernel: [ 2999.512823]  [<ffffffff8100f2f0>] oops_end+0x40/0xf0
Aug 19 15:19:57 localhost kernel: [ 2999.513855]  [<ffffffff8100f4f8>] die+0x58/0x90
Aug 19 15:19:57 localhost kernel: [ 2999.514882]  [<ffffffff8100bd24>] do_trap+0xc4/0x170
Aug 19 15:19:57 localhost kernel: [ 2999.515898]  [<ffffffff8100c035>] do_invalid_op+0x95/0xb0
Aug 19 15:19:57 localhost kernel: [ 2999.516913]  [<ffffffffa00a17ab>] ? i915_gem_object_unpin+0xab/0xb0 [i915]
Aug 19 15:19:57 localhost kernel: [ 2999.517920]  [<ffffffff8117d980>] ? __mark_inode_dirty+0x40/0x220
Aug 19 15:19:57 localhost kernel: [ 2999.518927]  [<ffffffff813f539b>] invalid_op+0x1b/0x20
Aug 19 15:19:57 localhost kernel: [ 2999.519933]  [<ffffffffa00a17ab>] ? i915_gem_object_unpin+0xab/0xb0 [i915]
Aug 19 15:19:57 localhost kernel: [ 2999.520942]  [<ffffffffa00b1f59>] intel_crtc_disable+0x49/0x60 [i915]
Aug 19 15:19:57 localhost kernel: [ 2999.521947]  [<ffffffffa006a835>] drm_helper_disable_unused_functions+0x115/0x190 [drm_kms_helper]
Aug 19 15:19:57 localhost kernel: [ 2999.522960]  [<ffffffffa006bba0>] drm_crtc_helper_set_config+0x8f0/0xa00 [drm_kms_helper]
Aug 19 15:19:57 localhost kernel: [ 2999.523976]  [<ffffffff810f8f89>] ? __generic_file_aio_write+0x229/0x440
Aug 19 15:19:57 localhost kernel: [ 2999.524985]  [<ffffffffa003acae>] drm_framebuffer_cleanup+0xce/0x100 [drm]
Aug 19 15:19:57 localhost kernel: [ 2999.525998]  [<ffffffffa00b7501>] intel_user_framebuffer_destroy+0x21/0x70 [i915]
Aug 19 15:19:57 localhost kernel: [ 2999.527016]  [<ffffffffa003d70b>] drm_mode_rmfb+0xcb/0x120 [drm]
Aug 19 15:19:57 localhost kernel: [ 2999.528028]  [<ffffffffa002e444>] drm_ioctl+0x3e4/0x4c0 [drm]
Aug 19 15:19:57 localhost kernel: [ 2999.529037]  [<ffffffffa003d640>] ? drm_mode_addfb+0x180/0x180 [drm]
Aug 19 15:19:57 localhost kernel: [ 2999.530052]  [<ffffffff81157012>] ? do_sync_write+0xd2/0x110
Aug 19 15:19:57 localhost kernel: [ 2999.531064]  [<ffffffff81190952>] ? fsnotify+0x1c2/0x2a0
Aug 19 15:19:57 localhost kernel: [ 2999.532073]  [<ffffffff8116939

Duh alert: it appears to be a graphics issue, judging from "i915" and "drm" appearing in numerous parts of the error messages.  The laptop in question is a Dell Inspiron N4110 (soon to be replaced), running Sandy Bridge-generation Intel HD 3000 integrated Graphics.  The graphics have been problematic for me in the past, in a lot of respects, but if it's a more general problem then I'd like to try to solve it for future reference.

I'll do the deed at some point in the near future and intentionally produce a lockup.  This should allow me to get the /debug/dri/0/i915_error_stat file and the dmesg output.  I have sshd now running on this machine and I can access it from another machine on my local intranet, so I should be able to dump those files before I have to reboot the machine.

At any rate, other comments/suggestions/advice is appreciated.



EDIT:  So the lockup occurs closer to 20 minutes after going idle.  The error stat file I was hoping to find is not there, but I did find something very interesting in the dmesg output, buried amidst the remaining output, which is pretty much what I pasted above:

[11486.475168] ------------[ cut here ]------------
[11486.475196] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:3336!
[11486.475222] invalid opcode: 0000 [#1] PREEMPT SMP

So... would this be a regression or something worth reporting upstream?  I Googled a little bit and found some posts on the Gentoo forums dated about a year ago that blamed KMS.  I moved KMS into my mkinitcpio a few weeks ago (and nearly broke my boot in the process), I'll investigate.


EDIT 2:  So from some circumstantial evidence, I added KMS to my mkinitcpio.conf file around August 9th (after the Linux 3.0 update, because I botched the install and had hell trying to re-run mkinitcpio).  But I can see similar messages going back to August 2nd if I grep /var/log/kernel.log*.  There's three old kernel.logs, and the third one, kernel.log.3, is clean.  Wondering what could have changed on August 2nd that broke things...


EDIT 3:  X changed on August 2nd.  At least there are some pacman packages timestamped August 2nd that would suggest as much.  I see some low-level updates: xterm, libx11, xorg-twm and xorg-xset, along with a dbus package dated August 3rd.  Xorg-server was updated on August 8th, so I don't think that's the culprit.

Last edited by Deemoney14 (2011-08-22 00:23:43)


This isn't the signature you're looking for...  Move along...

Offline

#2 2011-08-20 19:19:32

Deemoney14
Member
Registered: 2011-07-27
Posts: 70

Re: [CLOSED] Display Locks Up on Screen Lock - GPU Hang?

Bump.  I've tried to re-edit my initial post, because the issue is not just when the lid is closed, it's when the screen is locked period.  After enough time, lid open or closed, the display becomes unresponsive.  Not sure if it's a power management issue, a laptop thing or a graphics thing.


This isn't the signature you're looking for...  Move along...

Offline

#3 2011-08-22 00:23:28

Deemoney14
Member
Registered: 2011-07-27
Posts: 70

Re: [CLOSED] Display Locks Up on Screen Lock - GPU Hang?

Closing the issue.

Graphics YET AGAIN.

I figured it was screensaver related if I deactivated xscreensaver and have been fine since, so I googled "xscreensaver gpu hang".

Found TWO bug reports for the Intel drivers dated back in February and May with similar error messages.  Seems to be in xf86-video-intel.  Not sure if this was blatantly obvious to people reading this post but hopefully this helps others with similar issues.


This isn't the signature you're looking for...  Move along...

Offline

#4 2011-08-22 00:45:31

steve___
Member
Registered: 2008-02-24
Posts: 452

Re: [CLOSED] Display Locks Up on Screen Lock - GPU Hang?

I just read your posts now.  FWIW, I have an old laptop with the 830 intel chipset and have the same issues.  Up until recently I had to deactivate the screen saver and dpms by running this in $HOME/.xinitrc:

xset -dpms &
xset s off &

These two issues were fixed with 2.15.0-2 but closing the lid still cause the gpu to hang.

Offline

#5 2011-08-22 01:01:51

Deemoney14
Member
Registered: 2011-07-27
Posts: 70

Re: [CLOSED] Display Locks Up on Screen Lock - GPU Hang?

I'm running 2.15.0-2 at the moment, so once again, it might be a Sandy Bridge thing.  Some of the bug reports I read mentioned IronLakes and HuronLakes, guessing those are different chipsets.

I killed the xscreensaver daemon and the display turns off when I close the lid or after 15 minutes by Xfce, so no harm, no foul at the moment.


This isn't the signature you're looking for...  Move along...

Offline

Board footer

Powered by FluxBB