You are not logged in.

#1 2014-11-09 08:57:56

kalsan
Member
Registered: 2011-10-10
Posts: 119

Random crashes at startup

Hi there,
I recently had a problem of the computer "forgetting" the time which could be fixed by changing the BIOS battery. Unfortunately, I get random crashes at startup now. The probability is about 20% and the variance is very high, meaning that it may occur several times in a row, or not at all for several days. Behaviour: Shows "Arch, clean, blablablah" and then goes to black instead of displaying first the series of [OK] and then the console login. Yes, there is no DM installed. With "goes to black" I mean that the display is on, but only shows black. NumLock and CapsLock are working (I can turn them on and off). The Wireless Button shows by blinking that it is connected. Ctrl+Alt+Delete will turn off Wireless but does not result in any changes on the screen. It doesn't reboot either. Caps and Num keep working after that.
I really don't want to reinstall the system, so I strongly hope that it's possible to fix it. FSCK does not help unfortunately.
Any ideas how I can figure out what's wrong?
Cheers,
Kalsan

Offline

#2 2014-11-09 12:41:12

Potomac
Member
Registered: 2011-12-25
Posts: 528

Re: Random crashes at startup

do you hear a hard disk activity after the display becomes black ?

you could try to disable dpm with a kernel boot parameter ( if you have a radeon : radeon.dpm=0 ) to see if the bug is still here,

maybe it's the same bug but not the exact symptoms :

https://bbs.archlinux.org/viewtopic.php?id=189324

https://bugs.archlinux.org/task/42692

I'm doing a git bisect in order to find the commit who has introduced this bug,

kernels 3.16.x don't have the bug, the bug starts with kernel 3.17

Last edited by Potomac (2014-11-09 12:52:57)

Offline

#3 2014-11-09 14:32:10

kalsan
Member
Registered: 2011-10-10
Posts: 119

Re: Random crashes at startup

Well I don't "hear" my SSD but there is no activity displayed by the HD access lamp.
dpm reduces my power consumption by 10W. As it's a laptop, that's pretty important to me. I had changed dpm to 1 for longer already, so the bug isn't related to it.
Apropos kernel 3.17: My problem is a little different, as it doesn't display any text (its just blank) and it's not really frozen (Num and Caps and Net are working).
The problem first occured on Oct 31 / Nov 1. I install all updates every day. Would the timing match the introduction of kernel 3.17?
Cheers,
Kalsan

Offline

#4 2014-11-09 19:36:02

Potomac
Member
Registered: 2011-12-25
Posts: 528

Re: Random crashes at startup

kalsan wrote:

Apropos kernel 3.17: My problem is a little different, as it doesn't display any text (its just blank) and it's not really frozen (Num and Caps and Net are working).
Kalsan

when I said "freeze" it's not really a freeze, because my keyboard is still alive ( num, shift, alt and caps keys works ), it's a freeze in terms of "the boot has suddenly stopped", it's seems that the kernel or systemd wait something on boot when the bug occurs, and they can't go further in the boot process, I have to do a reset with the "reset" button,

the only difference between you and me is that I can see something on the screen ( no complete black screen )

Last edited by Potomac (2014-11-09 19:37:16)

Offline

#5 2014-11-10 08:43:04

kalsan
Member
Registered: 2011-10-10
Posts: 119

Re: Random crashes at startup

Hm, it might be the same thing then. Yesterday it happened again and this time my room was dark. I saw that the backlight of the screen was off as well (in contrast to me previous statement). So it might be AMD :-(
However, the problem only occured since November. Kernel 3.17 came out on the beginning of October. That's a counter-argument.
Which logs could I check to figure out what's wrong?

Offline

#6 2014-11-10 09:19:47

Potomac
Member
Registered: 2011-12-25
Posts: 528

Re: Random crashes at startup

you can try "journalctl", the log of systemd

https://wiki.archlinux.org/index.php/Journalctl#Journal

Offline

#7 2014-11-10 09:53:26

kalsan
Member
Registered: 2011-10-10
Posts: 119

Re: Random crashes at startup

Aha! I think I found it. Yes, dpm seems to be the problem here:

Nov 05 11:41:48 lp-user-arch kernel: NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
Nov 05 11:41:48 lp-user-arch kernel: switching from power state:
Nov 05 11:41:48 lp-user-arch kernel:         ui class: none
Nov 05 11:41:48 lp-user-arch kernel:         internal class: boot 
Nov 05 11:41:48 lp-user-arch kernel:         caps: 
Nov 05 11:41:48 lp-user-arch kernel:         uvd    vclk: 0 dclk: 0
Nov 05 11:41:48 lp-user-arch kernel:                 power level 0    sclk: 30000 mclk: 15000 vddc: 900 vddci: 0
Nov 05 11:41:48 lp-user-arch kernel:                 power level 1    sclk: 30000 mclk: 15000 vddc: 900 vddci: 0
Nov 05 11:41:48 lp-user-arch kernel:                 power level 2    sclk: 30000 mclk: 15000 vddc: 900 vddci: 0
Nov 05 11:41:48 lp-user-arch kernel:         status: c b 
Nov 05 11:41:48 lp-user-arch kernel: switching to power state:
Nov 05 11:41:48 lp-user-arch kernel:         ui class: performance
Nov 05 11:41:48 lp-user-arch kernel:         internal class: none
Nov 05 11:41:48 lp-user-arch kernel:         caps: 
Nov 05 11:41:48 lp-user-arch kernel:         uvd    vclk: 0 dclk: 0
Nov 05 11:41:48 lp-user-arch kernel:                 power level 0    sclk: 30000 mclk: 15000 vddc: 900 vddci: 1000
Nov 05 11:41:48 lp-user-arch kernel:                 power level 1    sclk: 40000 mclk: 80000 vddc: 900 vddci: 1000
Nov 05 11:41:48 lp-user-arch kernel:                 power level 2    sclk: 50000 mclk: 80000 vddc: 900 vddci: 1000
Nov 05 11:41:48 lp-user-arch kernel:         status: r 
Nov 05 11:41:48 lp-user-arch kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
Nov 05 11:41:48 lp-user-arch kernel: IP: [<ffffffffa0528b2b>] evergreen_bandwidth_update+0x4b/0x120 [radeon]
Nov 05 11:41:48 lp-user-arch kernel: PGD 22dee7067 PUD 230494067 PMD 0 
Nov 05 11:41:48 lp-user-arch kernel: Oops: 0000 [#1] PREEMPT SMP 
Nov 05 11:41:48 lp-user-arch kernel: Modules linked in: joydev iTCO_wdt iTCO_vendor_support tpm_infineon ppdev snd_hda_codec_idt snd_hda_codec_generic coretemp intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm crct10dif_pc
Nov 05 11:41:48 lp-user-arch kernel:  ac(+) button ext4 crc16 mbcache jbd2 sd_mod sr_mod crc_t10dif cdrom crct10dif_common atkbd libps2 ahci libahci libata sdhci_pci scsi_mod ehci_pci sdhci firewire_ohci led_class xhci_hcd ehci_hcd fir
Nov 05 11:41:48 lp-user-arch kernel: CPU: 1 PID: 1255 Comm: laptop_mode Tainted: G        W      3.17.2-1-ARCH #1
Nov 05 11:41:48 lp-user-arch kernel: Hardware name: Hewlett-Packard HP EliteBook 8570p/17A7, BIOS 68ICF Ver. F.45 10/07/2013
Nov 05 11:41:48 lp-user-arch kernel: task: ffff88022cf35a90 ti: ffff88022cfe0000 task.ti: ffff88022cfe0000
Nov 05 11:41:48 lp-user-arch kernel: RIP: 0010:[<ffffffffa0528b2b>]  [<ffffffffa0528b2b>] evergreen_bandwidth_update+0x4b/0x120 [radeon]
Nov 05 11:41:48 lp-user-arch kernel: RSP: 0018:ffff88022cfe3de8  EFLAGS: 00010246
Nov 05 11:41:48 lp-user-arch kernel: RAX: ffff88022dd18498 RBX: ffff88022dd18000 RCX: ffff88022dd184c8
Nov 05 11:41:48 lp-user-arch kernel: RDX: 0000000000000000 RSI: 0000000000000384 RDI: ffff88022dd18000
Nov 05 11:41:48 lp-user-arch kernel: RBP: ffff88022cfe3e18 R08: ffff880231d8ce92 R09: 0000000000000384
Nov 05 11:41:48 lp-user-arch kernel: R10: 000000000000c350 R11: 00000000000003b7 R12: ffff88022dd18000
Nov 05 11:41:48 lp-user-arch kernel: R13: 0000000000000000 R14: ffff88022dd19738 R15: ffff88022dd19048
Nov 05 11:41:48 lp-user-arch kernel: FS:  00007f8a4526c700(0000) GS:ffff88023dc40000(0000) knlGS:0000000000000000
Nov 05 11:41:48 lp-user-arch kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 05 11:41:48 lp-user-arch kernel: CR2: 0000000000000090 CR3: 0000000231305000 CR4: 00000000001407e0
Nov 05 11:41:48 lp-user-arch kernel: Stack:
Nov 05 11:41:48 lp-user-arch kernel:  0000000040784cbc ffff88022dd18000 ffff88022dd19710 0000000000000000
Nov 05 11:41:48 lp-user-arch kernel:  ffff88022dd19738 ffff88022dd19048 ffff88022cfe3e58 ffffffffa051f38e
Nov 05 11:41:48 lp-user-arch kernel:  ffffffff8105e9ac ffff88022dd18000 000000000000000c ffff88023139b2c0
Nov 05 11:41:48 lp-user-arch kernel: Call Trace:
Nov 05 11:41:48 lp-user-arch kernel:  [<ffffffffa051f38e>] radeon_pm_compute_clocks+0x62e/0x8f0 [radeon]
Nov 05 11:41:48 lp-user-arch kernel:  [<ffffffff8105e9ac>] ? __do_page_fault+0x2ec/0x600
Nov 05 11:41:48 lp-user-arch kernel:  [<ffffffffa051fc03>] radeon_set_dpm_state+0x73/0xf0 [radeon]
Nov 05 11:41:48 lp-user-arch kernel:  [<ffffffff813a71d8>] dev_attr_store+0x18/0x30
Nov 05 11:41:48 lp-user-arch kernel:  [<ffffffff8123e46a>] sysfs_kf_write+0x3a/0x50
Nov 05 11:41:48 lp-user-arch kernel:  [<ffffffff8123d9de>] kernfs_fop_write+0xee/0x180
Nov 05 11:41:48 lp-user-arch kernel:  [<ffffffff811c7247>] vfs_write+0xb7/0x200
Nov 05 11:41:48 lp-user-arch kernel:  [<ffffffff811c7eb9>] SyS_write+0x59/0xd0
Nov 05 11:41:48 lp-user-arch kernel:  [<ffffffff8153c7a9>] system_call_fastpath+0x16/0x1b
Nov 05 11:41:48 lp-user-arch kernel: Code: 94 24 78 20 00 00 85 d2 0f 8e d7 00 00 00 83 ea 01 49 8d 84 24 98 04 00 00 45 31 ed 49 8d 8c d4 a0 04 00 00 0f 1f 40 00 48 8b 10 <80> ba 90 00 00 00 01 41 83 dd ff 48 83 c0 08 48 39 c8 75 e9 4
Nov 05 11:41:48 lp-user-arch kernel: RIP  [<ffffffffa0528b2b>] evergreen_bandwidth_update+0x4b/0x120 [radeon]
Nov 05 11:41:48 lp-user-arch kernel:  RSP <ffff88022cfe3de8>
Nov 05 11:41:48 lp-user-arch kernel: CR2: 0000000000000090
Nov 05 11:41:48 lp-user-arch kernel: ---[ end trace 093cd6f2deafe652 ]---

Offline

#8 2014-11-10 19:10:12

Potomac
Member
Registered: 2011-12-25
Posts: 528

Re: Random crashes at startup

you can try to disable dpm with a kernel boot parameter :

radeon.dpm=0

Offline

#9 2014-11-10 19:13:18

kalsan
Member
Registered: 2011-10-10
Posts: 119

Re: Random crashes at startup

Yes. Unfortunately, this makes me consume over 40 W on a laptop. Therefore, this is not a durable solution.
Do you think it is a dpm bug? Any chance it might be solved? Is there a workaround?

Offline

#10 2014-11-10 22:57:45

Potomac
Member
Registered: 2011-12-25
Posts: 528

Re: Random crashes at startup

one solution could be to install kernel 3.18rc4, you can find this package on AUR :

https://aur.archlinux.org/packages/linux-mainline/

download the tarball file, extract it, then run "makepkg -c", the package will be created, after that you can install it with "pacman  -U the_package.xz",

don't forget after to add "linux-mainline" kernel in your bootloader file configuration ( grub, syslinux )

Last edited by Potomac (2014-11-11 00:19:44)

Offline

#11 2014-11-11 13:46:17

kalsan
Member
Registered: 2011-10-10
Posts: 119

Re: Random crashes at startup

There seems to be a problem with miffe repo (key cannot be imported)

[miffe]
Server = http://arch.miffe.org/$arch/

Compiling the kernel takes forever. I guess I'll just wait till december when the new kernel comes out.
Thanks for your help!
Cheers,
Kalsan

Last edited by kalsan (2014-11-12 12:44:51)

Offline

#12 2014-11-12 12:58:51

kalsan
Member
Registered: 2011-10-10
Posts: 119

Re: Random crashes at startup

UPDATE: Got miffe to work. The bug ALSO occurs under mainline.
What could I try next?
Cheers,
Kalsan

Offline

#13 2014-11-15 15:05:22

kalsan
Member
Registered: 2011-10-10
Posts: 119

Re: Random crashes at startup

Alright, fixed it. I'm not proud of the fix, but it "works": installed catalyst-test.

Offline

Board footer

Powered by FluxBB