You are not logged in.

#1 2014-11-11 15:09:29

justasug
Member
Registered: 2014-08-03
Posts: 168

[SOLVED] Help troubleshooting random crashes

Ever since installing Arch on this computer I've been experiencing random lock-ups, which completely bring everything to halt and the computer restarts automatically after that. When it happens, everything just "freezes" up, can't move the mouse, the keyboard isn't responding and I can't switch to a different TTY. At first I ignored it, but now it has happened once while installaing updates which broke everything; forcing me to reinstall from scratch.

I ruled out hardware issues (other distributions and Windows worked fine, but I could install those again and test it as a last resort, just to make sure). I tested the hard drive for errors, none reported. Then I ran memtest86 for 7 hours (6 passes) and all were okay (no reported errors). I looked through the forums and saw a few threads like this one: https://bbs.archlinux.org/viewtopic.php?id=189324, but those seem to be specific to the latest 3.17 kernel. This was also happening on the earlier (3.16) kernels.
The computer in question is an HP Pavilion DV4 laptop, with an Intel C2D T5800, Intel GMA X4500 and 4GB of DDR2 RAM, currently running the x86_64 3.17.2-1 kernel. I can post more specific info if needed.

My first thought was to check some logs to see if anything suspicious happened shortly before a crash. I only know the journalctl way, but there is nothing out of the ordinary listed in it (I can post the log if needed, maybe I missed something). Are there other logs I can check, which might have more information?

EDIT: This turned out to be caused by faulty hardware.

Last edited by justasug (2016-03-09 10:38:42)

Offline

#2 2014-11-11 16:47:44

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 19,926

Re: [SOLVED] Help troubleshooting random crashes

No answers, just questions.
What environment are you using?  Gnome, KDE, i3, OpenBox?
Are you using Wireless?  What is the wireless chipset?
When it freezes, do the keyboard lights blink?
Can you post the output of lsmod and of systemctl status.  Heck, while we are at it, how about the output of lspci -nn   ?


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way

Offline

#3 2014-11-11 21:23:15

justasug
Member
Registered: 2014-08-03
Posts: 168

Re: [SOLVED] Help troubleshooting random crashes

I'm using Openbox.
Yes, I'm using a wireless connection most of the time. The laptop has a Broadcom 4312 LP PHY chipset and I'm using it with the B43 module/drivers. It works fine, I had no trouble installing and setting it up.
As mentioned above, when it freezes everything stops. The keyboard lights don't respond when I try to turn on caps or num lock for example.

lsmod:

Module                  Size  Used by
nls_iso8859_1          12461  0 
nls_cp437              16553  0 
vfat                   21231  0 
fat                    61984  1 vfat
uas                    21995  0 
usb_storage            60311  1 uas
cdc_wdm                17427  0 
cdc_acm                30362  0 
cdc_ether              12564  0 
usbnet                 34978  1 cdc_ether
fuse                   87410  3 
ctr                    12927  3 
ccm                    17534  3 
arc4                   12536  2 
b43                   410153  0 
bcma                   45915  1 b43
mac80211              604456  1 b43
iTCO_wdt               12831  0 
iTCO_vendor_support    12649  1 iTCO_wdt
cfg80211              445286  2 b43,mac80211
rng_core               12808  1 b43
uvcvideo               78952  0 
videobuf2_vmalloc      12816  1 uvcvideo
videobuf2_memops       12519  1 videobuf2_vmalloc
videobuf2_core         47827  1 uvcvideo
v4l2_common            12995  1 videobuf2_core
joydev                 17063  0 
hp_wmi                 13238  0 
videodev              135040  3 uvcvideo,v4l2_common,videobuf2_core
sparse_keymap          12818  1 hp_wmi
media                  18365  2 uvcvideo,videodev
rfkill                 18867  3 cfg80211,hp_wmi
mousedev               17272  0 
ssb                    65506  1 b43
r8169                  68207  0 
coretemp               12820  0 
mii                    12675  2 r8169,usbnet
pcmcia                 53108  2 b43,ssb
pcmcia_core            18431  1 pcmcia
hwmon                  12930  1 coretemp
evdev                  21544  16 
mac_hid                12633  0 
serio_raw              12849  0 
jmb38x_ms              17096  0 
memstick               13696  1 jmb38x_ms
psmouse               107214  0 
pcspkr                 12595  0 
snd_hda_codec_hdmi     49213  2 
i2c_i801               16965  0 
lpc_ich                20768  0 
ir_lirc_codec          12675  0 
ir_jvc_decoder         12433  0 
lirc_dev               16951  1 ir_lirc_codec
ir_mce_kbd_decoder     12574  0 
ir_xmp_decoder         12433  0 
ir_sanyo_decoder       12437  0 
ir_sharp_decoder       12437  0 
ir_rc5_decoder         12433  0 
ir_sony_decoder        12435  0 
ir_nec_decoder         12433  0 
ir_rc6_decoder         12433  0 
rc_rc6_mce             12396  0 
hp_accel               25200  0 
lis3lv02d              17883  1 hp_accel
input_polldev          13118  1 lis3lv02d
fan                    12681  0 
ene_ir                 21814  0 
rc_core                22437  14 ir_sharp_decoder,ir_xmp_decoder,lirc_dev,ir_lirc_codec,ir_rc5_decoder,ir_nec_decoder,ir_sony_decoder,ene_ir,ir_mce_kbd_decoder,ir_jvc_decoder,ir_rc6_decoder,ir_sanyo_decoder,rc_rc6_mce
i915                  905750  2 
video                  18043  1 i915
thermal                17559  0 
battery                17452  0 
wmi                    17339  1 hp_wmi
snd_hda_codec_idt      56952  1 
snd_hda_codec_generic    63126  1 snd_hda_codec_idt
drm_kms_helper         80934  1 i915
drm                   259106  4 i915,drm_kms_helper
shpchp                 35210  0 
intel_agp              17432  0 
intel_gtt              17848  3 i915,intel_agp
i2c_algo_bit           12744  1 i915
i2c_core               50152  7 drm,i915,i2c_i801,drm_kms_helper,i2c_algo_bit,v4l2_common,videodev
snd_hda_intel          26387  4 
snd_hda_controller     26938  1 snd_hda_intel
snd_hda_codec         108536  5 snd_hda_codec_hdmi,snd_hda_codec_idt,snd_hda_codec_generic,snd_hda_intel,snd_hda_controller
snd_hwdep              17244  1 snd_hda_codec
snd_pcm                88487  4 snd_hda_codec_hdmi,snd_hda_codec,snd_hda_intel,snd_hda_controller
snd_timer              26614  1 snd_pcm
snd                    73436  16 snd_hwdep,snd_timer,snd_hda_codec_hdmi,snd_hda_codec_idt,snd_pcm,snd_hda_codec_generic,snd_hda_codec,snd_hda_intel
soundcore              13031  2 snd,snd_hda_codec
button                 12953  1 i915
ac                     12715  0 
acpi_cpufreq           17218  0 
processor              27777  3 acpi_cpufreq
ext4                  497696  1 
crc16                  12343  1 ext4
mbcache                17171  1 ext4
jbd2                   86417  1 ext4
sd_mod                 44398  3 
crc_t10dif             12431  1 sd_mod
crct10dif_common       12356  1 crc_t10dif
atkbd                  22254  0 
libps2                 12739  2 atkbd,psmouse
ahci                   33291  2 
libahci                27158  1 ahci
libata                181518  2 ahci,libahci
scsi_mod              147543  4 uas,usb_storage,libata,sd_mod
sdhci_pci              22095  0 
sdhci                  39043  1 sdhci_pci
led_class              12859  3 b43,sdhci,hp_accel
mmc_core              110434  4 b43,ssb,sdhci,sdhci_pci
uhci_hcd               43507  0 
ehci_pci               12512  0 
ehci_hcd               69939  1 ehci_pci
usbcore               199381  10 uas,uhci_hcd,uvcvideo,usb_storage,ehci_hcd,ehci_pci,usbnet,cdc_acm,cdc_wdm,cdc_ether
usb_common             12440  1 usbcore
i8042                  18002  1 libps2
serio                  18282  6 serio_raw,atkbd,i8042,psmouse

systemctl status:

● masina2
    State: running
     Jobs: 0 queued
   Failed: 0 units
    Since: Uto 2014-11-11 21:44:04 CET; 38min ago
   CGroup: /
           ├─1 /sbin/init
           ├─system.slice
           │ ├─dbus.service
           │ │ └─211 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
           │ ├─wpa_supplicant.service
           │ │ └─281 /usr/bin/wpa_supplicant -u
           │ ├─lightdm.service
           │ │ ├─226 /usr/bin/lightdm
           │ │ └─231 /usr/bin/Xorg.bin :0 -seat seat0 -auth /run/lightdm/root/:0 -nolisten tcp vt1 -novtswitch
           │ ├─accounts-daemon.service
           │ │ └─233 /usr/lib/accountsservice/accounts-daemon
           │ ├─systemd-journald.service
           │ │ └─126 /usr/lib/systemd/systemd-journald
           │ ├─udisks2.service
           │ │ └─549 /usr/lib/udisks2/udisksd --no-debug
           │ ├─systemd-logind.service
           │ │ └─210 /usr/lib/systemd/systemd-logind
           │ ├─systemd-udevd.service
           │ │ └─144 /usr/lib/systemd/systemd-udevd
           │ ├─polkit.service
           │ │ └─237 /usr/lib/polkit-1/polkitd --no-debug
           │ ├─NetworkManager.service
           │ │ ├─208 /usr/bin/NetworkManager --no-daemon
           │ │ └─306 /usr/bin/dhcpcd -B -K -L -G -c /usr/lib/networkmanager/nm-dhcp-helper wlan0
           │ ├─nmbd.service
           │ │ └─258 /usr/bin/nmbd -D
           │ └─rtkit-daemon.service
           │   └─349 /usr/lib/rtkit/rtkit-daemon
           └─user.slice
             ├─user-1000.slice
             │ ├─user@1000.service
             │ │ ├─311 /usr/lib/systemd/systemd --user
             │ │ └─312 (sd-pam)  
             │ └─session-c2.scope
             │   ├─283 lightdm --session-child 12 19
             │   ├─314 /usr/bin/openbox --startup /usr/lib/openbox/openbox-autostart OPENBOX
             │   ├─321 dbus-launch --sh-syntax --exit-with-session
             │   ├─322 /usr/bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
             │   ├─334 tint2
             │   ├─335 sh /home/dino/.config/openbox/autostart
             │   ├─338 volumeicon
             │   ├─339 nm-applet
             │   ├─345 /usr/lib/gvfs/gvfsd
             │   ├─348 /usr/bin/pulseaudio --start
             │   ├─355 /usr/lib/gvfs/gvfsd-fuse /run/user/1000/gvfs -f -o big_writes
             │   ├─376 /usr/lib/pulse/gconf-helper
             │   ├─378 /usr/lib/GConf/gconfd-2
             │   ├─421 firefox
             │   ├─541 thunar
             │   ├─543 /usr/lib/xfce4/xfconf/xfconfd
             │   ├─547 /usr/lib/gvfs/gvfs-udisks2-volume-monitor
             │   ├─558 /usr/lib/gvfs/gvfsd-trash --spawner :1.2 /org/gtk/gvfs/exec_spaw/0
             │   ├─637 /usr/lib/dconf/dconf-service
             │   ├─644 /usr/lib/gvfs/gvfsd-metadata
             │   ├─712 xfce4-terminal
             │   ├─716 gnome-pty-helper
             │   ├─717 bash
             │   └─722 systemctl status
             └─user-620.slice
               ├─user@620.service
               │ ├─259 /usr/lib/systemd/systemd --user
               │ └─260 (sd-pam)  
               └─session-c1.scope
                 ├─267 /usr/bin/dbus-launch --autolaunch 356703664f0d4637b0cb9d0385e4d31a --binary-syntax --close-stderr
                 ├─268 /usr/bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
                 ├─270 /usr/lib/at-spi2-core/at-spi-bus-launcher
                 ├─274 /usr/bin/dbus-daemon --config-file=/etc/at-spi2/accessibility.conf --nofork --print-address 3
                 └─277 /usr/lib/at-spi2-core/at-spi2-registryd --use-gnome-session

lspci -nn

00:00.0 Host bridge [0600]: Intel Corporation Mobile 4 Series Chipset Memory Controller Hub [8086:2a40] (rev 07)
00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a42] (rev 07)
00:02.1 Display controller [0380]: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a43] (rev 07)
00:1a.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 [8086:2937] (rev 03)
00:1a.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 [8086:2938] (rev 03)
00:1a.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 [8086:293c] (rev 03)
00:1b.0 Audio device [0403]: Intel Corporation 82801I (ICH9 Family) HD Audio Controller [8086:293e] (rev 03)
00:1c.0 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 [8086:2940] (rev 03)
00:1c.2 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 3 [8086:2944] (rev 03)
00:1c.3 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 4 [8086:2946] (rev 03)
00:1c.4 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 5 [8086:2948] (rev 03)
00:1c.5 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 6 [8086:294a] (rev 03)
00:1d.0 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 [8086:2934] (rev 03)
00:1d.1 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 [8086:2935] (rev 03)
00:1d.2 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 [8086:2936] (rev 03)
00:1d.7 USB controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 [8086:293a] (rev 03)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge [8086:2448] (rev 93)
00:1f.0 ISA bridge [0601]: Intel Corporation ICH9M LPC Interface Controller [8086:2919] (rev 03)
00:1f.2 SATA controller [0106]: Intel Corporation 82801IBM/IEM (ICH9M/ICH9M-E) 4 port SATA Controller [AHCI mode] [8086:2929] (rev 03)
00:1f.3 SMBus [0c05]: Intel Corporation 82801I (ICH9 Family) SMBus Controller [8086:2930] (rev 03)
02:00.0 Network controller [0280]: Broadcom Corporation BCM4312 802.11b/g LP-PHY [14e4:4315] (rev 01)
03:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8101E/RTL8102E PCI Express Fast Ethernet controller [10ec:8136] (rev 02)
04:00.0 System peripheral [0880]: JMicron Technology Corp. SD/MMC Host Controller [197b:2382]
04:00.2 SD Host controller [0805]: JMicron Technology Corp. Standard SD Host Controller [197b:2381]
04:00.3 System peripheral [0880]: JMicron Technology Corp. MS Host Controller [197b:2383]
04:00.4 System peripheral [0880]: JMicron Technology Corp. xD Host Controller [197b:2384]

Last edited by justasug (2014-11-11 21:25:28)

Offline

#4 2014-11-17 17:00:37

justasug
Member
Registered: 2014-08-03
Posts: 168

Re: [SOLVED] Help troubleshooting random crashes

After a couple of the usual crashes, there's been some "progress". The last one made it to this screen after the usual freeze-up.

6CdnfV7s.jpg

It said "run the above through mcelog --ascii", but what exactly do they refer to with "the above"?
According to the article on Wikipedia, MCE errors can occur after overlocking (I'm not overclocking), poorly-fitted heatsink/computer fans (I don't have any overheating issues) and an overloaded internal or external power-supply (don't know how to test for that). Is there anything else I can do or should I put this off as faulty hardware and move on?

EDIT:
I ran the error messages through mcelog, here's the output. Since "CPU" is mentioned often in it, does that mean that the CPU is faulty or dying?

Hardware event. This is not a software error.
CPU 1 BANK 0 TSC 5dcd26b076 
TIME 1416237431 Mon Nov 17 16:17:11 2014
MCG status:MCIP 
MCi status:
Error overflow
Uncorrected error
Error enabled
Processor context corrupt
MCA: BUS Level-0 Local-CPU-originated-request Generic Memory-access Request-did-not-timeout Error
BQ_DCU_READ_TYPE BQ_ERR_HARD_TYPE BQ_ERR_HARD_TYPE
timeout BINIT (ROB timeout). No micro-instruction retired for some time
STATUS f200004000000800 MCGSTATUS 4
CPUID Vendor Intel Family 6 Model 15
SOCKET 0 APIC 1 microcode a3


Hardware event. This is not a software error.
CPU 1 BANK 5 TSC 5dcd26b076 
TIME 1416237431 Mon Nov 17 16:17:11 2014
MCG status:MCIP 
MCi status:
Error overflow
Uncorrected error
Error enabled
Processor context corrupt
MCA: Internal Timer error
STATUS f200241010040400 MCGSTATUS 4
CPUID Vendor Intel Family 6 Model 15
SOCKET 0 APIC 1 microcode a3


Hardware event. This is not a software error.
CPU 0 BANK 0 TSC 5dcd26b08a 
TIME 1416237431 Mon Nov 17 16:17:11 2014
MCG status:MCIP 
MCi status:
Error overflow
Uncorrected error
Error enabled
Processor context corrupt
MCA: Internal Timer error
STATUS f200241010040400 MCGSTATUS 4
CPUID Vendor Intel Family 6 Model 15
SOCKET 0 APIC 1 microcode a3


Hardware event. This is not a software error.
CPU 0 BANK 0 TSC 5dcd26b08a 
TIME 1416237431 Mon Nov 17 16:17:11 2014
MCG status:MCIP 
MCi status:
Error overflow
Uncorrected error
Error enabled
Processor context corrupt
MCA: BUS Level-0 Local-CPU-originated-request Generic Memory-access Request-did-not-timeout Error
BQ_DCU_READ_TYPE BQ_ERR_HARD_TYPE BQ_ERR_HARD_TYPE
timeout BINIT (ROB timeout). No micro-instruction retired for some time
STATUS f200004000000800 MCGSTATUS 4
CPUID Vendor Intel Family 6 Model 15
SOCKET 0 APIC 1 microcode a3

Last edited by justasug (2014-11-17 21:38:02)

Offline

Board footer

Powered by FluxBB