You are not logged in.

#1 2017-05-18 11:49:47

dpetrov
Member
Registered: 2017-04-17
Posts: 3

[Arch Crash] Debugging a system that reboots itself

Hello Guys,

My system started rebooting occasionally without providing any clue in the journal.


# last | head
dpetrov  tty1                          Thu May 18 14:08   still logged in
reboot   system boot  4.10.11-1-ARCH   Thu May 18 14:00   still running
dpetrov  pts/20       192.168.10.139   Thu May 18 07:37 - crash  (06:23)
dpetrov  pts/33       192.168.10.139   Thu May 18 05:54 - 08:01  (02:06)
dpetrov  pts/33       192.168.10.139   Wed May 17 13:59 - 19:12  (05:12)
dpetrov  pts/33       192.168.10.139   Wed May 17 11:31 - 13:23  (01:51)
dpetrov  pts/33       192.168.10.100   Wed May 17 05:49 - 05:50  (00:00)
dpetrov  pts/33       192.168.10.100   Wed May 17 05:44 - 05:44  (00:00)
dpetrov  pts/33       192.168.10.139   Tue May 16 23:30 - 23:55  (00:25)
dpetrov  pts/28       192.168.10.139   Mon May 15 22:41 - 23:23  (00:42)

The system just crashed @ 14:00, here is the journalctl excerpt:

 journalctl -S '2017-05-18 13:45:00' -U '2017-05-18 14:01' | head -50                                                                                                                                             
-- Logs begin at Thu 2016-12-08 00:39:46 EET, end at Thu 2017-05-18 14:30:10 EEST. --
May 18 13:45:01 casanova.evolve.inc crond[32608]: pam_unix(crond:session): session opened for user dpetrov by (uid=0)                                                                                              
May 18 13:45:01 casanova.evolve.inc CROND[32609]: (dpetrov) CMD (/home/dpetrov/bin/syncmail)
May 18 13:45:12 casanova.evolve.inc CROND[32608]: (dpetrov) CMDOUT (indexing messages under /home/dpetrov/Maildir [/home/dpetrov/.mu/xapian])
May 18 13:45:12 casanova.evolve.inc CROND[32608]: (dpetrov) CMDOUT ()
May 18 13:45:12 casanova.evolve.inc CROND[32608]: (dpetrov) CMDOUT (cleaning up messages [/home/dpetrov/.mu/xapian])
May 18 13:45:12 casanova.evolve.inc CROND[32608]: [86B blob data]
May 18 13:45:12 casanova.evolve.inc CROND[32608]: (dpetrov) CMDOUT (elapsed: 0 second(s))
May 18 13:45:12 casanova.evolve.inc CROND[32608]: [86B blob data]
May 18 13:45:12 casanova.evolve.inc CROND[32608]: (dpetrov) CMDOUT (elapsed: 0 second(s))
May 18 13:45:12 casanova.evolve.inc CROND[32608]: pam_unix(crond:session): session closed for user dpetrov
May 18 13:52:57 casanova.evolve.inc kernel: usb 3-7: new full-speed USB device number 21 using xhci_hcd
May 18 13:53:01 casanova.evolve.inc kernel: usb 3-7: new high-speed USB device number 22 using xhci_hcd
May 18 13:56:54 casanova.evolve.inc kernel: usb 3-7: USB disconnect, device number 22
May 18 13:56:59 casanova.evolve.inc kernel: usb 3-7: new full-speed USB device number 23 using xhci_hcd
May 18 13:57:02 casanova.evolve.inc dnsmasq-dhcp[962]: DHCPREQUEST(virbr0) 192.168.122.80 52:54:00:39:20:6a
May 18 13:57:02 casanova.evolve.inc dnsmasq-dhcp[962]: DHCPACK(virbr0) 192.168.122.80 52:54:00:39:20:6a host
May 18 13:57:02 casanova.evolve.inc kernel: usb 3-7: new high-speed USB device number 24 using xhci_hcd
May 18 13:57:08 casanova.evolve.inc kernel: usb 3-7: USB disconnect, device number 24
-- Reboot --
May 18 14:00:28 casanova.evolve.inc systemd-journald[233]: Time spent on flushing to /var is 625us for 0 entries.
May 18 14:00:28 casanova.evolve.inc kernel: Linux version 4.10.11-1-ARCH (builduser@tobias) (gcc version 6.3.1 20170306 (GCC) ) #1 SMP PREEMPT Tue Apr 18 08:39:42 CEST 2017
May 18 14:00:28 casanova.evolve.inc kernel: Command line: BOOT_IMAGE=/vmlinuz-linux root=/dev/mapper/arch-lvroot rw quiet intel_iommu=off resume=/dev/mapper/arch-lvswap
May 18 14:00:28 casanova.evolve.inc kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
May 18 14:00:28 casanova.evolve.inc kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
May 18 14:00:28 casanova.evolve.inc kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
May 18 14:00:28 casanova.evolve.inc kernel: x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
May 18 14:00:28 casanova.evolve.inc kernel: x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
May 18 14:00:28 casanova.evolve.inc kernel: e820: BIOS-provided physical RAM map:
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x0000000000100000-0x00000000a8aa2fff] usable
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000a8aa3000-0x00000000a8aa9fff] ACPI NVS
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000a8aaa000-0x00000000a937cfff] usable
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000a937d000-0x00000000a9616fff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000a9617000-0x00000000bc1d3fff] usable
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000bc1d4000-0x00000000bc3d9fff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000bc3da000-0x00000000bc416fff] usable
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000bc417000-0x00000000bc4befff] ACPI NVS
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000bc4bf000-0x00000000bcffefff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000bcfff000-0x00000000bcffffff] usable
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000bf800000-0x00000000cf9fffff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000fed00000-0x00000000fed03fff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
May 18 14:00:28 casanova.evolve.inc kernel: BIOS-e820: [mem 0x0000000100000000-0x000000022e5fffff] usable

I have been using the CPUs intensively but I am constantly monitoring the temperatures

# sensors | tail                                                                                                                                                                                                   
Package id 0:  +64.0°C  (high = +80.0°C, crit = +100.0°C)                                                                                                                                                          
Core 0:        +60.0°C  (high = +80.0°C, crit = +100.0°C)                                                                                                                                                          
Core 1:        +64.0°C  (high = +80.0°C, crit = +100.0°C)                                                                                                                                                          
Core 2:        +61.0°C  (high = +80.0°C, crit = +100.0°C)                                                                                                                                                          
Core 3:        +59.0°C  (high = +80.0°C, crit = +100.0°C)                                                                                                                                                          

radeon-pci-0100
Adapter: PCI adapter
temp1:        +61.0°C  (crit = +120.0°C, hyst = +90.0°C)

Also I have setup the cpu frequency to 2.2Ghz

# sudo cpupower frequency-info | grep -i current
  current policy: frequency should be within 800 MHz and 2.20 GHz.
  current CPU frequency: 2.20 GHz (asserted by call to hardware)

No crashdump arround the time of reboot:

coredumpctl | tail                                                                                                                                                                                               
Sun 2017-05-14 18:17:25 EEST   4344  1000   100   6 missing  /usr/bin/synergys                                                                                                                                     
Sun 2017-05-14 22:49:11 EEST   4732  1000   100   6 missing  /usr/bin/synergys
Sun 2017-05-14 22:49:12 EEST  20061  1000   100   6 missing  /usr/bin/emacs-26.0.50
Sun 2017-05-14 22:52:14 EEST    911  1000   100   6 missing  /usr/bin/synergys
Sun 2017-05-14 22:54:03 EEST   1709  1000   100   6 missing  /usr/bin/synergys
Sun 2017-05-14 22:59:20 EEST   1012  1000   100   6 missing  /usr/bin/synergys
Sun 2017-05-14 23:01:28 EEST   1885  1000   100   6 missing  /usr/bin/synergys
Sun 2017-05-14 23:02:33 EEST   2330  1000   100   6 missing  /usr/bin/synergys
Mon 2017-05-15 11:52:36 EEST   3330  1000   100   6 missing  /usr/bin/emacs-26.0.50
Thu 2017-05-18 14:09:52 EEST    994  1000   100   6 present  /usr/bin/synergys

It looks like a power down but there was no outage at that time.

Any ideas are appreciated.
Thanks in advance.

Last edited by dpetrov (2017-05-18 11:50:31)

Offline

#2 2017-05-18 17:46:47

x33a
Forum Fellow
Registered: 2009-08-15
Posts: 4,587

Re: [Arch Crash] Debugging a system that reboots itself

Could be an issue with the memory or the PSU.

Offline

Board footer

Powered by FluxBB