You are not logged in.

#1 2024-05-01 09:24:57

tomzooi
Member
Registered: 2010-05-01
Posts: 26

Sudden reboot

I am experiencing sudden reboots when starting to do a bit of heavier work. I cannot put my finger on it when it happens exactly so that suggests power/voltage/temperature issue.
My kernel version (uname -r): 6.6.28-1-MANJARO , I tried using the 6.8.7-1 kernal as well, but this did not help.

My hardware is an ideacentre 3 (type 07ACH7)  with a ryzen 7 and amd gpu

lscpu:

Architecture:             x86_64
CPU op-mode(s):         32-bit, 64-bit
Address sizes:          48 bits physical, 48 bits virtual
Byte Order:             Little Endian
CPU(s):                   16
On-line CPU(s) list:    0-15
Vendor ID:                AuthenticAMD
Model name:             AMD Ryzen 7 5800H with Radeon Graphics
CPU family:           25
Model:                80
Thread(s) per core:   2
Core(s) per socket:   8
Socket(s):            1
Stepping:             0
CPU(s) scaling MHz:   21%
CPU max MHz:          4463,0000
CPU min MHz:          400,0000
BogoMIPS:             6391,91
Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16
sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs
ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt
lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm debug_swap
Virtualization features:
Virtualization:         AMD-V
Caches (sum of all):
L1d:                    256 KiB (8 instances)
L1i:                    256 KiB (8 instances)
L2:                     4 MiB (8 instances)
L3:                     16 MiB (1 instance)
NUMA:
NUMA node(s):           1
NUMA node0 CPU(s):      0-15
Vulnerabilities:
Gather data sampling:   Not affected
Itlb multihit:          Not affected
L1tf:                   Not affected
Mds:                    Not affected
Meltdown:               Not affected
Mmio stale data:        Not affected
Reg file data sampling: Not affected
Retbleed:               Not affected
Spec rstack overflow:   Vulnerable: Safe RET, no microcode
Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2:             Mitigation; Retpolines; IBPB conditional; IBRS_FW; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Srbds:                  Not affected
Tsx async abort:        Not affected

gpu from lspci:

05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] (rev c5)



journalctl -b -1 does not provide anything usefull bar the time I turned on my screen:

ei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 83 : connected = true , enabled = true
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandROutput 83 update
m_connected: 0
m_crtc XRandRCrtc(0x556af80f4820)
CRTC: 78
MODE: 88
Connection: 0
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 83 : connected = true , enabled = true
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandRCrtc  78  m_configTimestamp update 127776753  =>  127777772
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandRCrtc  79  m_configTimestamp update 18722  =>  127777772
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandRCrtc  80  m_configTimestamp update 18722  =>  127777772
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandRCrtc  81  m_configTimestamp update 18722  =>  127777772
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandRCrtc  82  m_configTimestamp update 18722  =>  127777772
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandROutput 83 update
m_connected: 0
m_crtc XRandRCrtc(0x556af80f4820)
CRTC: 78
MODE: 88
Connection: 0
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 83 : connected = true , enabled = true
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandROutput 84 update
m_connected: 1
m_crtc QObject(0x0)
CRTC: 0
MODE: 0
Connection: 1
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 84 : connected = false , enabled = false
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandROutput 85 update
m_connected: 1
m_crtc QObject(0x0)
CRTC: 0
MODE: 0
Connection: 1
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 85 : connected = false , enabled = false
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandROutput 86 update
m_connected: 1
m_crtc QObject(0x0)
CRTC: 0
MODE: 0
Connection: 1
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 86 : connected = false , enabled = false
mei 01 08:12:22 tomideacentre kscreenlocker_greet[20668]: QMetaObject::invokeMethod: No such method ScreenLocker::AccessDeniedNetworkReply::error(QNetworkReply::NetworkError)
mei 01 08:12:22 tomideacentre kscreenlocker_greet[20668]: QMetaObject::invokeMethod: No such method ScreenLocker::AccessDeniedNetworkReply::error(QNetworkReply::NetworkError)
mei 01 08:12:22 tomideacentre kscreenlocker_greet[20668]: qt.virtualkeyboard.hunspell: Hunspell dictionary is missing for "en_US". Search paths QList("/usr/share/qt6/qtvirtualkeyboard/hunspell", "/usr/share/hunspell", "/usr/share/myspell/dicts")
mei 01 08:12:22 tomideacentre kscreenlocker_greet[20668]: file:///usr/share/plasma/look-and-feel/org.kde.breeze.desktop/contents/lockscreen/MediaControls.qml:31:13: QML QQuickImage: Blocked request.
mei 01 08:12:23 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Emitting configChanged()
lines 6151-6200/6200 (END)

I turned my screen on a couple minutes before it restarted/crashed.

I have been running the sensors command every two seconds, this is the last output from it from just before the crash/reboot. I can´t see anything off with the values, but do you know if these values are "normal"? I already replaced the PSU with a heavier version, but to no avail sad.

wo  1 mei 2024  8:14:52 CEST (DATE)

jc42-i2c-16-18
Adapter: SMBus PIIX4 adapter port 0 at 0b00
temp1:        +37.5°C  (low  =  +0.0°C)                  ALARM (HIGH, CRIT)
(high =  +0.0°C, hyst =  +0.0°C)
(crit =  +0.0°C, hyst =  +0.0°C)

amdgpu-pci-0500
Adapter: PCI adapter
vddgfx:        1.38 V
vddnb:       918.00 mV
edge:         +44.0°C
PPT:           3.00 W

nvme-pci-0400
Adapter: PCI adapter
Composite:    +30.9°C  (low  = -273.1°C, high = +80.8°C)
(crit = +84.8°C)
Sensor 1:     +30.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +37.9°C  (low  = -273.1°C, high = +65261.8°C)

jc42-i2c-16-19
Adapter: SMBus PIIX4 adapter port 0 at 0b00
temp1:        +37.8°C  (low  =  +0.0°C)                  ALARM (HIGH, CRIT)
(high =  +0.0°C, hyst =  +0.0°C)
(crit =  +0.0°C, hyst =  +0.0°C)

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +50.5°C

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:      956.00 mV
fan1:        2230 RPM  (min = 1800 RPM, max = 6900 RPM)
edge:         +63.0°C  (crit = +97.0°C, hyst = -273.1°C)
PPT:           9.07 W  (cap =  35.00 W)

Last edited by tomzooi (2024-05-01 09:27:27)

Offline

#2 2024-05-01 14:00:40

seth
Member
Registered: 2012-09-03
Posts: 52,276

Re: Sudden reboot

My kernel version (uname -r): 6.6.28-1-MANJARO , I tried using the 6.8.7-1 kernal as well, but this did not help.

https://bbs.archlinux.org/misc.php?action=rules
Also there's no "a" in "kernel".

https://wiki.archlinux.org/title/Ryzen#Random_reboots

Offline

#3 2024-05-01 20:04:36

tomzooi
Member
Registered: 2010-05-01
Posts: 26

Re: Sudden reboot

Apologies for my ignorance of the rules.

The random reboot page does seem to provide some insight though.
Looking at the dmesg I see the following, which seems more GPU than anything else I guess?

mei 01 08:12:21.656980 tomideacentre kernel: ------------[ cut here ]------------
mei 01 08:12:21.657071 tomideacentre kernel: WARNING: CPU: 15 PID: 209 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:102 set_reg_field_values.isra.0+0xca/0xe0 [amdgpu]
mei 01 08:12:21.657089 tomideacentre kernel: Modules linked in: ufs hfsplus hfs cdrom minix msdos jfs nls_ucs2_utils xfs ext4 mbcache jbd2 snd_seq_dummy rfcomm snd_hrtimer snd_seq qrtr uhid cmac algif_hash algif_skcipher af_alg bnep snd_ctl_led jc42 intel_rapl_msr vfat intel_rapl>
mei 01 08:12:21.657157 tomideacentre kernel:  snd_hda_core btmtk snd_rn_pci_acp3x libarc4 cryptd snd_seq_device mc snd_hwdep think_lmi snd_acp_config sp5100_tco r8169 bluetooth rapl snd_pcm firmware_attributes_class wmi_bmof snd_soc_acpi ucsi_acpi pcspkr ecdh_generic cfg80211 rea>
mei 01 08:12:21.657185 tomideacentre kernel: CPU: 15 PID: 209 Comm: kworker/15:1H Not tainted 6.6.28-1-MANJARO #1 c0084fcd95e09efa3b38fe1f92ed891109adf4bb
mei 01 08:12:21.657200 tomideacentre kernel: Hardware name: LENOVO 90U9003RMH/376D, BIOS M4MKT17A 12/01/2023
mei 01 08:12:21.657215 tomideacentre kernel: Workqueue: events_highpri dm_irq_work_func [amdgpu]
mei 01 08:12:21.660231 tomideacentre kernel: RIP: 0010:set_reg_field_values.isra.0+0xca/0xe0 [amdgpu]
mei 01 08:12:21.660265 tomideacentre kernel: Code: 51 08 8b 08 48 8d 42 08 49 89 41 08 44 8b 02 48 8d 50 08 0f b6 c9 49 89 51 08 8b 00 45 85 c0 75 b3 0f 0b eb af e9 01 dc 7b fc <0f> 0b e9 3e ff ff ff 49 8b 51 08 eb cd 49 8b 41 08 eb d2 0f 1f 00
mei 01 08:12:21.660283 tomideacentre kernel: RSP: 0018:ffffc90006347b08 EFLAGS: 00010246
mei 01 08:12:21.660306 tomideacentre kernel: RAX: 0000000000000000 RBX: ffff88810c0c8480 RCX: 0000000000000000
mei 01 08:12:21.660330 tomideacentre kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffc90006347b10
mei 01 08:12:21.660348 tomideacentre kernel: RBP: ffffc90006347b88 R08: 0000000000000064 R09: ffffc90006347b18
mei 01 08:12:21.660364 tomideacentre kernel: R10: 0000000000000001 R11: 0000000000000100 R12: ffff88810a9152c0
mei 01 08:12:21.660381 tomideacentre kernel: R13: 0000000000000000 R14: ffff88810a9152c0 R15: 0000000000000000
mei 01 08:12:21.660399 tomideacentre kernel: FS:  0000000000000000(0000) GS:ffff88880e9c0000(0000) knlGS:0000000000000000
mei 01 08:12:21.660418 tomideacentre kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
mei 01 08:12:21.660434 tomideacentre kernel: CR2: 0000558a44205000 CR3: 000000010af7c000 CR4: 0000000000f50ee0
mei 01 08:12:21.660453 tomideacentre kernel: PKRU: 55555554
mei 01 08:12:21.660471 tomideacentre kernel: Call Trace:
mei 01 08:12:21.660494 tomideacentre kernel:  <TASK>
mei 01 08:12:21.660511 tomideacentre kernel:  ? set_reg_field_values.isra.0+0xca/0xe0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660527 tomideacentre kernel:  ? __warn+0x81/0x130
mei 01 08:12:21.660549 tomideacentre kernel:  ? set_reg_field_values.isra.0+0xca/0xe0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660562 tomideacentre kernel:  ? report_bug+0x171/0x1a0
mei 01 08:12:21.660578 tomideacentre kernel:  ? handle_bug+0x3c/0x80
mei 01 08:12:21.660595 tomideacentre kernel:  ? exc_invalid_op+0x17/0x70
mei 01 08:12:21.660611 tomideacentre kernel:  ? asm_exc_invalid_op+0x1a/0x20
mei 01 08:12:21.660628 tomideacentre kernel:  ? set_reg_field_values.isra.0+0xca/0xe0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660640 tomideacentre kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660660 tomideacentre kernel:  generic_reg_update_ex+0x74/0x1e0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660677 tomideacentre kernel:  ? dm_read_reg_func+0x3b/0xb0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660694 tomideacentre kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660706 tomideacentre kernel:  ? generic_reg_get2+0x26/0x50 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660723 tomideacentre kernel:  dce_aux_configure_timeout+0x102/0x220 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660745 tomideacentre kernel:  try_to_configure_aux_timeout+0x7f/0xe0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660762 tomideacentre kernel:  retrieve_link_cap+0x75/0xb90 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660779 tomideacentre kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660791 tomideacentre kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660804 tomideacentre kernel:  ? dp_is_sink_present+0xbc/0x120 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660825 tomideacentre kernel:  detect_link_and_local_sink+0xaee/0xf80 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660844 tomideacentre kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660857 tomideacentre kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660874 tomideacentre kernel:  link_detect+0x3a/0x480 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660896 tomideacentre kernel:  ? query_hpd_status+0x6e/0xa0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660909 tomideacentre kernel:  handle_hpd_irq_helper+0xf9/0x170 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660931 tomideacentre kernel:  process_one_work+0x174/0x340
mei 01 08:12:21.660944 tomideacentre kernel:  worker_thread+0x27b/0x3a0
mei 01 08:12:21.660965 tomideacentre kernel:  ? __pfx_worker_thread+0x10/0x10
mei 01 08:12:21.660975 tomideacentre kernel:  kthread+0xe8/0x120
mei 01 08:12:21.660989 tomideacentre kernel:  ? __pfx_kthread+0x10/0x10
mei 01 08:12:21.661007 tomideacentre kernel:  ret_from_fork+0x34/0x50
mei 01 08:12:21.661017 tomideacentre kernel:  ? __pfx_kthread+0x10/0x10
mei 01 08:12:21.661027 tomideacentre kernel:  ret_from_fork_asm+0x1b/0x30
mei 01 08:12:21.661040 tomideacentre kernel:  </TASK>
mei 01 08:12:21.661050 tomideacentre kernel: ---[ end trace 0000000000000000 ]---

But that is before I turned on my screen.

I am now trying out to raise the voltage using https://forum.level1techs.com/t/overclo … nux/126025
since there is no bios setting for this on this machine (bios is VERY limited unfortunately). Let's see if that does anything

Offline

Board footer

Powered by FluxBB