You are not logged in.
Pages: 1
I am experiencing sudden reboots when starting to do a bit of heavier work. I cannot put my finger on it when it happens exactly so that suggests power/voltage/temperature issue.
My kernel version (uname -r): 6.6.28-1-MANJARO , I tried using the 6.8.7-1 kernal as well, but this did not help.
My hardware is an ideacentre 3 (type 07ACH7) with a ryzen 7 and amd gpu
lscpu:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 7 5800H with Radeon Graphics
CPU family: 25
Model: 80
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
Stepping: 0
CPU(s) scaling MHz: 21%
CPU max MHz: 4463,0000
CPU min MHz: 400,0000
BogoMIPS: 6391,91
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16
sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs
ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt
lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm debug_swap
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 256 KiB (8 instances)
L1i: 256 KiB (8 instances)
L2: 4 MiB (8 instances)
L3: 16 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Reg file data sampling: Not affected
Retbleed: Not affected
Spec rstack overflow: Vulnerable: Safe RET, no microcode
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines; IBPB conditional; IBRS_FW; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Srbds: Not affected
Tsx async abort: Not affected
gpu from lspci:
05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] (rev c5)
journalctl -b -1 does not provide anything usefull bar the time I turned on my screen:
ei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 83 : connected = true , enabled = true
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandROutput 83 update
m_connected: 0
m_crtc XRandRCrtc(0x556af80f4820)
CRTC: 78
MODE: 88
Connection: 0
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 83 : connected = true , enabled = true
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandRCrtc 78 m_configTimestamp update 127776753 => 127777772
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandRCrtc 79 m_configTimestamp update 18722 => 127777772
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandRCrtc 80 m_configTimestamp update 18722 => 127777772
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandRCrtc 81 m_configTimestamp update 18722 => 127777772
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandRCrtc 82 m_configTimestamp update 18722 => 127777772
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandROutput 83 update
m_connected: 0
m_crtc XRandRCrtc(0x556af80f4820)
CRTC: 78
MODE: 88
Connection: 0
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 83 : connected = true , enabled = true
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandROutput 84 update
m_connected: 1
m_crtc QObject(0x0)
CRTC: 0
MODE: 0
Connection: 1
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 84 : connected = false , enabled = false
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandROutput 85 update
m_connected: 1
m_crtc QObject(0x0)
CRTC: 0
MODE: 0
Connection: 1
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 85 : connected = false , enabled = false
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: XRandROutput 86 update
m_connected: 1
m_crtc QObject(0x0)
CRTC: 0
MODE: 0
Connection: 1
mei 01 08:12:22 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Output 86 : connected = false , enabled = false
mei 01 08:12:22 tomideacentre kscreenlocker_greet[20668]: QMetaObject::invokeMethod: No such method ScreenLocker::AccessDeniedNetworkReply::error(QNetworkReply::NetworkError)
mei 01 08:12:22 tomideacentre kscreenlocker_greet[20668]: QMetaObject::invokeMethod: No such method ScreenLocker::AccessDeniedNetworkReply::error(QNetworkReply::NetworkError)
mei 01 08:12:22 tomideacentre kscreenlocker_greet[20668]: qt.virtualkeyboard.hunspell: Hunspell dictionary is missing for "en_US". Search paths QList("/usr/share/qt6/qtvirtualkeyboard/hunspell", "/usr/share/hunspell", "/usr/share/myspell/dicts")
mei 01 08:12:22 tomideacentre kscreenlocker_greet[20668]: file:///usr/share/plasma/look-and-feel/org.kde.breeze.desktop/contents/lockscreen/MediaControls.qml:31:13: QML QQuickImage: Blocked request.
mei 01 08:12:23 tomideacentre kscreen_backend_launcher[1228]: kscreen.xrandr: Emitting configChanged()
lines 6151-6200/6200 (END)
I turned my screen on a couple minutes before it restarted/crashed.
I have been running the sensors command every two seconds, this is the last output from it from just before the crash/reboot. I can´t see anything off with the values, but do you know if these values are "normal"? I already replaced the PSU with a heavier version, but to no avail .
wo 1 mei 2024 8:14:52 CEST (DATE)
jc42-i2c-16-18
Adapter: SMBus PIIX4 adapter port 0 at 0b00
temp1: +37.5°C (low = +0.0°C) ALARM (HIGH, CRIT)
(high = +0.0°C, hyst = +0.0°C)
(crit = +0.0°C, hyst = +0.0°C)
amdgpu-pci-0500
Adapter: PCI adapter
vddgfx: 1.38 V
vddnb: 918.00 mV
edge: +44.0°C
PPT: 3.00 W
nvme-pci-0400
Adapter: PCI adapter
Composite: +30.9°C (low = -273.1°C, high = +80.8°C)
(crit = +84.8°C)
Sensor 1: +30.9°C (low = -273.1°C, high = +65261.8°C)
Sensor 2: +37.9°C (low = -273.1°C, high = +65261.8°C)
jc42-i2c-16-19
Adapter: SMBus PIIX4 adapter port 0 at 0b00
temp1: +37.8°C (low = +0.0°C) ALARM (HIGH, CRIT)
(high = +0.0°C, hyst = +0.0°C)
(crit = +0.0°C, hyst = +0.0°C)
k10temp-pci-00c3
Adapter: PCI adapter
Tctl: +50.5°C
amdgpu-pci-0100
Adapter: PCI adapter
vddgfx: 956.00 mV
fan1: 2230 RPM (min = 1800 RPM, max = 6900 RPM)
edge: +63.0°C (crit = +97.0°C, hyst = -273.1°C)
PPT: 9.07 W (cap = 35.00 W)
Last edited by tomzooi (2024-05-01 09:27:27)
Offline
My kernel version (uname -r): 6.6.28-1-MANJARO , I tried using the 6.8.7-1 kernal as well, but this did not help.
https://bbs.archlinux.org/misc.php?action=rules
Also there's no "a" in "kernel".
Offline
Apologies for my ignorance of the rules.
The random reboot page does seem to provide some insight though.
Looking at the dmesg I see the following, which seems more GPU than anything else I guess?
mei 01 08:12:21.656980 tomideacentre kernel: ------------[ cut here ]------------
mei 01 08:12:21.657071 tomideacentre kernel: WARNING: CPU: 15 PID: 209 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:102 set_reg_field_values.isra.0+0xca/0xe0 [amdgpu]
mei 01 08:12:21.657089 tomideacentre kernel: Modules linked in: ufs hfsplus hfs cdrom minix msdos jfs nls_ucs2_utils xfs ext4 mbcache jbd2 snd_seq_dummy rfcomm snd_hrtimer snd_seq qrtr uhid cmac algif_hash algif_skcipher af_alg bnep snd_ctl_led jc42 intel_rapl_msr vfat intel_rapl>
mei 01 08:12:21.657157 tomideacentre kernel: snd_hda_core btmtk snd_rn_pci_acp3x libarc4 cryptd snd_seq_device mc snd_hwdep think_lmi snd_acp_config sp5100_tco r8169 bluetooth rapl snd_pcm firmware_attributes_class wmi_bmof snd_soc_acpi ucsi_acpi pcspkr ecdh_generic cfg80211 rea>
mei 01 08:12:21.657185 tomideacentre kernel: CPU: 15 PID: 209 Comm: kworker/15:1H Not tainted 6.6.28-1-MANJARO #1 c0084fcd95e09efa3b38fe1f92ed891109adf4bb
mei 01 08:12:21.657200 tomideacentre kernel: Hardware name: LENOVO 90U9003RMH/376D, BIOS M4MKT17A 12/01/2023
mei 01 08:12:21.657215 tomideacentre kernel: Workqueue: events_highpri dm_irq_work_func [amdgpu]
mei 01 08:12:21.660231 tomideacentre kernel: RIP: 0010:set_reg_field_values.isra.0+0xca/0xe0 [amdgpu]
mei 01 08:12:21.660265 tomideacentre kernel: Code: 51 08 8b 08 48 8d 42 08 49 89 41 08 44 8b 02 48 8d 50 08 0f b6 c9 49 89 51 08 8b 00 45 85 c0 75 b3 0f 0b eb af e9 01 dc 7b fc <0f> 0b e9 3e ff ff ff 49 8b 51 08 eb cd 49 8b 41 08 eb d2 0f 1f 00
mei 01 08:12:21.660283 tomideacentre kernel: RSP: 0018:ffffc90006347b08 EFLAGS: 00010246
mei 01 08:12:21.660306 tomideacentre kernel: RAX: 0000000000000000 RBX: ffff88810c0c8480 RCX: 0000000000000000
mei 01 08:12:21.660330 tomideacentre kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffc90006347b10
mei 01 08:12:21.660348 tomideacentre kernel: RBP: ffffc90006347b88 R08: 0000000000000064 R09: ffffc90006347b18
mei 01 08:12:21.660364 tomideacentre kernel: R10: 0000000000000001 R11: 0000000000000100 R12: ffff88810a9152c0
mei 01 08:12:21.660381 tomideacentre kernel: R13: 0000000000000000 R14: ffff88810a9152c0 R15: 0000000000000000
mei 01 08:12:21.660399 tomideacentre kernel: FS: 0000000000000000(0000) GS:ffff88880e9c0000(0000) knlGS:0000000000000000
mei 01 08:12:21.660418 tomideacentre kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
mei 01 08:12:21.660434 tomideacentre kernel: CR2: 0000558a44205000 CR3: 000000010af7c000 CR4: 0000000000f50ee0
mei 01 08:12:21.660453 tomideacentre kernel: PKRU: 55555554
mei 01 08:12:21.660471 tomideacentre kernel: Call Trace:
mei 01 08:12:21.660494 tomideacentre kernel: <TASK>
mei 01 08:12:21.660511 tomideacentre kernel: ? set_reg_field_values.isra.0+0xca/0xe0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660527 tomideacentre kernel: ? __warn+0x81/0x130
mei 01 08:12:21.660549 tomideacentre kernel: ? set_reg_field_values.isra.0+0xca/0xe0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660562 tomideacentre kernel: ? report_bug+0x171/0x1a0
mei 01 08:12:21.660578 tomideacentre kernel: ? handle_bug+0x3c/0x80
mei 01 08:12:21.660595 tomideacentre kernel: ? exc_invalid_op+0x17/0x70
mei 01 08:12:21.660611 tomideacentre kernel: ? asm_exc_invalid_op+0x1a/0x20
mei 01 08:12:21.660628 tomideacentre kernel: ? set_reg_field_values.isra.0+0xca/0xe0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660640 tomideacentre kernel: ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660660 tomideacentre kernel: generic_reg_update_ex+0x74/0x1e0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660677 tomideacentre kernel: ? dm_read_reg_func+0x3b/0xb0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660694 tomideacentre kernel: ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660706 tomideacentre kernel: ? generic_reg_get2+0x26/0x50 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660723 tomideacentre kernel: dce_aux_configure_timeout+0x102/0x220 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660745 tomideacentre kernel: try_to_configure_aux_timeout+0x7f/0xe0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660762 tomideacentre kernel: retrieve_link_cap+0x75/0xb90 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660779 tomideacentre kernel: ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660791 tomideacentre kernel: ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660804 tomideacentre kernel: ? dp_is_sink_present+0xbc/0x120 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660825 tomideacentre kernel: detect_link_and_local_sink+0xaee/0xf80 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660844 tomideacentre kernel: ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660857 tomideacentre kernel: ? srso_alias_return_thunk+0x5/0xfbef5
mei 01 08:12:21.660874 tomideacentre kernel: link_detect+0x3a/0x480 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660896 tomideacentre kernel: ? query_hpd_status+0x6e/0xa0 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660909 tomideacentre kernel: handle_hpd_irq_helper+0xf9/0x170 [amdgpu 56b8b9144de5645801bcac53a2298e9a5b972765]
mei 01 08:12:21.660931 tomideacentre kernel: process_one_work+0x174/0x340
mei 01 08:12:21.660944 tomideacentre kernel: worker_thread+0x27b/0x3a0
mei 01 08:12:21.660965 tomideacentre kernel: ? __pfx_worker_thread+0x10/0x10
mei 01 08:12:21.660975 tomideacentre kernel: kthread+0xe8/0x120
mei 01 08:12:21.660989 tomideacentre kernel: ? __pfx_kthread+0x10/0x10
mei 01 08:12:21.661007 tomideacentre kernel: ret_from_fork+0x34/0x50
mei 01 08:12:21.661017 tomideacentre kernel: ? __pfx_kthread+0x10/0x10
mei 01 08:12:21.661027 tomideacentre kernel: ret_from_fork_asm+0x1b/0x30
mei 01 08:12:21.661040 tomideacentre kernel: </TASK>
mei 01 08:12:21.661050 tomideacentre kernel: ---[ end trace 0000000000000000 ]---
But that is before I turned on my screen.
I am now trying out to raise the voltage using https://forum.level1techs.com/t/overclo … nux/126025
since there is no bios setting for this on this machine (bios is VERY limited unfortunately). Let's see if that does anything
Offline
Offline
Pages: 1