You are not logged in.

#1 2025-02-24 11:58:22

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,721
Website

System hard-locks when using chromium with no debug info to go on

I am experiencing intermittently hard locks when I use chromium. Upon rebooting, there is nothing in journalctl near the time stamp of the crash suggesting that the hard-lock is severe enough break that too. Looking for suggestions to get some debug data. I recently became aware of kernel sysrq so I have that enabled now. Beyond that, any suggestions are welcomed.

Symptoms include:
Frozen cursor/screen
Machine will not respond to a ping from another machine
Keyboard dead/will not respond to C+A+F3 for example

Should mention that the system completed 7+ memtest86+ cycles so I do not believe bad RAM is to blame.

Last edited by graysky (2025-02-24 12:00:21)

Offline

#2 2025-02-24 14:26:26

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,899

Re: System hard-locks when using chromium with no debug info to go on

How "intermittently"?
Can you somehow trigger/force this or is it completely random?

I do not believe bad RAM is to blame.

How about "low RAM"?
Browsers tend to be resource hogs and depending on your swap configuration you might start thrashing (the kernel isn't unresponsive, just super-busy)?

Machine will not respond to a ping from another machine

ie. not just the frambuffer frozen hmm

Is this btw. on the ck kernel or have you also reproduced this on the main one?
Recent development or recent detection (because you've not used chromium)

Offline

#3 2025-02-24 16:40:06

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,721
Website

Re: System hard-locks when using chromium with no debug info to go on

Not due to low RAM, 96 G on this box and recently, nothing else is going on this machine. I also left it sitting for 10 minutes or and no change/still locked. I am using the repo kernel. Unfortunately, I have not been able update CK's patch set for some time now. This is a recent development as this is a new machine. I have no other stability issues with it.

Offline

#4 2025-02-24 21:20:45

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,899

Re: System hard-locks when using chromium with no debug info to go on

seth wrote:

Can you somehow trigger/force this or is it completely random?

Not due to low RAM, 96 G on this box

You'll have to keep an eye onto it since a rogue process can allocate some GB RAM in virtually no time.
But it seems not likely that you're hitting some massive memory leak and nobody else does.

* lscpu?
* do you get this w/o GPU use "--disable-gpu"?
* wayland (xwayland?) or X11?

Offline

#5 2025-02-24 22:39:40

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,721
Website

Re: System hard-locks when using chromium with no debug info to go on

Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        48 bits physical, 48 bits virtual
Byte Order:                           Little Endian
CPU(s):                               32
On-line CPU(s) list:                  0-31
Vendor ID:                            AuthenticAMD
Model name:                           AMD Ryzen 9 9950X 16-Core Processor
CPU family:                           26
Model:                                68
Thread(s) per core:                   2
Core(s) per socket:                   16
Socket(s):                            1
Stepping:                             0
Frequency boost:                      enabled
CPU(s) scaling MHz:                   60%
CPU max MHz:                          5752.0000
CPU min MHz:                          600.0000
BogoMIPS:                             8599.99
Flags:                                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx_vnni avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid bus_lock_detect movdiri movdir64b overflow_recov succor smca fsrm avx512_vp2intersect flush_l1d amd_lbr_pmc_freeze
Virtualization:                       AMD-V
L1d cache:                            768 KiB (16 instances)
L1i cache:                            512 KiB (16 instances)
L2 cache:                             16 MiB (16 instances)
L3 cache:                             64 MiB (2 instances)
NUMA node(s):                         1
NUMA node0 CPU(s):                    0-31
Vulnerability Gather data sampling:   Not affected
Vulnerability Itlb multihit:          Not affected
Vulnerability L1tf:                   Not affected
Vulnerability Mds:                    Not affected
Vulnerability Meltdown:               Not affected
Vulnerability Mmio stale data:        Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed:               Not affected
Vulnerability Spec rstack overflow:   Not affected
Vulnerability Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:             Mitigation; Enhanced / Automatic IBRS; IBPB conditional; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds:                  Not affected
Vulnerability Tsx async abort:        Not affected

* I will add the --disable-gpu to ~/.config/chromium-flags.conf
* X11

Offline

#6 2025-02-24 22:56:17

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,899

Re: System hard-locks when using chromium with no debug info to go on

I assume you came across https://wiki.archlinux.org/title/Ryzen#Troubleshooting ?
Ignore the "Ryzen 5000" series, this is a constant problem.

Offline

#7 2025-02-25 20:25:36

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,721
Website

Re: System hard-locks when using chromium with no debug info to go on

I did see that. My old system was a 5950X based one and it was rock solid. I set this one up the same way PBO-wise and it has been stable to all loads I've thrown at it with the exception of this chromium thing.

Offline

#8 2025-02-26 09:12:41

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 69,899

Re: System hard-locks when using chromium with no debug info to go on

seth wrote:
seth wrote:

How "intermittently"?
Can you somehow trigger/force this or is it completely random?

Offline

#9 2025-02-26 11:16:46

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,721
Website

Re: System hard-locks when using chromium with no debug info to go on

Maybe once or twice per week. I do not know how to trigger it beyond loading/using chromium

Offline

#10 2025-02-26 13:08:45

agapito
Member
From: Who cares.
Registered: 2008-11-13
Posts: 701

Re: System hard-locks when using chromium with no debug info to go on

Even without overclock Zen3/4/5 processors can throw errors and the only way to find them is using CoreCycler in Windows, and to fix them you will need Curve Optimizer in the bios, although the symptoms are usually spontaneous reboots. What graphics card are you using?

If you really want to be sure that your memory is reliable, use mprime95 or stressapptest while a graphics application running in the background squeezes your graphics card to 100% for at least 6 hours.


Excuse my poor English.

Offline

#11 2025-02-26 13:32:09

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,721
Website

Re: System hard-locks when using chromium with no debug info to go on

The card is an old Radeon RX 560D. I have run quite a few high stress workloads without mce in journalctl. I suspect if it is related to PBO settings I need to trigger it using low stress stuff but it has completed tons of cycles of those too without any problems. Reference: https://wiki.archlinux.org/title/Stress … ting_tasks

Last edited by graysky (2025-02-26 13:32:45)

Offline

#12 2025-02-26 14:50:42

agapito
Member
From: Who cares.
Registered: 2008-11-13
Posts: 701

Re: System hard-locks when using chromium with no debug info to go on

On Linux there is no tool like CoreCycler to detect errors during transient loads which is the main cause of instability on Ryzen processors, although this may not be your case if you do not suffer from spontaneous reboots.

By the way, I have looked at that wiki article and it gives incorrect information. CPU 21 is Core 10 on Windows but not on Linux. On Linux, 12 physical cores is like this:

Core 0 = CPU 0 + CPU 12
Core 1 = CPU 1 + CPU 13
Core 2 = CPU 2 + CPU 14
Core 3 = CPU 3 + CPU 15
Core 4 = CPU 4 + CPU 16
Core 5 = CPU 5 + CPU 17
Core 6 = CPU 6 + CPU 18
Core 7 = CPU 7 + CPU 19
Core 8 = CPU 8 + CPU 20
Core 9 = CPU 9 + CPU 21
Core 10 = CPU 10 + CPU 22
Core 11 = CPU 11 + CPU 23

Last edited by agapito (2025-02-26 15:06:57)


Excuse my poor English.

Offline

#13 2025-02-26 15:27:54

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,721
Website

Re: System hard-locks when using chromium with no debug info to go on

You can use taskset to stress a given core.  See this script I use for light workload testing.
As to numbering, is that the output of lstopo?

Last edited by graysky (2025-02-26 15:31:27)

Offline

#14 2025-02-26 16:29:42

agapito
Member
From: Who cares.
Registered: 2008-11-13
Posts: 701

Re: System hard-locks when using chromium with no debug info to go on

CoreCycler wakes and sleeps cores quickly and verifies that Prime95/Y-Cruncher operations have been performed correctly. I don't think that script is as effective, although I have not tried it.

In 16-core processors like the one you have or the one I have (5950X) CPU 21 corresponds to core 5. In 12-core processors, CPU 21 is Core 9. Check it yourself:

cat /sys/devices/system/cpu/cpu21/topology/core_id
5

Last edited by agapito (2025-02-26 16:30:47)


Excuse my poor English.

Offline

#15 2025-02-26 17:57:40

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,721
Website

Re: System hard-locks when using chromium with no debug info to go on

It seems these are different entirely.

From sysfs:
cpu21 is core 5 and cpu 5 is core 5.

From lstopo
PU L#21 maps to core 10.

% for or i in {0..31}; do echo "$i $(cat /sys/devices/system/cpu/cpu$i/topology/core_id)" ; done
1 1
3 3
5 5
7 7
9 9
11 11
13 13
15 15
17 1
19 3
21 5
23 7
25 9
27 11
29 13
31 15

Yet:

% lstopo-no-graphics
Machine (94GB total)
  Package L#0
    NUMANode L#0 (P#0 94GB)
    Die L#0 + L3 L#0 (32MB)
      L2 L#0 (1024KB) + L1d L#0 (48KB) + L1i L#0 (32KB) + Core L#0
        PU L#0 (P#0)
        PU L#1 (P#16)
      L2 L#1 (1024KB) + L1d L#1 (48KB) + L1i L#1 (32KB) + Core L#1
        PU L#2 (P#1)
        PU L#3 (P#17)
      L2 L#2 (1024KB) + L1d L#2 (48KB) + L1i L#2 (32KB) + Core L#2
        PU L#4 (P#2)
        PU L#5 (P#18)
      L2 L#3 (1024KB) + L1d L#3 (48KB) + L1i L#3 (32KB) + Core L#3
        PU L#6 (P#3)
        PU L#7 (P#19)
      L2 L#4 (1024KB) + L1d L#4 (48KB) + L1i L#4 (32KB) + Core L#4
        PU L#8 (P#4)
        PU L#9 (P#20)
      L2 L#5 (1024KB) + L1d L#5 (48KB) + L1i L#5 (32KB) + Core L#5
        PU L#10 (P#5)
        PU L#11 (P#21)
      L2 L#6 (1024KB) + L1d L#6 (48KB) + L1i L#6 (32KB) + Core L#6
        PU L#12 (P#6)
        PU L#13 (P#22)
      L2 L#7 (1024KB) + L1d L#7 (48KB) + L1i L#7 (32KB) + Core L#7
        PU L#14 (P#7)
        PU L#15 (P#23)
    Die L#1 + L3 L#1 (32MB)
      L2 L#8 (1024KB) + L1d L#8 (48KB) + L1i L#8 (32KB) + Core L#8
        PU L#16 (P#8)
        PU L#17 (P#24)
      L2 L#9 (1024KB) + L1d L#9 (48KB) + L1i L#9 (32KB) + Core L#9
        PU L#18 (P#9)
        PU L#19 (P#25)
      L2 L#10 (1024KB) + L1d L#10 (48KB) + L1i L#10 (32KB) + Core L#10
        PU L#20 (P#10)
        PU L#21 (P#26)
      L2 L#11 (1024KB) + L1d L#11 (48KB) + L1i L#11 (32KB) + Core L#11
        PU L#22 (P#11)
        PU L#23 (P#27)
      L2 L#12 (1024KB) + L1d L#12 (48KB) + L1i L#12 (32KB) + Core L#12
        PU L#24 (P#12)
        PU L#25 (P#28)
      L2 L#13 (1024KB) + L1d L#13 (48KB) + L1i L#13 (32KB) + Core L#13
        PU L#26 (P#13)
        PU L#27 (P#29)
      L2 L#14 (1024KB) + L1d L#14 (48KB) + L1i L#14 (32KB) + Core L#14
        PU L#28 (P#14)
        PU L#29 (P#30)
      L2 L#15 (1024KB) + L1d L#15 (48KB) + L1i L#15 (32KB) + Core L#15
        PU L#30 (P#15)
        PU L#31 (P#31)

Last edited by graysky (2025-02-26 17:58:06)

Offline

#16 2025-02-26 23:20:07

agapito
Member
From: Who cares.
Registered: 2008-11-13
Posts: 701

Re: System hard-locks when using chromium with no debug info to go on

Honestly, i don´t care what lstopo says. I am 100% sure that it is as I said, if you want you can try it yourself. Set -30 to your best core in Curve Optimizer, you know how CPPC works right? Then, using taskset run a single thread task to that core adding +16. For example, if your best core is core 2, run a single thread task on CPU/thread 18 and your computer will crash, that is if you managed to get to the desktop, because normally the ones that are marked as better cores may even need positive values on the curve.

Last edited by agapito (2025-04-01 09:25:31)


Excuse my poor English.

Offline

#17 2025-03-01 11:04:30

graysky
Wiki Maintainer
From: :wq
Registered: 2008-12-01
Posts: 10,721
Website

Re: System hard-locks when using chromium with no debug info to go on

seth wrote:

* do you get this w/o GPU use "--disable-gpu"?

Happened again so I think this rules out the gpu setting. My PBO is set to -15 across all cores. I will drop that to -10 and see if it's stable. Pity it's so difficult to trigger this.

And yeah, the output of lstopo is confusing and does not reflect the kernel mapping.

Last edited by graysky (2025-06-23 13:57:15)

Offline

#18 2025-03-04 09:42:27

agapito
Member
From: Who cares.
Registered: 2008-11-13
Posts: 701

Re: System hard-locks when using chromium with no debug info to go on

graysky wrote:
seth wrote:

* do you get this w/o GPU use "--disable-gpu"?

Happened again so I think this rules out the gpu setting. My PBO is set to -15 across all cores. I will drop that to -10 and see if it's stable. Pity it's so difficult to trigger this.

That's not how it works. If you apply the same negative offset to all cores, you are limiting the performance of the rest of the cores, since they are working with the offset of your worst core in the curve.

You should do it per core and test them with CoreCycler, that way the instabilities will end and you will get the maximum performance out of your CPU.


Excuse my poor English.

Offline

Board footer

Powered by FluxBB