You are not logged in.

#1 2022-09-30 05:11:52

toydotgame
Member
Registered: 2022-09-30
Posts: 9
Website

[SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

(This is my first post in these forums, please correct me on which is the correct forum to post in if I've made the wrong choice)

When playing games, obviously my GPU and CPU heat up; both of them can get as high as 85°C. My CPU fan is decent enough that it stays below 80°C during most intensive work, but my GPU runs a lot hotter (above 80°C during gaming or hashing/similarly-intensive GPU work).

Sometimes however, during gaming (which is the only thing I do often that pushes my GPU that far), my computer will shut down spontaneously. Like, straight to black screen (the screen doesn't freeze or anything) and then I get the systemd shutdown messages and after a few seconds my computer is completely turned off. I investigated the problem by looking into my systemd journal at the time of my most recent shutdown (in this case September 28th at 2:59 pm):

$ journalctl -b -1 | tail -n 500 | grep "Sep 28 14:59:"

…which gave me a lot (see this Pastebin dump) of information, but most importantly it showed some lines at the top confirming that it was a CTF fault:

Sep 28 14:59:43 archtwo kernel: amdgpu 0000:04:00.0: amdgpu: ERROR: GPU over temperature range(SW CTF) detected!
Sep 28 14:59:43 archtwo kernel: amdgpu 0000:04:00.0: amdgpu: ERROR: System is going to shutdown due to GPU SW CTF!
Sep 28 14:59:43 archtwo systemd-logind[539]: System is powering down.

I've looked around online a bit and haven't found much on how to raise this critical temperature value. The closest I got was this GitLab issue on (what seems like) a similar issue – but it suggests kernel patches, which I'm not familiar with and don't know how to read nor use. I'd also like to know if perhaps there's just a config file somewhere that I can edit instead of if CTF auto-shutdowns are a hardcoded kernel-level things that require patching. (If so can I have some explanation on how to do that, thanks smile)

Last edited by toydotgame (2022-10-09 04:13:10)


Computer Specs: archtwo on PCPartPicker
AMD Ryzen 7 3700X CPU
AMD Radeon RX 570 4G GPU
32 GB (2x16) DDR4-3200 RAM

Offline

#2 2022-09-30 06:44:32

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,479

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

how to raise this critical temperature value

How about keeping the GPU cooler?
https://wiki.archlinux.org/title/Fan_sp … an_control

Offline

#3 2022-09-30 08:46:14

toydotgame
Member
Registered: 2022-09-30
Posts: 9
Website

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

Yeah, I changed the my GPU fan curve in CoreCtrl (what I use for controlling my fans and GPU clock) to be a bit stronger on the higher temperatures, so I should run cooler while gaming; but nevertheless, 85°C is a perfectly normal operating temperature for my GPU – so I want to raise the critical temperature anyway because it's inevitable that I will still reach 85°C over a more extended period of intensive use.


Computer Specs: archtwo on PCPartPicker
AMD Ryzen 7 3700X CPU
AMD Radeon RX 570 4G GPU
32 GB (2x16) DDR4-3200 RAM

Offline

#4 2022-09-30 10:48:06

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 15,167

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

Is this the archtwo mentioned in your signature ?

If so, the GPU temperatures appear to have gone up dramatically since you build it.

GPU Core Clock Rate
1.244 GHz
GPU Effective Memory Clock Rate
1.75 GHz
GPU Temperature While Idle
27.0° C
GPU Temperature Under Load
54.0° C

What is current idle temperature for the gpu ?
Have you checked if all fans are working correctly and airflow is not obstructed ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#5 2022-09-30 11:02:09

toydotgame
Member
Registered: 2022-09-30
Posts: 9
Website

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

Yeah, it's the same one. This is where it idles at with just Firefox open during normal web browsing: (Unlabelled value to the left is the GPU fan speed in rpm)
SdQd9Ww.png

Those values were from my first boot of the computer and I didn't really take the time to let the temperatures even out after I had run some more strenuous tasks. I've updated the PCPartPicker post to new values that are a bit closer to the temperatures I've experienced.


Computer Specs: archtwo on PCPartPicker
AMD Ryzen 7 3700X CPU
AMD Radeon RX 570 4G GPU
32 GB (2x16) DDR4-3200 RAM

Offline

#6 2022-09-30 11:15:18

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,479

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

What makes you believe the critical temperature is 85°C?
What's the output of "sensors"?
Otherwise, /sys/class/hwmon/ should have the GPU and there temp*_crit_* - what do those actually say?

Offline

#7 2022-09-30 11:25:39

toydotgame
Member
Registered: 2022-09-30
Posts: 9
Website

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

$ sensors
amdgpu-pci-0400
Adapter: PCI adapter
vddgfx:      900.00 mV 
fan1:         653 RPM  (min =    0 RPM, max = 4500 RPM)
edge:         +51.0°C  (crit = +94.0°C, hyst = -273.1°C)
PPT:          22.03 W  (cap = 120.00 W)

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +16.8°C  (crit = +20.8°C)

zenpower-pci-00c3
Adapter: PCI adapter
SVI2_Core:     1.43 V  
SVI2_SoC:      1.09 V  
Tdie:         +58.0°C  (high = +95.0°C)
Tctl:         +58.0°C  
Tccd1:        +59.2°C  
SVI2_P_Core:  16.04 W  
SVI2_P_SoC:    6.76 W  
SVI2_C_Core:  11.20 A  
SVI2_C_SoC:    6.18 A 
$ cat /sys/class/hwmon/hwmon2/temp1_crit
94000

…which means that the GPU's edge crit is 94°C, right? But I've never seen it get remotely that high before shutdown. It's unpredictable and hard to reproduce the shutdown most of the time, but in practice I've never seen the GPU temp go above the mid-80s °C range.


Computer Specs: archtwo on PCPartPicker
AMD Ryzen 7 3700X CPU
AMD Radeon RX 570 4G GPU
32 GB (2x16) DDR4-3200 RAM

Offline

#8 2022-09-30 11:33:01

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,479

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

the GPU's edge crit is 94°C, right?

Yup.
Given Lone_Wolf's findings in #4 I'd say there's some issue w/ the temperature management and the GPU temp spirales out of control despite the fan activity.
Maybe the heatsink got loose or there's a leak in the heatpipe (if any)?

Offline

#9 2022-09-30 11:36:51

toydotgame
Member
Registered: 2022-09-30
Posts: 9
Website

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

I compressed-air'd the heatsink and re-pasted the die a couple months ago and thermals seem to respond to the fan speed quite fine. Those temperatures in #4 are old ones and the one's I'm currently experiencing are the average ones I've had for the majority of my time using this build.


Computer Specs: archtwo on PCPartPicker
AMD Ryzen 7 3700X CPU
AMD Radeon RX 570 4G GPU
32 GB (2x16) DDR4-3200 RAM

Offline

#10 2022-09-30 11:58:56

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,479

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

Monitor the temps *very* closely and stress teh GPU a bit, see where it goes…

Offline

#11 2022-10-02 08:45:35

toydotgame
Member
Registered: 2022-09-30
Posts: 9
Website

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

I ran six games at once at max settings with Vsync off and couldn't get the temps above 74°C on the GPU and 82°C on the CPU. Fan rpm peaked at around 3200. I sat in this state for 20-25 minutes or so monitoring it and the temperatures settled down to the mid-70s °C range [rather than rising at any rate].
If I try this hard to reproduce it with six games maxxed out and I couldn't get temps that high, yet one game at medium-high settings can cause a CTF shutdown – then I'm not so sure it was actually high temps. I wasn't watching the temps when these shutdowns occurred (because they were so sporadic and unpredictable if and when they happened); so overall I'm pretty uncertain now on if my temperatures were actually running away and getting super hot or if there was a false trigger. (And in the latter's case, I have zero way to diagnose that in my knowledge set)


Computer Specs: archtwo on PCPartPicker
AMD Ryzen 7 3700X CPU
AMD Radeon RX 570 4G GPU
32 GB (2x16) DDR4-3200 RAM

Offline

#12 2022-10-02 10:17:43

Lone_Wolf
Administrator
From: Netherlands, Europe
Registered: 2005-10-04
Posts: 15,167

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

Back to basic troubleshooting then, please change the title to something like "unexpected shutdowns, log shows SW CTF"

Were the shutdowns with 1 specific game/app or with multiple games ?

Please post the full journal (run as root) of the most recent shutdown as well as the output of lspci -k (run this one as user).
Are you using X or wayland ?


Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.

clean chroot building not flexible enough ?
Try clean chroot manager by graysky

Offline

#13 2022-10-02 12:40:22

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,479

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

We still have

Sep 28 14:59:43 archtwo kernel: amdgpu 0000:04:00.0: amdgpu: ERROR: GPU over temperature range(SW CTF) detected!
Sep 28 14:59:43 archtwo kernel: amdgpu 0000:04:00.0: amdgpu: ERROR: System is going to shutdown due to GPU SW CTF!
Sep 28 14:59:43 archtwo systemd-logind[539]: System is powering down.

Do these shutdowns currently still happen and can you reproduce them w/ "one game at medium-high settings" and does that trigger formentioned journal entries?

Offline

#14 2022-10-03 04:40:18

toydotgame
Member
Registered: 2022-09-30
Posts: 9
Website

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

Lone_Wolf wrote:

Were the shutdowns with 1 specific game/app or with multiple games ?

In my experience it's only been with Minecraft. But I play Minecraft more than my other games by a notable amount so it could just be that the times that the shutdowns strike just has a much higher chance to strike when Minecraft is open more than other games.
I've also checked my Minecraft logs and nothing seems out of the ordinary until it reports that it has received an external kill signal and Java shuts down.

Lone_Wolf wrote:

Please post the full journal (run as root) of the most recent shutdown as well as the output of lspci -k (run this one as user).

lspci-k
sudo journalctl -b -1

Lone_Wolf wrote:

Are you using X or wayland ?

I'm using X.



seth wrote:

Do these shutdowns currently still happen and can you reproduce them w/ "one game at medium-high settings" and does that trigger formentioned journal entries?

I can't reproduce them on demand and they happen at a frequency of around once a month so it's hard to catch details before it happens. And yes these shutdowns are the cause of these journal entries AFAIK and have tested.

Last edited by toydotgame (2022-10-16 04:25:27)


Computer Specs: archtwo on PCPartPicker
AMD Ryzen 7 3700X CPU
AMD Radeon RX 570 4G GPU
32 GB (2x16) DDR4-3200 RAM

Offline

#15 2022-10-03 07:28:20

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,479

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

Sep 28 14:51:58 archtwo rtkit-daemon[961]: Supervising 8 threads of 4 processes of 1 users.
Sep 28 14:59:43 archtwo kernel: amdgpu 0000:04:00.0: amdgpu: ERROR: GPU over temperature range(SW CTF) detected!
Sep 28 14:59:43 archtwo kernel: amdgpu 0000:04:00.0: amdgpu: ERROR: System is going to shutdown due to GPU SW CTF!
Sep 28 14:59:43 archtwo systemd-logind[539]: System is powering down.

Completely out of the blue and the journal is excessively long but has no other amdgpu messages.
Do you overclock the GPU?
https://wiki.archlinux.org/title/AMDGPU#Overclocking
What if you undervolt it?

Offline

#16 2022-10-03 21:15:16

toydotgame
Member
Registered: 2022-09-30
Posts: 9
Website

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

Currently the GPU is on an unmodified clock. Performance governors (or whatever their GPU equivalent name is) are set to automatic and I have nothing configured anywhere to modify clock speeds or voltages. I'm not an experienced overclocker and have no need to overclock any part of my system as it performs well enough for my use case as it is.

seth wrote:

What if you undervolt it?

I'm not quite sure what values I would need to do that because I've never done it before. For reference, this is the menu that the Advanced performance mode on my GPU:
VQMtlNP.png


Computer Specs: archtwo on PCPartPicker
AMD Ryzen 7 3700X CPU
AMD Radeon RX 570 4G GPU
32 GB (2x16) DDR4-3200 RAM

Offline

#17 2022-10-04 06:24:43

seth
Member
From: Won't reply 2 private help req
Registered: 2012-09-03
Posts: 76,479

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

Just lower the values for State 6/7 to eg. 1200 each and maybe the RAM state 2 to 1500
You could alternatively try to lower the power limit and see whether that takes control over the frequencies as well.

But: if this has been a temporary problem and you're currently not experiencing any shutdowns it might also have been an environmental problem (eg. if the environment temperature is HOT™ the system will have a harder time to stay COOL™)

Offline

#18 2022-10-09 04:12:55

toydotgame
Member
Registered: 2022-09-30
Posts: 9
Website

Re: [SOLVED] Unexpected shutdowns, systemd journal shows SW CTF

Thanks, I've tried those voltages and everything seems normal after stressing it under normal gaming conditions for a while (aka just gaming on it normally when I felt like it and observing temperatures and clocks to make sure everything's normal).

Marking the thread as solved for now.


Computer Specs: archtwo on PCPartPicker
AMD Ryzen 7 3700X CPU
AMD Radeon RX 570 4G GPU
32 GB (2x16) DDR4-3200 RAM

Offline

Board footer

Powered by FluxBB