You are not logged in.

#1 2019-06-25 05:43:27

Skunky
Member
Registered: 2018-01-25
Posts: 230

[Solved] Thermald custom config not working

Hello everyone, i would like to set a max temp limit for my cpu by using thermald but seems like my config is not working properly for some reasons.

man thermal-conf.xml

has some basic configuration examples which i tried with no success.

For example when i use

<?xml version="1.0"?>
       <ThermalConfiguration>
         <Platform>
           <Name>Overide CPU default passive</Name>
           <ProductName>*</ProductName>
           <Preference>QUIET</Preference>
           <ThermalZones>
             <ThermalZone>
               <Type>cpu</Type>
               <TripPoints>
                 <TripPoint>
                   <Temperature>60000</Temperature>
                   <type>passive</type>
                 </TripPoint>
               </TripPoints>
             </ThermalZone>
           </ThermalZones>
         </Platform>
       </ThermalConfiguration>

in my

 /etc/thermald/thermal-conf.xml 

and run

 stress -c 10 

temps gets almost to 70c while

 <Temperature>60000</Temperature> 

parameter should limit temp to 60c

i have zero experience with XML so i am also probably missing something in the config file.

when i run

 sudo thermald --no-daemon --loglevel=debug  

i get

 RAPL sysfs present 
RAPL base path /sys/class/powercap/intel-rapl/
RAPL domain dir uevent
 /sys/class/powercap/intel-rapl/uevent/name doesn't exist
RAPL domain dir enabled
 /sys/class/powercap/intel-rapl/enabled/name doesn't exist
RAPL domain dir power
 /sys/class/powercap/intel-rapl/power/name doesn't exist
RAPL domain dir intel-rapl:0
name package-0
RAPL base path /sys/class/powercap/intel-rapl/intel-rapl:0/
RAPL domain dir uevent
 /sys/class/powercap/intel-rapl/intel-rapl:0/uevent/name doesn't exist
RAPL domain dir energy_uj
 /sys/class/powercap/intel-rapl/intel-rapl:0/energy_uj/name doesn't exist
RAPL domain dir intel-rapl:0:0
name core
RAPL domain dir enabled
 /sys/class/powercap/intel-rapl/intel-rapl:0/enabled/name doesn't exist
RAPL domain dir constraint_1_max_power_uw
 /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_1_max_power_uw/name doesn't exist
RAPL domain dir power
 /sys/class/powercap/intel-rapl/intel-rapl:0/power/name doesn't exist
RAPL domain dir device
 /sys/class/powercap/intel-rapl/intel-rapl:0/device/name doesn't exist
RAPL domain dir constraint_1_time_window_us
 /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_1_time_window_us/name doesn't exist
RAPL domain dir constraint_1_power_limit_uw
 /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_1_power_limit_uw/name doesn't exist
RAPL domain dir intel-rapl:0:1
name dram
RAPL domain dir constraint_0_time_window_us
 /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_time_window_us/name doesn't exist
RAPL domain dir subsystem
 /sys/class/powercap/intel-rapl/intel-rapl:0/subsystem/name doesn't exist
RAPL domain dir constraint_1_name
 /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_1_name/name doesn't exist
RAPL domain dir constraint_0_power_limit_uw
 /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw/name doesn't exist
RAPL domain dir constraint_0_name
 /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_name/name doesn't exist
RAPL domain dir name
 /sys/class/powercap/intel-rapl/intel-rapl:0/name/name doesn't exist
RAPL domain dir constraint_0_max_power_uw
 /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_max_power_uw/name doesn't exist
RAPL domain dir max_energy_range_uj
 /sys/class/powercap/intel-rapl/intel-rapl:0/max_energy_range_uj/name doesn't exist
RAPL domain count 1
RAPL domain dir subsystem
 /sys/class/powercap/intel-rapl/subsystem/name doesn't exist
RAPL domain count 1
22 CPUID levels; family:model:stepping 0x6:9e:9 (6:158:9)
Running on a vanilla kernel
Polling mode is enabled: 4
thd_read_default_thermal_sensors 
sensor_update: type x86_pkg_temp
sensor_update: type acpitz
sensor_update: type acpitz
thd_read_default_thermal_sensors loaded 3 sensors 
dts /sys/devices/platform/coretemp.0/name doesn't exist
sensor id 6 : No temp sysfs for reading raw temp
sensor id 6 : No temp sysfs for reading raw temp
sensor id 6 : No temp sysfs for reading raw temp
failed to open /dev/acpi_thermal_rel 
failed to open /dev/acpi_thermal_rel 
TRT/ART read failed
/etc/thermald/thermal-conf.xml:1: parser error : XML declaration allowed only at the start of the document
       <?xml version="1.0"?>
            ^
error: could not parse file /etc/thermald/thermal-conf.xml
sensor index:2 x86_pkg_temp /sys/class/thermal/thermal_zone2/ Async:1 
sensor index:0 acpitz /sys/class/thermal/thermal_zone0/ Async:0 
sensor index:1 acpitz /sys/class/thermal/thermal_zone1/ Async:0 
sensor index:3 hwmon /sys/class/hwmon/hwmon1/temp1_input Async:0 
sensor index:4 hwmon /sys/class/hwmon/hwmon1/temp2_input Async:0 
sensor index:5 hwmon /sys/class/hwmon/hwmon1/temp3_input Async:0 
thd_read_default_cooling devices 
cooling dev 1:0:1:Fan
cooling dev 8:0:3:Processor
cooling dev 6:0:3:Processor
cooling dev 4:0:1:Fan
cooling dev 2:0:1:Fan
cooling dev 0:0:1:Fan
cooling dev 9:-1:50:intel_powerclamp
cooling dev 7:0:3:Processor
cooling dev 5:0:3:Processor
cooling dev 3:0:1:Fan
thd_read_default_cooling devices loaded 10 cdevs 
Default constraint power limit is more than max power 115000000:51000000
powercap RAPL max power limit range 115000000 
RAPL max limit 57500000 increment: -5750000
set_pid_param 10 [-1000.100,10]
Use Default pstate drv settings
cooling dev index:11, curr_state:0, max_state:10, unit:10,000000, min_com:0, type:intel_pstate
sysfs open failed 
failed to open /dev/acpi_thermal_rel 
failed to open /dev/acpi_thermal_rel 
TRT/ART read failed
/etc/thermald/thermal-conf.xml:1: parser error : XML declaration allowed only at the start of the document
       <?xml version="1.0"?>
            ^
error: could not parse file /etc/thermald/thermal-conf.xml
pstate CPU present 0-3
name = package-0
name = core
name = dram
sysfs read failed constraint_0_max_power_uw
dram:powercap RAPL invalid max power limit range 
Calculate dynamically phy_max 
power_on_constraint_0_pwr 0
1: Fan, C:0 MN: 0 MX:1 ST:1 pt:/sys/class/thermal/ rd_bk 1 
8: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0 
6: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0 
4: Fan, C:0 MN: 0 MX:1 ST:1 pt:/sys/class/thermal/ rd_bk 1 
2: Fan, C:0 MN: 0 MX:1 ST:1 pt:/sys/class/thermal/ rd_bk 1 
0: Fan, C:0 MN: 0 MX:1 ST:1 pt:/sys/class/thermal/ rd_bk 1 
9: intel_powerclamp, C:-1 MN: 0 MX:50 ST:5 pt:/sys/class/thermal/ rd_bk 0 
7: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0 
5: Processor, C:0 MN: 0 MX:3 ST:1 pt:/sys/class/thermal/ rd_bk 0 
3: Fan, C:0 MN: 0 MX:1 ST:1 pt:/sys/class/thermal/ rd_bk 1 
10: rapl_controller, C:115000000 MN: 115000000 MX:57500000 ST:-5750000 pt:/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/ rd_bk 1 
11: intel_pstate, C:0 MN: 0 MX:10 ST:1 pt:/sys/devices/system/cpu/intel_pstate/ rd_bk 1 
12: rapl_controller_dram, C:100000000 MN: 100000000 MX:0 ST:-500000 pt:/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:1/ rd_bk 1 
thd_read_default_thermal_zones 
Added zone index:2 
Thermal Zone look for 2/type
Thermal Zone 2:x86_pkg_temp
read_trip_points 2/trip_point_0_type:passive 
read_trip_points 2/trip_point_0_temp:0 
read_trip_points 2/trip_point_1_type:passive 
read_trip_points 2/trip_point_1_temp:0 
read_trip_points Added 0 trips 
Added zone index:0 
Thermal Zone look for 0/type
Thermal Zone 0:acpitz
read_trip_points 0/trip_point_0_type:critical 
read_trip_points 0/trip_point_0_temp:119000 
Add trip pt 0:0:0x0:119000:1
read_trip_points 0/trip_point_1_type:active 
read_trip_points 0/trip_point_1_temp:71000 
Add trip pt 4:0:0x0:71000:1
read_trip_points 0/trip_point_2_type:active 
read_trip_points 0/trip_point_2_temp:55000 
Add trip pt 4:0:0x0:55000:1
read_trip_points 0/trip_point_3_type:active 
read_trip_points 0/trip_point_3_temp:50000 
Add trip pt 4:0:0x0:50000:1
read_trip_points 0/trip_point_4_type:active 
read_trip_points 0/trip_point_4_temp:45000 
Add trip pt 4:0:0x0:45000:1
read_trip_points 0/trip_point_5_type:active 
read_trip_points 0/trip_point_5_temp:40000 
Add trip pt 4:0:0x0:40000:1
read_trip_points Added 6 trips 
 >> read_cdev_trip_points for 
cdev trip point: 5 contains 5
cdev0 present
symbolic name ../cooling_device4:4
zone acpitz bounded 
cdev trip point: 4 contains 4
cdev1 present
symbolic name ../cooling_device3:3
zone acpitz bounded 
cdev trip point: 3 contains 3
cdev2 present
symbolic name ../cooling_device2:2
zone acpitz bounded 
cdev trip point: 2 contains 2
cdev3 present
symbolic name ../cooling_device1:1
zone acpitz bounded 
cdev trip point: 1 contains 1
cdev4 present
symbolic name ../cooling_device0:0
zone acpitz bounded 
cthd_sysfs_zone::read_cdev_trip_points: ZONE bound to CDEV status 1 
sort_and_update_poll_trip: trip_points_size =6
Sorted trip dump zone index:0 type:acpitz:
index 0: type:critical temp:119000 hyst:1 zone id:0 sensor id:0 control_type:1 cdev size:0
index 5: type:active temp:40000 hyst:1 zone id:0 sensor id:0 control_type:1 cdev size:1
cdev[0] Fan, Sampling period: 0
	 target_state:32536
	 pid: kp=-5,7287e-273 ki=6,95283e-310 kd=6,95283e-310
index 4: type:active temp:45000 hyst:1 zone id:0 sensor id:0 control_type:1 cdev size:1
cdev[0] Fan, Sampling period: 0
	 target_state:32536
	 pid: kp=-5,7287e-273 ki=6,95283e-310 kd=6,95283e-310
index 3: type:active temp:50000 hyst:1 zone id:0 sensor id:0 control_type:1 cdev size:1
cdev[0] Fan, Sampling period: 0
	 target_state:32536
	 pid: kp=-5,7287e-273 ki=6,95283e-310 kd=6,95283e-310
index 2: type:active temp:55000 hyst:1 zone id:0 sensor id:0 control_type:1 cdev size:1
cdev[0] Fan, Sampling period: 0
	 target_state:32536
	 pid: kp=-5,7287e-273 ki=6,95283e-310 kd=6,95283e-310
index 1: type:active temp:71000 hyst:1 zone id:0 sensor id:0 control_type:1 cdev size:1
cdev[0] Fan, Sampling period: 0
	 target_state:32536
	 pid: kp=-5,7287e-273 ki=6,95283e-310 kd=6,95283e-310
trip type: 0 temp: 119000 
trip type: 4 temp: 40000 
trip type: 4 temp: 45000 
trip type: 4 temp: 50000 
trip type: 4 temp: 55000 
trip type: 4 temp: 71000 
Add trip pt 5:0:0x0:35000:0
Added zone index:1 
Thermal Zone look for 1/type
Thermal Zone 1:acpitz
read_trip_points 1/trip_point_0_type:critical 
read_trip_points 1/trip_point_0_temp:119000 
Add trip pt 0:1:0x0:119000:1
read_trip_points Added 1 trips 
 >> read_cdev_trip_points for 
cthd_sysfs_zone::read_cdev_trip_points: ZONE bound to CDEV status 0 
sort_and_update_poll_trip: trip_points_size =1
Sorted trip dump zone index:1 type:acpitz:
index 0: type:critical temp:119000 hyst:1 zone id:1 sensor id:0 control_type:1 cdev size:0
trip type: 0 temp: 119000 
Add trip pt 5:1:0x0:114000:0
thd_read_default_thermal_zones loaded 2 zones 
zone cpu will be created 
dts zone /sys/devices/platform/coretemp.0/name doesn't exist
/sys/class/hwmon/hwmon0/name->acpitz
/sys/class/hwmon/hwmon1/name->coretemp
Added zone index:3 
zone dts syfs: /sys/class/hwmon/hwmon1/, package id 0 
Core temp DTS :critical 100000, max 80000, psv 90000
node type: Element, name: CoolingDevice value: rapl_controller
node type: Element, name: CoolingDevice value: intel_pstate
node type: Element, name: CoolingDevice value: intel_powerclamp
node type: Element, name: CoolingDevice value: cpufreq
node type: Element, name: CoolingDevice value: Processor
Add trip pt 3:3:0xffff:90000:0
- rapl_controller
- intel_pstate
- intel_powerclamp
- cpufreq
- Processor
CDEVS order specified in thermal-cpu-cdev-order.xml
sort_and_update_poll_trip: trip_points_size =1
Sorted trip dump zone index:3 type:cpu:
index 0: type:passive temp:90000 hyst:0 zone id:3 sensor id:65535 control_type:1 cdev size:4
cdev[0] rapl_controller, Sampling period: 0
	 target_state:not defined
cdev[1] intel_pstate, Sampling period: 0
	 target_state:not defined
cdev[2] intel_powerclamp, Sampling period: 0
	 target_state:not defined
cdev[3] Processor, Sampling period: 0
	 target_state:not defined
trip type: 3 temp: 90000 
Add trip pt 5:3:0x2:85000:0
failed to open /dev/acpi_thermal_rel 
failed to open /dev/acpi_thermal_rel 
TRT/ART read failed
/etc/thermald/thermal-conf.xml:1: parser error : XML declaration allowed only at the start of the document
       <?xml version="1.0"?>
            ^
error: could not parse file /etc/thermald/thermal-conf.xml


 ZONE DUMP BEGIN

Zone 3: cpu, Active:1 Bind:0 Sensor_cnt:1
..sensors.. 
sensor index:2 x86_pkg_temp /sys/class/thermal/thermal_zone2/ Async:1 
..trips.. 
index 0: type:passive temp:90000 hyst:0 zone id:3 sensor id:65535 control_type:1 cdev size:4
cdev[0] rapl_controller, Sampling period: 0
	 target_state:not defined
cdev[1] intel_pstate, Sampling period: 0
	 target_state:not defined
cdev[2] intel_powerclamp, Sampling period: 0
	 target_state:not defined
cdev[3] Processor, Sampling period: 0
	 target_state:not defined
index 1: type:polling temp:85000 hyst:0 zone id:3 sensor id:2 control_type:0 cdev size:0



 ZONE DUMP END
FD = 7
Current user preference is 0
Start main loop
thd_engine_thread begin
^Cpoll exit 1 polls_fd event 1 0
 energy 2:0:597085 mj: 0 mw 
 energy 1:0:8608166 mj: 0 mw 
read_temperature sensor ID 2
Sensor x86_pkg_temp :temp 38000 
pref 0 type 3 temp 38000 trip 90000 
Passive Trip point applicable 
Trip point applicable <  0:90000 
cdev size for this trippoint 4
cdev at index 8:Processor
Need to switch to next cdev 
cdev at index 9:intel_powerclamp
Need to switch to next cdev 
cdev at index 11:intel_pstate
Need to switch to next cdev 
cdev at index 10:rapl_controller
Need to switch to next cdev 
wakeup fd event
Received message 1
Terminating ...
Terminating thread..
thd_engine_thread_end
terminating on user request ..

at this point i think my config file is just getting ignored by thermald because according to the manual

A trip point is a temperature at which a cooling device needs to be activated.

my lscpu

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              2
Core(s) per socket:              2
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           158
Model name:                      Intel(R) Core(TM) i3-7100 CPU @ 3.90GHz
Stepping:                        9
CPU MHz:                         800.349
CPU max MHz:                     3900,0000
CPU min MHz:                     800,0000
BogoMIPS:                        7827.00
Virtualization:                  VT-x
L1d cache:                       64 KiB
L1i cache:                       64 KiB
L2 cache:                        512 KiB
L3 cache:                        3 MiB
NUMA node0 CPU(s):               0-3
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds:               Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Full generic retpoline, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse
                                 2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopolog
                                 y nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg 
                                 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand la
                                 hf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexprior
                                 ity ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt i
                                 ntel_pt xsaveopt xsavec xgetbv1 xsaves dtherm arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear
                                  flush_l1d

Can someone give me an hint on how to solve this? Thanks in advance

Last edited by Skunky (2019-06-28 08:40:45)

Offline

#2 2019-06-25 18:18:09

snakeroot
Member
Registered: 2012-10-06
Posts: 164

Re: [Solved] Thermald custom config not working

Do you have a blank line (or anything else) ahead of the line "<?xml version="1.0"?>" in your thermal-conf.xml? I was able to replicate the error

/etc/thermald/thermal-conf.xml:2: parser error : XML declaration allowed only at the start of the document
<?xml version="1.0"?>
     ^
error: could not parse file /etc/thermald/thermal-conf.xml

by adding a blank; without it the file parsed without complaint.

I ran your stress test with the corrected file and thermald appeared to cap temperature at 60C.

HTH,

Offline

#3 2019-06-26 02:06:36

Skunky
Member
Registered: 2018-01-25
Posts: 230

Re: [Solved] Thermald custom config not working

Hi, thank you very much for you reply

i had no blank line however, i ran few more tests and seems like the config is actually working but thermald has not enough time to take action, seems like i need to set a gap e.g. if i use

<Temperature>60000</Temperature> 

temps will rise up to about 70c.


While if i use

 <Temperature>50000</Temperature> 

temps will go up to around 63c.

I think

--poll-interval 

parameter can fix this however im happy knowing that it's working as supposed.

Thank you!!

EDIT: title modified otherwise i can't mark as solved

Last edited by Skunky (2019-06-26 02:08:47)

Offline

#4 2019-06-27 03:20:23

Skunky
Member
Registered: 2018-01-25
Posts: 230

Re: [Solved] Thermald custom config not working

This is getting weird, i ran some tests on my laptop which definitely has more thermal issues than my desktop and with

 <Temperature>55000</Temperature> 

  temps rise up to 75c and more.
I can reproduce this only while gaming because if i run

 stress

temps DO get locked which is even more weird.

Could be games changing my governors/clock?

Offline

#5 2019-06-27 14:08:36

ewaller
Administrator
From: Pasadena, CA
Registered: 2009-07-13
Posts: 19,769

Re: [Solved] Thermald custom config not working

Skunky wrote:

This is getting weird, i ran some tests on my laptop which definitely has more thermal issues than my desktop

Thermal design of laptops is far more challenging than it is for desktops.  The size of heat sinks are limited, intake and exhaust ports are small, and small duct-work make it impossible to achieve laminar airflow, fans are smaller. Volume is smaller which drives up the W/(m^3).

75°C is not unreasonable for a laptop under moderate load.  Mine will reach its thermal design limit of 100°C when doing tasks such as compiling with several cores, at which point throttling keeps it at that point.


Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way

Offline

#6 2019-06-28 06:15:25

Skunky
Member
Registered: 2018-01-25
Posts: 230

Re: [Solved] Thermald custom config not working

Thank you very much for your reply, but i don't understand why thermald couldn't just lower the cpu clock once the desired temp is reached?
I know cpu can handle very high temps but i just can't understand how this works roll

Offline

#7 2019-06-28 07:17:47

V1del
Forum Moderator
Registered: 2012-10-16
Posts: 21,627

Re: [Solved] Thermald custom config not working

thermald will do it's best to reach the target temperature, sometimes that simply isn't possible. FWIW games will often have bursts of heavy load followed by not much processing followed by bursts of heavy load, it's likely that the ramping happens too fast for thermald to limit properly, as long as you are within thermal design limits you shouldn't worry too much.

Offline

#8 2019-06-28 08:40:05

Skunky
Member
Registered: 2018-01-25
Posts: 230

Re: [Solved] Thermald custom config not working

So it's mostly because

V1del wrote:

the ramping happens too fast for thermald to limit properly

Thank you all for you help much appreciated

Last edited by Skunky (2019-06-28 08:40:23)

Offline

Board footer

Powered by FluxBB