You are not logged in.

#1 2022-11-12 06:32:59

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,168

thermald configuration issue

When I first installed thermald, I'm pretty sure I checked its status to make sure it was happy. However, I admit I've mostly assumed it was getting on with things OK on its own since. Now, however, it seems it wants configuration:

● thermald.service - Thermal Daemon Service
     Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
     Active: active (running) since Sat 2022-11-05 20:42:11 GMT; 6 days ago
   Main PID: 455 (thermald)
      Tasks: 3 (limit: 9370)
     Memory: 1.1M
        CPU: 4.688s
     CGroup: /system.slice/thermald.service
             └─455 /usr/bin/thermald --systemd --dbus-enable --adaptive

Tach 05 20:42:11 CompSelf thermald[455]: Thermal DTS: No coretemp sysfs found
Tach 05 20:42:11 CompSelf thermald[455]: sensor id 2 : No temp sysfs for reading raw temp
Tach 05 20:42:11 CompSelf thermald[455]: sensor id 2 : No temp sysfs for reading raw temp
Tach 05 20:42:11 CompSelf thermald[455]: sensor id 2 : No temp sysfs for reading raw temp
Tach 05 20:42:11 CompSelf thermald[455]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 05 20:42:11 CompSelf thermald[455]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 05 20:42:11 CompSelf thermald[455]: Thermal DTS or hwmon: No Zones present Need to configure manually
Tach 05 20:42:11 CompSelf thermald[455]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 05 20:42:11 CompSelf thermald[455]: Polling mode is enabled: 4
Tach 05 20:42:11 CompSelf systemd[1]: Started Thermal Daemon Service.

According to the wiki

wiki wrote:

This daemon proactively controls thermal parameters using P-states, T-states, and the Intel power clamp driver. thermald can also be used for older Intel CPUs. If the latest drivers are not available, then the daemon will revert to x86 model specific registers and the Linux "cpufreq subsystem" to control system cooling.

By default, it monitors CPU temperature using available CPU digital temperature sensors and maintains CPU temperature under control, before hardware takes aggressive correction action. If there is a skin temperature sensor in thermal sysfs, then it tries to keep skin temperature under 45C.

My system has P-states and the sensors mostly work correctly

iwlwifi_1-virtual-0
Adapter: Virtual device
temp1:        +49.0°C  

thinkpad-isa-0000
Adapter: ISA adapter
fan1:           0 RPM
fan2:        65535 RPM
CPU:          +36.0°C  
GPU:              N/A  
temp3:         +0.0°C  
temp4:         +0.0°C  
temp5:         +0.0°C  
temp6:         +0.0°C  
temp7:         +0.0°C  
temp8:         +0.0°C  

nvme-pci-0400
Adapter: PCI adapter
Composite:    +21.9°C  (low  = -273.1°C, high = +69.8°C)
                       (crit = +79.8°C)

BAT0-acpi-0
Adapter: ACPI interface
in0:          12.06 V  

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +36.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:        +36.0°C  (high = +100.0°C, crit = +100.0°C)
Core 1:        +34.0°C  (high = +100.0°C, crit = +100.0°C)

pch_skylake-virtual-0
Adapter: Virtual device
temp1:        +33.5°C  

BAT1-acpi-0
Adapter: ACPI interface
in0:          12.64 V  

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +36.0°C  (crit = +128.0°C)

fan2 is a figment of Linux's imagination or a dream of the firmware, I'm not sure which. At any rate, it doesn't exist.

The only file in /etc/termald is thermal-cpu-cdev-order.xml  which is just the default, I think.

<!--
Specifies the order of compensation to cool CPU only.
There is a default already implemented in the code, but
this file can be used to change order

The Following cooling device can present
-->

<CoolingDeviceOrder>
        <!-- Specify Cooling device order -->
        <CoolingDevice>rapl_controller</CoolingDevice>
        <CoolingDevice>intel_pstate</CoolingDevice>
        <CoolingDevice>intel_powerclamp</CoolingDevice>
        <CoolingDevice>cpufreq</CoolingDevice>
        <CoolingDevice>Processor</CoolingDevice>
</CoolingDeviceOrder>

So thermald is correct that there is no file to configure zones in /etc/thermald, but I'm confused as to why it wants one when it managed perfectly well before. Should I provide one or should I try to solve some prior problem which explains why it wants one?

EDIT: The manual page suggests /sys/class/thermal should be where it is looking:

 ls /sys/class/thermal/thermal_zone*/
/sys/class/thermal/thermal_zone0/:
available_policies  hwmon1/          k_d  k_po  mode    policy  slope       sustainable_power  trip_point_0_temp  type
device@             integral_cutoff  k_i  k_pu  offset  power/  subsystem@  temp               trip_point_0_type  uevent

/sys/class/thermal/thermal_zone1/:
available_policies  integral_cutoff  k_i   k_pu  offset  power/  subsystem@         temp               trip_point_0_type  uevent
hwmon5/             k_d              k_po  mode  policy  slope   sustainable_power  trip_point_0_temp  type

/sys/class/thermal/thermal_zone2/:
available_policies  k_d   k_pu    policy  subsystem@         trip_point_0_temp  trip_point_1_type  trip_point_3_temp  trip_point_4_type  trip_point_6_temp  trip_point_7_type
hwmon8/             k_i   mode    power/  sustainable_power  trip_point_0_type  trip_point_2_temp  trip_point_3_type  trip_point_5_temp  trip_point_6_type  type
integral_cutoff     k_po  offset  slope   temp               trip_point_1_temp  trip_point_2_type  trip_point_4_temp  trip_point_5_type  trip_point_7_temp  uevent

/sys/class/thermal/thermal_zone3/:
available_policies  k_d  k_po  mode    policy  slope       sustainable_power  trip_point_0_temp  trip_point_1_temp  type
integral_cutoff     k_i  k_pu  offset  power/  subsystem@  temp               trip_point_0_type  trip_point_1_type  uevent

I don't know how that should look, so I'm not sure what it doesn't like about it, but there do seem to be thermal zones there for the finding?

$ cat /sys/class/thermal/thermal_zone*/temp
37000
35000
50000
38000

EDIT 2: Is the problem lack of some ACPI interface?

man thermald wrote:

Thermal  daemon  allows  one  to  change  this  relationship [the relationship between thermal sensors and cooling devices]  or  add new one via a thermal configuration file (thermal-conf.xml). This file is automatically created (thermal-conf.xml.auto) and used, if the platform has ACPI thermal relationship table.  If not this needs to be manually configured.

For manual configuration refer to the manual page of the thermal-conf.xml.

In some newer platforms the auto creation of the config file is done by a companion tool "dptfxtract". This tool can be downloaded from  "https://github.com/intel/dptfxtract".
It is suggested as parts of the install process, run dptfxtract.

There  can  be multiple configuration files. User can select a configuration file via -config-file option to override the default selection. The default selection picks one of the file in the following order:

       - /etc/thermald/thermal-conf.xml.auto

       - /var/run/thermald/thermal-conf.xml.auto

       - /etc/thermald/thermal-conf.xml

       (*Assuming configure prefix=/ is used during build.)

Neither of the .auto files have been created in my case. It seems curious that there is no mention of the configuration file which does exist in /etc/thermald/. What is the relationship between that and the files mentioned in the man page? man thermal-conf.xml doesn't mention it either. That page illustrates how to create a configuration file, but I'm not really any the wiser as to whether that's the best thing to do or what I should want such a file to do. I don't understand why thermald doesn't seem to recognise the thermal zones defined in /sys/class/thermal or what temperature sensor it is complaining it doesn't have. Should it have created an .auto file?

Last edited by cfr (2022-11-12 16:50:42)


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#2 2022-11-13 01:14:06

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,168

Re: thermald configuration issue

Since the manual page mentions dptfxtract, I tried installing dptfxtract-bin from AUR and following the instructions at https://github.com/intel/dptfxtract, even though it claims the tool shouldn't be needed with thermald version 2.

mkdir acpi && pushd acpi
sudo acpidump > acpi.out

This produced a long file, the contents of which meant little to me.

acpixtract -a acpi.out 

This turned the long file into a slew of (mostly?) binary data files, which meant even less.

 acpi.out    batb.dat   dbgp.dat   ecdt.dat   fpdt.dat   poat.dat     ssdt11.dat   ssdt14.dat   ssdt17.dat   ssdt4.dat   ssdt7.dat   tpm2.dat    wsmt.dat
 apic.dat    boot.dat   dmar.dat   facp.dat   hpet.dat   ssdt1.dat    ssdt12.dat   ssdt15.dat   ssdt2.dat    ssdt5.dat   ssdt8.dat   uefi1.dat
'asf!.dat'   dbg2.dat   dsdt.dat   facs.dat   mcfg.dat   ssdt10.dat   ssdt13.dat   ssdt16.dat   ssdt3.dat    ssdt6.dat   ssdt9.dat   uefi2.dat

So I then tried Intel's extraction tool.

mkdir dptfxtract ; sudo dptfxtract -o ./dptfxtract *.dat

Result:

No valid tables found

Which makes me wonder what is in those files. Non-table things? Invalid tables? Gremlins?

At any rate, it didn't produce any .conf.auto file of the kind for which thermald suddenly hankers.


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#3 2022-11-13 01:54:46

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,168

Re: thermald configuration issue

[Edit: I thought I'd posted something here earlier, but couldn't see the post to edit it. Damn.]

I tried running the following to garner more information

# thermald --dbus-enable --adaptive --loglevel=info --no-daemon
[1668303945][INFO]RAPL domain count 1
[1668303945][INFO]RAPL domain count 1
[1668303945][MSG]22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
[1668303945][INFO]THD engine init failed
[1668303945][INFO]--adaptive option failed on this platform
[1668303945][INFO]Ignoring --adaptive option
[1668303945][INFO]RAPL domain count 1
[1668303945][INFO]RAPL domain count 1
[1668303945][MSG]22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
[1668303945][INFO]sensor_update: type iwlwifi_1
[1668303945][INFO]sensor_update: type acpitz
[1668303945][INFO]sensor_update: type x86_pkg_temp
[1668303945][INFO]sensor_update: type pch_skylake
[1668303945][INFO]thd_read_default_thermal_sensors loaded 4 sensors 
[1668303945][INFO]dts /sys/devices/platform/coretemp.0/name doesn't exist
[1668303945][MSG]sensor id 7 : No temp sysfs for reading raw temp
[1668303945][MSG]sensor id 7 : No temp sysfs for reading raw temp
[1668303945][MSG]sensor id 7 : No temp sysfs for reading raw temp
[1668303945][INFO]INT3400 Base path is 
[1668303945][INFO]failed to open /dev/acpi_thermal_rel 
[1668303945][INFO]failed to open /dev/acpi_thermal_rel 
[1668303945][INFO]TRT/ART read failed
[1668303945][MSG]Config file /etc/thermald/thermal-conf.xml does not exist
[1668303945][INFO]sensor index:2 iwlwifi_1 /sys/class/thermal/thermal_zone2/ Async:0 
[1668303945][INFO]sensor index:0 acpitz /sys/class/thermal/thermal_zone0/ Async:0 
[1668303945][INFO]sensor index:3 x86_pkg_temp /sys/class/thermal/thermal_zone3/ Async:1 
[1668303945][INFO]sensor index:1 pch_skylake /sys/class/thermal/thermal_zone1/ Async:0 
[1668303945][INFO]sensor index:4 hwmon /sys/class/hwmon/hwmon7/temp3_input Async:0 
[1668303945][INFO]sensor index:5 hwmon /sys/class/hwmon/hwmon7/temp1_input Async:0 
[1668303945][INFO]sensor index:6 hwmon /sys/class/hwmon/hwmon7/temp2_input Async:0 
[1668303945][INFO]thd_read_default_cooling devices loaded 7 cdevs 
[1668303945][INFO]powercap RAPL max power limit range 15000000 
[1668303945][INFO]set_pid_param 7 [-1000.100,10]
[1668303945][INFO]Use Default pstate drv settings
[1668303945][INFO]name = package-0
[1668303945][INFO]name = dram
[1668303945][INFO]sysfs read failed /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:2/constraint_0_max_power_uw
[1668303945][INFO]:powercap RAPL invalid max power limit range 
[1668303945][INFO]Calculate dynamically phy_max 
[1668303945][INFO]INT3400 Base path is 
[1668303945][INFO]failed to open /dev/acpi_thermal_rel 
[1668303945][INFO]failed to open /dev/acpi_thermal_rel 
[1668303945][INFO]TRT/ART read failed
[1668303945][MSG]Config file /etc/thermald/thermal-conf.xml does not exist
[1668303945][INFO]1: Processor, C:0 MN: 0 MX:10 ST:1 pt:/sys/class/thermal/ rd_bk 0 
[1668303945][INFO]6: TCC, C:3 MN: 0 MX:63 ST:1 pt:/sys/class/thermal/ rd_bk 1 
[1668303945][INFO]4: iwlwifi, C:0 MN: 0 MX:20 ST:1 pt:/sys/class/thermal/ rd_bk 1 
[1668303945][INFO]2: Processor, C:0 MN: 0 MX:10 ST:1 pt:/sys/class/thermal/ rd_bk 0 
[1668303945][INFO]0: Processor, C:0 MN: 0 MX:10 ST:1 pt:/sys/class/thermal/ rd_bk 0 
[1668303945][INFO]5: intel_powerclamp, C:-1 MN: 0 MX:50 ST:5 pt:/sys/class/thermal/ rd_bk 0 
[1668303945][INFO]3: Processor, C:0 MN: 0 MX:10 ST:1 pt:/sys/class/thermal/ rd_bk 0 
[1668303945][INFO]7: rapl_controller, C:15000000 MN: 15000000 MX:7500000 ST:-750000 pt:/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/ rd_bk 1 
[1668303945][INFO]8: intel_pstate, C:0 MN: 0 MX:10 ST:1 pt:/sys/devices/system/cpu/intel_pstate/ rd_bk 1 
[1668303945][INFO]9: rapl_controller_dram, C:100000000 MN: 100000000 MX:0 ST:-500000 pt:/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:2/ rd_bk 1 
[1668303945][INFO]10: LCD, C:0 MN: 0 MX:6818 ST:681 pt:/sys/class/backlight/intel_backlight/ rd_bk 1 
[1668303945][INFO]thd_read_default_thermal_zones loaded 4 zones 
[1668303945][INFO]INT3400 Base path is 
[1668303945][INFO]zone cpu will be created 
[1668303945][INFO]dts zone /sys/devices/platform/coretemp.0/name doesn't exist
[1668303945][INFO]/sys/class/hwmon/hwmon8/name->iwlwifi_1
[1668303945][INFO]/sys/class/hwmon/hwmon6/name->thinkpad
[1668303945][INFO]/sys/class/hwmon/hwmon4/name->nvme
[1668303945][INFO]/sys/class/hwmon/hwmon2/name->BAT0
[1668303945][INFO]/sys/class/hwmon/hwmon0/name->AC
[1668303945][INFO]/sys/class/hwmon/hwmon7/name->coretemp
[1668303945][INFO]Buggy max temp: to close to critical 90000
[1668303945][INFO]Core temp DTS :critical 100000, max 90000, psv 95000
[1668303945][INFO]node type: Element, name: CoolingDevice value: rapl_controller
[1668303945][INFO]node type: Element, name: CoolingDevice value: intel_pstate
[1668303945][INFO]node type: Element, name: CoolingDevice value: intel_powerclamp
[1668303945][INFO]node type: Element, name: CoolingDevice value: cpufreq
[1668303945][INFO]node type: Element, name: CoolingDevice value: Processor
[1668303945][INFO]min:0 max:0
[1668303945][INFO]min:0 max:0
[1668303945][INFO]min:0 max:0
[1668303945][INFO]min:0 max:0
[1668303945][INFO]CDEVS order specified in thermal-cpu-cdev-order.xml
[1668303945][INFO]/sys/class/hwmon/hwmon5/name->pch_skylake
[1668303945][INFO]/sys/class/hwmon/hwmon3/name->BAT1
[1668303945][INFO]/sys/class/hwmon/hwmon1/name->acpitz
[1668303945][INFO]INT3400 Base path is 
[1668303945][INFO]failed to open /dev/acpi_thermal_rel 
[1668303945][INFO]failed to open /dev/acpi_thermal_rel 
[1668303945][INFO]TRT/ART read failed
[1668303945][MSG]Config file /etc/thermald/thermal-conf.xml does not exist
[1668303945][INFO]

 ZONE DUMP BEGIN
[1668303945][INFO]
[1668303945][INFO]Zone 4: cpu, Active:1 Bind:0 Sensor_cnt:1
[1668303945][INFO]..sensors.. 
[1668303945][INFO]sensor index:3 x86_pkg_temp /sys/class/thermal/thermal_zone3/ Async:1 
[1668303945][INFO]..trips.. 
[1668303945][INFO]index 0: type:passive temp:95000 hyst:0 zone id:4 sensor id:65535 control_type:1 cdev size:4
[1668303945][INFO]cdev[0] rapl_controller, Sampling period: 0
[1668303945][INFO]       target_state:not defined
[1668303945][INFO]min_max 0
[1668303945][INFO]cdev[1] intel_pstate, Sampling period: 0
[1668303945][INFO]       target_state:not defined
[1668303945][INFO]min_max 0
[1668303945][INFO]cdev[2] intel_powerclamp, Sampling period: 0
[1668303945][INFO]       target_state:not defined
[1668303945][INFO]min_max 0
[1668303945][INFO]cdev[3] Processor, Sampling period: 0
[1668303945][INFO]       target_state:not defined
[1668303945][INFO]min_max 0
[1668303945][INFO]index 1: type:polling temp:85500 hyst:0 zone id:4 sensor id:3 control_type:0 cdev size:0
[1668303945][INFO]
[1668303945][INFO]

 ZONE DUMP END
[1668303945][INFO]Running on a vanilla kernel
[1668303945][MSG]Polling mode is enabled: 4
[1668303945][INFO]Current user preference is 0
[1668303945][INFO]thd_engine_thread begin

Interestingly (perhaps), I get slightly different results if I now start thermald.service than I did originally.

● thermald.service - Thermal Daemon Service
     Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
     Active: active (running) since Sun 2022-11-13 01:48:59 GMT; 2min 52s ago
   Main PID: 203505 (thermald)
      Tasks: 3 (limit: 9370)
     Memory: 1.3M
        CPU: 65ms
     CGroup: /system.slice/thermald.service
             └─203505 /usr/bin/thermald --systemd --dbus-enable --adaptive

Tach 13 01:48:59 MySelf systemd[1]: Started Thermal Daemon Service.
Tach 13 01:48:59 MySelf thermald[203505]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 01:48:59 MySelf thermald[203505]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 01:48:59 MySelf thermald[203505]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 01:48:59 MySelf thermald[203505]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 01:48:59 MySelf thermald[203505]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 01:48:59 MySelf thermald[203505]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 01:48:59 MySelf thermald[203505]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 01:48:59 MySelf thermald[203505]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 01:48:59 MySelf thermald[203505]: Polling mode is enabled: 4

Though they are not, unsurprisingly, a whit more encouraging.

Is thermald just not the right tool for this platform any more? It seemed like it used to be but if it can't auto-configure itself and the configuration tool is now considered obsolete because thermald auto-configures itself, it isn't looking great.

Last edited by cfr (2022-11-13 01:56:30)


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#4 2022-11-13 10:15:06

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 71,684

Re: thermald configuration issue

thermald seems up and running?
So possibly just noise: https://github.com/intel/thermal_daemon/issues/323 ?

Edit
https://www.linuxquestions.org/question … 175671352/
https://github.com/intel/thermal_daemon/issues/205
google is full of that pattern.

Last edited by seth (2022-11-13 10:17:21)

Offline

#5 2022-11-13 18:30:54

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,168

Re: thermald configuration issue

I'm not sure it is just noise. As I understand it, thermald should create /var/run/thermald/thermald-conf.xml.auto or /etc/thermald/thermald-conf.xml.auto with information about the thermal zones, but it doesn't do that. I found some (but not all - thanks!) of those threads earlier, but the situations reported there seemed different. When people run with --no-daemon, they get information about the thermal zones, but I don't. Yet there are thermal zones in /sys/class/thermal, so shouldn't thermald find them and auto-generate a configuration file with that information?

Or is the documentation just very, very misleading and it isn't meant to do that on my platform for some reason?


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#6 2022-11-13 18:49:59

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 71,684

Re: thermald configuration issue

Sanity check, do you get the same errors when "systemctl restart themald"?
https://github.com/intel/thermal_daemon/issues/211

The pid is low, with a little rotten luck, the service comes up before sysfs is mounted?

Offline

#7 2022-11-13 19:15:45

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,168

Re: thermald configuration issue

seth wrote:

Sanity check, do you get the same errors when "systemctl restart themald"?
https://github.com/intel/thermal_daemon/issues/211

The pid is low, with a little rotten luck, the service comes up before sysfs is mounted?

I get

● thermald.service - Thermal Daemon Service
     Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
     Active: active (running) since Sun 2022-11-13 19:01:35 GMT; 1min 11s ago
   Main PID: 3514 (thermald)
      Tasks: 3 (limit: 9369)
     Memory: 3.2M
        CPU: 24ms
     CGroup: /system.slice/thermald.service
             └─3514 /usr/bin/thermald --systemd --dbus-enable --adaptive

Tach 13 19:01:35 MyName systemd[1]: Started Thermal Daemon Service.
Tach 13 19:01:35 MyName thermald[3514]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 19:01:35 MyName thermald[3514]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 19:01:35 MyName thermald[3514]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 19:01:35 MyName thermald[3514]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 19:01:35 MyName thermald[3514]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 19:01:35 MyName thermald[3514]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 19:01:35 MyName thermald[3514]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 19:01:35 MyName thermald[3514]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 19:01:35 MyName thermald[3514]: Polling mode is enabled: 4

which is the same as I got when I booted earlier today, though slightly different from the original output I posted. It still doesn't create any .xml.auto in either /etc/thermald or /var/run/thermald.

I am getting an error in the journal for thermal zone 2, but that doesn't really explain why thermald doesn't seem to even find the zone or why it ignores the other thermal zones. The journal shows one thermal zone is found by the system, at least:

Tach 13 17:57:22 MyName kernel: thermal LNXTHERM:00: registered as thermal_zone0
Tach 13 17:57:22 MyName kernel: ACPI: thermal: Thermal Zone [THM0] (31 C)

though it doesn't like thermal zone 2:

Tach 13 17:57:24 MyName kernel: e1000e 0000:00:1f.6 eth0: MAC: 12, PHY: 12, PBA No: 1000FF-0FF
Tach 13 17:57:24 MyName kernel: mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915])
Tach 13 17:57:24 MyName kernel: iTCO_vendor_support: vendor-support=0
Tach 13 17:57:24 MyName kernel: ee1004 6-0050: 512 byte EE1004-compliant SPD EEPROM, read-only
Tach 13 17:57:24 MyName kernel: iwlwifi 0000:03:00.0: Detected Intel(R) Dual Band Wireless AC 8265, REV=0x230
Tach 13 17:57:24 MyName kernel: thermal thermal_zone2: failed to read out thermal zone (-61)
Tach 13 17:57:24 MyName kernel: iTCO_wdt iTCO_wdt: Found a Intel PCH TCO device (Version=4, TCOBASE=0x0400)
Tach 13 17:57:24 MyName kernel: iTCO_wdt iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)

I can't find anything in the journal about thermal zone 1 or thermal zone 3 (which are also listed under /sys/class/thermal).

I'm not sure how this is supposed to correspond to the sensors output, so I don't know if the relationship is unexpected or not:

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +35.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:        +34.0°C  (high = +100.0°C, crit = +100.0°C)
Core 1:        +32.0°C  (high = +100.0°C, crit = +100.0°C)

thinkpad-isa-0000
Adapter: ISA adapter
fan1:           0 RPM
fan2:        65535 RPM
CPU:          +34.0°C  
GPU:              N/A  
temp3:         +0.0°C  
temp4:         +0.0°C  
temp5:         +0.0°C  
temp6:         +0.0°C  
temp7:         +0.0°C  
temp8:         +0.0°C  

nvme-pci-0400
Adapter: PCI adapter
Composite:    +21.9°C  (low  = -273.1°C, high = +69.8°C)
                       (crit = +79.8°C)

BAT0-acpi-0
Adapter: ACPI interface
in0:          12.08 V  

iwlwifi_1-virtual-0
Adapter: Virtual device
temp1:        +47.0°C  

pch_skylake-virtual-0
Adapter: Virtual device
temp1:        +31.0°C  

BAT1-acpi-0
Adapter: ACPI interface
in0:          12.63 V  

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +34.0°C  (crit = +128.0°C)

As noted above, the second fan is hallucinatory. It should say 'N/A' or not be listed at all. There is no second fan, just as there is no GPU.

Last edited by cfr (2022-11-13 19:16:42)


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#8 2022-11-13 19:52:35

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 71,684

Re: thermald configuration issue

[1668303945][INFO]sensor index:2 iwlwifi_1 /sys/class/thermal/thermal_zone2/ Async:0 

Since this is from the iwlwifi module, did you test the behavior against the LTS kernel?

Next test would be to unload iwlwifi and restart thermald to see whether that calms it down.

Offline

#9 2022-11-13 23:52:16

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,168

Re: thermald configuration issue

seth wrote:
[1668303945][INFO]sensor index:2 iwlwifi_1 /sys/class/thermal/thermal_zone2/ Async:0 

Since this is from the iwlwifi module, did you test the behavior against the LTS kernel?

I hadn't thought to, but I have now.

Status after booting LTS:

● thermald.service - Thermal Daemon Service
     Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
     Active: active (running) since Sun 2022-11-13 23:18:57 GMT; 1min 36s ago
   Main PID: 497 (thermald)
      Tasks: 3 (limit: 9370)
     Memory: 3.8M
        CPU: 18ms
     CGroup: /system.slice/thermald.service
             └─497 /usr/bin/thermald --systemd --dbus-enable --adaptive

Tach 13 23:18:57 MyName thermald[497]: Thermal DTS: No coretemp sysfs found
Tach 13 23:18:57 MyName thermald[497]: sensor id 2 : No temp sysfs for reading raw temp
Tach 13 23:18:57 MyName systemd[1]: Started Thermal Daemon Service.
Tach 13 23:18:57 MyName thermald[497]: sensor id 2 : No temp sysfs for reading raw temp
Tach 13 23:18:57 MyName thermald[497]: sensor id 2 : No temp sysfs for reading raw temp
Tach 13 23:18:57 MyName thermald[497]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:18:57 MyName thermald[497]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:18:57 MyName thermald[497]: Thermal DTS or hwmon: No Zones present Need to configure manually
Tach 13 23:18:57 MyName thermald[497]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:18:57 MyName thermald[497]: Polling mode is enabled: 4

and after restarting:

● thermald.service - Thermal Daemon Service
     Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
     Active: active (running) since Sun 2022-11-13 23:20:58 GMT; 34s ago
   Main PID: 1025 (thermald)
      Tasks: 3 (limit: 9370)
     Memory: 1.3M
        CPU: 25ms
     CGroup: /system.slice/thermald.service
             └─1025 /usr/bin/thermald --systemd --dbus-enable --adaptive

Tach 13 23:20:58 MyName systemd[1]: Started Thermal Daemon Service.
Tach 13 23:20:58 MyName thermald[1025]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 23:20:58 MyName thermald[1025]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 23:20:58 MyName thermald[1025]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 23:20:58 MyName thermald[1025]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 23:20:58 MyName thermald[1025]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 23:20:58 MyName thermald[1025]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:20:58 MyName thermald[1025]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:20:58 MyName thermald[1025]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:20:58 MyName thermald[1025]: Polling mode is enabled: 4

On the current kernel after booting:

● thermald.service - Thermal Daemon Service
     Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
     Active: active (running) since Sun 2022-11-13 23:24:19 GMT; 1min 28s ago
   Main PID: 539 (thermald)
      Tasks: 3 (limit: 9369)
     Memory: 5.9M
        CPU: 27ms
     CGroup: /system.slice/thermald.service
             └─539 /usr/bin/thermald --systemd --dbus-enable --adaptive

Tach 13 23:24:19 MyName thermald[539]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 23:24:19 MyName thermald[539]: NO RAPL sysfs present
Tach 13 23:24:19 MyName thermald[539]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 23:24:19 MyName thermald[539]: sensor id 5 : No temp sysfs for reading raw temp
Tach 13 23:24:19 MyName thermald[539]: sensor id 5 : No temp sysfs for reading raw temp
Tach 13 23:24:19 MyName thermald[539]: sensor id 5 : No temp sysfs for reading raw temp
Tach 13 23:24:19 MyName thermald[539]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:24:19 MyName thermald[539]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:24:19 MyName thermald[539]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:24:19 MyName thermald[539]: Polling mode is enabled: 4

which is a new one for me (but there was another thread with the 'NO RAPL' line recently on the forums.
After restarting:

● thermald.service - Thermal Daemon Service
     Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
     Active: active (running) since Sun 2022-11-13 23:26:06 GMT; 25s ago
   Main PID: 1757 (thermald)
      Tasks: 3 (limit: 9369)
     Memory: 5.3M
        CPU: 21ms
     CGroup: /system.slice/thermald.service
             └─1757 /usr/bin/thermald --systemd --dbus-enable --adaptive

Tach 13 23:26:06 MyName systemd[1]: Started Thermal Daemon Service.
Tach 13 23:26:06 MyName thermald[1757]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 23:26:06 MyName thermald[1757]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 23:26:06 MyName thermald[1757]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 23:26:06 MyName thermald[1757]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 23:26:06 MyName thermald[1757]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 23:26:06 MyName thermald[1757]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:26:06 MyName thermald[1757]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:26:06 MyName thermald[1757]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:26:06 MyName thermald[1757]: Polling mode is enabled: 4

In no case do I get any .xml.auto created in either /etc/thermald or /var/run/thermald.

There does seem to be some kind of issue with thermald.service starting too soon on boot, since I sometimes get complaints then I don't get if I restart it concerning sysfs availability, but the failure to find thermal zones or generate an automatic configuration to handle them seems to persist regardless.

seth wrote:

Next test would be to unload iwlwifi and restart thermald to see whether that calms it down.

systemctl stop netctl-auto@wlan0.service
systemctl stop netctl-ifplugd@eth0.service
modprobe -r --remove-holders iwlwifi
systemctl restart thermald.service
systemctl status thermald.service

I got basically identical output to that shown above for the restart cases and no configuration generated in either of the two directories specified in the documentation.

modprobe iwlwifi mac80211 iwlmvm
systemctl start netctl-ifplugd@eth0.service netctl-auto@wlan0.service

[I think this should get me back to where I started?]

The odd thing (or one of the odd things) is that sensors seems to read the temperature for the wifi bit just fine

iwlwifi_1-virtual-0
Adapter: Virtual device
temp1:        +32.0°C 

Or is a 'virtual device' different? (How can a virtual device have an actual temperature?)


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#10 2022-11-14 13:14:59

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 71,684

Re: thermald configuration issue

I got basically identical output to that shown above

When you kick the iwlwifi module, does the corresponding /sys/class/thermal/thermal_zone* (woody word, btw) keep lingering around?
What does the "sensor id X : No temp sysfs for reading raw temp" then corespond to?

What happens if you blacklist iwlwifi at the boot?
"module_blacklist=iwlwifi", https://wiki.archlinux.org/title/Kernel … acklisting (check the journal to make sure it really wasn't loaded)

Offline

#11 2022-11-15 02:33:33

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,168

Re: thermald configuration issue

seth wrote:

I got basically identical output to that shown above

When you kick the iwlwifi module, does the corresponding /sys/class/thermal/thermal_zone* (woody word, btw) keep lingering around?

No. thermal_zone2 disappears from /sys/class/thermal.

seth wrote:

What does the "sensor id X : No temp sysfs for reading raw temp" then corespond to?

Sorry, but how do I tell what sensor id 7 corresponds to?

After removing and reloading iwlwifi, sensors output changes and the corresponding sensor initially gives 'N/A' rather than a temperature. But I had to modprobe it again to get wifi working, so maybe I just messed something up reloading it.

With the module removed:

$ sensors
thinkpad-isa-0000
Adapter: ISA adapter
fan1:           0 RPM
fan2:        65535 RPM
CPU:          +33.0°C  
GPU:              N/A  
temp3:         +0.0°C  
temp4:         +0.0°C  
temp5:         +0.0°C  
temp6:         +0.0°C  
temp7:         +0.0°C  
temp8:         +0.0°C  

nvme-pci-0400
Adapter: PCI adapter
Composite:    +20.9°C  (low  = -273.1°C, high = +69.8°C)
                       (crit = +79.8°C)

BAT0-acpi-0
Adapter: ACPI interface
in0:          12.06 V  

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +33.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:        +31.0°C  (high = +100.0°C, crit = +100.0°C)
Core 1:        +31.0°C  (high = +100.0°C, crit = +100.0°C)

pch_skylake-virtual-0
Adapter: Virtual device
temp1:        +29.0°C  

BAT1-acpi-0
Adapter: ACPI interface
in0:          11.30 V  

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +33.0°C  (crit = +128.0°C)

Contents of /sys/class/thermal/thermal_zone*/:

/sys/class/thermal/thermal_zone0/:
available_policies
device@
hwmon1/
integral_cutoff
k_d
k_i
k_po
k_pu
mode
offset
policy
power/
slope
subsystem@
sustainable_power
temp
trip_point_0_temp
trip_point_0_type
type
uevent

/sys/class/thermal/thermal_zone1/:
available_policies
hwmon5/
integral_cutoff
k_d
k_i
k_po
k_pu
mode
offset
policy
power/
slope
subsystem@
sustainable_power
temp
trip_point_0_temp
trip_point_0_type
type
uevent

/sys/class/thermal/thermal_zone3/:
available_policies
integral_cutoff
k_d
k_i
k_po
k_pu
mode
offset
policy
power/
slope
subsystem@
sustainable_power
temp
trip_point_0_temp
trip_point_0_type
trip_point_1_temp
trip_point_1_type
type
uevent

The following is missing in this case:

 type
 uevent
 
-/sys/class/thermal/thermal_zone2/:
-available_policies
-hwmon8/
-integral_cutoff
-k_d
-k_i
-k_po
-k_pu
-mode
-offset
-policy
-power/
-slope
-subsystem@
-sustainable_power
-temp
-trip_point_0_temp
-trip_point_0_type
-trip_point_1_temp
-trip_point_1_type
-trip_point_2_temp
-trip_point_2_type
-trip_point_3_temp
-trip_point_3_type
-trip_point_4_temp
-trip_point_4_type
-trip_point_5_temp
-trip_point_5_type
-trip_point_6_temp
-trip_point_6_type
-trip_point_7_temp
-trip_point_7_type
-type
-uevent
-
 /sys/class/thermal/thermal_zone3/:
 available_policies
 integral_cutoff

But restarted thermald.service returns the same status (or, if not the same, I can't tell the difference):

● thermald.service - Thermal Daemon Service
     Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
     Active: active (running) since Sun 2022-11-13 23:42:51 GMT; 1 day 2h ago
   Main PID: 2623 (thermald)
      Tasks: 3 (limit: 9369)
     Memory: 4.1M
        CPU: 3.232s
     CGroup: /system.slice/thermald.service
             └─2623 /usr/bin/thermald --systemd --dbus-enable --adaptive

Tach 13 23:42:51 MySelf systemd[1]: Started Thermal Daemon Service.
Tach 13 23:42:51 MySelf thermald[2623]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 23:42:51 MySelf thermald[2623]: 22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
Tach 13 23:42:51 MySelf thermald[2623]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 23:42:51 MySelf thermald[2623]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 23:42:51 MySelf thermald[2623]: sensor id 7 : No temp sysfs for reading raw temp
Tach 13 23:42:51 MySelf thermald[2623]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:42:51 MySelf thermald[2623]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:42:51 MySelf thermald[2623]: Config file /etc/thermald/thermal-conf.xml does not exist
Tach 13 23:42:51 MySelf thermald[2623]: Polling mode is enabled: 4

[Have not tried blacklisting yet.]


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#12 2022-11-15 09:27:41

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 71,684

Re: thermald configuration issue

Sorry, but how do I tell what sensor id 7 corresponds to?

See your "thermald --dbus-enable --adaptive --loglevel=info --no-daemon", it maps the sensor index to the thermal device.

Offline

#13 2022-11-15 19:43:08

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,168

Re: thermald configuration issue

seth wrote:

Sorry, but how do I tell what sensor id 7 corresponds to?

See your "thermald --dbus-enable --adaptive --loglevel=info --no-daemon", it maps the sensor index to the thermal device.

Oh. Sorry. I'm not entirely sure what it is telling me.

[1668540677][INFO]RAPL domain count 1
[1668540677][INFO]RAPL domain count 1
[1668540677][MSG]22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
[1668540677][INFO]THD engine init failed
[1668540677][INFO]--adaptive option failed on this platform
[1668540677][INFO]Ignoring --adaptive option
[1668540677][INFO]RAPL domain count 1
[1668540677][INFO]RAPL domain count 1
[1668540677][MSG]22 CPUID levels; family:model:stepping 0x6:8e:9 (6:142:9)
[1668540677][INFO]sensor_update: type acpitz
[1668540677][INFO]sensor_update: type x86_pkg_temp
[1668540677][INFO]sensor_update: type pch_skylake
[1668540677][INFO]thd_read_default_thermal_sensors loaded 3 sensors 
[1668540677][INFO]dts /sys/devices/platform/coretemp.0/name doesn't exist
[1668540677][MSG]sensor id 7 : No temp sysfs for reading raw temp
[1668540677][MSG]sensor id 7 : No temp sysfs for reading raw temp
[1668540677][MSG]sensor id 7 : No temp sysfs for reading raw temp
[1668540677][INFO]INT3400 Base path is 
[1668540677][INFO]failed to open /dev/acpi_thermal_rel 
[1668540677][INFO]failed to open /dev/acpi_thermal_rel 
[1668540677][INFO]TRT/ART read failed
[1668540677][MSG]Config file /etc/thermald/thermal-conf.xml does not exist
[1668540677][INFO]sensor index:0 acpitz /sys/class/thermal/thermal_zone0/ Async:0 
[1668540677][INFO]sensor index:3 x86_pkg_temp /sys/class/thermal/thermal_zone3/ Async:1 
[1668540677][INFO]sensor index:1 pch_skylake /sys/class/thermal/thermal_zone1/ Async:0 
[1668540677][INFO]sensor index:4 hwmon /sys/class/hwmon/hwmon7/temp3_input Async:0 
[1668540677][INFO]sensor index:5 hwmon /sys/class/hwmon/hwmon7/temp1_input Async:0 
[1668540677][INFO]sensor index:6 hwmon /sys/class/hwmon/hwmon7/temp2_input Async:0 
[1668540677][INFO]thd_read_default_cooling devices loaded 6 cdevs 
[1668540677][INFO]powercap RAPL max power limit range 15000000 
[1668540677][INFO]set_pid_param 7 [-1000.100,10]
[1668540677][INFO]Use Default pstate drv settings
[1668540677][INFO]name = package-0
[1668540677][INFO]name = dram
[1668540677][INFO]sysfs read failed /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:2/constraint_0_max_power_uw
[1668540677][INFO]:powercap RAPL invalid max power limit range 
[1668540677][INFO]Calculate dynamically phy_max 
[1668540677][INFO]INT3400 Base path is 
[1668540677][INFO]failed to open /dev/acpi_thermal_rel 
[1668540677][INFO]failed to open /dev/acpi_thermal_rel 
[1668540677][INFO]TRT/ART read failed
[1668540677][MSG]Config file /etc/thermald/thermal-conf.xml does not exist
[1668540677][INFO]1: Processor, C:0 MN: 0 MX:10 ST:1 pt:/sys/class/thermal/ rd_bk 0 
[1668540677][INFO]6: TCC, C:3 MN: 0 MX:63 ST:1 pt:/sys/class/thermal/ rd_bk 1 
[1668540677][INFO]4: intel_powerclamp, C:-1 MN: 0 MX:50 ST:5 pt:/sys/class/thermal/ rd_bk 0 
[1668540677][INFO]2: Processor, C:0 MN: 0 MX:10 ST:1 pt:/sys/class/thermal/ rd_bk 0 
[1668540677][INFO]0: Processor, C:0 MN: 0 MX:10 ST:1 pt:/sys/class/thermal/ rd_bk 0 
[1668540677][INFO]3: Processor, C:0 MN: 0 MX:10 ST:1 pt:/sys/class/thermal/ rd_bk 0 
[1668540677][INFO]7: rapl_controller, C:15000000 MN: 15000000 MX:7500000 ST:-750000 pt:/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/ rd_bk 1 
[1668540677][INFO]8: intel_pstate, C:0 MN: 0 MX:10 ST:1 pt:/sys/devices/system/cpu/intel_pstate/ rd_bk 1 
[1668540677][INFO]9: rapl_controller_dram, C:100000000 MN: 100000000 MX:0 ST:-500000 pt:/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:2/ rd_bk 1 
[1668540677][INFO]10: LCD, C:0 MN: 0 MX:6818 ST:681 pt:/sys/class/backlight/intel_backlight/ rd_bk 1 
[1668540677][INFO]thd_read_default_thermal_zones loaded 3 zones 
[1668540677][INFO]INT3400 Base path is 
[1668540677][INFO]zone cpu will be created 
[1668540677][INFO]dts zone /sys/devices/platform/coretemp.0/name doesn't exist
[1668540677][INFO]/sys/class/hwmon/hwmon6/name->thinkpad
[1668540677][INFO]/sys/class/hwmon/hwmon4/name->nvme
[1668540677][INFO]/sys/class/hwmon/hwmon2/name->BAT0
[1668540677][INFO]/sys/class/hwmon/hwmon0/name->AC
[1668540677][INFO]/sys/class/hwmon/hwmon7/name->coretemp
[1668540677][INFO]Buggy max temp: to close to critical 90000
[1668540677][INFO]Core temp DTS :critical 100000, max 90000, psv 95000
[1668540677][INFO]node type: Element, name: CoolingDevice value: rapl_controller
[1668540677][INFO]node type: Element, name: CoolingDevice value: intel_pstate
[1668540677][INFO]node type: Element, name: CoolingDevice value: intel_powerclamp
[1668540677][INFO]node type: Element, name: CoolingDevice value: cpufreq
[1668540677][INFO]node type: Element, name: CoolingDevice value: Processor
[1668540677][INFO]min:0 max:0
[1668540677][INFO]min:0 max:0
[1668540677][INFO]min:0 max:0
[1668540677][INFO]min:0 max:0
[1668540677][INFO]CDEVS order specified in thermal-cpu-cdev-order.xml
[1668540677][INFO]/sys/class/hwmon/hwmon5/name->pch_skylake
[1668540677][INFO]/sys/class/hwmon/hwmon3/name->BAT1
[1668540677][INFO]/sys/class/hwmon/hwmon1/name->acpitz
[1668540677][INFO]INT3400 Base path is 
[1668540677][INFO]failed to open /dev/acpi_thermal_rel 
[1668540677][INFO]failed to open /dev/acpi_thermal_rel 
[1668540677][INFO]TRT/ART read failed
[1668540677][MSG]Config file /etc/thermald/thermal-conf.xml does not exist
[1668540677][INFO]

 ZONE DUMP BEGIN
[1668540677][INFO]
[1668540677][INFO]Zone 4: cpu, Active:1 Bind:0 Sensor_cnt:1
[1668540677][INFO]..sensors.. 
[1668540677][INFO]sensor index:3 x86_pkg_temp /sys/class/thermal/thermal_zone3/ Async:1 
[1668540677][INFO]..trips.. 
[1668540677][INFO]index 0: type:passive temp:95000 hyst:0 zone id:4 sensor id:65535 control_type:1 cdev size:4
[1668540677][INFO]cdev[0] rapl_controller, Sampling period: 0
[1668540677][INFO]       target_state:not defined
[1668540677][INFO]min_max 0
[1668540677][INFO]cdev[1] intel_pstate, Sampling period: 0
[1668540677][INFO]       target_state:not defined
[1668540677][INFO]min_max 0
[1668540677][INFO]cdev[2] intel_powerclamp, Sampling period: 0
[1668540677][INFO]       target_state:not defined
[1668540677][INFO]min_max 0
[1668540677][INFO]cdev[3] Processor, Sampling period: 0
[1668540677][INFO]       target_state:not defined
[1668540677][INFO]min_max 0
[1668540677][INFO]index 1: type:polling temp:85500 hyst:0 zone id:4 sensor id:3 control_type:0 cdev size:0
[1668540677][INFO]
[1668540677][INFO]

 ZONE DUMP END
[1668540677][INFO]Running on a vanilla kernel
[1668540677][MSG]Polling mode is enabled: 4
[1668540677][INFO]Current user preference is 0
[1668540677][INFO]thd_engine_thread begin

So /sys/devices/platform/coretemp.0/name is missing. Is that linked to sensor id 7? /sys/class/hwmon/hwmon7/name->coretemp?


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#14 2022-11-15 21:55:06

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 71,684

Re: thermald configuration issue

Seems so.
https://github.com/intel/thermal_daemon/issues/351

But google has that more frequently, including in https://bbs.archlinux.org/viewtopic.php?id=247309

Does /sys/devices/platform/coretemp.0 exist at all?

Edit: https://bugs.archlinux.org/task/36974

lsmod | grep coretemp
modprobe coretemp
ls /sys/devices/platform/coretemp.0

?

Last edited by seth (2022-11-15 22:17:00)

Offline

#15 2022-11-15 23:08:33

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,168

Re: thermald configuration issue

Thanks. The thing is, I'm also getting that --adaptive isn't supported and there's no configuration auto-generated, so the responses saying 'it's just noise and is working fine' don't seem to apply in my case. thermald doesn't recognise any thermal zones at all - not just the one it specifically complains about.

seth wrote:

But google has that more frequently, including in https://bbs.archlinux.org/viewtopic.php?id=247309

The docs suggest thermald should now auto-configure itself. The script for generating configuration files mentioned in the man pages is described as obsolete on its home page. (I think it was github.) The docs suggest --adaptive should work everywhere, but thermald doesn't seem to share that optimism.

If I'm meant to write a configuration file, I really don't know what to put in it. I read the man page and I understand (I think) what the file should look like etc., but I don't know anything about what it would be reasonable to ask thermald to do specifically on my platform.

I'm not clear whether thermald is actually doing anything on my machine at all.

Thanks. I don't think I have that bug, at least smile.

seth wrote:

Does /sys/devices/platform/coretemp.0 exist at all?

Yes.

$ ls /sys/devices/platform/coretemp.0/
driver@  driver_override  hwmon/  modalias  power/  subsystem@  uevent
seth wrote:
lsmod | grep coretemp
modprobe coretemp
ls /sys/devices/platform/coretemp.0

?

$ lsmod | grep coretemp
coretemp               20480  0

I modprobed coretemp anyway, but the contents of /sys/devices/platform/coretemp.0 are unchanged.

I'm starting to wonder if my hardware is the problem but, in that case, I don't understand how sensors can seem quite happy.

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +51.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:        +45.0°C  (high = +100.0°C, crit = +100.0°C)
Core 1:        +51.0°C  (high = +100.0°C, crit = +100.0°C)

Edit: I'm almost certain high and critical temperatures were not identical before. Almost, but not completely.

Edit 2: i7z also has no problems reading the cores' temperatures (unless i7z and sensors are just fantasists). CPU info from i7z:

i7z DEBUG: Found Intel Processor
i7z DEBUG:    Stepping 9
i7z DEBUG:    Model e
i7z DEBUG:    Family 6
i7z DEBUG:    Processor Type 0
i7z DEBUG:    Extended Model 8
i7z DEBUG: msr = Model Specific Register
i7z DEBUG: msr device files exist /dev/cpu/*/msr
i7z DEBUG: You have write permissions to msr device files

------------------------------
--[core id]--- Other information
-------------------------------------
--[0] Processor number 0
--[0] Socket number/Hyperthreaded Sibling number  0,2
--[0] Core id number 0
--[0] Display core in i7z Tool: Yes

--[1] Processor number 1
--[1] Socket number/Hyperthreaded Sibling number  0,3
--[1] Core id number 1
--[1] Display core in i7z Tool: Yes

--[2] Processor number 2
--[2] Socket number/Hyperthreaded Sibling number  0,0
--[2] Core id number 0
--[2] Display core in i7z Tool: No

--[3] Processor number 3
--[3] Socket number/Hyperthreaded Sibling number  0,1
--[3] Core id number 1
--[3] Display core in i7z Tool: No

Socket-0 [num of cpus 2 physical 2 logical 4] 1,
Socket-1 [num of cpus 0 physical 0 logical 0] 
i7z DEBUG: Single Socket Detected
i7z DEBUG: In i7z Single_Socket()
i7z DEBUG: guessing Haswell

Edit 3: https://github.com/intel/thermal_daemon/issues/373?

Last edited by cfr (2022-11-15 23:37:45)


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

#16 2022-11-16 12:58:58

seth
Member
From: Don't DM me only for attention
Registered: 2012-09-03
Posts: 71,684

Re: thermald configuration issue

ls /sys/devices/platform/coretemp.0/

I modprobed coretemp anyway

The module isn't the problem, modprobe'ing a loaded module will silently just do nothing and most sensors don't have "name"s - so I guess that's really jus FYI

thermald doesn't recognise any thermal zones at all

Tach 13 23:18:57 MyName thermald[497]: Thermal DTS or hwmon: No Zones present Need to configure manually

seems only in the initial service start status, but not after a restart (if this is consistent and your main concern, I'd suspect a race for sysfs)

I'm not clear whether thermald is actually doing anything on my machine at all.

https://wiki.archlinux.org/title/CPU_fr … g#thermald

man thermald wrote:

By default, it monitors CPU temperature using available CPU digital temperature sensors and maintains CPU temperature under control, before HW takes aggressive correction action.

I'd keep the non-daemon, info-logging instance running and stress the CPU a bit.
I'd then expect it to log that it took action.

The docs suggest thermald should now auto-configure itself.

man thermald wrote:

In some newer platforms the auto creation of the config file is done by a companion tool "dptfxtract". This tool can be downloaded from "https://github.com/intel/dptfxtract". It is suggested as parts of the install process, run dptfxtract.

https://aur.archlinux.org/packages?K=dptfxtract ?

Offline

#17 2022-11-16 17:33:53

cfr
Member
From: Cymru
Registered: 2011-11-27
Posts: 7,168

Re: thermald configuration issue

seth wrote:

thermald doesn't recognise any thermal zones at all

Tach 13 23:18:57 MyName thermald[497]: Thermal DTS or hwmon: No Zones present Need to configure manually

seems only in the initial service start status, but not after a restart (if this is consistent and your main concern, I'd suspect a race for sysfs)

I think that's right and it doesn't always happen. In recent boots, I get the restart result from the outset. But when I run thermald in non-daemon mode, I don't get the zone output I see in other people's reports and I'm not sure why not. According to the documentation, it should produce an auto-generated configuration file, but it doesn't. My main concern is that it doesn't produce such a configuration and that it is not, therefore, working - or not working properly - on my machine.

seth wrote:

https://wiki.archlinux.org/title/CPU_fr … g#thermald

man thermald wrote:

By default, it monitors CPU temperature using available CPU digital temperature sensors and maintains CPU temperature under control, before HW takes aggressive correction action.

I'd keep the non-daemon, info-logging instance running and stress the CPU a bit.
I'd then expect it to log that it took action.

OK, thanks. I'll do that and see what happens.

seth wrote:

The docs suggest thermald should now auto-configure itself.

man thermald wrote:

In some newer platforms the auto creation of the config file is done by a companion tool "dptfxtract". This tool can be downloaded from "https://github.com/intel/dptfxtract". It is suggested as parts of the install process, run dptfxtract.

https://aur.archlinux.org/packages?K=dptfxtract ?

The git repository for the tool claims it's obsolete and --adaptive is to be used instead. However, I tried it anyway. See post #2 above. I couldn't get it to work. Either I'm missing some necessary step or it doesn't work with my hardware.


CLI Paste | How To Ask Questions

Arch Linux | x86_64 | GPT | EFI boot | refind | stub loader | systemd | LVM2 on LUKS
Lenovo x270 | Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz | Intel Wireless 8265/8275 | US keyboard w/ Euro | 512G NVMe INTEL SSDPEKKF512G7L

Offline

Board footer

Powered by FluxBB