You are not logged in.
Hello,
Since maybe (don't remember exactly) two weeks I have a problem with automatic random shutdown, caused by (I presume) temperature reading.
I have a Acer V17 (VN7-791G), with NVidia graphics, but I don't use it (only the integrated one).
I have collected lmsensors' logs (below). You will see that "temp3" jumps from 30°C to 117°C in a second. I don't know what is the "temp3" measurement of inside my laptop and I don't know how to find out.
Please take a look at logs:
10:08:15
acpitz-virtual-0
temp1: +27.8°C (crit = +105.0°C)
temp2: +29.8°C (crit = +105.0°C)
temp3: +30.0°C (crit = +100.0°C)
temp4: +55.0°C (crit = +92.0°C)
coretemp-isa-0000
Physical id 0: +70.0°C (high = +84.0°C, crit = +100.0°C)
Core 0: +66.0°C (high = +84.0°C, crit = +100.0°C)
Core 1: +67.0°C (high = +84.0°C, crit = +100.0°C)
Core 2: +65.0°C (high = +84.0°C, crit = +100.0°C)
Core 3: +64.0°C (high = +84.0°C, crit = +100.0°C)
10:08:16
acpitz-virtual-0
temp1: +27.8°C (crit = +105.0°C)
temp2: +29.8°C (crit = +105.0°C)
temp3: +117.0°C (crit = +100.0°C)
temp4: +55.0°C (crit = +92.0°C)
coretemp-isa-0000
Physical id 0: +69.0°C (high = +84.0°C, crit = +100.0°C)
Core 0: +65.0°C (high = +84.0°C, crit = +100.0°C)
Core 1: +66.0°C (high = +84.0°C, crit = +100.0°C)
Core 2: +65.0°C (high = +84.0°C, crit = +100.0°C)
Core 3: +64.0°C (high = +84.0°C, crit = +100.0°C)
10:08:17
acpitz-virtual-0
temp1: +27.8°C (crit = +105.0°C)
temp2: +29.8°C (crit = +105.0°C)
temp3: +101.0°C (crit = +100.0°C)
temp4: +55.0°C (crit = +92.0°C)
coretemp-isa-0000
Physical id 0: +69.0°C (high = +84.0°C, crit = +100.0°C)
Core 0: +66.0°C (high = +84.0°C, crit = +100.0°C)
Core 1: +67.0°C (high = +84.0°C, crit = +100.0°C)
Core 2: +65.0°C (high = +84.0°C, crit = +100.0°C)
Core 3: +64.0°C (high = +84.0°C, crit = +100.0°C)
10:08:18
acpitz-virtual-0
temp1: +27.8°C (crit = +105.0°C)
temp2: +29.8°C (crit = +105.0°C)
temp3: +105.0°C (crit = +100.0°C)
temp4: +55.0°C (crit = +92.0°C)
coretemp-isa-0000
Physical id 0: +69.0°C (high = +84.0°C, crit = +100.0°C)
Core 0: +65.0°C (high = +84.0°C, crit = +100.0°C)
Core 1: +66.0°C (high = +84.0°C, crit = +100.0°C)
Core 2: +65.0°C (high = +84.0°C, crit = +100.0°C)
Core 3: +63.0°C (high = +84.0°C, crit = +100.0°C)
... shutdown
Looking at the whole log of "temp3" (which is here, measured every 1s), there are many values > 100°C, but it looks like that there need to be a few bad valued in a row to cause a shutdown.
Here's also a plot of "temp3" sensor (measured every 1s)
http://i62.tinypic.com/i2sll4.png
Please help!
Thanks.
-- mod edit: read the Forum Etiquette and only post thumbnails http://wiki.archlinux.org/index.php/For … s_and_Code [jwr] --
Last edited by awes (2015-06-19 08:14:34)
Offline
I've identified the problem. It's a kernel upgrade 4.0.4-1 -> 4.0.4-2.
There are new config options enabled in 4.0.4-2:
+CONFIG_THERMAL_HWMON=y
+CONFIG_HWMON=y
Using kernel 4.0.4-1, lmsensors show only CPU temperature, using 4.0.4-2, there's new section "acpitz-virtual-0, Adapter: Virtual device", which includes "temp3" that shuts down my machine.
I presume that's a kernel bug and I've reported it to the kernel bug tracker.
edit:
before I mark it "solved": which process is responsible for shutting computer down when temperature is to high? I'd like to disable that feature...
Last edited by awes (2015-06-03 13:13:57)
Offline
be careful with this, you can burn your computer.
ezik
Offline
That shouldn't happen, CPU temps are fine and throttling is also working ok. I just want to disable automatic shutdown, at least to the moment that issue is fixed.
Offline
Turned out that Kernel downgrade didn't fix anything, shutdowns still occured.
Downgrade did only one thing: prevented lmsensors from seeing acpitz-virtual-0, but it didn't prevent kernel from using it.
I did solve it the proper way. Found in journal:
kernel: thermal thermal_zone2: critical temperature reached(113 C),shutting down
Which led me to: /sys/class/thermal/thermal_zone2/
(this is what lmsensors shows as "temp3", through /sys/class/hwmon/hwmon0/temp3_input)
SYSFS docs provided solution:
mode
One of the predefined values in [enabled, disabled].
This file gives information about the algorithm that is currently managing the thermal zone. It can be either default kernel based algorithm or user space application.
enabled = enable Kernel Thermal management.
disabled = Preventing kernel thermal zone driver actions upon
trip points so that user application can take full
charge of the thermal management.
The way to disable Kernel Thermal management for zone/sensor:
echo "disabled" > /sys/class/thermal/thermal_zone2/mode
Last edited by awes (2015-06-19 08:15:58)
Offline