You are not logged in.
Pages: 1
Earlier today I exchanged one LG Gram laptop for another because it kept getting the error "thermal thermal_zone0: acpitz: critical temperature reached, shutting down" (gotten from journalctl) and crashing. Now I have another, the same issue occurs. It is always under some load when it crashes but it's hard to reproduce with stress testing. Please let me know if any more information is required.
Last edited by jetlennit (2021-08-22 19:22:57)
Offline
That might be the result of the recent kernels, so you could try linux-lts.
https://www.spinics.net/lists/fedora-de … 90650.html
https://bugzilla.redhat.com/show_bug.cgi?id=1992706
Maybe try intel thermald?
Last edited by progandy (2021-08-11 22:38:23)
| alias CUTF='LANG=en_XX.UTF-8@POSIX ' |
Offline
I did try thermald on the last one, it didn't work unfortunately. I'll try out lts
Offline
LTS seems to have fixed the issue, now I just have to deal with the embarassment of having exchanged a perfectly good laptop lol. Thanks for the help!
Offline
I've now had two more crashes after switching to lts because of overheating. I have thermald installed, and both crashes have been during sofware installation. Does anyone have any other ideas on what to do?
Offline
I found out that thermal thermal_zone0: acpitz seems to be a secondary way of monitoring the cpu that doesn't actually work with the kernel, so I disabled it in the grub config using "GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet thermal.off=1"" and everything works fine now. I was able to replicate it by stressing my cpu, then closing and reopening my laptop lid. I the crash relates to the suspend to ram feature. (Ignore this, it didn't work)
Last edited by jetlennit (2021-08-22 19:24:14)
Offline
I have the same problem with the LG Gram 17 with the i7-8565U CPU, but my fan is always running.
To reproduce:
Suspend > stress > crash
or
Hibernate > stress > crash
It's interesting that the temperature readings before sleeping are not as high as after waking up from suspend, now matter how I stress it.
I tried:
different kernel versions -> 5.13, 5.10 LTS, 5.4 LTS;
different DE -> KDE plasma and Gnome;
different distros -> Manjaro, EndeavourOS, Kubuntu, MxLinux and fedora 34.
It does not happen on windows.
Edit: It does not happen on Kernel 4.19 LTS, which points towards the problem being a bug in the kernel
Last edited by CryptoPink (2021-09-19 15:25:14)
Offline
For LG Gram 2021:
1. Turn on laptop -> Hold F2 -> Once in BIOS, press LCtrl + LAlt + LShift + F7
2. Advanced -> Intel Advanced Menu -> Thermal Configuration -> Platform Thermal Configuration
3. Enable Passive Trip Points
Image gallery showing the same process: https://imgur.com/a/IaBpsoR
This will throttle the laptop. Sorry, but this laptop is thermally limited, and you need to void warranty to properly fix it.
Offline
For LG Gram 2021:
1. Turn on laptop -> Hold F2 -> Once in BIOS, press LCtrl + LAlt + LShift + F7
2. Advanced -> Intel Advanced Menu -> Thermal Configuration -> Platform Thermal Configuration
3. Enable Passive Trip Points
Image gallery showing the same process: https://imgur.com/a/IaBpsoRThis will throttle the laptop.
Sorry, but this laptop is thermally limited
Thanks for the suggestion. I will try it soon and post results here.
However, I am not sure you understood the problem, as the laptop operates normally until it is suspended and then wakes up from suspend.
It also doesn't happen on windows.
So the issue doesn't seem to be that it's thermally limited, but rather some firmware issue on the computer, a bug in linux or a misconfiguration of the hardware or the OS.
you need to void warranty to properly fix it.
Perfect! My laptop is out of warranty already. Could you teach me or point me to how I can properly fix it?
Offline
https://lore.kernel.org/linux-pm/202109 … nel.org/T/
Looks like a patch is coming.
What happens is this drivers uses a global variable to keep track of the tcc offset (tcc_offset_save) and uses it on resume. The issue is this variable is initialized to 0, but is only set in tcc_offset_degree_celsius_store, i.e. when the tcc offset is explicitly set by userspace. If that does not happen, the resume path will set the offset to 0 (in my case the h/w default being 3, the offset would become too low after a suspend/resume cycle).
Last edited by gmontanola (2021-09-17 16:43:18)
Offline
https://lore.kernel.org/linux-pm/202109 … nel.org/T/
Looks like a patch is coming.
Antoine Tenart wrote:What happens is this drivers uses a global variable to keep track of the tcc offset (tcc_offset_save) and uses it on resume. The issue is this variable is initialized to 0, but is only set in tcc_offset_degree_celsius_store, i.e. when the tcc offset is explicitly set by userspace. If that does not happen, the resume path will set the offset to 0 (in my case the h/w default being 3, the offset would become too low after a suspend/resume cycle).
Wow, this is so nice!
Thank you
Looks like this is the issue.
Could you help me to try to workaround the issue? This would be helpful to confirm if it's indeed the issue.
Maybe a workaround could be simply running a command after waking up the computer.
I'm also trying to figure out which kernel version would receive the patch.
Last edited by CryptoPink (2021-09-17 17:00:03)
Offline
For LG Gram 2021:
1. Turn on laptop -> Hold F2 -> Once in BIOS, press LCtrl + LAlt + LShift + F7
2. Advanced -> Intel Advanced Menu -> Thermal Configuration -> Platform Thermal Configuration
3. Enable Passive Trip Points
Image gallery showing the same process: https://imgur.com/a/IaBpsoRThis will throttle the laptop. Sorry, but this laptop is thermally limited, and you need to void warranty to properly fix it.
Thanks for the suggestion. I will try it soon and post results here.
Didn't help at all with the problem.
Offline
I have the same problem with the LG Gram 17 with the i7-8565U CPU, but my fan is always running.
To reproduce:
Suspend > stress > crash
or
Hibernate > stress > crashIt's interesting that the temperature readings before sleeping are not as high as after waking up from suspend, now matter how I stress it.
I tried:
different kernel versions -> 5.13, 5.10 LTS, 5.4 LTS;
different DE -> KDE plasma and Gnome;
different distros -> Manjaro, EndeavourOS, Kubuntu, MxLinux and fedora 34.It does not happen on windows.
Edit: It does not happen on Kernel 4.19 LTS, which points towards the problem being a bug in the kernel
Just edited my post because I'm now using kernel 4.19 LTS and it's free of the problem.
Hopefully Kernel 5.15 will include the fix. It also seems to be the next LTS kernel, which is great.
Offline
fixed in 5.14.9
Offline
fixed in 5.14.9
Thank you very much for the heads up! I can't wait to test it. I'm on Manjaro and the latest kernels available are 5.15.rc2 and 5.14.7-2
From the changelog:
commit 8467f200fd38141f65492b55333840cc7591658d
Author: Antoine Tenart <atenart@kernel.org>
Date: Thu Sep 9 10:56:12 2021 +0200thermal/drivers/int340x: Do not set a wrong tcc offset on resume
commit 8b4bd256674720709a9d858a219fcac6f2f253b5 upstream.
After upgrading to Linux 5.13.3 I noticed my laptop would shutdown due
to overheat (when it should not). It turned out this was due to commit
fe6a6de6692e ("thermal/drivers/int340x/processor_thermal: Fix tcc setting").
What happens is this drivers uses a global variable to keep track of the
tcc offset (tcc_offset_save) and uses it on resume. The issue is this
variable is initialized to 0, but is only set in
tcc_offset_degree_celsius_store, i.e. when the tcc offset is explicitly
set by userspace. If that does not happen, the resume path will set the
offset to 0 (in my case the h/w default being 3, the offset would become
too low after a suspend/resume cycle).
The issue did not arise before commit fe6a6de6692e, as the function
setting the offset would return if the offset was 0. This is no longer
the case (rightfully).
Fix this by not applying the offset if it wasn't saved before, reverting
back to the old logic. A better approach will come later, but this will
be easier to apply to stable kernels.
The logic to restore the offset after a resume was there long before
commit fe6a6de6692e, but as a value of 0 was considered invalid I'm
referencing the commit that made the issue possible in the Fixes tag
instead.
Fixes: fe6a6de6692e ("thermal/drivers/int340x/processor_thermal: Fix tcc setting")
Cc: stable@vger.kernel.org
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pI andruvada@linux.intel.com>
Link: https://lore.kernel.org/r/2021090908561 … kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
<3
Offline
Fantastic news! I'm testing it on Fedora 34 - so far the issue seems to be gone.
Offline
I'd say issue is completely gone, after a week of testing. I would normally have several shutdowns a day, but with kernel 5.14.9-200.fc34.x86_64 (Fedora 34) I have none so far.
Offline
Pages: 1