You are not logged in.
Pages: 1
Hi,
I did install kernel 5.9.11 from testing repo and have some problems during shutdown.
After [OK] reached Power-Off there is about two pages of messages looking as crash dump info
and then computer is switched off.
I was looking to journalctl -b-1 and it is finished with:
Nov 25 09:41:37 lnb systemd[1]: Reached target Power-Off.
Nov 25 09:41:37 lnb systemd[1]: Shutting down.
Nov 25 09:41:37 lnb audit: BPF prog-id=6 op=UNLOAD
Nov 25 09:41:37 lnb audit: BPF prog-id=5 op=UNLOAD
Nov 25 09:41:37 lnb audit: BPF prog-id=4 op=UNLOAD
Nov 25 09:41:37 lnb audit: BPF prog-id=3 op=UNLOAD
Nov 25 09:41:37 lnb systemd-shutdown[1]: Syncing filesystems and block devices.
Nov 25 09:41:37 lnb systemd-shutdown[1]: Sending SIGTERM to remaining processes...
Nov 25 09:41:37 lnb systemd-journald[212]: Journal stopped
... so messages are not in journal
Question: are these messages logged somewhere (they are to fast to read during shutdown)?
Reverting back to 5.9.10 - all is OK.
Last edited by GeorgeJP (2020-11-28 15:57:53)
Offline
Confirming; I see this on one of my systems (running linux-zen) - a crash trace with something to do with efivars? Scroll is too fast, I'll take a video and see if I can scrub through and find more detail.
Last edited by jonathon (2020-11-26 23:13:04)
Offline
I do see a stack trace and register info right at the end of shutdown. I think it's a kernel panic. Are we experiencing the same issue?
Last edited by jpegxguy (2020-11-27 00:41:40)
hi
Offline
OK, here's a stack trace:
I haven't noticed this on a Lenovo X230, but I see the same thing with `linux` and `linux-zen` on the below ASUS laptop. Might be Ryzen-related?
System: Kernel: 5.9.11-zen1-1-zen x86_64 bits: 64 compiler: gcc v: 10.2.0 Desktop: MATE 1.24.1 info: mate-panel
wm: marco 1.24.1 dm: LightDM 1.30.0 Distro: Arch Linux
Machine: Type: Laptop System: ASUSTeK product: GL702ZC v: 1.0 serial: <filter>
Mobo: ASUSTeK model: GL702ZC v: 1.0 serial: <filter> UEFI: American Megatrends v: GL702ZC.306 date: 07/05/2019
Battery: ID-1: BAT0 charge: 53.9 Wh condition: 55.8/74.2 Wh (75%) volts: 15.4/15.4 model: ASUSTeK ASUS Battery type: Li-ion
serial: N/A status: Not charging cycles: 10
CPU: Info: 8-Core model: AMD Ryzen 7 1700 bits: 64 type: MT MCP arch: Zen rev: 1 L2 cache: 4096 KiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 95808
Speed: 1374 MHz min/max: 1550/3000 MHz boost: enabled Core speeds (MHz): 1: 1943 2: 1791 3: 1416 4: 1421 5: 1375
6: 1375 7: 1375 8: 1375 9: 1362 10: 1359 11: 1373 12: 1369 13: 1357 14: 1356 15: 1379 16: 1376
Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] vendor: ASUSTeK
driver: amdgpu v: kernel bus ID: 0c:00.0 chip ID: 1002:67df
Device-2: Realtek USB2.0 HD UVC WebCam type: USB driver: uvcvideo bus ID: 1-8:3 chip ID: 0bda:57fa serial: <filter>
Display: x11 server: X.org 1.20.9 compositor: marco v: 1.24.1 driver: amdgpu resolution: <missing: xdpyinfo>
Audio: Device-1: AMD Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] vendor: ASUSTeK driver: snd_hda_intel
v: kernel bus ID: 0c:00.1 chip ID: 1002:aaf0
Device-2: Advanced Micro Devices [AMD] Family 17h HD Audio vendor: ASUSTeK driver: snd_hda_intel v: kernel
bus ID: 12:00.3 chip ID: 1022:1457
Sound Server: ALSA v: k5.9.11-zen1-1-zen
Network: Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: ASUSTeK driver: r8169 v: kernel port: e000
bus ID: 06:00.0 chip ID: 10ec:8168
IF: enp6s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Device-2: Realtek RTL8822BE 802.11a/b/g/n/ac WiFi adapter vendor: AzureWave driver: rtw_8822be v: N/A port: d000
bus ID: 07:00.0 chip ID: 10ec:b822
IF: wlp7s0 state: down mac: <filter>
Sensors: System Temperatures: cpu: 63.6 C mobo: N/A gpu: amdgpu temp: 56.0 C
Fan Speeds (RPM): N/A
Info: Processes: 534 Uptime: 18m wakeups: 1 Memory: 31.30 GiB used: 7.80 GiB (24.9%) Init: systemd v: 246 Compilers:
gcc: 10.2.0 clang: 11.0.0 Packages: pacman: 1882 Shell: Zsh v: 5.8 running in: terminator inxi: 3.1.09
I'll try tomorrow with `linux-lqx` to get a comparison with a kernel similar to `linux-zen`.
Last edited by jonathon (2020-11-27 01:49:14)
Offline
What if you revert the following:
https://git.kernel.org/pub/scm/linux/ke … fd218eb7c9
Edit:
See https://lore.kernel.org/linux-efi/CAA42 … l.com/T/#t
Last edited by loqs (2020-11-27 03:38:24)
Offline
My problem is, that my camera is not able to take quality picture (it is unreadable) and I don't know, how to record this oops to file (probably not possible, as drives are already unmounted).
Looks as problem is identified by kernel guys and affects some systems with UEFI (maybe bugs in UEFI implementation in firmware).
I am just user and have no experience playing with kernel modifying and compiling.
My system is Lenovo Z50-75, systemd-boot and I can confirm this bug on default and lts testing kernels.
Other system is running on VMware ESXi (UEFI/systemd-boot) and I don't see this oops on reboot (either not affected or too fast to see it on remote console)
Offline
What if you revert the following:
https://git.kernel.org/pub/scm/linux/ke … fd218eb7c9
Edit:
See https://lore.kernel.org/linux-efi/CAA42 … l.com/T/#t
Ah, that's it.
According to https://lore.kernel.org/linux-efi/CAMj1 … l.com/T/#t, "The memory leak addressed by commit fe5186cf12e3 is a false positive" and there's a revert patch available there too.
I'll rebuild and confirm.
Edit: the above patch fails to build when applied against 5.9.11, probably because CONFIG_DEBUG_KMEMLEAK is not enabled?
fs/efivarfs/inode.c: In function ‘efivarfs_create’:
fs/efivarfs/inode.c:106:2: error: implicit declaration of function ‘kmemleak_ignore’ [-Werror=implicit-function-declaration]
106 | kmemleak_ignore(var);
| ^~~~~~~~~~~~~~~
Trying again with a more minimal revert
diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c
index f943fd0b0699..15880a68faad 100644
--- a/fs/efivarfs/super.c
+++ b/fs/efivarfs/super.c
@@ -21,7 +21,6 @@ LIST_HEAD(efivarfs_list);
static void efivarfs_evict_inode(struct inode *inode)
{
clear_inode(inode);
- kfree(inode->i_private);
}
static const struct super_operations efivarfs_ops = {
Edit 2: OK, that compiled.
Now, let's try adding an #ifdef to detect whether CONFIG_DEBUG_KMEMLEAK is set...
Fixes: fe5186cf12e3 ("efivarfs: fix memory leak in efivarfs_create()")
Reported-by: David Laight <David.Laight@aculab>
Signed-off-by: Ard Biesheuvel <ardb@kernel>
---
fs/efivarfs/inode.c | 1 +
fs/efivarfs/super.c | 1 -
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/efivarfs/inode.c b/fs/efivarfs/inode.c
index 96c0c86f3fff..38324427a2b3 100644
--- a/fs/efivarfs/inode.c
+++ b/fs/efivarfs/inode.c
@@ -103,6 +103,9 @@ static int efivarfs_create(struct inode *dir,
var->var.VariableName[i] = '\0';
inode->i_private = var;
+#ifdef CONFIG_DEBUG_KMEMLEAK
+ kmemleak_ignore(var);
+#endif
err = efivar_entry_add(var, &efivarfs_list);
if (err)
diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c
index f943fd0b0699..15880a68faad 100644
--- a/fs/efivarfs/super.c
+++ b/fs/efivarfs/super.c
@@ -21,7 +21,6 @@ LIST_HEAD(efivarfs_list);
static void efivarfs_evict_inode(struct inode *inode)
{
clear_inode(inode);
- kfree(inode->i_private);
}
static const struct super_operations efivarfs_ops = {
Edit 3: That compiled, so the #ifdef worked to exclude the line. However, I don't know for certain whether that's actually the correct config item - I suspect it is, but I'm not 100% certain.
Next up, reboots!
Edit 4:
Stack trace doesn't appear now. Either of the above revert patches will work fine; I'm not really sure of the value of including `kmemleak_ignore(var);` unless any Arch config includes CONFIG_DEBUG_KMEMLEAK.
Last edited by jonathon (2020-11-27 16:56:52)
Offline
Upstream is going with #include <linux/kmemleak.h> instead of #ifdef CONFIG_DEBUG_KMEMLEAK
diff --git a/fs/efivarfs/inode.c b/fs/efivarfs/inode.c
index 96c0c86f3fff..6501344e37bd 100644
--- a/fs/efivarfs/inode.c
+++ b/fs/efivarfs/inode.c
@@ -9,6 +9,7 @@
#include <linux/ctype.h>
#include <linux/slab.h>
#include <linux/uuid.h>
+#include <linux/kmemleak.h>
#include "internal.h"
@@ -103,6 +104,7 @@ static int efivarfs_create(struct inode *dir, struct dentry *dentry,
var->var.VariableName[i] = '\0';
inode->i_private = var;
+ kmemleak_ignore(var);
err = efivar_entry_add(var, &efivarfs_list);
if (err)
diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c
index f943fd0b0699..15880a68faad 100644
--- a/fs/efivarfs/super.c
+++ b/fs/efivarfs/super.c
@@ -21,7 +21,6 @@ LIST_HEAD(efivarfs_list);
static void efivarfs_evict_inode(struct inode *inode)
{
clear_inode(inode);
- kfree(inode->i_private);
}
static const struct super_operations efivarfs_ops = {
Offline
Upstream is going with #include <linux/kmemleak.h> instead of #ifdef CONFIG_DEBUG_KMEMLEAK
Yup, this shows my lack of experience with kernel stuff:
We typically define these helpers unconditionally, and sort out the differences in the header file. In this case, we have
static inline void kmemleak_ignore(const void *ptr)
{
}in include/linux/kmemleak.h if CONFIG_DEBUG_KMEMLEAK is not set.
This makes the calling code much cleaner.
kmemleak_ignore() is a noop if CONFIG_DEBUG_KMEMLEAK is not set. See include/linux/kmemleak.h. Thus no extra condition is needed here.
Last edited by jonathon (2020-11-27 18:06:39)
Offline
Bug report filed:
Offline
Question: are these messages logged somewhere (they are to fast to read during shutdown)?
You can do (sudo) `halt` instead of `poweroff`. This will go through the shutdown sequence but then just stop, rather than power off.
This allows you to read those final messages.
Same here btw, kernel 5.9.11 has this issue on an Intel NUC8 i5, no problem with 5.9.10.
Offline
You can do (sudo) `halt` instead of `poweroff`
Thanks for tip. Unfortunatelly I can't see all info, only last page.
At least I can see, that Call Trace is related to efivars_destroy.
Edit: kernel 5.9.11-arch2-1 - reboot/shutdown is OK
Edit2: kernel 5.4.80-2-lts - reboot/shutdown is OK
Last edited by GeorgeJP (2020-11-28 15:57:21)
Offline
I hope it gets upstreamed though, and also in stable, because until now there's been a discussion on the #ifdef in the mailing list. It seems like this form should be fine.
hi
Offline
It has been reverted in 5.10-rc6, so I assume it will be reverted in 5.9 stable and LTS as well
Ard Biesheuvel (1):
efivarfs: revert "fix memory leak in efivarfs_create()"
Offline
Pages: 1