You are not logged in.
Drop-In: /etc/systemd/system/kea-dhcp4.service.d
└─spam.conf
what are you overriding here, and are you perhaps disabling journal integration?
Offline
It's completely harmless, it just filters out the "LFC_START Starting lease file cleanup" information, which just spams the log. I took advice here on the forum:- https://bbs.archlinux.org/viewtopic.php?id=286272
❯ cat /etc/systemd/system/kea-dhcp4.service.d/spam.conf
[Service]
LogFilterPatterns=~DhcpLFC
LogFilterPatterns=~DHCPSRV_MEMFILE
Offline
Actually all crashes are after a long idleness (about 1h, 4h & 5h)
The common element is the user mode write (error 6)
1. add some swap (you only have 2GB RAM?)
2. Add "processor.max_cstate=1" to the kernel parameters
3. Run memtest86+, at least over night
Offline
seth any thoughts on why no core dumps are being recorded?
Offline
1) Yes my home router APU2 - https://www.pcengines.ch/apu2.htm has only 2GB ram. But I also have a production Supermicro A2SDi-4C-HLN4F with 8GB ram and the problem is exactly the same ... the only difference is that on the Supermicro I have linux-lts and BTFS raid1.
2) Yes, I'll try add it to Grub.
3) Memtest86+ I ran a 6h cycle about a week ago - no errors
At first I also thought it was corrupted HW/RAM, but it behaves exactly the same on both routers. On Supermicro + APU2.
Offline
Coredumps are [edit: "either", changed opinion midway through ;-) ] explcitily disabled, the system is OOM or the stack is gone for other reasons.
@vecino you could run a sacrificial process and "kill -11" it to see whether that generates a coredump or this is a more general issue w/ the setup.
bash # starts new shell
kill -11 $$ # segfaults the new shell
Edit #2: also no coredumps on the other system?
Last edited by seth (2023-08-02 18:58:25)
Offline
Is that what you mean? I chose Samba, for example... PID = 353.
❯ ps aux | grep smbd
root 353 0.0 1.3 93732 26496 ? Ss Jul29 0:04 /usr/bin/smbd --foreground --no-process-group
❯ kill -11 353
Offline
The interesting question is whether you also got a core dump for this process now.
Offline
❯ coredumpctl list
TIME PID UID GID SIG COREFILE EXE SIZE
Wed 2023-08-02 21:02:03 CEST 353 0 0 SIGABRT present /usr/bin/smbd 776.9K
Offline
So generally works, lack is unique to the kea conditions.
Did you add some swap?
The Supermicro A2SDi-4C-HLN4F seems to have an atom chip and is not subject to the ryzen specific issues.
Ideally post a journal covering a kea crash from that system as well.
Offline
From Supermicro A2SDi-4C-HLN4F:
journal -b: https://0x0.st/s/amaHi0B-uqdUyHBfmLFwBg/H2xN.log
journal -k: https://0x0.st/s/fg8WpdJ2Sfu9dtJG694buw/H2xb.log
Offline
The only deviance:
Jul 26 01:20:37 vecinoap kernel: kea-lfc[5618]: segfault at 7f63fc000020 ip 00007f640caaa170 sp 00007f640ad2cdb0 error 6 in libc.so.6[7f640ca38000+15d000] likely on CPU 2 (core 8, socket 0)
pacman -Qikk glibc
Didn't lookup the other journal, but are you running the LTS kernel on both systems?
Edit: @loqs, do you know whether https://bbs.archlinux.org/viewtopic.php?id=282429 is fixed in the LTS kernel?
Last edited by seth (2023-08-02 19:47:06)
Offline
Supermicro: 6.1.39-1-lts - https://0x0.st/s/6_UoJ50WmVb0PWUhOqfExA/H2xW.txt
APU2: 6.4.7-arch1-2 - https://0x0.st/s/BqlYQiOmrN0h_0iADKRxxw/H2xV.txt
Offline
Edit: @loqs, do you know whether https://bbs.archlinux.org/viewtopic.php?id=282429 is fixed in the LTS kernel?
Should be the fix was applied as commit 254ee530286aeb6d6de93d05b2247153df590af1 in https://cdn.kernel.org/pub/linux/kernel … Log-6.1.30
Offline
[fuck]
https://kea.readthedocs.io/en/kea-1.6.1/arm/lfc.html
Does it also crash if you run it explicitly?
You could also replace the binary w/ a script that runs it either in debug mode or gdb :\
Offline
I haven't tried - it would take x hours for the first errors to show up.
It makes sense to modify the systemd unit and add something like: ExecStart=/usr/bin/kea-dhcp4 -c /etc/kea/kea-dhcp4.conf -d ?
❯ cat /etc/systemd/system/kea-dhcp4.service
[Unit]
Description=ISC Kea IPv4 DHCP daemon
Documentation=man:kea-dhcp4(8)
Wants=network-online.target
After=network-online.target
After=time-sync.target
[Service]
Environment="KEA_PIDFILE_DIR=/run"
Environment="KEA_LOCKFILE_DIR=/run/lock/kea"
ExecStart=/usr/bin/kea-dhcp4 -c /etc/kea/kea-dhcp4.conf -d
[Install]
WantedBy=multi-user.target
Offline
I don't think it'll take hours for kea-lfc, the description sounds as if kea-dhcp4 runs it periodically and I'd almost expect it'll fail deterministically.
The other idea would be to have
/usr/bin/kea-lfc
#!/bin/sh
exec /usr/bin/kea-lfc.bin -d > /tmp/kea-lfc.dbg 2>&1
and move the original /usr/bin/kea-lfc to /usr/bin/kea-lfc.bin
Edit: and don't forget to chmod +x the script
Last edited by seth (2023-08-02 20:35:15)
Offline
❯ cat /tmp/kea-lfc.dbg
Usage error: DHCP version required
Usage: kea-lfc
[-4|-6] -p file -x file -i file -o file -f file -c file
-4 or -6 clean a set of v4 or v6 lease files
-p <file>: PID file
-x <file>: previous or ex lease file
-i <file>: copy of lease file
-o <file>: output lease file
-f <file>: finish file
-c <file>: configuration file
-v: print version number and exit
-V: print extended version information and exit
-d: optional, verbose output
-h: print this message
Service failed: DHCP version required
Offline
You can reduce the interval between lfc invocations by adjusting the lfc-interval in kea-dhcp4.conf
Offline
And apparently have to
#!/bin/sh
exec /usr/bin/kea-lfc.bin -d "$@" > /tmp/kea-lfc.dbg 2>&1
Offline
still the same result as last time
Offline
Did the timestamp of the output file change?
To be sure: you're shadowing/replacing /usr/bin/kea-lfc, you're not just typing that into an interactive shell?
Offline
❯ date
Wed 2 Aug 23:29:15 CEST 2023
❯ cat kea.sh
#!/bin/sh
exec /usr/bin/kea-lfc.bin -d "$@" > /tmp/kea-lfc.dbg 2>&1
❯ ./kea.sh
❯ cd /tmp
❯ ls -ls
total 4
0 srwxrwxrwx 1 telegraf telegraf 0 Aug 2 21:47 dbus-ZCnpXcQZmz
4 -rw-r--r-- 1 root root 529 Aug 2 23:29 kea-lfc.dbg
0 drwx------ 2 root root 40 Aug 2 22:09 mc-root
0 drwx------ 3 root root 60 Aug 2 21:47 systemd-private-13eb90ad40c0433082fc3e698cf7232a-grafana.service-NpjZJc
0 drwx------ 3 root root 60 Aug 2 21:47 systemd-private-13eb90ad40c0433082fc3e698cf7232a-influxdb.service-OpNDp5
0 drwx------ 3 root root 60 Aug 2 21:47 systemd-private-13eb90ad40c0433082fc3e698cf7232a-systemd-logind.service-QVFy02
0 drwx------ 3 root root 60 Aug 2 21:47 systemd-private-13eb90ad40c0433082fc3e698cf7232a-systemd-timesyncd.service-z2TKTm
❯ cat kea-lfc.dbg
Usage error: DHCP version required
Usage: kea-lfc
[-4|-6] -p file -x file -i file -o file -f file -c file
-4 or -6 clean a set of v4 or v6 lease files
-p <file>: PID file
-x <file>: previous or ex lease file
-i <file>: copy of lease file
-o <file>: output lease file
-f <file>: finish file
-c <file>: configuration file
-v: print version number and exit
-V: print extended version information and exit
-d: optional, verbose output
-h: print this message
Service failed: DHCP version required
Last edited by vecino (2023-08-02 21:36:46)
Offline
❯ ./kea.sh
No, the plan was to make the script /usr/bin/kea-lfc and let it be called by kea-dhcp4
Offline
Oh, I'm stupid - I understand. I'm going to put it on and let it run for a few hours - then I'll post the results here.
Offline