You are not logged in.
Hi,
Since updating my system this morning, named and nslookup are crashing on my system.
Feb 11 09:40:11 arch-raspi named[444]: named: src/unix/udp.c:292: uv__udp_recvmsg: Assertion `handle->recv_cb != NULL' failed.
Feb 11 09:40:11 arch-raspi systemd-coredump[626]: [?] Process 444 (named) of user 40 dumped core.
Feb 11 09:40:11 arch-raspi systemd[1]: named.service: Main process exited, code=dumped, status=6/ABRT
Feb 11 09:40:11 arch-raspi systemd[1]: named.service: Failed with result 'core-dump'.
Feb 11 09:40:11 arch-raspi systemd[1]: named.service: Consumed 1.184s CPU time over 2min 8.175s wall clock time, 179M memory peak.# pacman -Q |grep bind
bind 9.20.18-1
This morning the libuv package got updated which is an dependency to bind. I suspect that the bind package needs to be recompiled.
Regards
Nikolay
Offline
I downgraded the package libuv (1.52.0-1 => 1.51.0-1) and now named no longer crashes.
Offline
I downgraded the package libuv (1.52.0-1 => 1.51.0-1) and now named no longer crashes.
This also worked for me. Thank you!
Offline
same here downgrade the libuv works!
Offline
This has been reported Upstream, so let's hope there will be an update soon or Arch backports it.
Offline
Here is the upstream libuv bug: https://github.com/libuv/libuv/issues/5030
I pulled my hair out trying to find the cause. In this case, glad to know I was not alone... Hope the fix arrives soon.
David C. Rankin, J.D.,P.E.
Offline
Currently running bind single threaded on just one core. I had no luck downgrading libuv to 1.51
/usr/bin/named -f -u named -n 1Seems to be an issue in libuv with thread synchronization on high load
Offline
Hell of a week. The libuv folks are still working on a fix. We provided journal entries and coredumpctl info. They have also isolated a dig command that will trigger the named crash every time, so I suspect a libuv fix will be forthcoming. Arch released a updated bind 9.20.20-1 that mitigates the crashes with libuv 1.52.0 (at least I haven't had one yet). This on top of mdadm 4.5 that prevented RAID1 arrays from being assembled with the latest Arch 6.18.13 kernel. (I was lucky enough to be caught by that too) The just released mdadm 4.5-2 package fixes that issue as well. Let's hope this is the end of the core-server and boot crashes we see for a while. The recovery (emergency) console staring back at you is never a comforting site...
David C. Rankin, J.D.,P.E.
Offline
Downgrading and running on one core doesn't fix. Named keeps crashing. The watchdog script checks if bind is still running and restarts automatically. Best compromise until it has been fixed upstream.
#!/usr/bin/env bash
# named-watchdog.sh — Restart BIND/named if it stops responding
# Usage: Run as root, ideally via systemd or screen/tmux
# chmod +x named-watchdog.sh && ./named-watchdog.sh &
INTERVAL=5 # seconds between checks
TIMEOUT=3 # dig query timeout (seconds)
CHECK_ZONE="." # zone to query (root works on any resolver)
CHECK_SERVER="127.0.0.1"
NAMED_SERVICE="named" # systemd unit name; change to "bind9" on Debian/Ubuntu
LOGFILE="/var/log/named/named-watchdog.log"
MAX_CONSECUTIVE_FAILS=2 # restarts after this many successive failures
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOGFILE"
}
restart_named() {
log "WARN named not responding — restarting $NAMED_SERVICE"
if systemctl restart "$NAMED_SERVICE"; then
log "INFO $NAMED_SERVICE restarted successfully"
else
log "ERROR failed to restart $NAMED_SERVICE — check systemd status"
fi
}
check_named() {
# Returns 0 if named responds, non-zero otherwise
dig +time="$TIMEOUT" +tries=1 +short "@${CHECK_SERVER}" SOA "$CHECK_ZONE" \
>/dev/null 2>&1
}
log "INFO named watchdog started (interval=${INTERVAL}s, max_fails=${MAX_CONSECUTIVE_FAILS})"
fail_count=0
while true; do
if check_named; then
if (( fail_count > 0 )); then
log "INFO named is responding again (was down for ${fail_count} check(s))"
fi
fail_count=0
else
(( fail_count++ ))
log "WARN named not responding (fail ${fail_count}/${MAX_CONSECUTIVE_FAILS})"
if (( fail_count >= MAX_CONSECUTIVE_FAILS )); then
restart_named
fail_count=0
# Give named a moment to come up before the next check
sleep "$INTERVAL"
fi
fi
sleep "$INTERVAL"
done
exit 0Offline