You are not logged in.
Hi all, I'm using the unbound dns resolver with the cachedb (redis) module active and there is something I don't understand with its caching mechanism.
Example:
DNS resolution of a random site works:
[root@localhost ~]# drill www.google.se | awk '$2~/[[:digit:]]+/ || $2=="Query" {print}'
www.google.se. 300 IN A 142.250.180.131
;; Query time: 218 msec
[root@localhost ~]# unbound-control dump_cache| grep 'google.se'
www.google.se. 262 IN A 142.250.180.131
[...]
[root@localhost ~]# redis-cli -s /run/redis/control monitor
1718325181.226289 [0 unix:/run/redis/control] "GET" [...]
1718325181.390708 [0 unix:/run/redis/control] "SET" [...]Unbound internal cache works, query time 0 ms:
[root@localhost ~]# drill www.google.se | awk '$2~/[[:digit:]]+/ || $2=="Query" {print}'
www.google.se. 257 IN A 142.250.180.131
;; Query time: 0 msecRedis cache works, query time 0 ms:
[root@localhost ~]# unbound-control flush google.se
[root@localhost ~]# unbound-control flush www.google.se
[root@localhost ~]# drill www.google.se | awk '$2~/[[:digit:]]+/ || $2=="Query" {print}'
www.google.se. 235 IN A 142.250.180.131
;; Query time: 0 msecUnbound serve expired NOT works, operation take 270 ms after TTL expired:
[root@localhost ~]# sleep 235
[root@localhost ~]# drill www.google.se | awk '$2~/[[:digit:]]+/ || $2=="Query" {print}'
www.google.se. 300 IN A 142.250.180.131
;; Query time: 270 msecMy ttl settings:
[root@localhost ~]# grep -E "^[^#].*(ttl|exp)" /etc/unbound/unbound.conf.d/*.conf | cut -d':' -f 2-
cache-min-ttl: 300
cache-max-ttl: 86400
cache-max-negative-ttl: 3600
infra-host-ttl: 3600
serve-original-ttl: no
serve-expired: yes
serve-expired-reply-ttl: 30
serve-expired-ttl: 259200
serve-expired-ttl-reset: no
serve-expired-client-timeout: 0
cachedb-check-when-serve-expired: yes
redis-expire-records: noman unbound.conf say:
serve-expired: <yes or no>
If enabled, Unbound attempts to serve old responses from cache with a TTL of serve-expired-reply-ttl in the response without
waiting for the actual resolution to finish. The actual resolution answer ends up in the cache later on. Default is "no".If anyone can explain to me why the serve-expired option doesn't work I would greatly appreciate it
Last edited by acubens (2024-06-16 22:56:19)
Offline
Here's what I think you're trying to achieve: Queries should be answered with stale cache entries (for up to 3 days!) to keep the answer time near 0. In my opinion this is a bit extreme - but O.K., it's your setup.
A. I use a caching nameserver (bind) serving only non-stale data which resolves www.google.se in around 25 ms after the TTL expires. Why is your query time 10 times higher? How does your unbound handle upstream queries?
B. According to the very detailed unbound documentation on that topic you may have configured neither pre-RFC 8767 ("primal") behavior nor RFC 8767 behavior. The primal configuration needs and uses "prefetch" - the RFC 8767 configuration requires "serve-expired-client-timeout" not to be 0 (which contradicts your purpose).
Offline
Here's what I think you're trying to achieve: Queries should be answered with stale cache entries (for up to 3 days!) to keep the answer time near 0. In my opinion this is a bit extreme - but O.K., it's your setup.
Exactly, this is the result I've had so far. I set up unbound a few years ago and it worked fine. Serve clients expired responses (serve-expire=yes) with a ttl of 30 seconds (serve-expired-reply-ttl=30). Now it doesn't work anymore.
A. I use a caching nameserver (bind) serving only non-stale data which resolves www.google.se in around 25 ms after the TTL expires. Why is your query time 10 times higher? How does your unbound handle upstream queries?
Response times are even worse, sometimes 500-1500ms. Could it be the recursion mode and DNSsec?
B. According to the very detailed unbound documentation on that topic you may have configured neither pre-RFC 8767 ("primal") behavior nor RFC 8767 behavior. The primal configuration needs and uses "prefetch" - the RFC 8767 configuration requires "serve-expired-client-timeout" not to be 0 (which contradicts your purpose).
I had already activated the prefetch and prefetch-keys options, while the serve-expired-client-timeout option activates a feature that I'm not interested in. In any case, even if I set serve-expired-client-timeout the problem still remains:
[root@arch ~]# unbound-control get_option prefetch
yes
[root@arch ~]# unbound-control get_option prefetch-key
yes
[root@arch ~]# unbound-control set_option cache-max-ttl 5
ok
[root@arch ~]# unbound-control set_option serve-expired-client-timeout 10
ok
[root@arch ~]# drill www.google.se | awk '$2~/^[[:digit:]]+$/ || $2=="Query" {print}'
www.google.se. 5 IN A 142.250.180.131
;; Query time: 688 msec
[root@arch ~]# sleep 5
[root@arch ~]# drill www.google.se | awk '$2~/^[[:digit:]]+$/ || $2=="Query" {print}'
www.google.se. 5 IN A 142.250.180.131
;; Query time: 498 msec
[root@arch ~]#Could it be an unbound bug?
Offline
Response times are even worse, sometimes 500-1500ms. Could it be the recursion mode and DNSsec?
IMHO - no. My DNS server is a recursive one with DNSSEC enabled ("auto").
If you can pinpoint the problems occurrence to a specific date - you could look out for an unbound update at that time and try the older version.
Offline
If you can pinpoint the problems occurrence to a specific date - you could look out for an unbound update at that time and try the older version.
The slowness of unbound when it is configured to work as a recursive server with DNSsec is not just my problem, searching on google there are many users who wonder about this.
My problem was another and that was serving expired DNS queries from cache.
Now I understand what the problem is. It is the "cachedb-check-when-serve-expired" option that is active by default. Setting it to "no" unbound returns expired queries with the correct ttl:
[root@arch ~]# systemctl restart unbound
[root@arch ~]# unbound-control set_option cache-max-ttl: 5
ok
[root@arch ~]# lookup(){ drill $1 | awk '$2~/^([0-9]+|Query)$/'; }
[root@arch ~]# pause(){ s="Zz"; [ $1 -gt 0 ] && c=${s:$(($1%2)):1} && echo -n "$c$c$c..." && sleep 1 && pause $(($1-1)) || echo; }
[root@arch ~]# lookup www.google.se; pause 6; lookup www.google.se
www.google.se. 5 IN A 142.250.180.131
;; Query time: 623 msec
ZZZ...zzz...ZZZ...zzz...ZZZ...zzz...
www.google.se. 30 IN A 142.250.180.131
;; Query time: 0 msec
[root@arch ~]#The first response has a ttl of 5 seconds (cache-max-ttl=5) the second response has a ttl of 30 seconds (serve-expired-reply-ttl=30). The expired response has been served.
I believe this is not normal.
In the unbound manual the "cachedb-check-when-serve-expired" option says
If enabled, the cachedb is checked before an expired response is returned.
[...]
If the cachedb also has no valid contents, the serve expired response is sent."
This doesn't happen:
[root@arch ~]# unbound-control set_option cachedb-check-when-serve-expired: yes
ok
[root@arch ~]# lookup www.google.se; pause 6; lookup www.google.se
www.google.se. 5 IN A 142.250.180.131
;; Query time: 1173 msec
ZZZ...zzz...ZZZ...zzz...ZZZ...zzz...
www.google.se. 5 IN A 142.250.180.131
;; Query time: 1169 msecBasically unbound serves expired queries if it finds them in its cache but not those it finds in the redis cache !?!?
I don't know, it seems like a bug to me ![]()
Offline
Confirmed bug:
https://github.com/NLnetLabs/unbound/issues/1064
https://github.com/ar51an/unbound-redis/issues/13
The package in arch repository (unbound-1.20.0) is buggy.
I installed latest source code from: https://github.com/NLnetLabs/unbound/ar … master.zip
Now it works correctly, also if unbound was restarted the (expired) cached result will be served from cachedb:
[root@arch ~]# lookup(){ drill $1 | awk '$2~/^([0-9]+|Query)$/'; }
[root@arch ~]# pause(){ s="Zz"; [ $1 -gt 0 ] && c=${s:$(($1%2)):1} && echo -n "$c$c$c..." && sleep 1 && pause $(($1-1)) || echo; }
[root@arch ~]# systemctl restart unbound
[root@arch ~]# unbound-control get_option cachedb-check-when-serve-expired
yes
[root@arch ~]# unbound-control set_option cache-max-ttl 5
ok
[root@arch ~]# lookup www.google.se; pause 6; lookup www.google.se
www.google.se. 30 IN A 142.250.180.131
;; Query time: 0 msec
ZZZ...zzz...ZZZ...zzz...ZZZ...zzz...
www.google.se. 0 IN A 142.250.180.131
;; Query time: 0 msec
[root@arch ~]# lookup www.google.se; pause 6; lookup www.google.se
www.google.se. 30 IN A 142.250.180.131
;; Query time: 0 msec
ZZZ...zzz...ZZZ...zzz...ZZZ...zzz...
www.google.se. 30 IN A 142.250.180.131
;; Query time: 0 msec
[root@arch ~]#Offline