You are not logged in.

#1 2021-01-22 13:51:04

mrmonday
Member
Registered: 2008-11-15
Posts: 11

systemd-resolved resolution sporadically takes a long time

I have noticed that occasionally web pages take a really long time to load. Bringing up the debugger in Firefox shows that when it happens, it is waiting on DNS resolution. I've noticed the behaviour in other applications, and while I can't be sure it's DNS, the delays last a similar amount of time, so I will assume it is.

DNS resolution will occasionally take up to 10 seconds, with no apparent pattern - there is no particular set of domains which it happens for. It happens more frequently with domains I haven't visited since boot, but it isn't consistent.

Since I can't consistently reproduce the issue, I'm finding it hard to debug.

I have tried clearing the DNS cache in Firefox using about:networking#dns, clearing the systemd-resolved cache with resolvectl flush-caches, and restarting systemd-resolved - and I cannot reproduce the issue immediately after it happens.

My resolved.conf:

$ cat /etc/systemd/resolved.conf | grep -v '^#'

[Resolve]
DNS=1.1.1.1#cloudflare-dns.com 1.0.0.1#cloudflaredns.com
FallbackDNS=8.8.8.8#dns.google 8.8.4.4#dns.google
DNSSEC=yes
DNSOverTLS=yes
MulticastDNS=no
LLMNR=no

systemd-resolved status:

$ resolvectl status                            
Global
           Protocols: -LLMNR -mDNS +DNSOverTLS DNSSEC=yes/supported       
    resolv.conf mode: stub                                                
  Current DNS Server: 1.1.1.1#cloudflare-dns.com                          
         DNS Servers: 1.1.1.1#cloudflare-dns.com 1.0.0.1#cloudflaredns.com
Fallback DNS Servers: 8.8.8.8#dns.google 8.8.4.4#dns.google               

Link 2 (enp6s0)
Current Scopes: none                                                       
     Protocols: -DefaultRoute +LLMNR -mDNS +DNSOverTLS DNSSEC=yes/supported

Link 3 (enp5s0)
Current Scopes: none                                                       
     Protocols: -DefaultRoute +LLMNR -mDNS +DNSOverTLS DNSSEC=yes/supported

Link 5 (wlan0)
Current Scopes: none                                                       
     Protocols: -DefaultRoute +LLMNR -mDNS +DNSOverTLS DNSSEC=yes/supported

Link 6 (virbr0)
Current Scopes: none                                                       
     Protocols: -DefaultRoute +LLMNR -mDNS +DNSOverTLS DNSSEC=yes/supported

Link 7 (virbr0-nic)
Current Scopes: none                                                       
     Protocols: -DefaultRoute +LLMNR -mDNS +DNSOverTLS DNSSEC=yes/supported

I have tried disabling DNSSEC and it seems to cause the issue to happen more frequently, but I'm not 100% on that. I'm considering disabling it anyway since it occasionally breaks resolution for some domains (most recently developer.apple.com).

I have not tried disabling DoT. Using the same DNS servers with DoT on my mobile (using the same internet connection) does not have problems.

journalctl -e -u systemd-resolved does not show any warnings, just informational startup/shutdown messages.

The only similar issues I've found from googling seem to be from a long time ago and don't really match the behaviour I'm seeing.

I have a couple of questions:

  • How do I go about debugging this?

  • Can you think of any way to make this more reproducible so I can more readily debug it?

Offline

#2 2021-01-22 18:17:26

mrmonday
Member
Registered: 2008-11-15
Posts: 11

Re: systemd-resolved resolution sporadically takes a long time

Fixed a typo.

Replaced:

DNS=1.1.1.1#cloudflare-dns.com 1.0.0.1#cloudflaredns.com

With:

DNS=1.1.1.1#cloudflare-dns.com 1.0.0.1#cloudflare-dns.com

In the resolved.conf and am still experiencing the same issue.

Offline

#3 2021-01-22 18:30:26

mrmonday
Member
Registered: 2008-11-15
Posts: 11

Re: systemd-resolved resolution sporadically takes a long time

Ran:

# systemctl log-level debug

Flushed caches, restarted systemd-resolvectl - no useful information in the logs unfortunately.

Offline

#4 2021-01-23 09:28:43

Koatao
Member
Registered: 2018-08-30
Posts: 92

Re: systemd-resolved resolution sporadically takes a long time

Hello,

You could try to monitor traffic with a tool like Wireshark? Because of DoT, you will probably have to monitor traffic from/to 1.1.1.1/1.0.0.1 at first.

You could also disable caching, and when a domain is taking a long time to resolve, try to resolve it again, just to see if it is domain related.

Offline

#5 2021-01-23 10:18:40

mrmonday
Member
Registered: 2008-11-15
Posts: 11

Re: systemd-resolved resolution sporadically takes a long time

Thanks for the suggestions!

I didn't see anything weird in wireshark, but I won't rule it out.

I set Cache=no in resolved.conf, and haven't seen the delay since.

I found: https://github.com/systemd/systemd/blob … .c#L21-L23

Which led me to https://github.com/systemd/systemd/issues/5552 and https://github.com/systemd/systemd/comm … d7e44b7da2

Those 10 seconds look suspiciously like what I was seeing. If the issue continues to be "resolved" (ha!) over the next few hours I'll dig into this a little more.

Offline

#6 2021-01-23 10:25:16

mrmonday
Member
Registered: 2008-11-15
Posts: 11

Re: systemd-resolved resolution sporadically takes a long time

Obviously I spoke too soon:

Screenshot of Firefox timings showing DNS resolution: 10.08s

Offline

#7 2021-01-23 12:14:01

Koatao
Member
Registered: 2018-08-30
Posts: 92

Re: systemd-resolved resolution sporadically takes a long time

Remove DoT, and capture traffic (and remove DNSSEC if you want to remove unnecessary traffic while you are capturing). You should see DNS traffic in clear and be able to find the failure in resolution.

Moreover, do you have the same problem of delay in resolution using other software than your browser? Did you run some test with dig or drill?

Last edited by Koatao (2021-01-23 12:14:44)

Offline

#8 2021-02-03 14:33:06

mrmonday
Member
Registered: 2008-11-15
Posts: 11

Re: systemd-resolved resolution sporadically takes a long time

I took 5 minutes to set up unbound instead of systemd-resolved.

I no longer get random delays when resolving, and don't have to spend any more time debugging. Thanks for your help everyone!

Last edited by mrmonday (2021-02-03 14:33:30)

Offline

#9 2021-02-04 07:51:15

tbg
Member
Registered: 2017-06-22
Posts: 72

Re: systemd-resolved resolution sporadically takes a long time

Please prepend [SOLVED] to your thread title.

Offline

#10 2021-02-07 14:20:53

Koatao
Member
Registered: 2018-08-30
Posts: 92

Re: systemd-resolved resolution sporadically takes a long time

I don't think prepending [SOLVED] is a good idea.
Anyone looking to solved the same kind of problem with systemd-resolved will not find any solution here, only hints into gathering information or bypass the problem by removing systemd-resolved from the equation.

But, we should also mention that systemd-networkd relies on systemd-resolved for some of its features so replacing systemd-resolved with another DNS cache/stub/server is not always possible.

Offline

#11 2021-06-27 18:00:57

ben781
Member
Registered: 2016-12-11
Posts: 18

Re: systemd-resolved resolution sporadically takes a long time

I have this same problem with Cloudflare DNS over TLS in systemd-resolved. It seems that Cloudflare closes the TCP connection before systemd-resolved expects it to be closed. Then systemd-resolved tries to reuse the previous connection. It takes 10 seconds to time out before systemd-resolved moves on to the next server in your list.

I have “solved” the problem by configuring systemd-resolved to use Google Public DNS, which doesn't suffer from this problem.

Last edited by ben781 (2021-06-28 02:42:09)

Offline

#12 2021-12-30 23:37:52

joanbrugueram
Member
Registered: 2018-11-12
Posts: 21

Re: systemd-resolved resolution sporadically takes a long time

I'm also experiencing the same problem. I can test the problem consistently by running `resolvectl query example.com --cache=no` a few times.

In the following cases I reproduce the problem (most of the time the query takes a few milliseconds, but maybe 10% of the time, it takes slightly over 10 seconds)
* DNS=1.0.0.1 (Cloudflare) and DNSOverTLS=yes
* DNS=8.8.8.8 (Google) and DNSOverTLS=yes

In the following cases I don't reproduce the problem (the query always resolves in a few milliseconds):
* DNS=9.9.9.10 (Quad9) and DNSOverTLS=yes
* Any of the three DNS and DNSOverTLS=no

I have reproduced the problem with three different computers (with different network adapters), two different routers from different ISPs, and two different OSes (Arch and Fedora).

In all three cases with DNSOverTLS=yes (including the Quad9 case which seems to work well from the user point of view, and for Cloudflare and Google, independently of whether the problem reproduces or not), capturing packets in Wireshark always shows some activity at time "x", but then some more activity at time "x+10sec".

Something very interesting is that both `resolvectl query -4 example.com --cache=no` and `resolvectl query -6 example.com --cache=no` always resolve instantly, the problem reproduces only with `resolvectl query example.com --cache=no` (which fetches both IPv4/A and IPv6/AAAA DNS records)

Last edited by joanbrugueram (2021-12-30 23:48:13)

Offline

#13 2022-01-15 21:46:04

joanbrugueram
Member
Registered: 2018-11-12
Posts: 21

Re: systemd-resolved resolution sporadically takes a long time

Offline

Board footer

Powered by FluxBB