You are not logged in.
Hi all,
A couple times a day I experience DNS failure. Firefox starts failing to load pages, and I can't ping URLs from the command line. I can still ping IP addresses directly. For example (pinging google.com just sits there doing nothing until I cancel with control-c):
lefty@lefty-pc:~$ ping www.google.com
^C
lefty@lefty-pc:~$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=117 time=15.6 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=117 time=14.9 ms
^C
--- 8.8.8.8 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 14.872/15.212/15.552/0.340 ms
lefty@lefty-pc:~$ cat /etc/resolv.conf
# Generated by NetworkManager
nameserver 192.168.1.1
I attempted to fix this by switching from netctl to NetworkManager, but the problem persists.
My network consists of a cable modem and a separate linksys router. The router is 192.168.1.1.
I checked and I think I only have one network service running:
$ systemctl list-unit-files --state=enabled
UNIT FILE STATE VENDOR PRESET
org.cups.cupsd.path enabled disabled
bluetooth.service enabled disabled
firewalld.service enabled disabled
getty@.service enabled enabled
libvirtd.service enabled disabled
lm_sensors.service enabled disabled
NetworkManager-dispatcher.service enabled disabled
NetworkManager-wait-online.service enabled disabled
NetworkManager.service enabled disabled
org.cups.cupsd.service enabled disabled
systemd-timesyncd.service enabled enabled
tlp.service enabled disabled
libvirtd-ro.socket enabled disabled
libvirtd.socket enabled disabled
org.cups.cupsd.socket enabled disabled
pcscd.socket enabled disabled
virtlockd.socket enabled disabled
virtlogd.socket enabled disabled
remote-fs.target enabled enabled
19 unit files listed.
I don't think my Windows desktop or my work Mac have this problem, but I don't spend that much time on those machines, so it's possible I haven't noticed the issue. The work machine is usually connected through a VPN, so it probably gets its own DNS.
Any suggestions on ways to diagnose this problem?
Offline
I don't know how you have your DNS setup...but in my case, if I were to have that problem, I might consider making /etc/resolv.conf immutable...for testing purposes.
Offline
OP, output of the contents of resolv.conf and do more testing on the router with different devices.
You can also try to test if it’s local by using Google’s or Cloudfare’s DNS directly...
And in that case, if manual or local DNS fixes your issue then you can make resolv.conf immutable, only if you consider its contents final.
Offline
Thanks for the suggestions. I've made /etc/resolv.conf immutable, and set it to google's dns 8.8.8.8.
The problem is still happening. These commands were all entered within the span of about a minute (note that at the very end, it starts resolving google.com correctly):
EDIT: added output with timestamps.
lefty@lefty-pc:~ 20:54:12$ ping www.google.com
PING www.google.com (172.217.6.68) 56(84) bytes of data.
64 bytes from sfo07s17-in-f68.1e100.net (172.217.6.68): icmp_seq=1 ttl=117 time=13.8 ms
64 bytes from sfo07s17-in-f68.1e100.net (172.217.6.68): icmp_seq=2 ttl=117 time=15.4 ms
lefty@lefty-pc:~ 20:54:14$ ping www.google.com
ping: www.google.com: Temporary failure in name resolution
lefty@lefty-pc:~ 20:54:25$ ping www.google.com
PING www.google.com (172.217.5.100) 56(84) bytes of data.
64 bytes from sfo03s07-in-f4.1e100.net (172.217.5.100): icmp_seq=1 ttl=117 time=20.6 ms
64 bytes from sfo03s07-in-f4.1e100.net (172.217.5.100): icmp_seq=2 ttl=117 time=17.1 ms
lefty@lefty-pc:~ 20:54:52$ ping www.google.com
ping: www.google.com: Temporary failure in name resolution
lefty@lefty-pc:~ 20:55:03$ ping www.google.com
PING www.google.com (216.58.195.68) 56(84) bytes of data.
64 bytes from sfo07s16-in-f68.1e100.net (216.58.195.68): icmp_seq=1 ttl=117 time=22.3 ms
EDIT 1: I can confirm that when the Arch laptop fails to ping google.com, both my wired Windows desktop and my Mac on the same wireless network are successful. So it appears the issue is with the Arch machine.
EDIT 2: If I do this a handful of times, Arch gets different ip addresses for google.com. But the mac and the windows machine keep getting the same IP each time (although they're different from each other). I'm not sure what that means, but it seems like Arch is actually doing a DNS query and getting a fresh result, whereas the other machines are either using a cached DNS result or they're hitting a more consistent DNS each time?
Last edited by LeftyAce (2020-10-11 04:04:25)
Offline
systemd-resolved or dnsmasq provide a local DNS cache, but you should check what's wrong w/ the DNS server in the router.
It likely proxies the DNS of your ISP?
Does that DNS work consistenly?
You can use dig, drill & nslookup to query specific DNS servers w/o having to juggle your global resolver config.
Offline
Hi Seth. I'm not convinced there's anything wrong with the DNS server on the router, since other machines on the network don't have a problem, and bypassing the router by manually setting my DNS to 8.8.8.8 doesn't help. If that conclusion is incorrect I'd like to understand why.
I gave dig a try, using watch, time, and dig to do a lookup of www.google.com using 8.8.8.8 as the DNS once a second. The turnaround time to return the result was always under 100 ms.
By contrast, running the same combination of watch, time, and ping, I occasionally get 5-20 second hangups while it tries to resolve the URL.
If it's helpful I can try to figure out how to capture the response times so I can plot them or calculate min and max, or the percentage of time that it hangs.
Offline
If pinging an IP is fine, dig is fine but pinging a domain is not, regardless of the DNS resolver, the problem is likely w/ /etc/nsswitch.conf and some other host module (eg. an mdns leftover?)
Offline
Here's the contents of nsswitch.conf:
# Name Service Switch configuration file.
# See nsswitch.conf(5) for details.
passwd: files systemd
group: files systemd
shadow: files
publickey: files
hosts: files mymachines myhostname resolve [!UNAVAIL=return] dns
networks: files
protocols: files
services: files
ethers: files
rpc: files
netgroup: files
I'm suspicious of the hosts line, but so far I'm unsuccessful trying to determine what the default should be.
I do have openVPN installed on this machine (but I'm not running it currently), and I have previously set up an ssh server on the machine (also not running currently).
Offline
That hosts are the default entry.
Networkmanager defaults to systemd-resolved, is that running ("ps aux | grep resolved")?
I'd cut out libvirt, firewalld & tlp to rule out interference from there (though TLP is unlikely since it should affect dig as well) for further tests.
Offline