You are not logged in.
Hi! have a perplexing error. I have a box with arch on it (called BASE for ease of reading), and that has an arch vm on it (I'll call VM).
When I run `ping -c 3 www.google.com` on BASE, it will either hang indefinitely, or hang for a really long time and respond correctly (meaning, like this: https://wiki.archlinux.org/title/Networ … tion#Ping).
It does this regardless of whether or not the VM is running. When I run the same command on the VM, it responds quickly and correctly (as defined above).
I don't know where to begin troubleshooting this. Ive retried on BASE with nginx disabled, I use resilio sync and tried disabling that, jellyfin, tailscale, freshrss, wallabag..
I am trying to install nethogs to get a better look but.. no connection. Strangely! All of the applications listed work.
I strongly believe this is a failure in domain name resolution -- pinging 8.8.8.8 instead of www.google.com responds quickly and correctly...
PS: The internet connection just worked long enough to download nethogs. I didn't really find anything. BUT I did realize in addition to the apps listed, I'm connected to the BASE os over SSH (on LAN)?
Im trying to be thorough, but probably overlooked something obvious
/etc/systemd/network/25-br0-en.network:
[Match]
Name=en*
[Network]
Bridge=br0
/etc/systemd/network/25-br0.network:
[Match]
Name=br0
[Network]
DNS=192.168.40.1
Address=192.168.40.xx/24
Gateway=192.168.40.1
/etc/systemd/network/25-br0.netdev:
[NetDev]
Name=br0
Kind=bridge
/etc/systemd/network/90-old-wired.network:
[Match]
Name=eno1
[Network]
DHCP=yes
resolvectl status on BASE:
Global
Protocols: +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Fallback DNS Servers: 1.1.1.1#cloudflare-dns.com 9.9.9.9#dns.quad9.net 8.8.8.8#dns.google 2606:4700:4700::1111#cloudflare-dns.com
2620:fe::9#dns.quad9.net 2001:4860:4860::8888#dns.google
Link 2 (br0)
Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.40.1
DNS Servers: 192.168.40.1
Link 3 (wlp2s0)
Current Scopes: none
Protocols: -DefaultRoute +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
Link 4 (eno1)
Current Scopes: none
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Link 9 (tailscale0)
Current Scopes: none
Protocols: +DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Link 10 (vnet0)
Current Scopes: LLMNR/IPv6 mDNS/IPv6
Protocols: -DefaultRoute +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
resolvectl status on VM:
Global
Protocols: +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: foreign
Fallback DNS Servers: 1.1.1.1#cloudflare-dns.com 9.9.9.9#dns.quad9.net 8.8.8.8#dns.google 2606:4700:4700::1111#cloudflare-dns.com
2620:fe::9#dns.quad9.net 2001:4860:4860::8888#dns.google
DNS Domain: ~.
Link 2 (enp1s0)
Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.40.1
DNS Servers: 192.168.40.1
Thanks!
Last edited by zZzUP3RzZz (2024-11-09 23:24:46)
Offline
please post console output as text in code-tags instead of screenshots
Offline
Fixed!
EDIT: The image vs code block thing, not the issue
Last edited by zZzUP3RzZz (2024-11-10 02:31:18)
Offline
Just a few things I noticed:
Why is "/etc/systemd/network/90-old-wired.network" still around? Do you know that adding an adapter to a bridge makes it a slave to that bridge and it loses any meaningful OSI level 3 functionality? Take a look at the output of "ip a" on the host.
You somehow tinkered with "/etc/resolv.conf" inside the VM but not on the host ("resolv.conf mode: foreign"). Have you compared "/etc/resolv.conf" on the host and the VM?
Offline
Just a few things I noticed:
Why is "/etc/systemd/network/90-old-wired.network" still around? Do you know that adding an adapter to a bridge makes it a slave to that bridge and it loses any meaningful OSI level 3 functionality? Take a look at the output of "ip a" on the host.
You somehow tinkered with "/etc/resolv.conf" inside the VM but not on the host ("resolv.conf mode: foreign"). Have you compared "/etc/resolv.conf" on the host and the VM?
I thought keeping 90-old-wired.network was necessary. I have deleted it, but the issue still persists.
Here is output of ip a:
on host:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 02:31:f2:xx:xx:xx brd ff:ff:ff:ff:ff:ff
inet 192.168.xx.xx/xx brd 192.168.xx.255 scope global br0
valid_lft forever preferred_lft forever
inet6 fe80::31:f2ff:xxxx:xxxx/xx scope link proto kernel_ll
valid_lft forever preferred_lft forever
3: wlp2s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 0c:7a:15:xx:xx:xx brd ff:ff:ff:ff:ff:ff
4: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br0 state UP group default qlen 1000
link/ether a8:a1:59:xx:xx:xx brd ff:ff:ff:ff:ff:ff
altname enp0s31f6
9: tailscale0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1280 qdisc fq_codel state UNKNOWN group default qlen 500
link/none
inet6 fe80::43bf:84d5:xxxx:xxxx/xx scope link stable-privacy proto kernel_ll
valid_lft forever preferred_lft forever
10: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UNKNOWN group default qlen 1000
link/ether fe:54:00:xx:xx:xx brd ff:ff:ff:ff:ff:ff
inet6 fe80::fc54:ff:xxxx:xxxx/xx scope link proto kernel_ll
valid_lft forever preferred_lft forever
on VM:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:xx:xx:xx brd ff:ff:ff:ff:ff:ff
inet 192.168.xx.xx/xx brd 192.168.xx.255 scope global enp1s0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:xxxx:xxxx/xx scope link proto kernel_ll
valid_lft forever preferred_lft forever
I think the reason it looks that way on the vm vs the host is that when I set up the vm in virt-manager I had already configured the bridge and entered it in during setup..
I have however compared the resolv.confs and they are the exact same..
nameserver 127.0.0.53
options edns0 trust-ad
search .
Offline
More tests:
Is "/etc/resolv.conf" on both instances a link:
lrwxrwxrwx 1 root root 37 Oct 30 2023 /etc/resolv.conf -> /run/systemd/resolve/stub-resolv.conf
Is resolved on both instances listening on the ports:
[thc@box ~]$ sudo ss -l -p -u -n
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
UNCONN 0 0 127.0.0.54:53 0.0.0.0:* users:(("systemd-resolve",pid=xxx,fd=yy))
UNCONN 0 0 127.0.0.53%lo:53 0.0.0.0:* users:(("systemd-resolve",pid=xxx,fd=yy))
What happens if you try
resolvectl query www.google.com
on the host?
Offline
More tests:
Is "/etc/resolv.conf" on both instances a link:
lrwxrwxrwx 1 root root 37 Oct 30 2023 /etc/resolv.conf -> /run/systemd/resolve/stub-resolv.conf
Is resolved on both instances listening on the ports:
[thc@box ~]$ sudo ss -l -p -u -n State Recv-Q Send-Q Local Address:Port Peer Address:Port Process UNCONN 0 0 127.0.0.54:53 0.0.0.0:* users:(("systemd-resolve",pid=xxx,fd=yy)) UNCONN 0 0 127.0.0.53%lo:53 0.0.0.0:* users:(("systemd-resolve",pid=xxx,fd=yy))
What happens if you try
resolvectl query www.google.com
on the host?
Hm! Resolv.conf on host is a link but on VM it is not!
host:
lrwxrwxrwx 1 root root 39 Sep 22 23:03 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf
vm:
-rw-r--r-- 1 root root 920 Oct 22 17:22 /etc/resolv.conf
The results of ss are not as interesting:
UNCONN 0 0 127.0.0.54:53 0.0.0.0:* users:(("systemd-resolve",pid=478,fd=22))
UNCONN 0 0 127.0.0.53%lo:53 0.0.0.0:* users:(("systemd-resolve",pid=478,fd=20))
as well as
UNCONN 0 0 0.0.0.0:5353 0.0.0.0:* users:(("systemd-resolve",pid=432,fd=15))
UNCONN 0 0 0.0.0.0:5355 0.0.0.0:* users:(("systemd-resolve",pid=432,fd=11))
and
UNCONN 0 0 [::]:5353 [::]:* users:(("systemd-resolve",pid=478,fd=16))
UNCONN 0 0 [::]:5355 [::]:* users:(("systemd-resolve",pid=478,fd=13))
appear on both outputs.
resolvectl query www.google.com hangs the same as ping
Thanks!
Offline
Hmmm.
Can you ping 192.168.40.1 from your host?
Is there a firewall involved?
Offline
both machines can ping 192.168.40.1 fine, and I'm almost definitely sure there is not a firewall in the mix.
A piece of the puzzle that may be important (I mentioned this in the OP but it bears rementioning) is that the host machine will (not always) respond to `ping -c 3 www.google.com` correctly, just after a very long time. But it does respond! Sometimes.
Offline
What happens if you dissolve the bridge (temporarily remove all bridge-"conf"s and replace it with your old wired.conf)?
Offline
Ok, so that fixes the issue on the host machine!
But now, how do I set the bridge up for the VM?
(I originally set it up by following this)
Last edited by zZzUP3RzZz (2024-11-11 03:28:37)
Offline
Does the problem reappear if you just set up "half a bridge" (bridge and en*-slave only - no vnet* attached or even active)?
Last edited by -thc (2024-11-11 07:58:03)
Offline
Does the problem reappear if you just set up "half a bridge" (bridge and en*-slave only - no vnet* attached or even active)?
I don't know how to do that...
Would that be deleting the 25-br0.network file?
Offline
No - just reinstate all three bridge conf-files and control via
brctl show
(as root) that the en*-Adapter is the only bridge member.
Offline
No - just reinstate all three bridge conf-files and control via
brctl show
(as root) that the en*-Adapter is the only bridge member.
I still don't know what you mean. Reinstate? Also, there is no brctl because it is deprecated and replaced by 'bridge link'
Last edited by zZzUP3RzZz (2024-11-11 17:48:07)
Offline
Also, there is no brctl because it is deprecated and replaced by 'bridge link'
doesn't look like to me
https://man.archlinux.org/man/brctl.8.en
https://archlinux.org/packages/extra/x8 … dge-utils/ - which is a dependency for docker - hence I doubt it's deprecated
anyway - what thc is requesting: does the issue on the host also happen with NO vm running (i.e. no vnet0 connected to br0)?
Offline
Huh. this was my source for it being deprecated. I'll install it.
Yes, the problem still exists with no vm running. I thought I mentioned that in the OP but I did not .
Offline
It is marked deprecated upstream, but it shoudl still work. https://wiki.linuxfoundation.org/networking/bridge
The wiki page has documentation for using iproute2 as well as brctl: https://wiki.archlinux.org/title/Network_bridge
| alias CUTF='LANG=en_XX.UTF-8@POSIX ' |
Offline
I've set up a simple bridge like yours in my Arch VM with active systemd-resolved and everything works:
# 10-br0.netdev
[NetDev]
Name=br0
Kind=bridge
# 10-tapvm.netdev
[NetDev]
Name=tapvm
Kind=tap
# 15-ens33.network
[Match]
Name=ens33
[Network]
Bridge=br0
# 15-tapvm.network
[Match]
Name=tapvm
[Network]
Bridge=br0
# 20-br0.network
[Match]
Name=br0
[Network]
DHCP=yes
[DHCPv4]
UseDNS=yes
I have no idea why your setup behaves differently.
Offline
I have no idea why your setup behaves differently.
Ah. Thank you for your help anyway
Piece of the puzzle (should have tried this sooner)... replacing 25-br0.network with:
[Match]
Name=br0
[Network]
DHCP=yes
fixes the issue -- So it must be something to do with the static ip address. Which doesn't make sense, because the network on the VM is set up exactly the same (only difference is a number in the IP Address).
Again, I followed the directions in here. I just went back through a second time as well.
EDIT: Maybe the VM is ignoring the static IP Address file, and running dhcp anyway.. This would explain why the VM works and the host doesnt -- both static ip address setups are wrong? How do you check if DHCP is running? But I used dhcping on the vm and it replied no answer.
EDIT EDIT: That's not it -- I changed the static IP Address field on the VM and it changed to the new IP correctly.
Last edited by zZzUP3RzZz (2024-11-11 23:46:59)
Offline
well, maybe someone else can brighten up us both as I, too, have difficulties to really understand how a bridge works:
as I also play around with VMs but also want to take advantage of my PXE setup in place for me using a bridge instead of nat is far easier (I don't know if qemu even support pxe like virtualbox does)
I have my bridge configured to just use dhcp instead of a static ip and also not have vlans in place (which is a good idea if one use vms for something like hosting or cameras or other stuff that should mix with the rest of the lan)
from how I understand it: a network bridge is a somewhat virtual switch to which the host system as well as vms connect to share one physical network interface
in order for this virtual switch to work it requires an upstream link which is done by enslaving the physical interface so anything that connects to it (the vms with thier vnetX) can get a connection to tge physical lan
now what I fail to understand is this: as the host also requires a connection the bridge itself somewhat becomes its new main network interface and hence now it requires an IP
to me this somewhat contradicts the analogy that a bridge is merely a more or less dumb switch - which although require handling of ARP tables usually doesn't get its own IP unless its a dmart managed switched - and the host itself should also get something like a vnet virtual interface
I'm not quite sure how it's supposed to work but it should work with assigning a static ip the same as with dhcp - or are bridges supposed to work with dhcp only?
Offline
It's not that complicated.
A bridge connects two separate networks on OSI level 2 via ARP tables and doesn't care about IP addresses (like an OSI level 2 switch).
A bridge is represented as an interface (e.g. br0).
The bridge works independently from the OSI level 3 (IP) status of the interface. Without an IP address it's a "headless" bridge.
On a host with only one physical interface you probably need to assign IPv4/IPv6 addresses to the bridge interface and it will work like a phyiscal interface on OSI level 3. Additionally the bridge still works "below" that on OSI level 2 (If you have more that one physical adapter you may choose bridges, headless bridges or unabridged interfaces).
Last edited by -thc (2024-11-13 06:07:22)
Offline
Do you think I need to assign IP addresses to the bridge (to make it act like a physical interface on OSI level 3)? There is more than one interface -- but only one is connected to the internet
I know my issue is connected to Static IP vs DHCP, but the IP address config is the same on VM and host, and the VM works. Beyond that, if I use ip a on host, the br0 interface shows as up, with the same IP that I specified in the config. (If i change the IP address in the config, it shows the change in ip a, but the problem perists)
EDIT: Replacing the DNS line with DNS=1.1.1.1 8.8.8.8 does work!I had a feeling it might, but I still feel like it's just working around the problem (esp because the same config without the change works on the VM)
So I don't think this is solved yet.
Last edited by zZzUP3RzZz (2024-11-12 20:39:41)
Offline
Do you think I need to assign IP addresses to the bridge (to make it act like a physical interface on OSI level 3)? There is more than one interface -- but only one is connected to the internet
It should not matter. Regardless how the bridge interface acquires it's IP configuration it should work either way.
I know my issue is connected to Static IP vs DHCP, but the IP address config is the same on VM and host, and the VM works.
Not exactly the same?
EDIT: Replacing the DNS line with DNS=1.1.1.1 8.8.8.8 does work!I had a feeling it might, but I still feel like it's just working around the problem (esp because the same config without the change works on the VM)
So I don't think this is solved yet.
This sounds like your router (192.168.40.1) has a (DNS) problem with two different MACs/IP addresses on a single physical link.
Offline
Sorry, I just now saw this.
The only difference between the two config files is the IP set, other than that it is the same.
This sounds like your router (192.168.40.1) has a (DNS) problem with two different MACs/IP addresses on a single physical link.
That sounds like the likely problem (considering I have no idea what else it would be). However, if that were the case I think the host connection would function when the vm is down. Also, I don't know how to troubleshoot/fix that. The DNS (1.1.1.1) setting is a hack but at least it lets me use the internet, making this issue less urgent; I would like to get it "right" though
Offline