You are not logged in.

#1 2024-08-13 01:37:26

gcb
Member
Registered: 2014-02-12
Posts: 212

help understand nftables

edit: i'm an idiot. Pls see the 4th comment. I still have a question on how to proceed with nftables after identifying the actual problem.


I'm having a weird issue with nft and podman automatic rules. When the podman rules are "on" AND the container actually listen to the port, the "accept" rules in my input chain do not trigger (counter doesn't increment). If podman is not running, or it is running (with or without the port mapped to itself) and the container is NOT handling connections to the port, then the accept input filter triggers!

It's confusing. It happens all the time, so I reduced complexity with a test container with only `netcat -l 0.0.0.0 -p 9999`.

I run that with `-p 0.0.0.0:9999:9999` on podman. (tried ready-made images, one i just build with buildah+pacstrap, starting via Quadlet with `PublishPort=0.0.0.0:9999:9999/tcp`, etc, etc).

I've disabled all the rootless features and am running standard as root. In fact via systemd with all the recommended patterns.

I confirm the port is listening

# ss -apn4
7:tcp   LISTEN 0      4096          0.0.0.0:9999       0.0.0.0:*     users:(("conmon",pid=13011,fd=5))

# podman ps --format "{{.Ports}}"
0.0.0.0:9999->9999/tcp, 9999/tcp

I've simply opened the port on the standard nftables cofig file that gets installed with the base package (not using any other firewall package!)

# diff /etc/nftables.conf /etc/nftables.conf.original_pacman
 ...
 table inet filter {
    ...
+    tcp dport 9999 counter name "cntAllow9999" accept
    ...

now, the tests:

Case 1: podman not running.

nftables only contain the above 'table inet filter'. connection from other hosts on the LAN reach the host, increment the "inputAllow9999" counter, and immediately say "failed: Connection refused" because there's nothing listening on the port. But the important part, nft allowed it and incremented the counter. If I start netcat locally i can handle the connection just fine.

Case 2: podman attached to that port, but container does nothing with it.

if i start a container listening on the port, but nothing handling the connection, e.g. `podman run --publish 0.0.0.0:9999:9999/tcp bashcontainer /usr/bin/bash`, Then again, everything "works". Hosts on the LAN can connect, connection is immediately refused as nothing is listening. The cntAllow9999 counter gets incremented.

Case 3: podamn handling the connection

Now, if I do the same thing as case 2, but start `netcat -l 9999` in the container, or any other service that will open that port, then it gets very strange. Note that the nftable rules added by podman are exactly the SAME in case 2 and here. But now, only the host running podman can open a connection (using localhost or the LAN interface IP) and the cntAllow9999 counter increments fine and the container shows the connection. BUT, If any host on the LAN tries to connect, now it timesout, and more weirdly, the cntAllow9999 counter doesn't increment. I have absolutely no idea what "gets" the connection in this case and how it happens *before* the input chain. I even tried to change the input chain priority to -1 and nothing change. it still doesn't increment the cntAllow9999 counter.
If i cancel `netcat`, the LAN host connection goes back to being refused and the input filter cntAllow9999 increments as usual!
The container have no firewall whatsoever. ...but come to thing of this now, it is using the same kernel as the outside. Overthinking about this is confusing.

here's the network. I never touched it other than starting things as podman expects and passing the expose ports options:

 # podman network inspect podman
[
     {
          "name": "podman",
          "id": "2f259bab93aaaaa2542ba43ef33eb990d0999ee1b9924b557b7be53c0b7a1bb9",
          "driver": "bridge",
          "network_interface": "podman0",
          "created": "2024-08-13T01:45:58.885465578Z",
          "subnets": [
               {
                    "subnet": "10.88.0.0/16",
                    "gateway": "10.88.0.1"
               }
          ],
          "ipv6_enabled": false,
          "internal": false,
          "dns_enabled": false,
          "ipam_options": {
               "driver": "host-local"
          },
          "containers": {
               "2ec96ef31808760b21bbacf4a1f0a14a495e68fed9a4c0a9bebe8e46046d0c1d": {
                    "name": "beautiful_ishizaka",
                    "interfaces": {
                         "eth0": {
                              "subnets": [
                                   {
                                        "ipnet": "10.88.0.21/16",
                                        "gateway": "10.88.0.1"
                                   }
                              ],
                              "mac_address": "4e:3a:98:0f:52:3c"
                         }
                    }
               }
          }
     }
]

here's my ruleset when podman is running:

table inet filter { # handle 18
        counter cntInputdrop { # handle 3
                packets 39 bytes 6535
        }

        counter cntForwarddrop { # handle 4
                packets 65 bytes 4056
        }

        counter cntPkttype { # handle 5
                packets 6 bytes 720
        }

        counter cntAllowSSH { # handle 6
                packets 2 bytes 120
        }

        counter cntAllow9999 { # handle 13
                packets 0 bytes 0
        }

        chain input { # handle 1
                type filter hook input priority -1; policy drop;
                ct state invalid drop comment "early drop of invalid connections" # handle 14
                ct state { established, related } accept comment "allow tracked connections" # handle 16
                iif "lo" accept comment "allow from loopback" # handle 17
                ip protocol icmp accept comment "allow icmp" # handle 18
                meta l4proto ipv6-icmp accept comment "allow icmp v6" # handle 19
                tcp dport 22 counter name "cntAllowSSH" accept # handle 20
                tcp dport 9999 counter name "cntAllow9999" accept # handle 26
                meta pkttype host limit rate 5/second burst 5 packets counter name "cntPkttype" reject with icmpx admin-prohibited # handle 28
                counter name "cntInputdrop" log # handle 29
        }

        chain forward { # handle 2
                type filter hook forward priority 0; policy drop;
                counter name "cntForwarddrop" log # handle 30
        }
}
# Warning: table ip nat is managed by iptables-nft, do not touch!
table ip nat { # handle 19
        chain POSTROUTING { # handle 4
                type nat hook postrouting priority 100; policy accept;
                counter packets 7766 bytes 701481 jump NETAVARK-HOSTPORT-MASQ # handle 12
                ip saddr 10.88.0.0/16 counter packets 1 bytes 72 jump NETAVARK-1D8721804F16F # handle 59
        }

        chain NETAVARK-HOSTPORT-SETMARK { # handle 6
                counter packets 0 bytes 0 xt target "MARK" # handle 10
        }

        chain NETAVARK-HOSTPORT-MASQ { # handle 7
                xt match "comment" meta mark & 0x00002000 == 0x00002000 counter packets 0 bytes 0 xt target "MASQUERADE" # handle 11
        }

        chain NETAVARK-HOSTPORT-DNAT { # handle 9
                tcp dport 9999 xt match "comment" counter packets 3 bytes 180 jump NETAVARK-DN-1D8721804F16F # handle 64
        }

        chain PREROUTING { # handle 17
                type nat hook prerouting priority -100; policy accept;
                xt match "addrtype" counter packets 55 bytes 3804 jump NETAVARK-HOSTPORT-DNAT # handle 18
        }

        chain OUTPUT { # handle 19
                type nat hook output priority -100; policy accept;
                xt match "addrtype" counter packets 0 bytes 0 jump NETAVARK-HOSTPORT-DNAT # handle 20
        }

        chain NETAVARK-1D8721804F16F { # handle 56
                ip daddr 10.88.0.0/16 counter packets 0 bytes 0 accept # handle 57
                ip daddr != 224.0.0.0/4 counter packets 0 bytes 0 xt target "MASQUERADE" # handle 58
        }

        chain NETAVARK-DN-1D8721804F16F { # handle 60
                ip saddr 10.88.0.0/16 tcp dport 9999 counter packets 0 bytes 0 jump NETAVARK-HOSTPORT-SETMARK # handle 61
                ip saddr 127.0.0.1 tcp dport 9999 counter packets 0 bytes 0 jump NETAVARK-HOSTPORT-SETMARK # handle 62
                tcp dport 9999 counter packets 3 bytes 180 xt target "DNAT" # handle 63
        }
}
# Warning: table ip filter is managed by iptables-nft, do not touch!
table ip filter { # handle 20
        chain NETAVARK_ISOLATION_2 { # handle 1
        }

        chain NETAVARK_ISOLATION_3 { # handle 2
                oifname "podman0" counter packets 0 bytes 0 drop # handle 27
                counter packets 0 bytes 0 jump NETAVARK_ISOLATION_2 # handle 6
        }

        chain NETAVARK_INPUT { # handle 3
                ip saddr 10.88.0.0/16 udp dport 53 counter packets 0 bytes 0 accept # handle 28
        }

        chain NETAVARK_FORWARD { # handle 4
                xt match "conntrack" counter packets 0 bytes 0 drop # handle 12
                ip daddr 10.88.0.0/16 xt match "conntrack" counter packets 0 bytes 0 accept # handle 29
                ip saddr 10.88.0.0/16 counter packets 2 bytes 132 accept # handle 30
        }

        chain FORWARD { # handle 7
                type filter hook forward priority 0; policy accept;
                xt match "comment" counter packets 74 bytes 4612 jump NETAVARK_FORWARD # handle 8
        }

        chain INPUT { # handle 9
                type filter hook input priority 0; policy accept;
                xt match "comment" counter packets 3012073 bytes 2160733554 jump NETAVARK_INPUT # handle 10
        }
}

I've added counters everywhere  (can't add on the podman rules, don't know how), and running a firehose of failed connections (`netcat -w 1 LANIP 9999` in a loop) I do not see any of my counters going up!


So where are those connections going?! any ideas?

---

also, what are those `xt` in the podman rules?

ip daddr 10.88.0.0/16 xt match "conntrack" counter packets 0 bytes 0 accept # handle 25
...
xt match "comment" counter packets 3002281 bytes 2159961453 jump NETAVARK_INPUT # handle 10

Cannot find it on the nftables references.

Last edited by gcb (2024-08-13 02:48:24)

Offline

#2 2024-08-13 02:01:45

gcb
Member
Registered: 2014-02-12
Posts: 212

Re: help understand nftables

some hints with the `xt`.. if i try to dump the ruleset to a file that I can modify (to add counters to the podman rules), flush, and load from the file with `nft -f` it fails with

Error: unsupported xtables compat expression, use iptables-nft with this ruleset
      ((and it points to all the xt commands))
                                 ^^^^^^^^^^^

Offline

#3 2024-08-13 02:09:26

gcb
Member
Registered: 2014-02-12
Posts: 212

Re: help understand nftables

Hum... it seems that the counter "cntForwarddrop" does increase when i get timeout from other hosts on LAN... but if that is a forward rejection... wouldn't `netcat <LANIP> 9999` from the host running podman also fails since the connection to that IP would also go trhu the forward to the veth interface for 10.88.0.0? There's no rule allowing by src ip...

Last edited by gcb (2024-08-13 02:10:28)

Offline

#4 2024-08-13 02:47:17

gcb
Member
Registered: 2014-02-12
Posts: 212

Re: help understand nftables

Ok, so it is the forward policy drop.

What is recommended here? allow everything destined to the podman virtual interface? or the virtual network ip rage? or would that expose some control pane i don't want exposed?

redhat official instance is still on iptables https://access.redhat.com/solutions/5885821

And our wiki is in heavy discussion on this topic https://wiki.archlinux.org/title/Podman#IP_networking

Last edited by gcb (2024-08-13 02:49:25)

Offline

Board footer

Powered by FluxBB