You are not logged in.
This was triggered by a discussion on arch-general mail list.
I have made a sample nftables firewall script available on my gh blog:
https://github.com/gene-git/blog/tree/master/nftables
It has:
- 1 for workstation / laptop (single interface)
- 1 for firewall router (2 interfaces)
The firewall script supports services provided on firewall itself (e.g. DNS, border email, ssh etc) . It also has services forwarded to internal servers (web server, ssh, vpn etc).
And it provides NAT for the external ip or range of ips.
It has blocks and whitelist and handled both inet and netdev (ingress) blocks. Blocking in the ingress (or egress) hook of netdev is efficient, ip address only, and because it is early in the packet flow will block both new connections (SYN) as well as any established/related. So only block something here if you don't need any access at all - no ping, no replies from hitting a web-server etc. Normal blocks are in the inet table and do allow replies.
I hand edited a fully working firewall for this example and hope it's useful. After your edits, please confirm no typos etc by running check:
nft -c -f nftables.conf
The IP sets for blocks and whitelists are kept in separate files. These contain lists of CIDR blocks.
Obviously these will need editing before any use. Keeping them in separate files makes them easy to update from a script,
After any changes to the sets, reload the rules to pick up new set data.
I took me quite some time reading and learning about nftables and making real world firewalls that work well. I have always found examples helpful.
Hopefully these too might be helpful to others.
And if you find typos or boo boos please let me know - hopefully my manual edits didn't muck it up
gene
Offline
It is great that you shared this work with the community.
Mike C
Offline
Thanks @mcloaked
Last edited by GeneArch (2023-09-24 20:04:25)
Offline
@GeneArch: Are you interested in a detailed analysis of your rulesets?
Offline
Of course, I'm always interested in improvements.
The rulesets were created to serve as examples. Not sure what you mean by detailed analysis but sure
Offline
The rulesets were created to serve as examples.
Do you mean that none of those rulesets are actively in use on actual systems?
Offline
Let me be a bit more clear - both are based on 'actively in use on actual systems'.
- The workstation sample was copied from one that is used - and edited for privacy.
So the "exact" file as it stands is slightly different - obviously.
- Similarly, the firewall is also copied from real one that has been in service for a very, very long time - it too was edited for privacy.
The "real" firewall has many blocks, whitelists and additional services that were not germane for the sample. The services lists were therefore edited down significantly
but with intention to provide a good framework for others. VOIP services in particular need special care, which I did not explain in the sample as they were removed
to keep things a bit simpler. Again the CIDR sets were changed to be illustrative and the CIDRs in them should be replaced with 'real' CIDRs that are of interest to you.
The "real" firewall has been running for a very, very long time, and the rules along with all the various sets of CIDRs are generated by in an house tool - perhaps one day I may release the tool too .
The tool is run frequently to keep the various data sets updated.
I also fixed a typo in case that was bothering you and confirm that both samples now pass 'nft -c -f xxx.conf'
Offline
O.K. - thanks for this clarification.
Since you removed the typos (that lead to syntax errors) and changed two port values (>70000) to less painful ones the list gets shorter.
Two preliminary remarks:
1. Priorities
Chain priority orders the rule execution between different chains of the same hook type.
Since your rulesets contain every hook (ingress, prerouting, input, forward, output, postrouting, egress) only once there is simply no reason to change the priority from the traditional defaults (among others 0 for filter).
2. Filtered - Not Closed
One purpose of a firewall is packet filtering. Filtered ports are not closed ports.
Please do not use the term "closed port" for blocking/filtering a port.
Let's dive in:
I. Analysis of the "workstation" ruleset
https://github.com/gene-git/blog/blob/m … ables.conf
# line 95
tcp flags & (fin|syn|rst|ack) != syn ct state new drop
While this is IMHO more of a concern for firewalls/routers than a workstation/desktop it should be positioned above the first TCP allow rule.
# line 100
ct state new drop
A "conntrack" connection is not a TCP connection. The netfilter connection tracking system considers all kinds of communication - including broadcasts and ICMP traffic - a new connection.
Since all new traffic is blocked now, all following rules and the chain policy are unused.
# lines 105 and 106
ip protocol icmp accept
meta l4proto ipv6-icmp accept
Will never match because of line 100. You should be unable to ping your workstation.
# line 111
meta pkttype { broadcast, unicast, multicast } accept
Will never match because of line 100. Broadcasts should be disabled.
"Unicast" means all packets with a single IP destination address. This rule would have rendered the whitelist approach obsolete. Luckily it's never matched.
# line 127
#oifname $iface meta l4proto { tcp, udp } th dport @closed reject
On a firewall/router this makes sense - to stop leaking intranet traffic to the outside.
On a "workstation" - not so much. Maybe the reason you disabled it.
II. Analysis of the "firewall" ruleset
https://github.com/gene-git/blog/blob/m … ables.conf
# line 105
define wg_ip_ext = 1.2.3.4 # external ip for wiregiard
This variable is never used - possible typo on line 237.
# line 133
iifname $int_iface ip daddr $ext_ip ct state related,established accept
This rule will never match - line 121 already did allow this.
# lines 162 and 163
oifname $ext_iface ip protocol tcp tcp dport @closed_tcp reject
oifname $ext_iface ip protocol udp udp dport @closed_udp reject
This should not be neccessary - your server should not create those packets.
# line 166
oifname $ext_iface ip daddr $ext_ip accept
Send over the external interface to the external IP of the server? Or did you mean "saddr"?
# lines 219 to 223
oif lo accept
# drop invalid packets
ct state invalid drop
tcp flags & (fin|syn|rst|ack) != syn ct state new drop
An "oif lo" rule on a prerouting chain? The lines 129 and 158 already cover this.
IMHO those rules belong to filter chains. I use the nat table for NAT rules only and the filter tables for filtering only.
See below for - in my eyes - a cleaner way to discard invalid packets early.
# line 257
tcp flags syn tcp option maxseg size 1-536 drop
I don't know what that rule is about - I can only assume that a regular SYN packet has a "maxseg size" of 0.
# lines 249 to 271
table netdev t_netdev {
[...]
} # end netdev table
The packets traversing netdev table are neither de-fragmented nor classified by connection tracking.
If you want to discard invalid or malformed packets before routing and NAT there is a cleaner way to do this by replacing your whole netdev table and the "invalid" rules from your nat table with an early packet filter.
Create a filter chain with a prerouting hook and a priority of -150 - it will be traversed before the prerouting chain in the nat table (-100) and after connection tracking kicks in.
chain early_packet_filter {
type filter hook prerouting priority -150; policy accept;
ct state invalid drop
tcp flags & (fin|syn|rst|ack) != syn ct state new drop
tcp flags & (fin|syn|rst|psh|ack|urg) == fin|syn|rst|psh|ack|urg drop
tcp flags & (fin|syn|rst|psh|ack|urg) == 0x0 drop
tcp flags syn tcp option maxseg size 1-536 drop
ip daddr @early_block drop
}
Since your server should not create packets with a destination address in @early_block there should be no need to block them on the output side.
Offline
That's super helpful - thank you @thc
I'll spend some time carefully going over your thoughts and suggestions. Your comments are valuable.
I do believe it may still make sense to use netdev (even with fragmented packets) given how efficient that path is.
thanks for taking the time to go over them.
I think it would be great if you could send a PR on github with improvements ?
thanks!
Last edited by GeneArch (2023-09-30 20:39:37)
Offline
thanks again for suggestions @thc - made some changes
Offline
Looks good .
Some minor suggestions:
workstation
# line 103 (not necessary)
ct state new drop
You can explicitly block this here - but the chain policy (drop) would do it implicitly in the next step anyway.
firewall
# line 144 (not necessary)
iifname $int_iface ip daddr $ext_ip ct state related,established accept
This rule will never match - line 133 already did this.
# line 173 (maybe intentional)
oifname $ext_iface ip daddr $ext_ip accept
This means all packets created by processes on the server itself emerging from the external facing interface may only have the external server ip as the destination address.
If this is what you want - o.k.
# line 194 (cosmetic)
type nat hook postrouting priority 90; policy accept;
I would choose type "filter".
# line 236 (cosmetic)
# Note http3 / Quic uses USB on port 443
should be "UDP" .
Offline
very nice !
i think it's time to say goodbye to iptables ...
I don't love rosbeef
Offline
@manix - definitely - nftables is so much nicer and cleaner. Hope you find example(s) helpful.
Offline