Brand new nth expression

I’ve been in charge of creating a new netfilter match expression that provides a nth packet counter matching but every a given value is reset. Such expression will allow to create a round robin packet matching very useful for load balancing or to emulate network failures, for example:

 ip daddr <ipsaddr> dnat nth 3 map {
         0: <ipdaddrA>,
         1: <ipdaddrB>,
         2: <ipdaddrC>

This expression is the equivalent to the nth mode of the statistic match in iptables.

In order to face this challenge, I’ve been studying how nft expressions works in both the kernel and libnftnl sides, using as a reference how expressions like nft_meta, nft_counter and nft_cmp works:

  • nft_meta was useful as a template, as the key random seems to be similar to nth. But it’s totally different for several reasons: there is no needed several operations, no required sreg registers.
  • nft_cmp was useful to pass-through a data structure from netlink to netfilter, but not too similar to what we need to build.
  • nft_counter likely the most similar code as performs counting operations SMP safe. But the counter behaves counting the packets and bytes independently in every CPU and once the user request the counter value, it operates an addition of all CPU counters to return the final result. This is not what we need from nth, as we need the last updated value.

Additionally, I’ve been inspired in the current implementation of the nth mode in the xt_statistic expression from xtables in order to operate with atomic values in order to ensures that all CPUs are synced with the last updated counter value.

From the libnftnl point of view, I’ve been inspired from the counter.c, cmp.c and meta.c expressions in order to implement the nth.c expression.


Supporting inverted bitwise in nft I

I’m still banging my head providing support for the inverted bitwise that I referenced in an older post. Now the challenge is not only provides such functionality but also simplify the code.

In the nftables source code we can currently see a function called


in the file netlink_linearize.c which is called to generate the bitwise and cmp operations needed when the list of bitwise is positive, like is shown below:

nft --debug=netlink add rule ip filter INPUT ct state new,related,established,untracked
ip filter INPUT 
  [ ct load state => reg 1 ]
  [ bitwise reg 1 = (reg=1 & 0x0000004e ) ^ 0x00000000 ]
  [ cmp neq reg 1 0x00000000 ]

Now, the challenge is to improve the behavior in order to generate both operations in the evaluation phase, within the file evaluate.c creating the logic structure:

        relational (OP_NEQ)
                / \
               /   \
              /     \
         bitwise   value
            /  \
           /    \
     ct state   mask

No luck until now, but I’ll upgrade the state of this development.