As I had posted a few weeks ago (<a href="https://news.ycombinator.com/item?id=41085314">https://news.ycombinator.com/item?id=41085314</a>), I recently implemented a very similar thing myself.<p>My solution ended up using tc's mirred[0] action for implementing a fully L2-transparent frame relay. I wonder if their setup achieves the same degree of transparency, because afaiui, that's just not possible involving a 802.1Q-compliant (Linux) bridge.<p>I spent close to a week optimizing my setup looking at kernel flame graphs and perf results, reading adapter-specific tuning guides and driver source, and can say that the <i>only</i> really meaningful performance optimizations (in both the Broadwell- and Zen3/Vermeer-based implementations I tried) were disabling mitigations in the kernel (esp. on Zen3, that boosted performance by more than 20%), and getting CPU frequency scaling/idle states sorted out correctly (which yielded much higher wins on the older Broadwell uarch, because power state transition appears to happen much quicker on Zen3).<p>As for the solution presented in the (on the whole really great; I love it!) article, I have my doubts about the effectiveness of the cargo-culted "sysctl tuning" mentioned - TCP, for example, is simply not involved at all in the described setup, so "tuning" its buffer allocations cannot have any effect on the workload.<p>Kudos to the writers for solving their problem in a creative, cost-effective and maintainable way! :)<p>[0]: <a href="https://www.man7.org/linux/man-pages/man8/tc-mirred.8.html" rel="nofollow">https://www.man7.org/linux/man-pages/man8/tc-mirred.8.html</a>