There's a lot of push back from engineers - especially people at lower levels of the stack - against testing infrastructure. One particularly famous example is Linux. Rather than testing before merging in code, they merge in code and then test the release candidate as a whole. It also seems game developers are extremely against automated testing frameworks as a whole. I've heard many times that it would be impossible to develop an enemy AI in a test-driven way (I did this for a senior project in college - finished the AI before the game was able to even start testing it [0]).<p>I wonder what would need to happen to convince people that:<p>1. Even if you do something extremely low level, you can draw a distinction between your hardware and the interface that 99% of your software runs at.<p>2. You can develop complex behaviors iteratively with automated testing just like you can develop complex programs iteratively (tests are just programs).<p>[0] - <a href="https://github.com/gravypod/it491-disabler-ai" rel="nofollow">https://github.com/gravypod/it491-disabler-ai</a>
I couldn't make it past serialization/deserialization logic in my own hobbiest TCP/IP stack. Even that part was super buggy. Next time around I'm definitely going to be unit testing more parts otherwise it's too hard for a beginner to get the easy parts right let alone the harder parts.<p>Also, take a look at gvisor's network stack. It's definitely unit tested.<p><a href="https://github.com/google/gvisor/tree/master/pkg/tcpip/link/ethernet" rel="nofollow">https://github.com/google/gvisor/tree/master/pkg/tcpip/link/...</a> (an example)
Yes, a TCP stack certainly is complex enough to warrant serious automated testing and/or TDD.<p>The idea of putting the TCP stack in user space is interesting. If one actually could map the memory of the whole device into user space one could maybe have fewer system calls and therefore have better performance.<p>Also, what I find somewhat irritating about using a linux system is how often one needs to run commands as root (sudo) for common administrative tasks like mounting a disk or stuff like that. Having a user space TCP stack could also decrease the need for that as far as setting up the network is concerned. If the linux machine is single user, as most of them are nowadays, it makes more sense that way, I think.
One benefit we discovered with this test framework after the blog post was written was that it made it much more convenient to do fuzzing and differential testing of the TCP stack. The core problem with fuzzing TCP is that there's a lot of incrementally built up state, and everything is extremely timing-dependent.<p>You basically need the fuzzer to have a model of TCP state so that it can effectively explore the state space, which is quite complicated and not something you can do with off-the shelf tools.<p>But once you have a bunch of unit tests designed to put the TCP stack into a specific state + a way of saving and restoring that state, it's really easy to just have snapshot of interesting situations where you can run a fuzzer on the next packet to be transmitted and see what happens.
It would be nice to have a bring-your-own-I/O TCP stack library that *doesn’t* rely on custom callbacks - something like BearSSL but for TCP, where the stack is just a pure state machine object and the user is responsible for explicitly shunting packets to and from the state machine, retaining control over when and how the I/O is done. Instead of having to define callbacks for retrieving time and consuming packets, why not explicitly pass the timestamp and packet data to a state machine object via a direct function call?