On a similar note, my favorite-named network testing tool: <a href="https://github.com/tylertreat/comcast">https://github.com/tylertreat/comcast</a>
Tools like this and Jepsen and chaosmonkey do fault injection on network and/or processes.<p>Are there any tools you like for storage fault injection?<p>Many database companies have an internal tool specific to their database for this. I think Scylla has a public tool for storage fault injection.<p>I've been experimenting with ptrace for storage fault injection myself. And want to try FUSE and some other tech out.<p><a href="https://notes.eatonphil.com/2023-10-01-intercepting-and-modifying-linux-system-calls-with-ptrace.html" rel="nofollow noreferrer">https://notes.eatonphil.com/2023-10-01-intercepting-and-modi...</a>
We used this. Then we realised none of our guys have any idea how to build systems that are resilient to the failures we simulated. So they swept it under the rug and just cross fingers nothing happens.
I can't say enough good things about Toxiproxy.<p>I will repeat my previous comment[0] about it:<p>> Toxiproxy is fantastic. I wish they supported a full configuration file in JSON or TOML or something but other than that it has been a lifesaver testing websockets.<p>0: <a href="https://news.ycombinator.com/item?id=32116969">https://news.ycombinator.com/item?id=32116969</a>
Toxiproxy is based. It's basically a REST API that lets you spawn sub-servers acting as a relay to some remote or local machine. From there: you're able to add different behaviors to the relay (using a REST API on the sub server.) These behaviors are called 'toxics' and they're designed to introduce more uncertainty in the relay they run on. For example - there's a toxic that lets you introduce both latency and a random range of jitter to delay packets by.<p>You can add behavior to drop certain packets (like every N packets) or even to split messages up into multiple packets (TCP works based on a stream protocol so technically a send may result in multiple packets being received. This is why the recv() function used in BJ's guide to network programming works up to a modified wrapper that uses a while loop until it returns no data! I believe BJ does the same for send (it assumes a send buffer of unknown length and that send might not send all your data))<p>Toxiproxy is a very-well engineered tool because you can point any of your TCP client software to the right Toxiproxy relay end-point and you won't have to modify your code. What's significant about Toxiproxy is it creates the basis to start thinking about the requirements for network code in a more scientific way. For example: lets say you write network code. The performance of your code depends on how your network behaviors when you run the code. Consequently, its extremely difficult to know if your software will actually handle adversity in the future. But with Toxiproxy you can approach write algorithms that better handle this.<p>Shameless plug: I am slowly working on my own networking stack in Python and I implemented my own version of toxiproxy (the server and client.) Otherwise you would have to download toxiproxy's server in Go for a project which isn't that easy to package and use as part of testing. Here's info about it on my docs page - <a href="https://p2pd.readthedocs.io/en/latest/built/toxiproxy.html" rel="nofollow noreferrer">https://p2pd.readthedocs.io/en/latest/built/toxiproxy.html</a> The interface is still unstable and may change or have bugs. But there's integration tests at least, lel.<p>I read what another commenter here said about UDP. I wanted to say that UDP is a total pain in the ass to work with. It took me a long time to design parts of my code that do address lookups with STUN because of UDP. It gets easier though once you know what to expect.
Oh I just started using this. I was a bit disappointed initially that it can't randomly drop connections through its probabilistic filters, but you can still achieve this with another process commanding it so it's stayed.
Refreshing to see Ruby being used!<p><a href="https://github.com/Shopify/toxiproxy-ruby">https://github.com/Shopify/toxiproxy-ruby</a>
I wonder why there is no remote control web interface. It would be just awesome to flip a switch to jitter/disable proxied connections and a slider to rate limit the connection. Should be easy, right?