The OP details how poor software engineering practices brought down a 1.4B market marker with 1400 employees in 2012.<p>Some of the issues mentioned include:<p><pre><code> - Keeping synthetic test data generation as part of a production build.
- Keeping dead code for years.
- Re-purposing a feature flag.
- Refactoring without regression tests.
- Manual deployments without peer reviews. They forgot to update one of their servers with the new code.
- Automated alerts sent via email were ignored.
- Rolled back to a version of the code running on the server they forgot to update, making things worse.
- Rushing out a release without proper software engineering hygiene.
</code></pre>
The article suggests improvements that could have prevented the chain of events.<p>For those here who are in HFT circles, have things improved after the Knight Capital Group debacle?<p>edit: formatting
Random, but I interviewed at Knight Capital for a software engineering position a few weeks before this all went down. I was in London, so the interview was done over the phone. Picture me in the evening, handwriting C to solve some problem (the fog of time too thick to remember what that problem was), then reading out what I'd written, semicolons and all, to the interviewer. Because of course there was no shared doc. I did very badly. But then, so did they.
<i>The incident happened after a technician forgot to copy the new Retail Liquidity Program (RLP) code to one of the eight SMARS computer servers, which was Knight's automated routing system for equity orders. RLP code repurposed a flag that was formerly used to activate an old function known as 'Power Peg'. Power Peg was designed to move stock prices higher and lower in order to verify the behavior of trading algorithms in a controlled environment. Therefore, orders sent with the repurposed flag to the eighth server triggered the defective Power Peg code still present on that server</i> [1]<p>> Power Peg was designed to move stock prices higher and lower in order to verify the behavior of trading algorithms in a controlled environment.<p>This is insane. Make one wonder, what <i>is</i> or <i>isn't</i> actually being deployed in prod in 2022.<p>[1] <a href="https://en.wikipedia.org/wiki/Knight_Capital_Group#2012_stock_trading_disruption" rel="nofollow">https://en.wikipedia.org/wiki/Knight_Capital_Group#2012_stoc...</a>
Back in the day $440M loss due to coding error was a landmark warning case. How could this happen??<p>In 2021 alone something like $10B was lost due to bugs in defi land.<p>Something about the worst possible thing could happen tends to happen eventually and it gets worse every passing year.
Original Post-Mortem: <a href="https://www.sec.gov/litigation/admin/2013/34-70694.pdf" rel="nofollow">https://www.sec.gov/litigation/admin/2013/34-70694.pdf</a>
My small footnote to this story: we had the look on the "blind bid" to purchase the portfolio of erroneous trades on the program trading desk I worked on. Ultimately we didn't bid (thank you risk department) and it traded away to Goldman as the article correctly reports. Would've been a great trade though. I estimated it netted Goldman $2m+
Important to add that this occurred in August 2012 and not in 2019 as the title implies.<p>I had interviewed there in my garden year (2010-2011) and was ultimately not considered for a role as a high-frequency quant.
What's amazing to me is that there are many software engineers here who can recognize how errors like this can so easily creep into software (some would say they're inevitable in any sufficiently complex software) but they still somehow think immutable smart contracts on blockchains are somehow still the future.<p>Crazy.
Hindsight is a helluva drug. The SEC report cannot be viewed as a “postmortem.”<p><a href="https://www.kitchensoap.com/2013/10/29/counterfactuals-knight-capital/" rel="nofollow">https://www.kitchensoap.com/2013/10/29/counterfactuals-knigh...</a>
Interesting. Five years prior, this story was posted on this blog: <a href="https://dougseven.com/2014/04/17/knightmare-a-devops-cautionary-tale/" rel="nofollow">https://dougseven.com/2014/04/17/knightmare-a-devops-caution...</a>
Really good write-up. Perhaps there's a dawning realization that the model of 'move fast and break things' is fatally flawed?<p>> "Knight’s IT project managers and CIO should have pushed back on the hyper-aggressive delivery schedule... Thirty days to implement, test, and deploy major changes to an algorithmic trading system that is used to make markets daily worth billions of dollars is impulsive, naive, and reckless."<p>The fact that <i>since 2008, the portion of all stock trades in the U.S. taking place away from public markets has risen from 15 percent to more than 40 percent</i> is also kind of astonishing. It's long past time to re-erect the walls between commercial and investment banking.
Here’s a 225 million dollar oopsie from 2005 <a href="https://www.foxnews.com/story/typing-error-causes-225m-loss-at-tokyo-stock-exchange" rel="nofollow">https://www.foxnews.com/story/typing-error-causes-225m-loss-...</a>
> However, a dark cloud remains: market data suggests that 70 percent of U.S. equity trading is now executed by high-frequency trading firms<p>Is it just me, or was this one of the scarier statements in the article?
High frequency trading does not seem to add anything, it's not a market instrument to balance anything. It's more like gambling with the odds slightly in your favor.<p>Oh course, I'm a laymen when it comes to high frequency trading and might totally wrong.
My experience in quant finance is that strong engineering teams are viewed as cost centers and are not well compensated the way they are at “real” tech companies, especially when the firm isn’t doing well. Management foolishly tends to only value seeking alpha, rather than reliability of existing alpha.<p>Unsurprisingly poor management tends not to realize when their software is built on a house of cards.<p>Definitely saw bugs cause large losses (8 digit numbers).
My brother worked in a medium-sized fintech not long ago. This example is shown in their onboarding process. I guess the same goes for other companies out there.
Excellent report of a fault case study.<p>Yes, check lists help, not just in medicine, but also in aviation. But automating things is of course the best option.
at the end .. its just money going from one account to another right? Its not like some physical thing that has perished and cant be brought back. Why is it difficult to reverse the transactions?
I read this<p>>Under stock exchange rules, Knight would have been required to pay for those shares three days later. However, there was no way it could pay, since the trades were unintentional and had no source of funds behind them. The only alternatives were to try to have the trades canceled, or to sell the newly acquired shares the same day.<p>And then I understand why /r/WallStreeBets and /r/Antiwork is gaining traction.<p>All it takes is a bit of organisation and the adoption of Govt tactics and practices which is ultimately violence and then just maybe you might see a Govt that works for the people and not the criminals, but I cant picture Bernie Sanders wielding a pitchfork!<p>Still I see Musk was market making with his tweet. I dont think you can be any more blatant! LOL
<a href="https://twitter.com/elonmusk/status/1520650036865949696?cxt=HHwWgICjsdvJt5oqAAAA" rel="nofollow">https://twitter.com/elonmusk/status/1520650036865949696?cxt=...</a>