> As mentioned before, the CI critical path is bound by its longest stretch of dependent actions. If one test consistently takes 20 minutes to execute and flakes, and has some logic to retry on failure, let’s say up to 3 times, it’ll take up to 60 minutes. It doesn’t matter if all other builds and tests execute in 30 seconds. That one slow, flaky test holds everyone’s builds back for up to 1 hour.<p>Honestly, really surprised to not see this mentioned til the end. Some of the other things in the article were almost jaw-dropping ($1+ million in instances savings, needing 48 cores to run CI, etc.), but having flakey tests regularly causing you problems, having to rerun extremely expensive jobs, is something that I would argue should have been addressed first, not last.