TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: What was your worst technical mistake?

9 pointsby kevinmannixalmost 8 years ago
As an intern, I had a manager tell me that each developer will make at least one giant mistake during their career. The important thing is to have a plan of action, be transparent, and ensure that a similar mistake won&#x27;t be made again.<p>Every now and then a story gains some traction that deals with an employee mistake, whether a small typo or a system design issue that wreaks havoc. Most recently, the story of a fresh-faced developer that dropped production while setting up his dev environment [0].<p>I&#x27;m sure there are many more stories that would could inform others what not to do or what warning signs to look for. What was your biggest mistake, how did you remedy the situation, and what were the repercussions?<p>[0]: http:&#x2F;&#x2F;uk.businessinsider.com&#x2F;worst-first-day-ever-reddit-2017-6

3 comments

existenceboxalmost 8 years ago
I share this story from time to time whenever this question comes up. I&#x27;m probably a broken record at this point but I&#x27;ve always thought it important to set expectations clearly for new devs by being open about my own failures; and after the recent reddit post it seems about time to braindump once again.<p>I deleted &#x2F;etc on a live, user facing, production cluster once.<p>Wrote a script to determine OS, settings, a bunch of other bits, and then configure the node appropriately. I sanity checked it for BSD, ubuntu, debian, RHEL, all the machines I thought it would run on.<p>Turns out there was a Solaris cluster.<p>Long and the short; the software I was configuring installed differently on Solaris, my script did not properly audit&#x2F;validate, and proceeded to, upon not finding the right subdirectories when performing a traversal, declare itself done while still sitting in &#x2F;etc and nuke the entire dir.<p>The joking lesson I tell myself from this I summarize as a quote my sysadmin mentor told me: &quot;Don&#x27;t miss.&quot;<p>Less glibly, and more actionably,<p>- enumerate your edge cases and failure modes rigorously both from a &quot;what do I expect&quot; and a &quot;what if&quot; perspective. (kinda under this bucket, UNDERSTAND YOUR GODDAMN SPEC, AGGRESSIVELY; this is true both in ops and dev)<p>-Write your code with the EXPECTATION that bits will fail, and have it self audit.<p>-rm * is a big hammer. For all the press DD gets, rm * (and rf) should be used with care and proper precaution, ESPECIALLY if automated. Have extra &quot;mental flags&quot; to give extra care if you see rm *&#x27;s and such in your code.<p>-PHASED ROLLOUTS.<p>I&#x27;m sure there are more learnings, but those are what come to mind at a thought.<p>To answer the latter half of your question, the repercussion (and remedy) was my boss going to me: &quot;whelp, you get to send out an outage email, and learn how to rebuild a cluster&quot; (not before calling the other sysadmins into the room, having a brief moment of &quot;let&#x27;s point and laugh&quot; and then sharing their own explosions, some of which made mine pale in comparison :) )
itamarstalmost 8 years ago
An employee who dropped production on first day is not at fault, it&#x27;s the company&#x27;s fault. I have similar but not quite as bad story, deploying code that <i>almost</i> brought down our company&#x27;s main customer. My fault, but organization was at fault too (but to be fair we had ops people who shut it off when it caused problems).<p>So two thoughts:<p>1. How bad the outcome is doesn&#x27;t necessarily reflect on how big a mistake something is. Software is so complex that even small hard-to-avoid mistakes can cause big problems... and sometimes big mistakes only cause trivial problems. So while big mistakes make good stories (and I&#x27;m sure people will post some), every mistake is worth learning from.<p>2. Most problems are, in the end, not an individual&#x27;s fault. It&#x27;s a whole system that failed. So don&#x27;t just like for what you can do better, though that&#x27;s important. Figure out where the system broke, and how to make the system better.<p>If you want to more deliberately learn from mistakes, Gary Klein&#x27;s book &quot;The Power of Intuition&quot; is really useful.<p>(I am BTW writing a weekly email with mistakes I&#x27;ve made both programming and in my career - the story I mentioned above is the first email you&#x27;d get, and I just sent out the 41st, with plenty more mistakes to come. 20+ years of coding and still more mistakes to make! <a href="https:&#x2F;&#x2F;softwareclown.com" rel="nofollow">https:&#x2F;&#x2F;softwareclown.com</a> if you&#x27;re interested.)
jonrgroveralmost 8 years ago
I used inheritance rather than composition when writing a wrapper to DataTable (before extension methods existed) in C#. I fixed it later for future companies, but it ran in about ten times the time it should have taken and it killed the product. A little while later I offered to come back to the company to fix the mistake. it would only have taken 2 to 3 hours, but by then the product was dead.