TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

It's about what broke, not who broke it

215 pointsby rodrodrodabout 7 years ago

29 comments

_n_b_about 7 years ago
I work in the nuclear industry, where most places are pretty good about maintaining a &quot;blame-free&quot; culture. You focus on what processes and procedures failed, what controls were missing, etc., that allowed somebody to make a mistake.<p>As this attitude was adopted, things shifted too far (at least in the opinion of industry groups, and my observation) to the point where people underperforming to the point of negligence weren&#x27;t blamed, and the corrective actions to prevent reoccurrences of problems they caused ended up being cumbersome and expensive without really improving safety. (And in this industry, everything relates back to safety.)<p>In recent years, things have shifted back towards a more pragmatic middle ground. There are tools to assess if a problem was organizational (and it still almost always is) or if there was some element of personal negligence involved. This follows with an industry wide trend of trying to fix the real problems that affect safety and operations, not over-engineer cumbersome corrective actions.
评论 #16697347 未加载
评论 #16697871 未加载
评论 #16698712 未加载
nevatiaritikaabout 7 years ago
My manager at work especially has the reverse attitude where the person who broke it is more significant than what broke&#x2F;how we fixed it&#x2F; how to avoid it in the future. I have seen people get taunted for a bug they caused two years ago, a bug which didn&#x27;t affect any revenue or was pretty easy to fix. And of course it still gets pointed out during appraisals.<p>Its a nightmare, because there&#x27;s no room for experiment left anymore. Everyone just sticks to the template, afraid to do more than required, never deleting unused code etc. An attitude like this never ever helps!
评论 #16695157 未加载
评论 #16695366 未加载
评论 #16696505 未加载
评论 #16695129 未加载
评论 #16696878 未加载
csoursabout 7 years ago
I took down an assembly plant by clicking on a Network status icon from a particular hardware supplier.<p>Over the weekend, firmware patches were applied, and the server rebooted. After reboot, everything worked fine, so the tech marked the change successful and went home.<p>Well, apparently the NICs would work just fine, but not all settings were applied until you opened the UI provided by the vendor. When you opened the UI, the final settings would be applied, and the NICs would reboot, just long enough to kill TCP connections.<p>That loss of TCP connection killed the parent system, and then all the other children systems also died when the parent died.<p>So who would you even blame there? The guy who set the tripwire? The guy who tripped on the tripwire? The guy who designed a system that could be brought down by a momentary loss of connection?<p>I&#x27;m lucky that my boss wasn&#x27;t the type to point fingers, because I was the guy who was there when it happened, and it sure got a lot of attention.
评论 #16696686 未加载
评论 #16695506 未加载
aytekinabout 7 years ago
We have put a rule that made our system very strong over the years: We don’t care if you broke the site, just fix it quickly and more importantly write a test that will catch the same problem if it happens again.<p>Every time someone breaks something, we get harder to break.
评论 #16696391 未加载
CoolGuySteveabout 7 years ago
I used to think this way until I started working with someone who was <i>nearly always</i> the one who broke it. At some point we just had to face the fact that his work was unreliable even after significant mentoring.<p>If the tasks were difficult that would be one thing, but I&#x27;m talking about stuff like committing code to prod that was clearly never even executed once.
评论 #16695074 未加载
评论 #16695452 未加载
评论 #16695262 未加载
评论 #16707099 未加载
ComputerGuruabout 7 years ago
I do a lot of open source work and unfortunately a very common posion is focusing on “who broke it,” which is especially disparaging when done in public. A particularly nasty habit is when outsider Alice opens an GitHub issue saying “xxxx is broken” and developer Bob replies with “yup, @Charlie’s commit fubar’d everything.”<p>Unfortunately both very demoralizing and very common.
评论 #16697442 未加载
评论 #16698049 未加载
userbinatorabout 7 years ago
<i>I had to then tell them that this person still worked there.</i><p>The old IBM story is worth mentioning in relation to this: <a href="http:&#x2F;&#x2F;www.mbiconcepts.com&#x2F;watson-sr-and-thoughtful-mistakes.html" rel="nofollow">http:&#x2F;&#x2F;www.mbiconcepts.com&#x2F;watson-sr-and-thoughtful-mistakes...</a>
koseiabout 7 years ago
When someone makes a mistake, that&#x27;s an incredible investment in them. I&#x27;m always surprised* when people try to throw it away by firing them or making them want to quit. Help them learn from it and apply that knowledge moving forward. Otherwise they&#x27;re just taking that knowledge and using it to help another company.<p>*Obviously with the caveat that some people are repeat offenders who are careless or just not good employees
评论 #16697942 未加载
ashleynabout 7 years ago
Reminds me of when someone ran &quot;rm -rf &#x2F;&quot; at Pixar and deleted all of Toy Story 2.<p>The backups were crap and the only reason it survived was because someone took a server to work from home.<p>When all was said and done, they never really found who did it, they just made organisational changes to ensure it didn&#x27;t happen again. No blame game.
评论 #16697963 未加载
partycoderabout 7 years ago
If in soccer the opposing team scores, who is to blame? the goalkeeper, defenses? the coach? the whole team? the referee? nobody?<p>Preventing goals means that the strategy needs to ensure good ball possession, and staying on the offense, to reduce the burden on the defense, to reduce the burden on the goalkeeper, who is the last line of defense.<p>If the last line of defense fails that&#x27;s not an individual failure but a team failure, coach included, since the coach selects who gets to play, when and their roles.<p>Same in software: bad management passes the burden to developers, bad development passes the burden to testers, bad testing passes the burden to release management.
评论 #16702335 未加载
zer00eyzabout 7 years ago
It&#x27;s not about whats broken, its about what you DO when it is broken.<p>This my favorite interview question to ask candidates:<p>&quot;What is your all time biggest screw up, and how did you come back from it&quot; - I then tell them the story of me loosing several hundred thousand dollars and the funny things that happened around it to set the tone. If you have been in tech for any length of time you have one of these stories (if not a few). I have heard some great ones by simply asking and it gives great insight into a candidate (humor, stress response, the things you have seen).
评论 #16697690 未加载
dancekabout 7 years ago
I think this is an important piece of organization culture. If the first reaction to problems is blame and punishment, issues are covered up. But if finding bugs and fixing them is considered valuable, there will be less issues in the long run.<p>Of course I write enough stupid bugs myself that I&#x27;m bound to think this way.
评论 #16695189 未加载
PeterStuerabout 7 years ago
I found this to be the touchstone of spotting a dysfunctional enterprise. There it is all about the &#x27;who&#x27;, never about the fix. In those environments every new project is CYA from day 1. The disconnect between daily activities and the success of the company is so large, that all actions and projects are just about personal politics. A failure that can be blamed on the right target is often even a preferred outcome as eliminating a competitor for a promotion is even better than not having failed. If you find yourself in such an environment, try to leave asap.
silverorioleabout 7 years ago
Sure, if you have a huge company and a revolving door, the solution is a bunch of processes and idiot-proof safety nets, and no one person is to blame for most bugs. If you’re in a small company, the solution is to teach the devs by showing them what mistakes they made. I don’t think that’s a bad thing; if you write code, that code is your responsibility, and you shouldn’t be sensitive about people telling you your code is broken.<p>Also, focusing on the code itself, for me at least, easily leads to thoughts like “this function is crap! What idiot wrote this!?”. Finding out who broke it leads to thoughts like “I see John introduced this buggy function. I should go check with him, maybe he had a good reason.”
gjvcabout 7 years ago
Mishaps occur on a spectrum, and may be categorised from mistakes, carelessness, recklessness, through to malicious intent, and any combination of the above all along said spectrum.<p>Though these categories may seem like they are orientated on individuals&#x27; actions, they may be used to determine where the risk lies in systems (and people&#x27;s use thereof) and how measures can be taken to avoid the same problems being repeated.<p>Much of the time, the complexity of systems (using the term in the widest possible sense) is under-estimated, and automated integrity checks are not used as religiously as they may be.
red_admiralabout 7 years ago
I&#x27;m 90% in agreement. Her workplace definitely sounds like somewhere I&#x27;d consider working myself (if I were looking for a job).<p>There are some things that I consider basic competence standards, like not storing passwords in plain text in any system you&#x27;re building. I wouldn&#x27;t fire an intern for getting that wrong but I also wouldn&#x27;t let an intern near a production authentication system without some oversight.<p>If someone is a security engineer with a responsibility to know these kinds of things as part of their job role and certification, then if they&#x27;d implemented passwords-in-clear to cut corners somewhere, even if it&#x27;s to meet a really important deadline, I&#x27;d be extremely unhappy. Of course I&#x27;d establish the general pattern of what had gone wrong first, and if it was a superior being abusive to the security engineer to get the product launched on time I&#x27;d still be really unhappy but not at the engineer.<p>Occasionally one does follow the chain of causes back though and finds not the organisation&#x27;s culture but an individual who really should have known better.
评论 #16699491 未加载
jancsikaabout 7 years ago
The answer requires context, at least for FLOSS projects.<p>If unlucky dev #13 broke something because humans can no longer reason about the relevant part of the system, then it doesn&#x27;t matter that #13 was the one who broke something. What really matters is that people get busy removing the sandtraps from their software.<p>However, many FLOSS projects run on the sheer joy and freedom that comes with maintaining a particular subsystem or area of the code. Most devs have a quick understanding of the responsibilities associated with that. But in cases where that responsibility doesn&#x27;t come naturally, <i>who</i> broke becomes the focus. Addressing that issue will determine whether or not future breakages occur.
koliberabout 7 years ago
It isn&#x27;t about who broke it. But if there is a person on the team who continually breaks things, does not learn from their mistakes and repeats them, or is not truthful when they break things, the team should react appropriately.
hennsenabout 7 years ago
It’s also about how it broke. And who broke it is sometimes the person who can say a lot if not most about that. Therefore i don’t recommend teaching to never talk about tge person who took an action that lead to a disaster, but rather encouraging a culture where admitting having taken a wrong step doesn’t lead to punishment, neither financial or social. Who broke it is an important part of the analysis, helping the organization to learn from each other’s errors. Making it a taboo talking about it is missing a chance for development...
pronoiacabout 7 years ago
Ooh, this is good. Part of it&#x27;s covered under the name of &quot;blameless post-mortems,&quot; but I don&#x27;t remember searching for similar breakage, which is a great idea.
iramillerabout 7 years ago
This seems like a classic case of applying the Five Whys [<a href="https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;5_Whys" rel="nofollow">https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;5_Whys</a>] methodology for root cause analysis.
drdeadringerabout 7 years ago
I don&#x27;t see how this is not &quot;better mousetrap, better mouse&quot;. Phrases from &quot;they build a better fool&quot; to &quot;they build a better US Navy crewman&quot; are a hundred a penny, and yes I&#x27;ve experienced the other side of this.<p>The best programmer vs the worst user, and every mix in between, shall produce situations needing attention this article addresses.<p>I&#x27;ve been in this situation on both sides. &quot;Of course it should be clear what this phrase means, how could they fuck this up?&quot; ... and ... &quot;I have on idea what this means, both choices could mean what I want but either choice ends me up on the wrong page of this bullshit &#x27;choose my own adventure&#x27; that I&#x27;ll have to repeat if I&#x27;m wrong&quot;.<p>I&#x27;m interested in finding out if I&#x27;m understanding this wrong, and&#x2F;&#x2F;or other thoughts.
gowldabout 7 years ago
The SRE Book teaches a lot of the lessons that this blog teaches. <a href="https:&#x2F;&#x2F;landing.google.com&#x2F;sre&#x2F;book.html" rel="nofollow">https:&#x2F;&#x2F;landing.google.com&#x2F;sre&#x2F;book.html</a>
donttrackabout 7 years ago
I totally agree. Its usually the hallmark of a good team, if they have the &quot;we are in this together&quot; attitude.
lkrubnerabout 7 years ago
There is the risk of conflating two separate types of problem. There are problems that arise from the complexity of the code, and problems that arise from particular people.<p>If a programmer has a habit of sloppy code, or violates the team&#x27;s standards in some ways, then a good leader will keep track of the fact that one person is responsible for a recurring pattern of mistakes.<p>I absolutely agree with Rachel By The Bay, that many bugs arise from the complexity of the situation, and it would be wrong to blame the person who just happens to trip over that bug. But a good leader should take action against anyone who repeatedly screws up, and who seems unwilling to improve.<p>I&#x27;ve written about this before. This is from &quot;How To Destroy A Tech Startup In Three Easy Steps&quot;:<p>----------------------<p>Wednesday, July 15th, 2015<p>I got to work at 11:00 a.m. John announced that our demo had stopped working. Sipping my coffee, I logged into the server to find out what the problem was. I looked at the error log for the API app, but it seemed okay. Then I checked the error log for the NLP app.<p>java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1955) at Celolot.nlp.Extractor.fuckBitchesGetMoney.java:87<p>What the hell was this?<p>“FuckBitchesGetMoney”?<p>What kind of name is that for a function?<p>A computer programmer can name their functions anything, but there are some “best practices” regarding names, and this particular function name violated all of them.<p>I asked Sital why he had given this name to his function. He looked at me straight, shrugged, and stated that the name was from the 1995 song by The Notorious B.I.G., “Get Money.” I replied that rap lyrics were not part of our naming conventions. He promised that he would change it.<p>Coming from anyone else, I might have interpreted the function name as an act of angry rebellion, but Sital was too forthright for that. Apparently, he thought the name was funny and went with it because he wanted to add some humor to his code. Never did he stop to think it might be unprofessional.<p>I looked through his code and found several other functions that had inappropriate names. I sent him a list and asked him to change their names to something standard.<p>A week later the function was still there. FuckBitchesGetMoney. Yet I don’t think that any of this was a deliberate act of rebellion. He was just oddly forgetful and disorganized.<p><a href="https:&#x2F;&#x2F;www.amazon.com&#x2F;Destroy-Tech-Startup-Easy-Steps&#x2F;dp&#x2F;0998997617&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.amazon.com&#x2F;Destroy-Tech-Startup-Easy-Steps&#x2F;dp&#x2F;09...</a>
评论 #16695541 未加载
评论 #16697414 未加载
teddyhabout 7 years ago
What’s that old saying; “<i>Fix the problem, not the blame</i>”?
nstjabout 7 years ago
I like this site and hadn&#x27;t really read much from it - it&#x27;s interesting how much it&#x27;s been front paged over the last couple of weeks: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;from?site=rachelbythebay.com" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;from?site=rachelbythebay.com</a>
评论 #16695242 未加载
BrissyCoderabout 7 years ago
I don&#x27;t know. Where I work no discernible pattern can be found with the &quot;what&quot; that broke.<p>It&#x27;s always the same f<i></i>*ing people that break it though!
评论 #16695656 未加载
评论 #16695165 未加载
评论 #16695294 未加载
评论 #16696651 未加载
erikbabout 7 years ago
It makes sense for a logical perspective, but in practice that&#x27;s not how it works.<p>In reality if something breaks, and you are stupid enough to mention it, then (a) you are considered an a-hole for blaming &lt;responsible-person-for-topic&gt; even if you didn&#x27;t and (b) responsible for fixing it.<p>So your main job is somehow make your stuff work despite all the other stuff that doesn&#x27;t work and all the other people that try to stop you, silently. The less you criticize the better. What you get in return is that if you fuck up, people will try to avoid blaming you as well. Also if you don&#x27;t succeed at making anything happen you get a little arrogant smile from your manager and a mediocre feedback round. But otherwise nothing happens.<p>The only change to that pattern happens when you piss off your manager or your manager&#x27;s manager. Then suddenly each and everyt activity you do will be scrutinized and if there&#x27;s a problem it will be used against you. The best hope they have is that you go away by yourself.
评论 #16697073 未加载