TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Open source code with profanity in comments is statistically better

292 pointsby dev_sndalmost 2 years ago

50 comments

whoopdedoalmost 2 years ago
From the research paper:<p>&gt; we calculate the swear factor as the number of swearwords divided by the lines of code<p>That&#x27;s what I suspected. Assuming that most swear words will be contained in comments, what this is actually measuring is the ratio of comments to code. In other words, code that is more heavily commented is better.<p>I think we already knew this.<p>That said I would like to see a more critical analysis. First control for comment density. Then compare code quality to swearing in comments and also variable names.
评论 #36622981 未加载
评论 #36626065 未加载
评论 #36626994 未加载
评论 #36627090 未加载
评论 #36623866 未加载
评论 #36628725 未加载
评论 #36627074 未加载
评论 #36622976 未加载
评论 #36623698 未加载
评论 #36620591 未加载
评论 #36620635 未加载
vharuckalmost 2 years ago
Possible explanation: swearing is more likely to be committed into code by people who either (1) own the code, or (2) know they&#x27;re too valuable to be punished. So it self-selects.<p>I personally have very different commenting styles between my work and personal projects. Not that any of it&#x27;s good.
评论 #36620346 未加载
评论 #36620383 未加载
评论 #36620295 未加载
评论 #36626648 未加载
bawolffalmost 2 years ago
I&#x27;d bet a lot of the non-profanity code is people open sourcing code just to be impressive on resumes or for school, where the profanity code is probably real code.<p>Sounds likely to be a classic case of correlation != causation
评论 #36620182 未加载
评论 #36619760 未加载
评论 #36620154 未加载
评论 #36619603 未加载
评论 #36621861 未加载
评论 #36619693 未加载
betamikealmost 2 years ago
I skimmed the paper, and it looks like they are looking for swearing _anywhere_ in the repos&#x27; code, not just comments.<p>I would be curious to see the ratio of swearing in comments vs code identifiers. I&#x27;d also be curious to see if the repos with swearing in their comments just have more comments in total. Perhaps the correlation is, &quot;code with more comments is more likely to be higher quality&quot;.
评论 #36620827 未加载
评论 #36620381 未加载
andrewedstromalmost 2 years ago
I&#x27;m sure the top comment here will be something like &quot;this is invalid because no way can you assign a numerical value to code quality! wtf?!&quot;<p>I&#x27;m withholding my own judgement on that.<p>For anyone curious, the authors are coming up with a code quality score using an open-source tool called SoftWipe[0]. From the paper:<p>&gt; SoftWipe is an open source tool and benchmark to assess, rate, and review scientific software written in C or C++ with respect to coding standard adherence. The coding standard adherence is assessed using a set of static and dynamic code analysers such as Lizard (<a href="https:&#x2F;&#x2F;github.com&#x2F;terryyin&#x2F;lizard">https:&#x2F;&#x2F;github.com&#x2F;terryyin&#x2F;lizard</a>) or the Clang address sanitiser (https: &#x2F;&#x2F;clang.llvm.org&#x2F;). It returns a score between 0 (low adherence) and 10 (good adherence). In order to simplify our experimental setup, we excluded the compilation warnings, which require a difficult to automate compilation of the assessed software, from the analysis using the --exclude-compilation option.<p>[0]: <a href="https:&#x2F;&#x2F;github.com&#x2F;adrianzap&#x2F;softwipe">https:&#x2F;&#x2F;github.com&#x2F;adrianzap&#x2F;softwipe</a>
评论 #36620036 未加载
cjsplatalmost 2 years ago
While at Sun in the early 2000&#x27;s, I was part of the due diligence team for an acquisition and had two days to review the entire code base of a 3 year old, 50 person software team.<p>This was standard practice, and the M&amp;A policies knew that there was no way to actually understand all the code so there was a policy document to describe what to look for.<p>Of course the red flag things were unexpected 3rd party copyrights and&#x2F;or license terms in case the code was encumbered.<p>But &quot;swear words&quot; were on the yellow flag list, in addition to &quot;ToDo&quot;, &quot;XXXX&quot;, and &quot;Fix Me&quot; types of things.<p>I remember thinking about places I have been in the past and that the people used those style comments tended to be the better programmers.<p>I mentioned this to the person leading the evaluation, and was told that point of noticing these kinds of comments was to look a more closely at the nearby code and try to decide if major functionality was missing or being faked.<p>It all worked out for that acquisition, but I remember being curious about whatever deal had gone bad in the distant past that made them codify this specific practice.
KolmogorovCompalmost 2 years ago
Correlation is not causality. Swearing in the comments will not magically make your code better, but fixing a hidden bugs that you have been chasing for weeks will certainly make you swear when fixed.
评论 #36620793 未加载
评论 #36620253 未加载
评论 #36621348 未加载
skrebbelalmost 2 years ago
My pet theory is that this is because honest, emotional comments are much more useful than the usual “professional” style that try to hide it when you have no clue what you’re doing.<p>When it’s clear someone was stuck, frustrated, banging their head against the wall etc while writing a particular bit of code, you can refactor a lot less defensively because you know the crappy parts weren’t secretly there for a reason.<p>I love real, honest, emotional comments. Pour all the frustration in there. Future you and your colleagues will thank you.
评论 #36622167 未加载
version_fivealmost 2 years ago
I remember reading that people who swear a lot are statistically smarter. I&#x27;m sure there are lots of caveats to that, as with the code.<p>How long will it be before someone who doesn&#x27;t understand causality starts encouraging developers to write profane comments? It wouldn&#x27;t be any more absurd than lots of other non-causal behaviors I&#x27;ve seen pushed because somebody successful does them.
评论 #36619894 未加载
评论 #36619974 未加载
评论 #36623171 未加载
评论 #36619605 未加载
yongjikalmost 2 years ago
Sorry for being off topic, but let me introduce you to the only true metric of code quality: WTF&#x2F;minute.<p><a href="https:&#x2F;&#x2F;www.osnews.com&#x2F;story&#x2F;19266&#x2F;wtfsm&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.osnews.com&#x2F;story&#x2F;19266&#x2F;wtfsm&#x2F;</a><p>One wonders if profanity in the source code interferes with reviewers and skews this important metric ...
ykalmost 2 years ago
&gt; Next, Strehmel and his team quantified the compliance of these two different sets of open source code with coding standards. The results were presented as an indicator of the quality of the source code through the SoftWipe tool.<p>I would read that study as coding standards lead to profanity. (Not sure wether or not coding standards should be correlated with code quality, I just think it is obvious that the measure is correlated with the conclusion in an obvious way.)<p>[Post posting:] Also looking at the plots, it seems that the two distributions are different, first the swear word distribution seems to be wider and second it has a clear outlier at &quot;software quality&quot; 8, so if anything it is an indication that something much more complex is going on.
gridspyalmost 2 years ago
My Hypothesis<p>1. Passionate developers often swear more often when they feel safe to do so<p>2. Developers work better in a &quot;safe environment&quot; where they are not judged &#x2F; forced to follow other guidelines by social or employment pressure.<p>And another point : those places where it&#x27;s unsafe (often due to managerial micromanagement) are miserable places to work. That can drive away skilled developers or suppress them if they remain.<p>All this is assuming the research metric is real, though I&#x27;m not sure it is. If the metric for &quot;code quality&quot; is actually &quot;precision following a coding standard&quot; you&#x27;d have though that rigid adherence to procedure would lead to a higher score?
评论 #36624600 未加载
dev_sndalmost 2 years ago
Here&#x27;s the link to the original full PDF: <a href="https:&#x2F;&#x2F;cme.h-its.org&#x2F;exelixis&#x2F;pubs&#x2F;JanThesis.pdf" rel="nofollow noreferrer">https:&#x2F;&#x2F;cme.h-its.org&#x2F;exelixis&#x2F;pubs&#x2F;JanThesis.pdf</a>
scnsalmost 2 years ago
Reminds me of a study. It showed, that swearing enables you to tolerate pain better. It was simple. Two groups, both had to put their hands into ice water. The group that was allowed to swear could do it longer.<p>I&#x27;d hypothesize, that programmers, who actually care about quality, swear more.<p>Individuals with AD(H)D might have a have a lower tolerance to pain. This, coupled with wide open sensual channels and decreased impulse control, might be a contributing factor.<p>[Edit] added parenthesis and link<p>Not correlated to swearing, but AD(H)D:<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=XdT4DIiX7Nk">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=XdT4DIiX7Nk</a>
评论 #36621625 未加载
makeitdoublealmost 2 years ago
An alternative take:<p>Swearing was more abundant in the earlier days and the code that survived until today is probably better that what got lost along the way.<p>In general the coding population has grown, we&#x27;re more used to coding in corporate settings with code reviews, commit message processing etc. and the bulk of devs aren&#x27;t just as emotional in writing their comments (some will still swear like sailors, but it&#x27;s not the norm)<p>&gt; The study relied solely on the source code written in C.<p>This in particular, probably reduced the number of hobby and beginner&#x27;s project in the study.
cozzydalmost 2 years ago
Improve your C code with this one neat trick!<p><pre><code> #define fuck if #define shit else #define ass return</code></pre>
fnordpigletalmost 2 years ago
When we open sourced the Netscape Navigator a major undertaking was code sanitation. This including excising licensed libraries etc (resulting in an initial release that wasn’t able to compile), but also removing enormous amounts of profanity and references to how evil Microsoft was.
评论 #36622070 未加载
gweinbergalmost 2 years ago
I&#x27;ve noticed the same effect in HNN posts. The more profanity there is the comments, the better the original post! Unfortunately there&#x27;s no good way to take advantage of this; optimizing to the metric destroys the value of the metric.
saintradonalmost 2 years ago
Isn&#x27;t there a study that shows that people can tolerate pain better when they swear? This doesn&#x27;t really surprise me. I&#x27;ll swear in my code if I really feel like it but commits and everything else I keep more presentable.
评论 #36620599 未加载
mikecolesalmost 2 years ago
My code, by twisting this finding, is bug-free.
alpaca128almost 2 years ago
&gt; Much of the community considers profanity as a vulgar display of lack of intelligence and education, because why use profanity when you have a rich vocabulary?<p>Why not use the full range of one&#x27;s vocabulary?
评论 #36620711 未加载
aosmithalmost 2 years ago
This is a normal part of software... You find something really bad, git blame says it&#x27;s your own, you leave a vulgar comment about how bad it is for the next guy.
ratelalmost 2 years ago
My favorite (almost) obscene quote I found reviewing code, although I never could find the back story to it:<p>&quot;Which idiot wrote this crap?<p>You did!<p>Which idiot hired me?&quot;<p>I think this also points to the statistical significance. Code that has been worked over a couple of times and&#x2F;or has been worked on by different people for all those hard and fringe problems will be better, but also accumulate more comments venting the trouble people had fixing them. It does not seem very interesting.
ydnaclementinealmost 2 years ago
One rule I live by is I never ever swear in comments or commits, just not worth it. Even in personal projects.<p>But one of my favorite projects to ctrl-f for &quot;fuck&quot; is in the jedi outcast source code. Since it is proprietary and was a good game: <a href="https:&#x2F;&#x2F;github.com&#x2F;search?q=repo%3Agrayj%2FJedi-Outcast+fuck&amp;type=code">https:&#x2F;&#x2F;github.com&#x2F;search?q=repo%3Agrayj%2FJedi-Outcast+fuck...</a>
评论 #36621721 未加载
评论 #36620623 未加载
评论 #36622370 未加载
danansalmost 2 years ago
I bet there are a lot of less visible but stronger correlations to code quality, including incentive structures, programmer time spent to code ratio, quality of tools, quality of documentation, etc.<p>Swearing in code, however, is much easier to quantify, and of course chosen to chuff up those who think swearing itself is a virtue.<p>It would be a mistake to draw the conclusion that allowing swearing in code will improve code quality.
tsukikagealmost 2 years ago
&quot;In 2018, Adam Farley, a contributor to the OpenJDK project, the presence of profanity in the source code.&quot;<p>Someone accidentally a verb.
评论 #36622929 未加载
评论 #36623597 未加载
z3t4almost 2 years ago
Or that those that need to adhere to strict linters, formatting, etc are less happy with their life and thus use more profanity. eg. &quot;code quality&quot; tools that does not have any benefits besides finding potential bugs that does not effect the state of the program, like lines that are 71 characters long instead of 70 characters.
wjholdenalmost 2 years ago
This story and the resulting discussion here on HN are such a great example of data mining with statistical methods. The researchers found a non-obvious result using statistics. Now we&#x27;re all speculating about the underlying cause, trying to apply our domain knowledge to explain the result.
sircastoralmost 2 years ago
Anecdotally, it seems to me that I work with a lot of folks that swear frequently but not in their code comments.
jansommeralmost 2 years ago
I sometimes feel like swearing in the comments or commit messages, which can be the first thought coming to mind, and spend a few resources on writing in a kinder way.<p>Perhaps I could use this as an excuse for not reaching a deadline...
pyerialmost 2 years ago
Probably same goes about the people? The emotionally charged person that swears and gives a mouthful usually ACTS better than the cold calculated one that speaks the right words but full of cunning inside?
moonchromealmost 2 years ago
Being passionate about code correlates with quality - shocking
BizarreBytealmost 2 years ago
I find it a bit suspect swearing would ever even get though a proper code review. It’s extremely unprofessional, I would tell someone to remove it.
评论 #36620806 未加载
评论 #36623754 未加载
评论 #36621926 未加载
charonn0almost 2 years ago
I only swear in commit messages. Am I doing it wrong?
pickingdinneralmost 2 years ago
Not to get too philosophical, but does profanity measure the children in the room, or does it measure the adults in the room?<p>Schrodinger&#x27;s chat (room).
briantakitaalmost 2 years ago
Until every fucking wanker who reads this article adds profanity to their shitty code expecting their bullshit to be better.
coding123almost 2 years ago
in other words, it increases the chance that the programmer is in a specific locale (like the US?) such that the location has less bad programmers than other locations.<p>And probably, increases the chance that the person is fed up with fixing someone else&#x27;s code - hence the anger
twodavealmost 2 years ago
The best CS professor I ever had always said that the #1 language among programmers is profanity.
paxysalmost 2 years ago
I&#x27;d first like to know how they judged what is &quot;good&quot; vs &quot;bad&quot; code.
DonHopkinsalmost 2 years ago
The original terminal emulator terminal.el in gnu emacs, written by mly (Richard Mlynarik), was particularly salty. I finally tracked down a copy, but it looks like somebody complained and in 1990 it was begrudgingly cleaned up a bit, so some of the worst stuff was moved out into a separate file called term-nasty.el for posterity (you, here, now), so as not to give &quot;in to the pressure to censor obscenity that currently threatens freedom of speech and of the press in the US&quot; (oh, Richard &lt;3 ):<p><a href="https:&#x2F;&#x2F;opensource.apple.com&#x2F;source&#x2F;emacs&#x2F;emacs-59.0.80&#x2F;emacs&#x2F;lisp&#x2F;ChangeLog.3" rel="nofollow noreferrer">https:&#x2F;&#x2F;opensource.apple.com&#x2F;source&#x2F;emacs&#x2F;emacs-59.0.80&#x2F;emac...</a><p>1990-08-26 Richard Stallman (rms@mole.ai.mit.edu)<p>* terminal.el: Move possibly offensive comments to term-nasty.el.<p><a href="https:&#x2F;&#x2F;www.digiater.nl&#x2F;openvms&#x2F;freeware&#x2F;v10&#x2F;emacs&#x2F;common&#x2F;lisp&#x2F;terminal.el" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.digiater.nl&#x2F;openvms&#x2F;freeware&#x2F;v10&#x2F;emacs&#x2F;common&#x2F;li...</a><p>[...]<p><pre><code> ;; disgusting unix-required shit ;; Are we living twenty years in the past yet? (defun te-losing-unix () nil) </code></pre> [...]<p><pre><code> ;; (A version of the following comment which might be distractingly offensive ;; to some readers has been moved to term-nasty.el.) ;; unix lacks ITS-style tty control... (defun te-process-output (preemptable) ;;&gt;&gt; There seems no good reason to ever disallow preemption (setq preemptable t) </code></pre> [...]<p><pre><code> ;; I suppose if I split the guts of this out into a separate ;; function we could trivially emulate different terminals ;; Who cares in any case? (Apart from stupid losers using rlogin) </code></pre> [...]<p><pre><code> (?\C-b . te-backward-char) ;; should be C-d, but un*x ;; pty&#x27;s won&#x27;t send \004 through! ;; Can you believe this? </code></pre> [...]<p><pre><code> ;; Did I ask to be sent these characters? ;; I don&#x27;t remember doing so, either. ;; (Perhaps some operating system or ;; other is completely incompetent...) </code></pre> [...]<p><pre><code> ;;-- Not-widely-known (ie nonstandard) flags, which mean ;; o writing in the last column of the last line ;; doesn&#x27;t cause idiotic scrolling, and ;; o don&#x27;t use idiotische c-s&#x2F;c-q sogenannte ;; ``flow control&#x27;&#x27; auf keinen Fall. &quot;LP:NF:&quot; ;;-- For stupid or obsolete programs &quot;ic=^p_!:dc=^pd!:al=^p^o!:dl=^p^k!:ho=^p= :&quot; ;;-- For disgusting programs. ;; (VI? What losers need these, I wonder?) &quot;im=:ei=:dm=:ed=:mi:do=^p^j:nl=^p^j:bs:&quot;))) </code></pre> [...]<p><pre><code> (setq te-process (start-process &quot;terminal-emulator&quot; (current-buffer) &quot;&#x2F;bin&#x2F;sh&quot; &quot;-c&quot; ;; Yuck!!! Start a shell to set some terminal ;; control characteristics. Then start the ;; &quot;env&quot; program to setup the terminal type ;; Then finally start the program we wanted. (format &quot;%s; exec %s&quot; te-stty-string (mapconcat &#x27;te-quote-arg-for-sh (cons program args) &quot; &quot;))))) </code></pre> [...]<p><pre><code> ;;;; what a complete loss </code></pre> [...]<p><a href="https:&#x2F;&#x2F;www.digiater.nl&#x2F;openvms&#x2F;freeware&#x2F;v10&#x2F;emacs&#x2F;common&#x2F;lisp&#x2F;term-nasty.el" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.digiater.nl&#x2F;openvms&#x2F;freeware&#x2F;v10&#x2F;emacs&#x2F;common&#x2F;li...</a><p><pre><code> ;;; term-nasty.el --- Damned Things from terminfo.el ;;; This file is in the public domain, and was written by Stallman and Mlynarik ;;; Commentary: ;; Some people used to be bothered by the following comments that were ;; found in terminal.el. We decided they were distracting, and that it ;; was better not to have them there. On the other hand, we didn&#x27;t want ;; to appear to be giving in to the pressure to censor obscenity that ;; currently threatens freedom of speech and of the press in the US. ;; So we decided to put the comments here. ;;; Code: These comments were removed from te-losing-unix. ;(what lossage) ;(message &quot;fucking-unix: %d&quot; char) This was before te-process-output. ;; fucking unix has -such- braindamaged lack of tty control... And about the need to handle output characters such as C-m, C-g, C-h and C-i even though the termcap doesn&#x27;t say they may be used: ;fuck me harder ;again and again! ;wa12id!! ;(spiked) ;;; term-nasty.el ends here </code></pre> Note to the gentle readers: &quot;wa12id&quot; stands for &quot;with a 12 inch dildo&quot;.<p>Jamie Zawinski kept Lucid Emacs nasty:<p><a href="https:&#x2F;&#x2F;groups.google.com&#x2F;g&#x2F;gnu.misc.discuss&#x2F;c&#x2F;U5oXKOfWinQ&#x2F;m&#x2F;xek-XhmQ9eoJ" rel="nofollow noreferrer">https:&#x2F;&#x2F;groups.google.com&#x2F;g&#x2F;gnu.misc.discuss&#x2F;c&#x2F;U5oXKOfWinQ&#x2F;m...</a><p>Noah Friedman, Aug 3, 1992, 4:54:20 AM<p>In article &lt;15i2n9...@hal.com&gt; wood...@hal.com (Nathan Hess) writes:<p>&gt;In article &lt;FRIEDMAN.9...@nutrimat.gnu.ai.mit.edu&gt;, friedman@gnu (Noah Friedman) writes:<p>&gt;&gt;It&#x27;s by no means necessary, but it&#x27;s <i>funny</i>.<p>&gt;Along the same lines, look at lisp&#x2F;terminal.el<p>Of course, terminal.el is actually useful, albeit not terribly powerful.<p>(and terminal.el is pretty mild compared to some of the other things I&#x27;ve seen written by mly. :-))<p>Incidentally, a lot of terminal.el has been rewritten in version 19.<p>Too bad... I liked all the variable names and comments in the original.<p>Jamie Zawinski, Aug 5, 1992, 12:40:38 AM<p>In the FSF-distributed Emacs 19, the obscenities (will) have been stripped from terminal.el, though they are preserved in a file called term-nasty.el, to avoid appearing to bow to the censors.<p>In Lucid GNU Emacs, terminal.el will remain as nasty as it ever was.<p>-- Jamie &quot;Truth, Justice, and the Fucking First Amendment&quot; Zawinski
francassoalmost 2 years ago
This is an example where correlation does imply causation IMHO
评论 #36620009 未加载
评论 #36620719 未加载
twodavealmost 2 years ago
Article should have included some juicy examples. 4&#x2F;10
bjornsingalmost 2 years ago
Sure, that means someone cares.
pak9rabidalmost 2 years ago
&#x2F;&#x2F; fuckin eh
stainablesteelalmost 2 years ago
someone hired a team to review 10k repos just for this?
ftxbroalmost 2 years ago
now we let goodhart&#x27;s take its course
happytigeralmost 2 years ago
Fuck yea it is.
OnlyMortalalmost 2 years ago
Fuck that code!
helmsbalmost 2 years ago
Correlation ≠ Causation
yodsanklaialmost 2 years ago
I&#x27;m surprised this is still a thing. I suppose this is associated with &quot;toxic masculinity&quot; which is frowned upon nowadays. I&#x27;m always a bit worried that I forget to edit my swearing and that it goes to code review.