Personally I found most of those examples to be not great. I think they would be improved better names overall, more well named private methods and most importantly well structured companion test classes proving the behavior.<p>I really dont want to read another developers comments on their mental state while I am feverishly debugging their code during a 3am page.<p>The number of times I have been misled by an outdated comment. I generally open git blame to see if the comments predate any code now. Tests on the other hand almost never lie.<p>Readmes or comments explaining why something exists vs what it allegedly does are generally much better. If you can’t write expressive code why would you assume your English is much better?
Speaking of source code comments, antirez (of Redis fame) wrote a fantastic article[0] about that topic some time ago and I still recommend it to colleagues whenever they make the hollow statement that "code should and can be intelligible on its own, without any comments".<p>[0]: <a href="https://web.archive.org/web/20210226004600/http://antirez.com/news/124" rel="nofollow">https://web.archive.org/web/20210226004600/http://antirez.co...</a> (I still don't understand why he deleted his blog)
For the first example with conditions, I much prefer rolling the "why" of the comments into boolean variable names where possible e.g.<p><pre><code> // We still have burst budget for *all* tokens requests.
if self.one_time_burst >= tokens {
...
} else {
// We still have burst budget for *some* of the tokens requests.
</code></pre>
becomes something like (I'm missing context but you get the idea):<p><pre><code> enoughBudgetForAllTokenRequests = self.one_time_burst >= tokens
if enoughBudgetForAllTokenRequests
...
else
...
</code></pre>
I don't see this often though. I see the commenting pattern a lot where the comment usually duplicates the "what" of the conditional code, and it's often ambiguous if the comment is talking about the condition or a whole branch.<p>I think a descriptive variable makes the code easier to skim, it's more precise and less likely to become inaccurate. For a big complex conditional, you can break in up into several well-named boolean variables you compose together (but obviously comment the bits that can't be covered by a concise variable name).
some types of comments are redundant.<p>These days I write my code as comments first and then write the code.<p>But for example this one<p><pre><code> // If either token bucket capacity or refill time is 0, disable limiting.
if size == 0 || complete_refill_time_ms == 0 {
return None;
}
</code></pre>
The comment is the same as the code. It is pointless. Instead I might say under what circumstances I expect them to be zero or why Im disabling limiting when they are zero.<p>This one is great, though at first glance I cant relate the actual code to the comment, at least I understand why the code is there.<p><pre><code> // We still have burst budget for *some* of the tokens requests.
// The tokens left unfulfilled will be consumed from current `self.budget`.
tokens -= self.one_time_burst;
self.one_time_burst = 0;</code></pre>
The Protobuf documentation made me cringe. Every character of its "personality" adds a cognitive burden for the reader and a maintenance burden for anyone who has to update it. If a developer wants to be cute with the same brevity and clarity that the comments would have otherwise, fine. Otherwise it's self-indulgent performative trash that should be rejected in review.
I just prefer nice, long, descriptive function names.<p>I was working on an unfamiliar codebase yesterday and had to work with the postMessage function. It was React, so one might assume that it sends the message to the backend, which was my initial assumption. Nope. It posted it to the page. Didn't help that another utility had another postMessage that was about what to do after the message was sent.
A lot of this I find to be kinda bad commenting, you've got to remember NOT to just directly comment on what the code is doing, but rather explaining why. With future updates your comments will eventually be wrong if they do anything other than give context.<p>If all your comment does is say what the code on the next line is doing, don't. Instead just try to make the next line more readable.
More and more I feel these huge comment blocks with the following inline comments sprinkled everywhere should really be a separate documentation file with its own structure where anyone not familiar enough can get up to speed, and understand the core principles behind the code written here.<p>For instance in the firecracker example, they don’t write 3 pages of block comments to explain the token bucket algorithm (or do they?), as anyone could go read the wikipedia article instead.<p>Comments living with the code have their benefits, but here I can’t imagine these huge comment blocks or algorithm related warnings get revised anytime outside of project wide refactoring.<p>What we also miss by having these comments inlined is the higher perspective of how it works with the other classes or how it fits in their running context. Anything that spans multiple class and/or methods makes it difficult to decide where to put the info and what scope it should have. People usually end up explaining the same things at 3 or 4 places in sublty different ways, which makes no one happy.<p>An exemple of that, this struct description<p>> /// TokenBucket provides a lower level interface to rate limiting with a
/// configurable capacity, refill-rate and initial burst.<p>This comment is hinting at a sequence where ‘TokenBucket’ needs to be a configurable lower level interface. But what is that sequence, what needs it to be lower level, is there higher level interfaces ? I’ll sure end up spelunking the rest of the doc to get answers to that.
I like meta comments about the business or technical decisions behind the code. You can come up with clever method and variable names to help enlighten the reader, but nothing beats an intimate, first-hand account explaining the <i>why</i>.
I agree with others, code commentary shouldn't be a novel. Comments should only provide info that the code can't make easy to understand. Why it was done this way, what it's intended to do, complex or nonlinear interactions that aren't clear from the code. But adding things that make reading the code sound more like English isn't helpful, it's extra work and prone to getting out of sync.
The code should explain what it's doing (self documenting code) and tests should explain why it's doing it.<p>Comments tend to just become a place for misinformation or get disconnected from the actual logic.<p>Adding more comments sometimes doesn't clarify the situation, it just acts as a second source of truth.
Worth comparing: <a href="http://steve-yegge.blogspot.com/2008/02/portrait-of-n00b.html" rel="nofollow">http://steve-yegge.blogspot.com/2008/02/portrait-of-n00b.htm...</a> A "n00b" is scared of code, and wants as much help to understand it as possible. An "expert" already knows all they need to know to work on the code, and is more productive by putting as much code on the screen as they can.<p>Rather.. I think part of the point is, context matters. If the team is small, and the project is changing rapidly.. I think the needs and expectations for documentation are different than if the project or team is large and change is slow.
Time and active readers.<p>Want well documented (and tested and thought out) code?<p>Then give the coders people outside of their team / hierarchy who are going to check it / read it / use it.<p>And then give them time.<p>And repeat.
Don't sleep on /usr/share/doc/, /usr/share/info/, and /usr/share/man/ either. Code comments are not a replacement for <i>real documentation</i> that is written from the perspective of either a user or a developer. If you can somehow auto-generate meaningful documentation, that's great, but often whatever is auto-generated is only good in theory.
The view I've developed on docuemntation over time is that it should be addressed outside in. What I look for--and also create, where possible--is a high-level architecture diagram that depicts all major components of the system. Then comes some description of typical control flow through the system, either visually or via. textual description. For example, "the request hits the server, then it goes here, does that etc.).<p>What would be even better is to create a model of the system using something externally altogether, like TLA+. It need not be perfect right at the beginning, but over time, invariants can be put in place, and even possibly mechanically tested by a model checker. Just even the Next state disjunct would work great - it'll give me the superset of actions the system is capable of.<p>To me, well-commented functions are all well and good; but without a high level understanding of how everything fits and works together, I doubt it'd be very helpful. It's like winning the battle but losing the war.<p>I'm not a fan of relying on IDE-centric features either. What if my IDE is not yours?!
I had a lot of trouble reading that code. I kept getting distracted by the comments and missed the code. The code seems pretty self-explanatory. I have some changes I would make, but they really depend on performance vs depth (modulo the burst vs budget bug).<p><pre><code> // Hit the bucket bottom, let's auto-replenish and try again.
self.auto_replenish();
</code></pre>
Is ugly. Why repeat in English exactly what the code says? I don't even know what language this is, but I know it's replenishing the tokens in the bucket.<p>All of the comments in reduce are redundant, this is what I would write at the top, so that anyone using the method doesn't have to page through to figure out what it does, it also makes the "smells" more obvious:<p><pre><code> // reduce: attempt to consume tokens from bucket,
// starting with one_time_burst, overflowing to
// budget.
// (keep comment only if _expensive_):Performs
// auto-replenish when
// burst is emptied, and budget has insufficient
// tokens.
// Asserts size, but only on replenishment.
// On OverConsumption, entire bucket is emptied.
// On Failure, only one_time_burst is emptied.
// In all cases, tokens is reduced by amount
// fulfilled.
</code></pre>
I don't understand why OverConsumption is different to Failure. Both will result in throttling by the caller. The reason for the difference should be documented.<p>I'm also curious why burst is consumed, then budget. I would expect _budget_ to be consumed first (with refill) with overflow into burst? My expectation is for burst and budget to have different refill schedules in auto_replenish, so using burst first would result in more failures by missing refill opportunities.<p>I'm also curious why the size assertion is only on requests that result in refills. Is it guaranteed that size = max(one_time_burst + budget)? Why care about the difference? Is hogging the tokens "bad"?<p>Then, I'd work to remove the need for those comments - particularly the need to explain what happens on error - which should be "nothing", or that the bucket is emptied of whatever is available, not something in between.<p>Finally I'd remove the mutation of tokens, returning either unfulfilled or fulfilled in the return value.
There are too many comments in there.<p>Comments are not the first tool people should try to reach because:<p>1. That's a lazy way to make the code clear, most code should be self-descriptive. If the teams are spamming comments it's likely to be a bad code base already.<p>2. There's no guarantee the comment is compatible with the code: It's natural language and there's no obvious way to test that. After versions of modification, the comments might still look like they make sense but that's not how the code work anymore.<p>Two alternatives:<p>1. Declarative programming: Try to leverage declarative programming styles more. For example, top-down style dynamic programming or memoization is always more readable and less error-prone than the bottom-up ones. You don't have to fully adopt functional programming or logic programming, but do take some inspiration from them.<p>2. Testing. Tests are like comments, they reflect some meaning of the code. Tests are not like comments, if the code breaks, they break.<p>Of course, comments are very valuable, but it's most valuable when applies in the right spots only.
When I see code with a lot of inline comments, I can’t help but think that a lot of them would make sense to implement as trace log messages instead. If you’re going to write all these messages documenting the flow of the program, why not give yourself the ability to flip a switch and output them somewhere?
I have a really similar motivation behind a project that aims to teach Jetpack Compose in the same fashion and funny enough, the name is also pretty similar to the title of this thread - Learn Jetpack Compose By Example. You can find it here - <a href="https://github.com/vinaygaba/Learn-Jetpack-Compose-By-Example" rel="nofollow">https://github.com/vinaygaba/Learn-Jetpack-Compose-By-Exampl...</a><p>There's a ton of comments in every example (even though they might be redundant across the examples) and the idea is that you should be able to learn this new framework by merely reading through the examples.
On my team fit/engineering values interview I always asking about benefits of autotests. People who said that tests written alongside the code are the best documentation are getting a mental "plus one point".
I have seen two extremes on this over the years... comment a lot, or do not comment at all. In the teams I work in, I would love to see a more moderate and practical view on the matter.<p>I get the whole document with your code, and practice that. But sometimes we end up with variable names that get very long and start leading not the most aesthetic code that gets formatted in a very vertical way.<p>Don't know the right answer, but suspect the two extreme views are not correct.
IMO, comments are needed when the code itself can't express something necessary to understand its overall function.<p>I've found that hence e.g. for rather complex statements where the code isn't lacking information, not commenting is the correct approach.<p>Whereas sometimes, when I'm forced to write exceptional code because of e.g. a workaround, a comment can nicely introduce my future self and other readers to the context.
Well documented code is respectful of the maintainer's time. Nobody wants to read a monologue full of noise to understand what is happening.<p>Comments are not monologues. Explain what the line you are commenting does, and perhaps elaborate why. But always acknoweledge: more words = more patience required = more maintenance cost.<p>Also, nobody will care about what your name is in 2 years. Do not make the code about you.
I find that doing a "headerdoc" kind of thing works for me.<p>I did write up a thing about it, but it's a long read, so folks don't usually read it: <a href="https://littlegreenviper.com/miscellany/leaving-a-legacy/" rel="nofollow">https://littlegreenviper.com/miscellany/leaving-a-legacy/</a><p>Has some pretty heavy-duty examples.
I went from my last job where every method and variable declaration had to be commented to my current job where there isn't a specific rule against comments in code but if you put in a PR with comments in it, you're going to get rejected after being laughed at.<p>I thought I would miss it, but I don't miss the documentation. Code doesn't lie.
I was not very impressed by the token bucket example. Firstly, it is totally ridiculous to need to write your own implementation of gcd and the comment is silly and unnecessary.<p>Secondly, the implementation for the token bucket seems to just be bad. A simpler implementation is as follows:<p>- Input number of tokens and refill time.<p>- Output the constant state describing the limiter: the max number of tokens and the time to get a single token (= refill time / size, ignoring rounding issues which shouldn’t generally matter if your times are in nanos)<p>- The only mutable state is a single time instant: the time when the limiter becomes full<p>- to attempt to take a token, compute max(0,full_time - now) / new_token_time and if this is less than or equal to the size of the bucket minus the number of tokens you want, you may withdraw and set full_time := max(full_time, now) + tokens_to_withdraw * new_token_time<p>If that division is concerning then you should either attempt to withdraw from the rate limiter less frequently or you should not be using one at all.<p>The rest of the code is about scheduling events to happen when there are free tokens in the ratelimiter. This is a different problem and also over-complicated. You should either poll and retry periodically with your most important event (rather than whatever event you thought was most important when the bucket became empty) though you then want to be careful about stale tasks never running, or you should just have a queue of pairs (Task, cost) where cost is an int. after withdrawing tokens from the ratelimiter or when the queue transitions from empty to non-empty, peek the top item and set a timer for rl.full_time - rl.new_token_time * (rl.size - cost) and when the timer fires (or immediately if the time was in the past) you can dequeue, withdraw cost tokens, and run that top task (then repeat.)
The first has an enormous amount of visual noise and it's definitely over commented.<p>The second one is ok, but please don't follow the first one as a good example, it takes away the ability to scan the code
IMHO writing a well-documented code requires a discipline and a set of good habbits.<p>One of those habbits is to update documentation (tests/comments) when software is evolving.<p>Unfortunately it is a very rare habbit it seems.
If you must dump a 5-paragraph essay about why something exists, you put it in the git commit message so when you git blame those lines, you see the history of the code.<p>These examples are a fucking nightmare.
i've found that well written code + useful comments + comment with link to design document works pretty well.<p>i got an email about code i'd written 10 years ago...thanking me for taking the time to document everything.