Self-Documenting Code

72 点作者 tie-in7 个月前

30 条评论

johnfn7 个月前

My cut:<pre><code> const passwordRules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W{1,}/]; async function createUser(user) { const isUserValid = validateUserInput(user); const isPasswordValid = user.password.length >= 8 && passwordRules.every((rule) => rule.test(user.password)); if (!isUserValid) { throw new Error(ErrorCodes.USER_VALIDATION_FAILED); } if (!isPasswordValid) { throw new Error(ErrorCodes.INVALID_PASSWORD); } const userExists = await userService.getUserByEmail(user.email); if (userExists) { throw new Error(ErrorCodes.USER_EXISTS); } user.password = await hashPassword(user.password); return userService.create(user); } </code></pre> 1. Don't use a bunch of tiny functions. This makes it harder for future eng to read the code because they have to keep jumping around the file(s) in order to understand control flow. It's much better to introduce a variable with a clear name.2. Don't use the `a || throw()` structure. That is not idiomatic JS.2a. Don't introduce `throwError()`. Again, not idiomatic JS.3. Use an enum-like object for error codes for clarity.4. If we must use passwordRules, at least extract it into a global constant. (I don't really like it though; it's a bit too clever. What if you want to enforce a password length minimum? Yes, you could hack a regex for that, but it would be hard to read. Much better would be a list of arrow functions, for instance `(password) => password.length > 8`.5. Use TypeScript!

评论 #41929169 未加载

评论 #41929070 未加载

评论 #41928787 未加载

评论 #41940543 未加载

评论 #41940840 未加载

评论 #41941658 未加载

评论 #41941606 未加载

alilleybrinker7 个月前

There is no such thing as universally self-documenting code, because self-documentation relies on an assumption of an audience — what that audience knows, what patterns are comfortable for them — that does not exist in general.Self-documenting code can work in a single team, particularly a small team with strong norms and shared knowledge. Over time as that team drifts, the shared knowledge will weaken, and the "self-documenting" code will no longer be self-documenting to the new team members.

simonw7 个月前

I don't find this easier to read:<pre><code> !(await userService.getUserByEmail(user.email)) || throwError(err.userExists); </code></pre> I guess if I worked in a codebase that used that pattern consistently I'd get used to it pretty quickly, but if I dropped into a new codebase that I didn't work on often I'd take a little bit longer to figure out what was going on.

评论 #41928489 未加载

评论 #41928402 未加载

评论 #41928617 未加载

评论 #41937724 未加载

Chris_Newton7 个月前

If I were reviewing the original code, the first thing I’d question is the line<pre><code> user.password = await hashPassword(user.password); </code></pre> 1. As a rule, mutations are harder to understand than giving new names to newly defined values.2. The mutation here apparently modifies an object passed into the function, which is a side effect that callers might not expect after the function returns.3. The mutation here apparently changes whether user.password holds a safe hashed password or a dangerous plain text password, which are bad values to risk mixing up later.4. It’s not immediately obvious why hashing a password should be an asynchronous operation, but there’s nothing here to tell the reader why we need to await its result.At least three of those problems could trivially be avoided by naming the result hashedPassword and, ideally, using TypeScript to ensure that mixing up plain text and hashed passwords generates a type error at build time.I do agree with many of the other comments here as well. However, I think the above is more serious, because it actually risks the program behaving incorrectly in various ways. Questions like whether to use guard clauses or extract the password check into its own function are more subjective, as long as the code is written clearly and correctly whichever choices are made.

评论 #41944269 未加载

评论 #41942057 未加载

cjfd7 个月前

Typescript looks much, much better than what he ends up with. The typescript is more or less the same thing but with comment tokens removed. How is just removing the comment tokens not an obvious improvement in readability?Honestly, I think all of jsdoc, pydoc, javadoc, doxygen is stuff that most code should not use. The only code that should use these is code for libraries and for functions that are used by hundreds or thousands of other people. And then we also need to notice that these docs in comments are not sufficient for documentation either. When a function is not used by hundreds or thousands of people, just write a conventional comment or perhaps not write a comment at all if the function is quite straightforward. Documentation that explains the big picture is much more important but that is actually somewhat hard to write compared to sprinkling jsdoc, pydoc, javadoc or doxygen worthless shit all over the place.

评论 #41940620 未加载

joecarrot7 个月前

If one of my developers used "||" that way I would definitely throw some side eye

评论 #41927828 未加载

评论 #41928023 未加载

评论 #41928404 未加载

dvt7 个月前

The writer here misunderstands how short-circuit evaluation is supposed to be used. The idea is that you should use SCE in a few, pretty standard, cases:<pre><code> cheapFunction(...) || expensiveFunction(...) // saves us a few cylces car = car || "bmw" // setting default values, common pattern funcA(...) && funcB_WhichMightBreakWithoutFuncA(...) // func A implies func B ... // probably a few other cases I don't remember </code></pre> Using it to handle control flow (e.g. throwing exceptions, as a makeshift if-then, etc.) is a recipe for disaster.

评论 #41928392 未加载

评论 #41928131 未加载

variadix7 个月前

Types are the best form of documentation because they can be used to automatically check for user error, are integral to the code itself, and can provide inline documentation. The more I program in dynamically typed (or even weakly statically typed) languages the more I come to this conclusion.

评论 #41940517 未加载

0xbadcafebee7 个月前

"Self-documenting code" is already a thing called Code-as-Docs. It's the inverse of Docs-as-Code, where you're "writing documentation like you write code". Code-as-Docs is where you write Code that is self-documenting. (And this has absolutely nothing to do with Literate Programming.)You do not have to adhere to any specific principles or methods or anything specific in order to do Code-as-Docs. Just write your code in a way that explains what it is doing, so that you don't need comments to understand it.This often means refactoring your code to make it clearer what it does. It may not be what your ideal engineer brain wants the code to do, but it will make much more sense to anyone maintaining it. Plus very simple things like "variables-that-actually-describe-what-they-do" (in a loop over node names, don't make a variable called x; make a variable called node_name)edit It seems like I'm the only one who says "Code-as-docs"... by searching for "Code-as-documentation" instead of "Code-as-docs", I found this: <a href="https://martinfowler.com/bliki/CodeAsDocumentation.html" rel="nofollow">https://martinfowler.com/bliki/CodeAsDocumentation.html</a>I guess "self-documenting code" more hits: <a href="https://www.google.com/search?q=self-documenting+code" rel="nofollow">https://www.google.com/search?q=self-documenting+code</a> <a href="https://en.wikipedia.org/wiki/Self-documenting_code" rel="nofollow">https://en.wikipedia.org/wiki/Self-documenting_code</a> <a href="https://wiki.c2.com/?SelfDocumentingCode" rel="nofollow">https://wiki.c2.com/?SelfDocumentingCode</a>

评论 #41941690 未加载

amonith7 个月前

After 10 years as a commercial dev I've noticed I don't really care about things like this. Not sure if it ever made a difference. The "local code" - as in anything within a function or often a single class (1-2k LoC is not really a problem) - is trivial to read in most languages. The most difficult thing to understand always was the domain or the infrastructure/library quirks - stuff that's never properly documented. (Hot take: might not be worth to document anyway as it takes longer to write and update such docs than to struggle with the code for a little bit).Naming or visual code structure was never a problem in my career so far.

评论 #41941007 未加载

tln7 个月前

I find the comment at the end interesting// Creates a user and returns the newly created user's id on successHmm, it returns an id? But the @returns is Promise<any>? The code as written will change when userService.create changes... without the actual, human readable bit of prose, that potential code issue could be easily overlooked.Of course, here the code could have a newtype for UserId and return Promise<UserId>, making the code better and then the prose is basically not needed (but please just write a docstring).FWIW I would document that the `user` parameter is modified. And document the potential race condition between checking the existence of a user and creating a user, and maybe why it was chosen to be done in this order (kinda flimsy in this example). Which would probably lead me to designing around these issues.Trying to only document via self-documenting code seems to always omit nuances.<pre><code> /** Create a user and return the id, or throw an error with an appropriate code. * * user.password may be changed after this function is called. */ async function createUser(user: User): Promise<number> { if (!validateUserInput(user)) { throw new Error(err.userValidationFailed); } if (isPasswordValid(user.password)) { // Check now if the user exists, so we can throw an error before hashing the password. // Note: if a user is created in the short time between this check and the actual creation, // there could be an unfriendly error const userExists = !!(await userService.getUserByEmail(user.email)); if (userExists) { throw new Error(err.userExists); } } else { throw new Error(err.invalidPassword); } user.password = await hashPassword(user.password); return userService.create(user); }</code></pre>

评论 #41928790 未加载

gnarlouse7 个月前

Having a function throwError makes me squirm.`isValid() || throwError()` is an abuse of abstraction

评论 #41928563 未加载

jnsie7 个月前

I lived in the C# world for a while and our style guides mandated that we use those JSDoc style comments for every function definition. I loathed them. They invariable became a more verbose and completely redundant version of the function definition. Developers even used a tool (GhostDoc, IIRC) to generate these comments so that CreateNewUser() became // Create New User. Nobody ever read them, few ever updated them, and they reinforced my hunch that a small percentage of comments are useful (in which case, by all means, use comments!)

评论 #41940454 未加载

gtirloni7 个月前

Is writing a few comments here and there explaining why things are done in a certain way so terrible that we have to create this thing?

评论 #41940979 未加载

mannyv7 个月前

Code only tells you 'what,' not 'why.' And 'why' is usually what matters.

评论 #41928793 未加载

mmastrac7 个月前

I've been developing for a very long time and I'm neither on the side of "lots of comments" or "all code should speak for itself".My philosophy is that comments should be used for two things: 1) to explain code that is not obvious at first glance, and 2) to explain the rationale or humanitarian reasons behind a bit of code that is understandable, but the reasons for its existence are unclear.No philosophy is perfect, but I find that it strikes a good balance between maintainability of comment and code pairing and me being able to understand what a file does when I come back to it a year later.The article is not good IMO. They have a perfect example of a function that could actually make use of further comments, or a refactoring to make this more self-documenting:<pre><code> function isPasswordValid(password) { const rules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W{1,}/]; return password.length >= 8 && rules.every((rule) => rule.test(password)); } </code></pre> Uncommented regular expressions are a code smell. While these are simple, the code could be more empathetic to the reader by adding at least a basic comment:<pre><code> function isPasswordValid(password) { // At least one lowercase, one uppercase, one number and one symbol const rules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W{1,}/]; return password.length >= 8 && rules.every((rule) => rule.test(password)); } </code></pre> Which would then identify the potentially problematic use of \W (ie: "[^a-zA-Z0-9]"). And even though I've been writing regular expressions for 20+ years, I still stumble a bit on character classes. I'm likely not the only one.Now you can actually make this function self-documenting and a bit more maintainable with a tiny bit more work:<pre><code> // Returns either "true" or a string with the failing rule name. // This return value is kind of awkward. function isPasswordValid(password) { // Follow the password guidelines by WebSecuritySpec 2021 const rules = [ [MIN_LENGTH, /.{8,}/], [AT_LEAST_ONE_LOWERCASE, /[a-z]{1,}/], [AT_LEAST_ONE_UPPERCASE, /[A-Z]{1,}/], [AT_LEAST_ONE_NUMBER, /[0-9]{1,}/], // This will also allow spaces or other weird characters but we decided // that's an OK tradeoff. [AT_LEAST_ONE_SYMBOL, /\W{1,}/], ]; for (const [ruleName, regex] of rules) { if (!regex.test(password)) { return ruleName; } } return true; } </code></pre> You'd probably want to improve the return types of this function if you were actually using in production, but this function at least now has a clear mapping of "unclear code" to "english description" and notes for any bits that are possibly not clear, or are justifications for why this code might technically have some warts.I'm not saying I'd write this code like this -- there's a lot of other ways to write it as well, with many just as good or better with different tradeoffs.There are lots of ways to make code more readable, and it's more art than science. Types are a massive improvement and JSDoc is so understandably awkward to us.Your goal when writing code shouldn't be to solve it in the cleverest way, but rather the clearest way. In some cases, a clever solution with a comment can be the clearest. In other cases, it's better to be verbose so that you or someone else can revisit the code in a year and make changes to it. Having the correct number of comments so that they add clarity to code without having too many that they become easily outdated or are redundant is part of this as well.

评论 #41937224 未加载

评论 #41941026 未加载

评论 #41937802 未加载

Mikhail_Edoshin7 个月前

Names do have some importance. If you pick random words and assign them to things you deal with you will find yourself unable to reason about them. Try it, it is interesting. Yet names are not the pinnacle of design. Far from it.Look at a mechanical watch. (For example, here: <a href="https://ciechanow.ski/mechanical-watch/" rel="nofollow">https://ciechanow.ski/mechanical-watch/</a>). Those little details, can you come up with self-documenting names for them? I do not think so. In programming good design is very much like that watch: it has lots of strangely-looking things that are of that shape because it fits their purpose [1]. There is no way to give them some presumably short labels that explain that purpose out of the context. Yet we need to point to them as we talk about them [2]. The role of names in programming is thus much more modest. In the order of importance:- They must be distinct within the context (of course). - Yet their form must indicate the similarities between them: alen and blen are of the same kind and are distinct from abuf and bbuf, which are also of the same kind. - They must be pronounceable and reasonably short. Ideally they should be of the same length. - They need to have some semblance to the thing they represent. - It would be nice to make them consistent across different contexts. Yet this is is incredibly tedious task of exponential complexity.There is also the overall notation. Ideally it should resemble written reasoning that follows some formal structure. None of existing notations is like that. The expressive tools in these notations are not meant to specify reasoning: they are meant to specify the work of a real or virtual machine of some kind. The fallacy of self-documenting code is an unrecognized desire to somehow reason with the knobs of that machine. It will not work this way. Yet a two-step process would work just fine: first you reason, then you implement this on the machine. But it will not look self-documenting, of course. P. S. This is a major problem in programming: we keep the code, but do not keep the reasoning that led to it.[1] fitness for the purpose, Christopher Alexander, “The timeless way of building”. [2] notion vs definition, Evald Ilyenkov.

virgilp7 个月前

missed the opportunity to create named constants for each of the password validation rules.

sesteel7 个月前

Looking at this thread, it is a wonder that any PRs make it through review. I started calling these kinds of debates Holographic Problems.- Spaces vs Tabs- Self documenting code vs documented code- Error Codes vs Exceptions- Monolithic vs Microservices Architectures- etc.Context matters and your context should probably drive your decisions, not your personal ideology. In other words, be the real kind of agile; stay flexible and change what needs to be changed as newly found information dictates.

Aeolun7 个月前

> Short-circuit evaluation allows us to simplify conditional statements by using logical operators.Simple is not the same thing as understandable.They lost me entirely here.

Izkata7 个月前

I'd like to propose weird alternative to this:<pre><code> function throwError(error) { throw new Error(error); } async function createUser(user) { validateUserInput(user) || throwError(err.userValidationFailed); isPasswordValid(user.password) || throwError(err.invalidPassword); !(await userService.getUserByEmail(user.email)) || throwError(err.userExists); </code></pre> What if...<pre><code> [ [() => validateUserInput(user), err.userValidationFailed], [() => isPasswordValid(user.password), err.invalidPassword], [() => !(await userService.getUserByEmail(user.email)), err.userExists], ].forEach(function([is_good, error]) { if (!is_good()) { throw new Error(error); } }); </code></pre> Also on the regex:<pre><code> const rules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W{1,}/]; </code></pre> No one caught that in all four of these, "{1,}" could be replaced with the much more common "+". A bit odd considering the desire for brevity. I do personally prefer "[0-9]" over "\d", especially considering the other rules, but can go either way on "\W".I might have also added a fifth regex for length though, instead of doing it differently, if my head was in that mode: /.{8,}/

评论 #41942145 未加载

jansommer7 个月前

> The first change I would make is to use named constants instead of cryptic error codes.But he keeps the cryptic error codes that will go into the logs, or in the frontend where the developer will have to look up the error code. Don't map an error name to u105, just return the actual string: "userValidationFailed".

G_o_D7 个月前

reportgunner7 个月前

I don't like this article, author just added some abstraction and moved the stuff that matters out of the perspective and we just have to imagine that whatever is outside the perspective is perfect.

tehologist7 个月前

All source code is self documenting, source code is for developers to read. Computer languages are a human readable version of what the compiler executes and is for the developers, not the compilers benefit. As a developer, I read way more software than I write and if the source is hard to understand then I feel you failed as a developer. Writing is a skill, no less important in software than anywhere else. Properly named functions/variables and easy to follow flow control is a skill that takes years to learn. All developers should keep a thesaurus and a dictionary nearby. If you find yourself writing a lot of comments trying to explain what you are doing in your code, then you probably should refactor.

gspencley7 个月前

I agree with most of the article but want to nitpick this last part:> I’m not a fan of TypeScript, but I appreciate its ability to perform static type checks. Fortunately, there’s a way to add static type checking to JavaScript using only JSDoc comments.If you're writing JSDoc comments, then you're not writing what the author considers to be "self-documenting code."I wish the author had explained why they are not a fan of TypeScript. Compile time type-safety aside, as the author acknowledges by implication adding type specificity negates the usefulness of JSDoc comments for this particular situation.I'm personally a big proponent of "self documenting code" but I usually word it as "code that serves as its own documentation because it reads clearly."Beyond "I would personally use TypeScript to solve that problem", my case for why ALL comments are a code smell (including JSDoc comments, and in my personal opinion) is:- They are part of your code, and so they need to be maintained just like the rest of your code- But ... they are "psychologically invisible" to the majority of developers. Our IDEs tend to gray them out by default etc. No one reads them.- Therefore, comments can become out of sync with the code quite easily.- Comments are often used to explain what confusing code does. Which means that instead of fixing the code to add clarity, they do nothing but shine a spotlight on the fact that the code is confusing.- In doing the above, they make messy code even messier.I am slightly amenable to the idea that a good comment is one that explains WHY weird code is weird. Even then, if you have the luxury of writing greenfield code, and you still need to do something un-intuitive or weird for really good reasons ... you can still write code that explains the "why" through good naming and separation of concerns.The only time that I would concede that a code comment was the best way to go about things in context is when you're working with a very large, legacy commercial code-base that is plagued by existing tech debt and you have no good options other than to do your weird thing inline and explain why for logistical and business reasons. Maybe the refactor would be way too risky and the system is not under test, the business has its objectives and there's just no way that you can reasonably refactor in time etc. This happens... but professional developers should ideally treat incremental refactoring as a routine part of the development lifecycle so that this situation is as unlikely as possible to arise in the future.

评论 #41928438 未加载

subversive-dev7 个月前

The picture at the top of article is of a German public library. It has a beautiful, clean architecture. Somehow I don't see how it relates to self-documenting code.

评论 #41944187 未加载

dgeiser137 个月前

If future "AI" can write code then it should be able to read code and describe what it does at various levels of detail.

kitd7 个月前

I anticipate the day when Gen AI gives us self-coding documentation.

评论 #41928742 未加载

omgJustTest7 个月前

What if the documentation were code? ...