People seem to fixate on stack traces, because other languages present nearly every error in stack trace form. I think you should think about why you want them and make sure you have a good reason before mindlessly adding them. I do collect stack traces in some Go code, because Sentry requires it for categorization, but in general, you can do a much better job yourself, with very little sorcery involved.<p>A common problem is that when multiple producers produce failing work items and send them to a consumer -- a stack trace will just show "panic -> consumer.doWork() -> created by consumer.startWork()". Gee, thanks. You need to track the source, so that you have an actionable error message. If the consumer is broken, fine, you maybe have enough information. If a producer is producing invalid work items, you won't have enough information to find which one. You'll want that.<p>The idea of an error object is for the code to make a decision about how to handle that error, and if it fails, escalate it to a human for analysis. The application should be able to distinguish between classes of failures, and the human should be able to understand the state of the program that caused the failure, so they can immediately begin fixing the failure. It's up to you to capture that state, and make sure that you consistently capture the state.<p>Rather than leaving it to chance, I have an opinionated procedure:<p>1) Every error should be wrapped. This is where all the context for the operator of your software comes from, and you have to do it every time to capture the state of the application at the time of the error.<p>2) The error need not say "error" or "failure" or "problem". It's an error, you know it failed. As an example, prefer "upgrade foos: %w" over "problem upgrading foos: %w". (The reason is that in a long chain, if everyone does this, it's just redundant: "problem frobbing baz: problem fooing bars: problem quuxing glork: i/o timeout". Compare that to "frob baz: foo bars: quux glork: i/o timeout".)<p>But if you're logging an error, I pretty much always put some sort of error-sounding words in there. Makes it clear to operators that may not be as zen about failures as you that this is the line that identifies something not working. "2021-08-23T20:45:00.123 PANIC problem connecting to database postgres://1.2.3.4/: no route to host". I'm open to an argument that if you're logging at level >= WARNING that the reader knows it's a problem, though. (I also tend to phrase them as "problem x-ing y" instead of "error x-ing y" or "x-ing y failed". Not going to prescribe that to others though, use the wording that you like, or that you think causes the right level of panic.)<p>3) Error wrapping shouldn't duplicate any information that the caller already has. The caller knows the arguments passed to the function, and the name of the function. If its error wrapping needs those things to produce an actionable error message, it will add them. It doesn't know what sub-function failed, and it doesn't know what internally-generated state there is, and those are going to be the interesting parts for the person debugging the problem. So if you're incrementing a counter, you might do a transaction, inside of which is a read and a write -- return "commit txn: %w", "rollback txn: %w", "read: %w", "write: %w", etc. The caller can't know which part failed, or that you decided to commit vs. roll back, but it does know the record ID, that the function is "update view count", etc.<p>The standard library violates this rule, probably because people "return err" instead of wrapping the error, and this gives them a shred of hope. And, it's my made-up rule, not the Go team's actual rule! Investigate those cases and don't add redundant information. (For example, os.ReadFile will have the filename in the error message, because it returns an fs.PathError, which contains that. net.Dial is another culprit. Make a list of these and break the rule in these cases.)<p>4) Any error that the program is going to handle programmatically should have a sentinel (`var ErrFoo = errors.New("foo")`), so that you can unambiguously handle the error correctly. (People seem to handle io.EOF quite well; emulate that.)<p>You can describe special cases of your error by wrapping it before returning it, `fmt.Errorf("bar the quux: %w", ErrFoo)`.<p>Finally, since I talked about logging, please talk about things that DID work in your logs. Your logs are the primary user interface for operators of your software, but often the most neglected interface point. When you see something like "problem connecting to foo\nproblem connecting to foo", you're going to think there's a problem connecting to foo. But if you write "problem connecting to foo (attempt 1/3)\nproblem connecting to foo (attempt 2/3)\nconnected to foo", then the operator knows not to investigate that. It worked, and the program expected it to take 3 attempts. Perfect. (Generally, for any long-running operation a log "starting XXX" and "finished XXX" are great. That way, you can start looking for missing "finished" messages, rather than relying on self-reported errors.)<p>(And, outside of HN comments, I would make that a structured log, so that someone can easily select(.attempt == .max_attempts) and things like that. It's ugly if you just read the output, but great if you have a tool to pretty-print the logs: <a href="https://github.com/jrockway/json-logs/releases/tag/v0.0.3" rel="nofollow">https://github.com/jrockway/json-logs/releases/tag/v0.0.3</a>)<p>Anyway, I guess where this rant goes is -- errors are not an afterthought. They're as much a part of your UI as all the buttons and widgets. They will happen to all software. They will happen to yours. Some poor sap who is not you will be responsible for fixing it. Give them everything you need, and you'll be rewarded with a pull request that makes your program slightly more reliable. Give them "unexpected error: success", and you'll have an interesting bug report to context-switch to over the course of the next month, killing anything cool you wanted to make while you track that down.