Marcan's attitude is great; I know of a ton of people (myself included) who would've written that article with far more complaining interleaved. Super informative as well, I learned a ton from this article (GRUB 2 feature for marking off bad RAM? Wow!). Very well written, informative, humorous, etc. Love it
That is such as well written post, and showcases some serious diagnostic skills by an experienced person.<p>Btw, thanks about telling me about the badram Grub 2 feature, had no idea that exited.
Man I loved the approach to narrow down the offending object file using the regex on the SHA hash of the binaries. That would've saved me lots of time hunting and guessing bugs with cscope+kdb back in my kernel hacker days!
Wow, these are some serious debugging skills. I also admire the tenacity and the will to investigate the root cause.<p>> I tried setting GOMAXPROCS=1, which tells Go to only use a single OS-level thread to run Go code. This also stopped the crashes, again pointing strongly to a concurrency issue.<p>I think I would have stopped there.
Hah. I remember Bryan Cantrill complaining about this exact thing. Glad that it's fixed.<p>Turns out somebody else did, too:
<a href="https://twitter.com/bcantrill/status/774290166164754433?lang=en" rel="nofollow">https://twitter.com/bcantrill/status/774290166164754433?lang...</a><p>/edit: spelling
The investigation in the linked Go issue [1] is also impressive.<p>[1] <a href="https://github.com/golang/go/issues/20427" rel="nofollow">https://github.com/golang/go/issues/20427</a>
This is great story, probably wins the year for "best bug you've ever encountered?" question.
Having implemented some weird runtimes for weird languages, I am sympathetic to Go team here -- these odd tradeoffs of pushing the envelope on OS <-> your_own_compiler interactions can trigger some wild experiences.
For a similar tale of vDSO getting someone in trouble, check out this fun talk "Really crazy container troubleshooting stories": <a href="https://media.ccc.de/v/ASG2017-115-really_crazy_container_troubleshooting_stories" rel="nofollow">https://media.ccc.de/v/ASG2017-115-really_crazy_container_tr...</a>
Ninja level debugging and diagnostic skills. A fascinating read from start to finish. Bonus points for the GRUB 2 feature for masking out bad RAM blocks – still dreaming of owning a laptop with ECC memory :/
Setting up for a 104 byte stack seems pretty crazy, wouldn't you risk overrunning the red zone even without all that stack probing? <a href="https://en.wikipedia.org/wiki/Red_zone_(computing)" rel="nofollow">https://en.wikipedia.org/wiki/Red_zone_(computing)</a>