TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

My Go executable files are still getting larger

264 点作者 alexbilbie大约 4 年前

22 条评论

rsc大约 4 年前
This article is full of misinformation. Just a few representative things:<p>- The expansion of pclntab in Go 1.2 dramatically improved startup time and reduced memory footprint, by letting the OS demand-page this critical table that is used any time a stack must be walked (in particular, during garbage collection). See <a href="https:&#x2F;&#x2F;golang.org&#x2F;s&#x2F;go12symtab" rel="nofollow">https:&#x2F;&#x2F;golang.org&#x2F;s&#x2F;go12symtab</a> for details.<p>- We (the Go team) did not “recompress” pclntab in Go 1.15. We did not remove pclntab in Go 1.16. Nor do we have plans to do either. Consequently, we never claimed “pclntab has been reduced to zero”, which is presented in the article as if a direct quote.<p>- If the 73% of the binary diagnosed as “not useful” were really not useful, a reasonable demonstration would be to delete it from the binary and see the binary still run. It clearly would not.<p>- The big table seems to claim that a 40 MB Go 1.8 binary has grown to a 289 MB Go 1.16 binary. That’s certainly not the case. More is changing from line to line in that table than the Go version.<p>Overall, the claim of “dark bytes” or “non-useful bytes” strikes me as similar to the claims of “junk DNA”. They’re not dark or non-useful. It turns out that having the necessary metadata for garbage collection and reflection in a statically-compiled language takes up a significant amount of space, which we’ve worked over time at reducing. But the dynamic possibilities in reflection and interface assertions mean that fewer bytes can be dropped than you’d hope. We track binary size work in <a href="https:&#x2F;&#x2F;golang.org&#x2F;issue&#x2F;6853" rel="nofollow">https:&#x2F;&#x2F;golang.org&#x2F;issue&#x2F;6853</a>.<p>An unfortunate article.
评论 #26834364 未加载
评论 #26836583 未加载
评论 #26841612 未加载
评论 #26841021 未加载
评论 #26919261 未加载
评论 #26835192 未加载
评论 #26837003 未加载
评论 #26834408 未加载
bradfitz大约 4 年前
To promote my own tool, <a href="https:&#x2F;&#x2F;github.com&#x2F;bradfitz&#x2F;shotizam" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;bradfitz&#x2F;shotizam</a> lets you drill down into why Go binaries are large without having to make up terms like &quot;dark bytes&quot;.
评论 #26835024 未加载
评论 #26836050 未加载
评论 #26834543 未加载
haberman大约 4 年前
&gt; The sum of the sizes reported by go tool nm does not add up to the final size of the Go executable.<p>&gt; At this time, I do not have a satisfying explanation for this “dark” file usage.<p>The author&#x27;s journey of starting with &quot;nm --size&quot;, discovering &quot;dark&quot; bytes, and wanting to attribute them properly, is <i>exactly</i> what led me to create and invest so much effort into Bloaty McBloatface: <a href="https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;bloaty" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;bloaty</a><p>Bloaty&#x27;s core principle is that every byte of the file should be attributed to something, so that the sum of the parts always adds up to the total file size. If we can&#x27;t get detailed symbol, etc. information for a given region of the file, we can at least fall back to describing what section the bytes were in.<p>Attributing all of the bytes requires parsing much more than just the symbol table. Bloaty parses many different sections of the binary, including unwind information, relocation information, debug info, and the data section itself in an attempt to attribute every part of the binary to the function&#x2F;data that emitted it. It will even disassemble the binary looking for references to anonymous data (some data won&#x27;t make it into the symbol table, especially things like string literals).<p>I wrote up some details of how Bloaty works here: <a href="https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;bloaty&#x2F;blob&#x2F;master&#x2F;doc&#x2F;how-bloaty-works.md" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;bloaty&#x2F;blob&#x2F;master&#x2F;doc&#x2F;how-bloaty-...</a>. The section on the &quot;Symbols&quot; data source is particularly relevant here:<p>&gt; I excerpted two symbols from the report. Between these two symbols, Bloaty has found seven distinct kinds of data that contributed to these two symbols. If you wrote a tool that naively just parsed the symbol table, you would only find the first of these seven:&quot;<p>The author&#x27;s contention that these &quot;dark&quot; bytes are &quot;non-useful&quot; is not quite fair. There are plenty of things a binary contains that are useful even though they are not literally executable code. For example, making a binary position-independent (which is good for security) requires emitting relocations into the binary so that globals with pointer values can be relocated at program load time, once the base address of the binary is chosen. I don&#x27;t know if Go does this or not, but it&#x27;s just one example.<p>On the other hand, I do agree that the ability to produce slim binaries is an important and often undervalued property of modern compiler toolchains. All else being equal, I much prefer a toolchain that can make the smallest binaries.<p>Bloaty should work reasonably well for Go binaries, though I have gotten some bug reports about things Bloaty is not yet handling properly for Go: <a href="https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;bloaty&#x2F;issues&#x2F;204" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;bloaty&#x2F;issues&#x2F;204</a> Bloaty is just a side thing for me, so I often don&#x27;t get as much time as I&#x27;d like to fix bugs like this.
parhamn大约 4 年前
&gt; In other words, the Go team decided to make executable files larger to save up on initialization time.<p>I mean... Im genuinely curious if this is a &quot;we have extra engineering resources and can explore&#x2F;complain about this&quot; or &quot;we have a client who is running cockroachdb and can&#x27;t handle a 172mb binary install for a database server&quot;.<p>Is there really someone out there who installs Cockroach (a global distributed auto-sharded database) and thinks twice about 172mb of disk space?<p>Sure, it&#x27;d be nice to have smaller binaries but outside of some embedded applications Go&#x27;s binaries sizes are well within the nothing-burger range for most compute systems.
评论 #26833064 未加载
评论 #26833405 未加载
评论 #26841072 未加载
评论 #26833584 未加载
评论 #26834151 未加载
arp242大约 4 年前
&gt; Moreover, consider that these executable files fly around as container images, and&#x2F;or are copied between VMs in the cloud, thousands of times per day! Every time, 70% of a couple hundred megabytes are copied around for no good reason and someone needs to pay ingress&#x2F;egress networking costs for these file copies. That is quite some money being burned for no good reason!<p>Does the author think the Go authors are stupid blubbering idiots who someone missed this huge elephant-sized low-hanging fruit? Binary sizes have been a point of attention for years, and somehow missing 70% wasted space would be staggeringly incompetent.<p>Reminds me of the time in high school when one of the my classmates ended up with a 17A doorbell in some calculations. I think he used the wrong formula or swapped some numbers. The teacher, quite rightfully, berated him for not actually looking at the result of his calculation and judging if it&#x27;s roughly in the right ballpark, as 17A is a ludicrous amount of power for a doorbell. Anyone can see that&#x27;s just widely wrong.<p>If this story had ended up with 0.7%, sure, I can believe that. 7%? Unlikely and I&#x27;d be skeptical, but still possible I suppose. *70%* Yeah nah, that&#x27;s just as silly as a 17A doorbell.<p>This huge 70% number should have been a clue to the author themselves too that they&#x27;ve missed something.
评论 #26835240 未加载
stabbles大约 4 年前
&gt; Every time, 70% of a couple hundred megabytes are copied around for no good reason and someone needs to pay ingress&#x2F;egress networking costs for these file copies. That is quite some money being burned for no good reason!<p>Meanwhile half of the world is pushing images by nvidia, intel and amd around for their machine learning software:<p>Intel OneAPI runtime libraries: 4.74GB (or 18.4GB for compilers) CUDA runtime libraries: 1.92GB (or 4.2GB for compilers)<p>These go binaries are still relatively small
评论 #26833585 未加载
jeffbee大约 4 年前
After looking into the size of the CockroachDB binary, the magnitude of the plank in the author&#x27;s eye becomes clear. This iceberg is ridiculously bloated. Much of the space is coming from the static data of geographic projections that, I assume, basically nobody needs. This includes a single init function that is 1.4MB of machine code from 6MB of auto-generated source code. Then there&#x27;s the entire AWS SDK, with a static definition of every service AWS offers, by name, and in what regions, by name. Nevermind the Azure SDK. There are three implementations of protocol buffers in here: gogo in Go and Google&#x27;s in both Go and C++. There are at least four SQL parsers in here, including vitess&#x27;s and another one for crdb in Go.<p>Last but by no means least there are in total 13MB of autogenerated functions of the colexec package, each of which is over 100KB long, which are autogenerated and share virtually all of their code. These are an obscene waste of code space and undoubtedly de-deuplicating this code would not just reduce code size but also speed up the program, due to icache trashing.
londons_explore大约 4 年前
It&#x27;s time for debug info like this to be sent to &quot;onlinesymbolserver.com&quot;, encrypted with a hash of the binary.<p>Then, whenever a debugger connects to a binary, it can simply download the symbols as required.<p>And for the 99.9% who don&#x27;t need debug info, it isn&#x27;t needlessly shipped.<p>Microsoft invented this in the 90&#x27;s...
评论 #26833046 未加载
评论 #26834108 未加载
评论 #26833140 未加载
EdiX大约 4 年前
From the same data scientist that concluded non-linear growth from two single data points...
njuw大约 4 年前
&gt; starting in Go 1.16, the pclntab is not present any more, and instead is re-computed from other data in the executable file.<p>Does anyone have a source for this? As it still appears to be there<p>- Go 1.15 <a href="https:&#x2F;&#x2F;i.imgur.com&#x2F;3YlZGOk.png" rel="nofollow">https:&#x2F;&#x2F;i.imgur.com&#x2F;3YlZGOk.png</a><p>- Go 1.16 <a href="https:&#x2F;&#x2F;i.imgur.com&#x2F;gGYsj32.png" rel="nofollow">https:&#x2F;&#x2F;i.imgur.com&#x2F;gGYsj32.png</a>
评论 #26837618 未加载
kissgyorgy大约 4 年前
Python is still more wildly used &#x2F; popular language, but I never seen a Python container for a real project which was less than 1GB.
评论 #26836983 未加载
评论 #26834898 未加载
评论 #26836283 未加载
评论 #26835062 未加载
orangechairs大约 4 年前
Hey all -- Cockroach Labs blog editor here. Based in part on the feedback we received from this community, we have retracted the post. The blog post link above will take you the retraction, where we share what we&#x27;ve learned from this experience.
jeffbee大约 4 年前
This article should be renamed “what is Chesterton‘s fence?” And the author should have realized their mistake right after typing “at this time I don’t know what it is used for”.
marcus_holmes大约 4 年前
I always wonder if this is the flip side of the fast compilation?<p>It would be nice to be able to decide on those trade-offs ourselves. I mostly write web servers in Go, which (as the article says) are executed rarely, so init time really doesn&#x27;t matter to me. But I&#x27;ve been looking at writing some desktop apps in Go, and then init time will matter.
评论 #26833772 未加载
评论 #26833766 未加载
评论 #26833076 未加载
评论 #26837638 未加载
评论 #26834086 未加载
评论 #26833328 未加载
评论 #26834683 未加载
评论 #26833418 未加载
brainzap大约 4 年前
Is there a bug report for this?
评论 #26833234 未加载
评论 #26832916 未加载
u678u大约 4 年前
We need more initContainers where the big shared system libraries are in a separate container that is cached locally. I feel like history is repeating.
kreetx大约 4 年前
Couldn&#x27;t the dark bytes just be shipped as a separate file - for those who need it?
评论 #26833025 未加载
tedunangst大约 4 年前
&gt; Every time, 70% of a couple hundred megabytes are copied around for no good reason and someone needs to pay ingress&#x2F;egress networking costs for these file copies.<p>Just zero them out and they&#x27;ll compress to nothing. Even better, with a sparse file aware tool like tar, they won&#x27;t even use disk space.
boredpandas777大约 4 年前
Maybe Go optimizing for serverless with reduced init time?
daitangio大约 4 年前
Go static linking is a great happy idea for a Java guy trapped in the Classpath Dependency Hell (or C# &#x2F; DLL Hell).<p>It is a very annoying thing for a C++ programmer, which can dynamically link operating system libraries at will.
评论 #26833912 未加载
评论 #26833754 未加载
评论 #26834184 未加载
tediousdemise大约 4 年前
It seems to be a challenge to add zero-overhead features to programming languages.<p>Poor design decisions result in a language that gets extremely bloated over time, forcing you to use features that you don’t want to.<p>The better approach is to make these features <i>optional</i>, such as through a standard library.
blinkingled大约 4 年前
&gt; We can call this the “dark file usage” of Go binaries, and it occupies between 15% and 33% of the total file size inside CockroachDB.<p>&gt; Sadly, the removal of pclntab in Go 1.16 actually transferred the payload to the “dark” bytes.<p>I surely would have expect better from programming language designers&#x2F;developers than this. Sounds like they just moved the problem from one place to another.
评论 #26833893 未加载