Swap out the allocator <a href="https://users.rust-lang.org/t/optimizing-rust-binaries-observation-of-musl-versus-glibc-and-jemalloc-versus-system-alloc/8499" rel="nofollow">https://users.rust-lang.org/t/optimizing-rust-binaries-obser...</a>
For those curious, Musl's malloc implementation is currently being re-written for higher performance and robustness, see <a href="https://github.com/richfelker/mallocng-draft" rel="nofollow">https://github.com/richfelker/mallocng-draft</a>
We'd be happy to address specific problems on the mailing list. I believe it's a known issue that the Rust compiler is making really heavy use of rapid allocation/freeing cycles, and would benefit from linking a performance-oriented malloc replacement. Doing so is inherently a tradeoff between many factors including performance, memory overhead, safety against erroneous usage by programs, etc.<p>One statement in your post, which some readers pointed out was apparently added later, "Others have suggested that the performance problems in musl go deeper than that and that there are fundamental issues with threading in musl, potentially making it unsuitable for my use case," seems wrong unless they just meant that the malloc implementation is not thread-caching/thread-local-arena-based. The threads implementation in musl is the only one I'm aware of that doesn't still have significant bugs in some of the synchronization primitives or in cancellation. It's missing a few optional and somewhat obscure features like priority-ceiling mutexes, and Linux doesn't even admit a fully correct implementation in some regards like interaction of thread priorities with some synchronization primitives, but all the basic functionality is there and was written with extreme attention to correctness, and musl aims to be a very good choice in situations where this matters.
Swapping out the allocator for jemalloc would be my first try. It's easy to do and often results in better performance. 30x requires some kind of pathological case though.
Post was not very illuminating. Very little content. It's pretty much a "if musl is slow, it may be the allocator (eom)" which fits in the headline and would have saved me the click.
This actually someone asking and not an investigation and explanation. There isn't even a lot of due diligence to figuring it out - no profiling or resource usage other than CPUs. Also it is musl combined with docker causing a 30x slowdown.<p>If something is running 30x slower from linking in a different libc, I'm guessing it should not be that difficult to narrow down the cause at least a little bit.