There were two motivations for shared libraries; one no longer applies and the other, arguably was never reliable, so after 35 years of dynamic linking, a couple of years ago I went static only (except for system calls of course).<p>The first reason was disk space: object files could simply be smaller. Nowadays, for all intents and purposes, disk space is free and unlimited.<p>The second was the ability to ship fixes to system facilities. These were fragile, leading to even more complex methods such as versioning, so though this was a good idea in principle, in practice it was very painful. I believe the Windows folks had their own name for it, "DLL Hell". The npm people still suffer from this from time to time, sometimes notoriously.<p>As for the fixes: software isn't updated on magtape any more, and most systems have robust upgrade systems. In addition there are all manner of quasi-hermetic isolation systems (VMs, docker images, venvs and the like) so why not just ship a static binary?<p>Plus with a static binary, if you really care about order of loading/unloading etc (which you ideally shouldn't) its trivial to manage with a custom linker script.<p>My only exception to this is the kernel: it can be upgraded, and mostly promises not to vary system call semantics too much, so it's OK to me not to link the kernel into my binary :-).
The glibc repository has multiple test cases for atexit, it seems a lot of the current behavior derives from a fix for this bug in 2005: <a href="https://sourceware.org/bugzilla/show_bug.cgi?id=1158" rel="nofollow">https://sourceware.org/bugzilla/show_bug.cgi?id=1158</a><p>It seems odd for the author to have invested so much time in this, but not to have found or patches those glibc test case to demonstrate the behavior he expects in various hairy test cases glibc itself is worrying about, and which forms the basis for its current behavior.
The submission title is misleading. The C standard library function atexit() isn't broken. At worst, some C library implementations are broken (or more charitably, not C standard compliant) <i>if</i> you also use certain other facilities that are <i>not</i> C standard library functions.
I question what the author was expecting to happen. If bar_exit comes from a dylib, and that dylib is subsequently unmapped, you can't very well call any functions from the unmapped library, so there's only 4 possible outcomes I can see:<p>1. The program crashes during termination as atexit() tries to invoke the previously-registered-but-now-invalid function.<p>2. The function is silently skipped as it's no longer loaded.<p>3. The function is invoked upon dlclose().<p>4. dlclose() does not actually unmap the library.<p>Of these options, only the 3rd actually seems reasonable. The first two are obviously bad, and the 4th seems like it rather defeats the purpose of calling dlclose() if you can't actually unload the library.
I don't see how the behaviour he wants can work. If you dlclose a library and cause it to be unmapped then the function pointers in that library are invalid and can't be called atexit. Either, dlclose will have broken behaviour by keeping libraries mapped even though all their dlopen references have been dropped or atexit will be broken.
I believe this was changed everywhere because glibc maintainers decided that it made more sense to run at library unload time, and then everyone else had to change their implementation to match "what Linux does."<p>That being said, none of these behaviors really make any sense. If the registered function isn't mapped into the process address space any more, what is supposed to happen?
I seem to remember a WWDC session from a few years ago (the one where dyld 3 was first introduced) where they said that dlclose() is now essentially a no-op on macOS, since it’s rarely done, the benefits are few, and the possible problems are many.<p>Edit: no-op dlclose() was something under consideration for dyld 3 on everything but macOS. <a href="https://devstreaming-cdn.apple.com/videos/wwdc/2017/413fmx92zo14voet8/413/413_app_startup_time_past_present_and_future.pdf?dl=1" rel="nofollow">https://devstreaming-cdn.apple.com/videos/wwdc/2017/413fmx92...</a>
There's two distinct concepts that are conflated for statically linked executables.<p>On process death vs on module death. Is the function to clean up the process, or is it to clean up the module?<p>I think the argument for module cleanup is stronger, irrespective of the difficulty of registering code that's supposed to last longer than the calling address space. The module had nothing before it was loaded, it should leave nothing behind when it is unloaded. It's symmetrical.<p>The difficulty of providing an executable callback that outlives its module just seals the deal.
This reminds me of a strange bug I encountered once on ia64. We had a custom library that we loaded with dlopen(), and then called an initialization function within this library. Unbeknownst to me at the time, this function created a thread which did something, and then almost immediately went to sleep for a few seconds. Also unbeknownst to me, there was also function that you were supposed to call before dlclose() to kill this thread. If you dlclosed the library without previously having called this function to stop the thread, when the thread woke up, all its code would of course be gone, and the program segfaulted, leaving a core file. Interestingly, if you ran gdb on the core file, gdb segfaulted! Presumably because the thread didn't have any corresponding code in the core file either. Took awhile to figure that one out. Curiously, on x86, all this seemingly worked fine (maybe dlclose implicitly killed the thread on x86? ... never figured out why it didn't crash on x86 but did on ia64.)
It's actually quite hard to do exit cleanups, even by putting functions into `.__fini_array`. The reason is because some signals (SIGTERM, SIGKILL) cannot be masked, when a thread/process (`task_struct`) got `SIGKILL`-ed, there's no way for application to run functions (including .__fini_array). Things can get more complicated when a thread calls `exit_group`, which is also recommended, cause all other threads in the same process group receive `SIGKILL`, hence none of the other threads are able to run any code after.