My first thought when reading this is "what about rust" ? Rust uses the system libc to support most of it's standard library, so could it have the same problem ?<p>A quick look at the std::env::set_env docs [0] tells us that Rust is aware the underlying implementation are inherently thread-unsafe, and looking at its implementation [1] tells us that rust uses a global lock for all access to the environment.<p>So at least, it will avoid segfaulting your programs.<p>Good job rust :D<p>EDIT: However, I guess if you fork() (which is not something you can do with the standard rust library AFAIK, so requires unsafe), you may get the first problem of having a deadlock.<p>[0]: <a href="https://doc.rust-lang.org/std/env/fn.set_var.html" rel="nofollow">https://doc.rust-lang.org/std/env/fn.set_var.html</a><p>[1]: <a href="https://github.com/rust-lang/rust/blob/51d29343c04a27570a8ff8282611007d6e6408de/src/libstd/sys/unix/os.rs#L415" rel="nofollow">https://github.com/rust-lang/rust/blob/51d29343c04a27570a8ff...</a>
Windows has sane environment-retrieval functions: <a href="https://msdn.microsoft.com/en-us/library/windows/desktop/ms683188(v=vs.85).aspx" rel="nofollow">https://msdn.microsoft.com/en-us/library/windows/desktop/ms6...</a><p>Unlike getenv, GetEnvironmentVariable copies a variable's value into a <i>caller provided buffer</i>, addressing the use-after-free problem inherent in getenv's interface.<p>Wouldn't it be nice if glibc got something like a getenv2 with a similar interface? Why should people have to trip over weird corner cases in POSIX over and over and over again? Why are we afraid to add new APIs that make some damn sense?
I've had fun debugging this very problem in the past, as the root cause of a very puzzling intermittent failure.<p>The symptoms were intermittent segfaults in some high level scientific python code running on a large cluster in AWS. Given we had no custom C library extensions, this should be impossible, but there they were; the jobs would fail randomly for one in every several thousand runs, if memory serves.<p>We eventually tracked this to a call to getenv() in the OpenBLAS thread pool per-thread initialization code. This seems innocuous enough by itself, but couple it with some other unrelated python library which happened to call setenv() and you have a nice race condition accessing the internal glibc data structures. Ultimately this ended up with a segfault due to dereferencing some already freed memory.<p>Ah, good times. Here's the root cause analysis if anyone wants the gory details :-)
<a href="https://github.com/xianyi/OpenBLAS/issues/716#issuecomment-164339663" rel="nofollow">https://github.com/xianyi/OpenBLAS/issues/716#issuecomment-1...</a>
Another fun one is "don't exit() in multi-threaded code on glibc".<p>It goes like this:<p>1. Thread A calls FILE f = fopen(); if (f == NULL) {error}<p>2. Thread B calls exit(). exit() flushes and closes all open FILE (it does so it an internal atexit() handler)<p>3. Thread A determined that the file opened just fine, f != NULL. But Thread B has called fclose() on it, so any use in thread A is a use-after-free if Thread B is a tad slow in actually exiting the process
I guess this is one of those "obvious in retrospect" things, but modifying global state in a multithreaded program is definitely something that should immediately ring alarm bells unless the API is specified to be thread-safe.
I wrote the equivalent code in Go to see what happened<p><a href="https://play.golang.org/p/E3glBkbAo3" rel="nofollow">https://play.golang.org/p/E3glBkbAo3</a><p>(You'll need to run it locally for the full effect - things on play are limited to a single thread.)<p>It didn't crash.<p>I wasn't suprised though as Go doesn't use the C library and its authors are very careful about multithreaded code.
How is it even defined to setenv something in multi-threaded code?<p>Also, if you want to pass an environment variable to a child process from a specific thread, how do you do that without interfering with other threads that might do the same (alter the same variable to pass it to a different child process)? Would that require an ugly mutex around invoking child processes?
You know, I'm starting to think that multithreading may not be the best idea in an environment explicitly not designed for it, and we should be using other approaches, it the latency requirements of the application make that possible.<p>If only we had some sort of shared-nothing concurrency, like a syscall that <i>forks</i> your process and uses copy-on-write to make it efficient. But that's just crazy talk.
We should just deprecate setenv(). There is no need for it because we have execve().<p>I suppose there is system() on non-POSIX, but then you don't have a portable shell anyway nor do you have portable environment variables. You're off in 100% non-portable territory at that point, so no point having it in the standards.