As far as I can establish with a cursory skim, the article describes priority inversion, and states:<p>"OS schedulers might implement different techniques to lessen the severity of this problem, but it's far from being solved once and for all."<p>The article fails to describe what "solved once and for all" might mean.<p>Given that Priority Ceiling Protocol and Priority Inheritance are two common solutions used in real-time schedulers, I would like to understand why priority inversion is not a solved problem.<p>On the other hand, I have yet to learn whether Mac OS has any synchronisation primitives that mitigate mutex priority inversion.
Solution from the article: Don't fiddle with thread priorities.<p>Solution that works well in real life: Use lock-free (or, better still, wait-free) data structures.<p>The trick is that, for those of us in the devices and media world, and despite not always using RT schedulers, we generally need (largely) consistently prioritized performance for critical operations. For example, I actually need my device audio buffer callback to be timely to avoid displeasing gaps.<p>How have whole industries managed to deliver sellable products in these spaces without resorting to RT kernels? In many cases, careful uses of thread priorities and, yes, mutexes, have been involved.<p>This article feels like it was written by someone half-way through:<p>Junior engineer: Just use thread priority.<p>Senior engineer: No! Thread priority is a minefield.<p>Principal engineer: We're going to carefully walk through this minefield.
> If possible, avoid headaches by just not fiddling around with thread priorities.<p>No, learn the ins and outs of the primitives of your OS. Threads are perfectly fine to run at different priorities but as soon as you start communicating between them they become a chain of sorts and if resources are exhausted (such as a queue reaching a maximum size) or if you artificially couple the threads using some kind of sync mechanism then your priorities will not what you want them to be.<p>Prioritization works well for <i>independent</i> threads, not for dependent threads, which are somewhat closer to co-routines that happen to use the thread scheduler rather than that they call each other directly.
This article seems like 3 pages of my CS Operating Systems unit summarised into an article. Since I started reading Hacker News I've always assumed the userbase has a typical education equating to at least a CS education.<p>Am I wrong for assuming that this article seems fairly low level for the user base here? It just seems to me that if you can read this article and understand push, pop, thread priority, mutex, transitive, etc then it's more than likely that someone has already lectured at you about the issues that can arrive with using mutexes for locking.
This is called priority inversion.<p>One way to deal with this is to make it so that when a low-priority thread dequeues a message from a high-priority thread then the low-priority thread temporarily inherits the client's higher priority, then later goes back to its lower priority. The problem is identifying when to go back. Solaris doors was an IPC mechanism that did this well, but at the price of being synchronous -- when you throw asynchrony in that approach doesn't work, and you really do want asynchrony. If you trust the low-priority threads enough you can let them pick a priority according to that of the client they are servicing at any moment.
I was hoping for a cool story of os bugs currupting threads. Instead, as others pointed out, this is just a more verbose explanation of priority inversion.<p>I would like to read the zombie injection story still.
Note that this is only catastrophic when using <i>realtime</i> thread priorities, i.e. thread priority kinds that cause a runnable higher priority thread to run instead of a lower priority one regardless of how much time they have been running already.<p>In general OSes, the commonly used thread priorities are not realtime, and you need admin/root to set realtime thread priorities.<p>Also indefinite deadlock can only happen if realtime threads want to take up more CPU than available (which in particular requires to not have more cores than realtime threads), since otherwise they will eventually all be sleeping/waiting, allowing any threads waiting on the mutex to run.