Must Be This Tall to Write Multi-Threaded Code

146 pointsby mnemonikalmost 10 years ago

29 comments

The problems with traditional multi-threaded concurrency go beyond just complexity and safety. They also offer relatively poor performance on modern hardware due to the necessarily poor locality of shared structures and context switching, which causes unnecessary data motion down in the silicon. Whether or not "message passing" avoids this is dependent on the implementation.Ironically, the fastest model today on typical multi-core silicon looks a lot like old school single-process, single-core event-driven models that you used to see when servers actually had a single core and no threads. One process per physical core, locked to the core, that has complete ownership of its resources. Other processes/cores on the same machine are logically treated little different than if they were on another server. As a bonus, it is very easy to distribute software designed this way.People used to design software this way back before multithreading took off, and in high-performance computing world they still do because it has higher throughput and better scalability than either lock-based concurrency or lock-free structures by a substantial margin. It has been interesting to see it make a comeback as a model for high concurrency server software, albeit with some distributed systems flavor that was not there in the first go around.

评论 #9905855 未加载

评论 #9906286 未加载

评论 #9908069 未加载

bsderalmost 10 years ago

Sigh. What's wrong with using lock-free data structures?Go study java.util.concurrent. It's one of the absolute best libraries ever written by some of the smartest programmers I have ever seen.The primary question is "Do I really need to wait or do I just need to be consistent?" 90% of the time the answer is that consistent is good enough.Lock-free data structures are not a panacea. They don't always do as well as locks in the face of contention. However, if you have that much contention, congratulations, you have an actual spot you really need to optimize.By default, though, lock-free data structures protect you from so much fail it's ridiculous. I don't dread concurrent programming if I have a good lock-free data structure library.That having been said, if you really have to wait (normally for hardware access), then you MUST do certain things. Your "lock" MUST be as small as possible--if it isn't "lock with timeout", "single small action that always goes to completion even if error occurs", "unlock"--YOU HAVE FAILED. START OVER. Also, note the "timeout" portion of the lock. "Timeout" MUST be handled and IS NOT NECESSARILY AN ERROR.Now, these don't get all the situations. People who need "transactions" have hard problems. People who have high contention have hard problems.However, I can count the number of times I genuinely needed to deal with contention or transactions on one hand and still have two fingers left over.Whereas, I have lost count of the number of times that I cleared out all manner of bugs simply by switching to a lock-free data structure.

评论 #9907441 未加载

评论 #9906546 未加载

评论 #9906361 未加载

评论 #9907099 未加载

smegelalmost 10 years ago

I've written a lot of multi-threaded code, but I don't think I've written ANY multi-threaded code that doesn't involve queues passing objects (hopefully safely) between threads. As in, no locks, no semaphores, no accessing shared state between threads (OK apart from an global flag structure that just set various error conditions encountered and was only interacted with by atomic get/set operations and where order of access was never important). Adding a lock to a program is like a huge red flag - stop everything and really think about what you are doing.

评论 #9907543 未加载

评论 #9907148 未加载

评论 #9906893 未加载

nickpsecurityalmost 10 years ago

Sounds like they're just using the wrong tools. Eiffel's SCOOP model[1] made things a lot easier back around '95. They've been improving on it since, with a lot of great work recently [2]. Various modifications proved absence of deadlocks, absence of livelocks, or guarantee of progress. I believe a version was ported to Java. A few modern variants have performance along lines of C++ w/ TBB and Go.What are the odds that it could be ported to a restricted use of C++ language, I wonder?Note: Ada's concurrency strategy also prevented quite a few types of errors. They're described in this article [3] on ParaSail, a language designed for easy concurrency that goes much further.[1] <a href="http://cme.ethz.ch/scoop/" rel="nofollow">http://cme.ethz.ch/scoop/</a>[2] <a href="http://cme.ethz.ch/publications/" rel="nofollow">http://cme.ethz.ch/publications/</a>[3] <a href="http://www.embedded.com/design/programming-languages-and-tools/4375616/ParaSail--Less-is-more-with-multicore" rel="nofollow">http://www.embedded.com/design/programming-languages-and-too...</a>

tormehalmost 10 years ago

People: There are solutions to this shit. If you're building a distributed system or need to deal with loss of data regardless, use actor systems (Erlang or Akka). If you need something that's not quite as probabilistic and are willing to handle deadlocks use CSP (Go or Rust). If you need absolute determinism and you're willing to pay for it in performance use SRP (Esterel or possibly maybe at your own risk Céu).If you need shitstains in your underwear use locks and semaphores.

评论 #9907258 未加载

Animatsalmost 10 years ago

The main problem with multithreaded programming is that most languages are clueless about threads and locks. They were an afterthought in C, and were viewed as an operating system primitive, not a language primitive. The language has no clue what data is locked by which lock. There's no syntax to even talk about that. Of course concurrency in C and C++ gets messed up.Rust has a rational approach to shared data protection. Shared data is owned by a mutex, and you must borrow that mutex to access the data. You can have N read-only borrowers, or one read-write borrower. The borrow checker checks that at compile time. This gives us an enforceable way to think about who can access what.

评论 #9906507 未加载

SCHiMalmost 10 years ago

I got quite frustrated when I read this article. That's because this article, and the many others like this, confuse the real issue.This article, and those like it, all state that the problem with multi-threading and synchronization is inherent to the programing paradigm/language/architecture you're using:> "Buggy multi-threaded code creates race conditions, which are the most dangerous and time-consuming class of bugs in software"> "because the traditional synchronization primitives are inadequate for large-scale systems."Ok. Fair enough, now tell us why that is so.I get quite annoyed when the author then proceeds to turn it all around by saying this:> "Locks don’t lend themselves to these sorts of elegant principles. The programmer needs to scope the lock just right so as to protect the data from races, while simultaneously avoiding (a) the deadlocks that arise from overlapping locks and (b) the erasure of parallelism that arise from megalocks. The resulting invariants end up being documented in comments:"> "And so on. When that code is undergoing frequent changes by multiple people, the chances of it being correct and the comments being up to date are slim."Implying that the real problem with locks/threading/synchronization is actually communication, proper documentation discipline, programmer skill (soft and hard).Of-course I'm not saying that the process of using primitive synchronization methods can't be abstracted over to make it easier to write _proper_ multi threaded code. It's just that this really feels like subjective politicking very much like the aversion to (proper use of) goto() in C/C++ code.

评论 #9906095 未加载

SEMWalmost 10 years ago

> In this approach, threads own their data, and communicate with message-passing. This is easier said than done, because the language constructs, primitives, and design patterns for building system software this way are still in their infancy"Still in their infancy"? That's basically a description of Erlang's concurrency model, almost three decades old now.Is there a concurrency equivalent of Spencer's law -- something along the lines of "Those who do not understand Erlang are doomed to reinvent it"?

评论 #9905771 未加载

rayineralmost 10 years ago

It's much easier to use threads than to use them properly. Arguably, the stricture's of Rust's type system makes it harder to use threads. But it makes it almost impossible to use threads improperly. Both are probably good things.I have seen some real doosies writing multithreaded code. We had a relatively simple data analysis project that took in spectrum measurements from a piece of hardware, logged them, did some basic visualizations, and allowed for controlling the hardware. Each of these functions ran in one or more threads. Imagine my surprise when I saw lots of uses of CreateThread but nary a call to WaitForSingleObject or even EnterCriticalSection. I think there may have been a Boolean flag to "coordinate" a pair of producer/consumer threads.

评论 #9905749 未加载

评论 #9905725 未加载

nbardyalmost 10 years ago

It seems like the key to writing current code is to abandon the idea of understanding how to construct a concurrent architecture and to figure out how to adopt a pattern concurrent which provides certain guarantees. Often it is something baked into the language, but frequently it is a library. This is one of the reasons I'm so thrilled with Clojure.1) Because it has STM baked in and there is a core library for CSP.2) Because it is a lisp so adding foreign syntax is as simple as a library and doesn't need to be a language extension.

chipsyalmost 10 years ago

The article's style got me into a ranting mood. I don't want "allusion links" that surface vapid text like "new superpowers" or "never ending stream of goodness". You are forcing me to click on them to know WTF you mean.

评论 #9907157 未加载

steven2012almost 10 years ago

I hate blog posts like this.You get one guy, who is seemingly very smart, and he says basically "Don't do multithreading, it's very hard. Only an elite few, such as me, can do this right, so most of you out there DON'T DO IT!"It's bullshit. Mainly because it's no harder than anything else, and has just as much pitfalls as every other type of programming. Yes, to a certain degree multithreading is hard, but it's not rocket science. But PROGRAMMING is hard. Not just multithreaded programming. There's nothing very special about multithreaded programming that should scare off people from trying it. Sure, you might fuck up, but that'sFor example, our entire company was almost completely brought down a few months ago by our "architect" implementing a feature so poorly that it caused massive system instability. What was this feature? It essentially boiled down to a 1 or a 2. Customer accounts were tagged with either a 1 or a 2, an it's supposed to take a different code path for each, but he made it so fucking complicated and he didn't do his due diligence, the entire weight of his code cause significant downtime, and a customer that accounts for 60% of our revenues almost walked. And none of this is rocket science.Of course, I worked at another company where one engineer thought "oh, asynchronous APIs are faster than synchronous APIs" so they implemented the entire API asynchronously. Of course, that required mutexes on the server side. And then more mutexes. And it got to the point where the performance was hell because of the unintended consequences of trying to make things faster. You would write a new API and the server would barf saying "You took the locks in the wrong order" but there was no indication of you ever doing anything wrong. It was a mess. So I get what the OP is saying, but it's not specific to just multithreadedness. I bet the same programmer would have made a mess of a single-threaded app as well. They are just shitty or careless programmers.If you're careful, multithreaded programming is helpful and you can see some significant performance boosts from it. But like every other paradigm in programming, don't overuse it. A judicious use of simple multithreaded programming might help a lot, but there are few apps that benefit from an extremely complex system with hundreds of threads, massive amounts of mutexes, etc.

评论 #9906745 未加载

评论 #9907034 未加载

mannykannotalmost 10 years ago

"The resulting invariants end up being documented in comments."There's your problem. If you are going to use locks, you need a wider view of the system than you get at the source-code level. It is doable, but there is a big impedance mismatch between this approach to software development and agile methods.

tsothaalmost 10 years ago

It's really not that hard to write multi-threaded code. I just laugh when I read articles like this - I've been doing it for more than fifteen years now. By taking a tool like that away from your team you're stunting their growth and your product.

评论 #9907513 未加载

jonduboisalmost 10 years ago

Thread-based concurrency has no future. It's complex and it doesn't scale beyond a certain point. Process-based concurrency is relatively simple (especially if your programming language has good async support) and it can scale indefinitely.The one advantage of threads is that the overhead is lower when operating at low concurrency. But it's like algorithmic complexity, people only care about growth in complexity not about the initial offset.

评论 #9906709 未加载

评论 #9905888 未加载

MCRedalmost 10 years ago

Erlang and Elixir solved this problem. I only write multi-threaded code in very limited cases when it makes sense to split processing out of UI on mobile devices.Everywhere else I use Elixir, and I write multi-process code and I don't think twice about it.And I never run into problems.I'm really feeling like people keep choosing tools that haven't solved the problem, or even tried to, and then thinking that the problem is perennial.It was solved a long time ago by erlang.

评论 #9906992 未加载

zzzcpanalmost 10 years ago

> However, these programmers aren’t fleeing concurrency itself - they’re fleeing concurrent access to the same data.He's not wrong.Modern real world example: Golang authors designed net library in a such way, that everyone who uses it has to think about concurrent access to shared mutable states. Which is hard and unnecessary. Event loops never had this problem, but for some reason got labeled "non idiomatic" by Golang folks. So I had to implement event loop myself.

atsalolialmost 10 years ago

Sigh. No mention of logic verification that there are no race conditions. Problem has been solved by Dr Holzmann at JPL <a href="http://www.verticalsysadmin.com/making_robust_software/" rel="nofollow">http://www.verticalsysadmin.com/making_robust_software/</a>

zubirusalmost 10 years ago

>> In this approach, threads own their data, and communicate with message-passing.This is the same paradigm as MPI, the message parsing interface. Using it, you also get for free the ability to deploy your "threaded" code in distributed memory architectures. But any person who had just a bit of experience with this standard can tell you how tedious is to develop a parallel code with it. Maybe this is product of the paradigm or just the verbosity of the API (see for example: <a href="http://www.mpich.org/static/docs/v3.1/www3/MPI_Alltoallv.html" rel="nofollow">http://www.mpich.org/static/docs/v3.1/www3/MPI_Alltoallv.htm...</a>).I wish there was some sort of OpenMP or Intel TBB equivalent for MPI to ease out the pain.

aidenn0almost 10 years ago

My dad said he had to read Hoare when he got his M.S. in the eary '80s, and that half the people who read it didn't understand it, and half the people who understood it ignored it. It's 30 years later and people are still using crappy synchronization primitives.

ddmillsalmost 10 years ago

There is a relatively new actor-like language called paninij[1] which uses the idea of 'capsules'. I have been developing a java annotation based version of it called `@PaniniJ`. Capsule oriented programming enforces modular reasoning, which in turn allows the code to be transformed automatically into multithreaded goodness.[1] <a href="http://paninij.org/" rel="nofollow">http://paninij.org/</a> [2] <a href="https://github.com/hridesh/panini" rel="nofollow">https://github.com/hridesh/panini</a>

评论 #9905786 未加载

rpcope1almost 10 years ago

I think you really ought to have to read Little Book of Semaphores before you're allowed to touch multi-threaded code. [1][1] - <a href="http://www.greenteapress.com/semaphores/downey05semaphores.pdf" rel="nofollow">http://www.greenteapress.com/semaphores/downey05semaphores.p...</a>

ArkyBeaglealmost 10 years ago

I am utterly ignorant of what Gecko looks like, but in largerish realtime embedded work, things always seem to end up in more formal design methodologies utilizing transactional models such as message sequences ( frequently expressed in charts. )

kabdibalmost 10 years ago

Coroutines are great stuff. Being able to yield for an async result and then wake up later, without having to do expensive and buggy lock rendezvous nonsense, is manageable and scalable.

hyperpalliumalmost 10 years ago

Shared-nothing message passing is an answer, as used in Erlang, but I seem to recall reading that race and deadlock can still occur, just at a higher level.

opnitroalmost 10 years ago

Just so you know, this is broken on ios-safari. I love the title though.

zobzualmost 10 years ago

Love this sign, too bad I heard its gone

bronzalmost 10 years ago

What is racing?

评论 #9905692 未加载

评论 #9905722 未加载

batoualmost 10 years ago

Run this in your browser's console window to make it actually scrollable without playing find the sodding scrollbar:<pre><code> document.getElementById("contentpane").style.width = "100%"</code></pre>

评论 #9905617 未加载