Counterpoint: Parallel programming isn't very hard.<p>Seriously: try fork() + wait(), or spawn in bash / cmd line. make -j offers parallelism (and Ninja is "make -j by default").<p>OpenMP's "#pragma omp parallel for" is extremely easy to use.<p>Things get harder with thread libraries, but the "parallelism" part of pthreads is pretty easy. pthread_create(&thread_id, NULL, function, args);, at least if you're fine with the default settings.<p>C++ Threads are even easier than pthreads.<p>-----------<p>The hard part is:<p>1. Communication / Synchronization -- But really, condition variables are the most complicated thing most people need. fork() + wait() push the difficult communication part to the wait() or to pipes. But if you stick with wait() / waitpid() as your main communication mechanism, its pretty easy to think about.<p>2. High performance programming -- This is just hard, especially if you begin to reach for the highest performance "memory barrier / lock-free programming" styles. This is where "false sharing" comes in (your code correctly executes under false sharing, but slowly due to how the L1 cache handles multicore architectures).<p>---------<p>However, if you stick with a simple fork-join communication model (ex: pthread_join, or wait()/waitpid() based synchronization)... its really not too hard at all. Just try to stay away from the high-performance techniques unless you really need them.