I feel that a discussion of <a href="http://en.wikipedia.org/wiki/Amdahl's_law" rel="nofollow">http://en.wikipedia.org/wiki/Amdahl's_law</a> should be mandatory when introducing parallel programming<p>Small sequential portions of an otherwise parallel algorithm can have huge effects on the overall running when trying to scale up.<p>"parconc" explains this while discussing a parallel version of k-means, talks about how things like granularity of data needs to be fine-tuned for parallel algos, and provides some nice visualizations into what the CPU's are actually doing on a timeline: <a href="http://chimera.labs.oreilly.com/books/1230000000929/ch03.html#sec_par-kmeans-perf" rel="nofollow">http://chimera.labs.oreilly.com/books/1230000000929/ch03.htm...</a><p>Overall I think multicore is a good tool to have in your toolbox, but it seems like there needs to be a lot of tuning and effort to get good rewards for the time invested.