I'm glad to see people talking about one of the four stages of optimization:<p>1. Do we have to optimize this? Time is never free, and opportunity cost in terms of engineering effort is usually very significant.<p>2. Can we do less work? (this article)<p>3. What's the bottleneck? CPU, FP, memory bandwidth, lock contention?<p>4. How do we squeeze out better performance? Assembly, loop unrolling, etc.<p>I usually cringe when I hear about people talking about #4, as very few of them have asked #1, #2, or #3 yet.<p>And usually, you just hear people talking about doing what the compiler's already doing (writing it in assembly? /The compiler already does that/), usually doing a pretty reasonable job at high optimization levels. They just exchange maintainability for the warm fuzzy feeling that they've been as macho as needed.