HP partnered with Intel to bring HP's Playdoh vliw architecture to market, because HP could not afford to continue investing in new leading-edge fabs. Compaq/DEC similarly killed Alpha shortly before getting acquired by HP, because Compaq could not afford its own new leading edge fab either. SGI spun off its MIPS division and switched to Itanium for the same reason -- fabs were getting too expensive for low-volume parts. The business attraction wasn't Itanium's novel architecture. It was the prospect of using the high-volume most profitable fab lines in the world. But ironically, Itanium never worked well enough to sell in enough volumes to pay its way in either fab investments or in design teams.<p>The entire Itanium saga was based on the theory that dynamic instruction scheduling via OOO hardware could not be scaled up to high IPC with high clock rates. Lots of academic papers said so. VLIW was sold as a path to get high IPC with short pipelines and fast cycle times and less circuit area. But Intel's own x86 designers then showed that OOO would indeed work well in practice, better than the papers said. It just took huge design teams and very high circuit density, which the x86 product line could afford. That success doomed the Itanium product line, all by itself.<p>Intel did not want its future to lie with an extended x86 architecture shared with AMD. It wanted a monopoly. It wanted a proprietary, patented, complicated architecture that no one could copy, or even retarget its software. That x86-successor arch could not be yet another RISC, because those programs are too easy to retarget to another assembler language. So, way beyond RISC, and every extra gimmick like rotating register files was a good thing, not a hindrance to clock speeds and pipelines and compilers.<p>HP's Playdoh architecture came from its HP Labs, as had the very successful PARISC before it. But the people involved were all different. And they could make their own reputations only by doing something very different from PARISC. They sold HP management on this adventure without proving that it would work for business and other nonnumerical workloads.<p>VLIW had worked brilliantly in numerical applications like Floating Point Systems' vector coprocessor. Very long loop counts, very predictable latencies, and all software written by a very few people. VLIW continues to thrive today in the DSP units inside all cell phone SOCs. Josh Fisher thought his compiler techniques could extract reliable instruction-level parallelism from normal software with short-running loops, dynamically-changing branch probabilities, and unpredictable cache misses. Fisher was wrong. OOO was the technically best answer to all that, and upward compatible with massive amounts of existing software.<p>Intel planned to reserve the high-margin 64-bit server market for Itanium, so it deliberately held back its x86 team from going to market with their completed 64 bit extensions. AMD did not hold back, so Intel lost control of the market it intended for Itanium.<p>Itanium chips were targeted only for high-end systems needing lots of ILP concurrency. There was no economic way to make chips with less ILP (or much more ILP), so no Itanium chips cheap and low-power enough to be packaged as development boxes for individual open-source programmers like Torvalds. This was only going to market via top-down corporate edicts, not bottom-up improvements.<p>The first-gen Itanium chip, Merced, included a modest processor for directly executing x86 32-bit code. This ran much slower than Intel's contemporary cheap x86 chips, so no one wanted that migration route. It also ran slower than using static translation from x86 assembler code to Itanium native code. So HP dropped that x86 portion from future Itanium chips. Itanium had to make it on its own via its own native-built software. The large base of x86 software was of no help. In contrast, DEC designed Alpha and migration tools so that Alpha could efficiently run VAX object code at higher speeds than on any VAX.