For software people who find this interesting and would like a great introduction, consider taking the course "Computation Structures" from MITx on EdX.<p>It's in three parts. The next run of part I, "Digital Circuits", starts September 6th. Here's the link: <a href="https://www.edx.org/course/computation-structures-part-1-digital-mitx-6-004-1x-0" rel="nofollow">https://www.edx.org/course/computation-structures-part-1-dig...</a><p>Part I cpvers basic logic gates, at both a high level (how you use them) and a lower level (how you build them on a chip), their characteristics (propagation delay, contamination delay, and things like that), combinatorial logic, sequential logic, state machines, pipelining, and probably more that I don't remember. The labs are done via a browser-based simulator, and by the end of the course you will have designed and implemented a 32-bit ALU with add, subtract, logical and arithmetic shifts in both directions by up to 31 bits, all the boolean operators, and the usual comparison operators.<p>Part II builds on that, taking you through designing and implementing a full 32-bit processor. Caching is discussed in the lectures, but not used in the processor.<p>Part III (which I did not have time to take), I believe, adds caching and pipelining to the processor, and covers parallel processing and device handling, and also operating system stuff.<p>For a while I wanted to actually build the processor from Part II using discrete 7400 series logic, but the chip count came in too high for me. My gate counts were: 295 AND2, 8 AND3, 3 NOR2, r OR2, 96 OR3, 20 OR4, 226 XOR2, 6 NOT, 563 MUX2, 161 MUX4. (That's not counting whatever I'd need for the control ROM and the 32 x 32-bit register file).<p>At 4 MUX2s per chip, and 4 AND2s per chip, that's 215 chips. Another 81 for the MUX4s and 57 for the XOR2s brings it up to 353. Without even tossing in the rest, I'm way over my limit.<p>I could cut this down quite a bit by taking out the shift unit (which uses 353 MUX2s), making the shift instructions generate an illegal instruction trap, and have the trap handler emulate the shift instructions. That would save 88 chips. (Well, not quite 88 chips...I think I'd have to add a "logical right shift by 1" instruction to make it so this approach would not be too slow, but a dedicated "logical right shift by 1" unit is a lot simpler than a "shift logical or arithmetic in any direction by any amount" unit).<p>The cool thing though, is that <i>I</i>, a software guy, <i>could</i> could actually make those hardware changes now. A lot of things about computers now make a lot more sense to me. I highly recommend it to those curious about what goes on at a lower level than we software guys normally deal with (even if we are writing software that interfaces with devices).