I tried to do a clockless fully async bus interface in around 1988 in a chip I was designing at Masscomp for a fast data acquisition system. Never got built, but it was fun trying, and it would've been really fast. "Lower design complexity" though: hahaha! Nope.
I worked with Alain Martin at Caltech, and I always loved the idea of asynchronous circuits. When I became an FPGA engineer, I realized the big problem with both FPGAs and asynchronous logic: the tooling doesn't generalize well to other domains, so you have to be a narrow specialist to make progress.<p>If someone could convert synchronous verilog to async circuits under the hood, they may see huge gains in speed and power use for their circuits, but that is a huge uphill climb.
It's a classic idea. There were some early asynchronous mainframes built from discrite logic. It might come back. It's an idea that comes around when you can't make the clock speed any higher.<p>It's one of those things from the department of "we can make it a little faster at the cost of much greater complexity, higher cost, and lower reliability". That's appropriate to weapons systems and auto racing.
Having a common clock reference (per core) is essential for reducing latency between components. If you have to poll or await some other component arbitrarily, there will necessarily be extra overhead and delays in these areas. There will also need to be extra logic area dedicated to these activities. Make no mistake, just because there's no central clock, doesnt mean you are magically off the hook. You still need to logically serialize the instruction stream(s).<p>Even for low power applications, you would probably use less battery getting the work done quickly in a clocked CPU and then falling back to a lower power state ASAP. Allowing the pipeline effects to take hold in a modern clocked CPU should quickly offset any relative overhead. Heterogenous compute architecture is also an excellent and proven approach.<p>Certainly, there are many things that happen in a CPU that should not necessarily be bound by a synchronous clock domain (e.g. ripple adder). But, for these areas where async cpu a clear win, would we actually see any gains in practice using real software? Feels like there's a lot of other strategic factors that wash out any specific wins.
I don't think async can make things faster but it can make them more energy efficient and the incentives for that is still close to none as our economic models reward waste until all EROEI is depleted.<p>But you need to add the ability to switch things off dynamically, meaning cores on CPU/GPU; so far the industry has solved this with little.big but that requires all software to change, it's going to take time that we unfortunately do not have as hardware is closing the ownership model.
I will raise an import distinction: asynchronous logic != dynamic logic.<p>There can be dynamic synchronous logic, and vice versa.<p>Dynamic vs. static determines whether the circuit as such needs to be driven by any constant pacing input, whether embedded clock, or external clock, vs. not needing it to arrive to a settled state (to latch.)<p>If you are to speak strictly, asynchronous vs. synchronous determines whether that pacing input is external, or recovered from input.
Mini-MIPS isn't <i>that</i> different from a conventional out-of-order superscalar microarchitecture. The article even says:<p><pre><code> However, the MiniMIPS pipeline structure can execute instructions out-of-order with respect to each other because instructions that take different times to execute are not artificially synchronized by a clock signal.</code></pre>
Memory cells are the thing that uses the vast majority of power in a CPU. And they are used everywhere, cache, uOP cache, BTB, etc.<p>Async CPU solved a problem that would have marginal benefit in a metric we care about<p>Also, I imagine, they would need to be implemented assuming the worst timing delay from the processes. They can't be binned like modern CPUs.
Modern clocked processors don't account for worst-case timings. Instead, instructions take variable count of clock cycles to complete.<p>In some sense they're already asynchronous, despite clocked.