The icache is interesting. One of the high points of RISC-V is that it requires less work in the decode part of the frontend compared to competitors, which can be used to reduce pipeline stages. However, it has the disadvantage that it generally requires more code, both in amount of instructions and in bytes, to do the same work. This is bad both because it requires more throughput on the frontend and because it makes L1i cache efficiency worse.<p>This CPU solves this by making the instruction cache very large. This certainly requires more fetch stages in the pipeline to clock high, which is offset by the simpler decode, probably giving similar results to competitors.
I get the feeling that ARM and RISC platforms are going to screw around for so long that by the time you can buy a system with one in them Intel and AMD will have caught up on performance per watt and made them unneeded.