For tagged values, I loved the POWER rlwinm: Rotate Left Word Immediate aNd with Mask (and it's companion rlimi). Pretty much any sane tagging scheme could be converted to the unboxed value with that single instruction; even somewhat exotic tagging schemes like mixing high-bit and low-bit tagging could be handled by it.<p>Of course in modern architectures being able to do something in one instruction is only tenuously related to being able to do something quickly, but it was a super handy instruction back in the day.
It's worth noting that on systems with real cache coherency (MOESI for example) where for example writing data into the dcache to an address A results in cache line shootdown in the icache as part of fetching an 'exclusive/modified' line into the dcache - in this world EXPORT.I is essentially a no-op because what it requires the icache implement (shootdown of icache lines) has already happened naturally.<p>Equally on such a system the only thing left for FENCE.I to do is to flush any (potentially now bogus) subsequent instructions that are in the execution pipe that might have been prefetched before the writes occurred. In such a system FENCE.I and IMPORT.I are identical.<p>Hopefully the people writing this spec are listening ... please make sure your spec understands high end systems like this and doesn't add stuff that require special cases in systems that do ubiquitous coherency right
Counting down to someone pointing at the annoyingly named ARM FJCVTZS instruction. The naming is obviously more about legal problems than reality, but so it goes.<p>To be very very clear: FJCVTZS does not do anything amazing, clever, or special. The problem it solves is very simple: the behaviour of double->int conversion in JS is the default x86 behaviour. Getting that behaviour on any non-x86 platform is expensive. So a more accurate name would be FXCVTZS. The implementation of FJCVTZS in a CPU is also not expensive, it simply requires passing a specific rounding mode to the FPU for the integer conversion (overriding the default/current global mode), and matching the x86 OOB result.<p>(Also I really wish people would stop posting to GitHub repos unless the repos have the actual readable spec available or linked, rather than the unbuilt markup version. It just makes reading them annoying.)
There's a document in there about pointer masking: <a href="https://github.com/riscv/riscv-j-extension/blob/master/pointer-masking-proposal.adoc" rel="nofollow">https://github.com/riscv/riscv-j-extension/blob/master/point...</a><p>It seems like the objective of this is to implement different access privileges... but why do you need specialized instructions for this? This is typically done by the OS and memory protection. The pointer masking extension would be to have multiple levels of privilege within a single process? I'm assuming that this is to protect the JIT from a JITted program? Except it's not completely safe, because there might still be bugs in the JIT that could allow messing with the pointer tags. Struggling to think of a real use case.