This is neat but the analysis of their work leaves a bit to be desired. You can't just randomly select instructions and see if you did a good job, because the instruction space is not really uniform on any axis that people care about. For example, on a hypothetical ISA that has most the encoding space that is, like, simple arithmetic ops then you can get "good" coverage really easily. But that's not actually very useful because the instructions people care about when doing this kind of analysis are specific and usually more esoteric, and difficult to analyze with a simple bitstring approximation. Like, this definitely cannot discover the semantics of syscall, or rdrand. The authors claim they would have been able to discover reptar if they extended their work slightly, but I think it is pretty dubious that their methodology is powerful enough to do so.