I think there will be some argument in the comments about whether any LLM can be considered unusable for this test.<p>That is, is there any current LLM where you could prompt it with something like an interview question and it would give you perfect code one shot or zero shot?<p>It might be better to host an LLM and craft the questions such that you know that the LLM will screw it up. Or, since you're self-hosting, you can have the system prompt, say insert subtle bugs that aren't syntax errors. I'll have to test if that actually does anything.