Ask HN: How to do prevent candidates from cheating in screening tests?

3 pointsby m3h11 months ago

Recently, I've found that the candidates in my company's hiring pipeline are using ChatGPT to solve the coding test we use for screening.While the coding test isn't necessarily complex, it is still hard to finish it within the time limit I provide. This means that candidates who are genuinely attempting it often submit the code with some tests failing. However, some candidates submit the tests with all tests passing, which would be pretty tricky (but not impossible) in the allocated test time. Sometimes, there's a tell, like their submission time being too fast to be reasonable, but it is not always possible to tell.I am using a code-screening test platform (CodeSubmit). Despite having challenging problems in its library, it is not ChatGPT-proof. But I'm also amazed at how well ChatGPT solves the problems (well, most of them). It seems to be able to follow complex instructions and adjust the program to pass unit tests.Are you seeing the same problems in your hiring pipeline?How can I ChatGPT-proof my screening tests and take-home assignments?

4 comments

Two411 months ago

We do a simple offline idiot test to see if they know how to rub two variables together (10 minutes of effort for a junior, say), then we go into a face-to-face interview where we give them a laptop with a blank project (no frameworks, no extra packages), and we all work through a moderately difficult logic problem (30-40 mins) together on a big screen. This gives us far more insight into their problem-solving skills, their ability to work with others, their knowledge of comp-sci and language fundamentals (the ones that count, no implementing a B-tree from scratch or some such nonsense). They still have access to online resources such as stack overflow, because that's part of problem solving and most programmers' workflow. we have had good results, and have employed some bad-on-paper candidates who have turned out to be great programmers, and turned away some candidates who had great CVs, but turned out to be not great problem solvers or team players.We've found that large take-home tasks disadvantage certain candidates who don't have time or patience (tech interview fatigue, for example). They also don't really weed out candidates who will hack around on a 4 hour task for 20 hours until they cobble together something passable, or - as in your case - feed the whole assignment to an LLM. There's no substitute for hands-on experience with the candidate, but you need to be willing to put senior resources with them for that hour or two.

elmerfud11 months ago

My question would be why are you administering a test that you expect people to fail on? What is it you hope to understand when you are intentionally testing them in a way that you yourself admit is hard to finish within the timeline.Is this an example of your company culture that you give timelines that are impossible to meet for the expected working hours? Because that seems like a toxic workplace culture and it seems like you have a toxic hiring practice. So I think your question about how to find cheaters using chatGPT to do your ridiculous test is a bad question.You're abusing a candidate giving them an impossible task and they've simply found a better way to solve it. Maybe you should be fired and they should be hired.I would never want to work for a company or interact with a person who intentionally sets me up to fail. Maybe you can let us know the name of your company so we can all avoid it.

unsupp0rted11 months ago

Why give them a time limit at all?Let them solve the problem, then discuss their solution with them. If they know what they did, why they did it, and what the trade-offs were, then what else do you need?

taylodl11 months ago

ChatGPT is simply another tool in a modern developer's arsenal. I'd be looking for a test that evaluates how well you can utilize ChatGPT but can't be solved completely by ChatGPT. Prompt engineering is an important skill to possess these days!

评论 #40777602 未加载