My university checks for possible cheaters when delivering programming assignments. by inputing them to a program that gives back the probablitlity those two programs have code copyed of each other. How does the program work in essence?
It may use something like Kolmogorov complexity <a href="https://en.wikipedia.org/wiki/Kolmogorov_complexity#Compression" rel="nofollow">https://en.wikipedia.org/wiki/Kolmogorov_complexity#Compress...</a><p>...where symbol matching is used along with variable names after removing all extra white space.<p>Although, I'm guessing your university uses PMD : <a href="http://pmd.sourceforge.net" rel="nofollow">http://pmd.sourceforge.net</a> It's a fairly popular tool for detecting code issues (including copied segments).
Take a look at this:<p><a href="http://theory.stanford.edu/~aiken/moss/" rel="nofollow">http://theory.stanford.edu/~aiken/moss/</a><p>Specifically:<p><a href="http://theory.stanford.edu/~aiken/publications/papers/sigmod03.pdf" rel="nofollow">http://theory.stanford.edu/~aiken/publications/papers/sigmod...</a>
I'm no expert in this but I suppose just like how they check for plagiarism in essays.<p>They usually compare submissions with other submissions, past submissions and also against similar references on the internet in case a match is found on some website.
It's probably a combination of runtime analysis and code construction analysis.<p>Submissions with similar runtimes will use the same algorithm. If most students use the proper way, but a group of five students all, by "chance," pick an obscure dumb way (or a really obscure fast way), then they may be copying. Manual investigation would be triggered.<p>Look into how JavaScript minimizers/obfuscators work. They could be parsing your code down to a form where your variable names and comments don't matter at all. They're just analyzing and looking for coincident structure of programs.<p>Now, your administration is smart enough to not flag code where there's one way to do it (write "find a member of a list"), but if 10 people copy the same wrong way to do it, it would be a flag to investigate further.