TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Why we still can't stop plagiarism in undergraduate computer science

51 pointsby kevinchenabout 7 years ago

28 comments

macintuxabout 7 years ago
I was teaching the lab portion of CS 101 (don&#x27;t remember the actual course number off-hand) when I discovered that two students had the same remarkable code that shouldn&#x27;t have worked but did.<p>We were using C, and instead of using globals or parameters, each function declared the same local variables in the same order. The stack, then, remained sufficiently consistent that each function had access to the values it needed.<p>When I confronted the two of them about plagiarism (and explaining what they had done wrong) their defense was that they were working on the problem together, and thus had made the same mistake out of ignorance.<p>And frankly, it made perfect sense. I could easily see myself doing something like that.<p>I guess my point is that, for at least some small portion of the problem space, plagiarism isn&#x27;t really plagiarism.
评论 #16651481 未加载
评论 #16656417 未加载
评论 #16656981 未加载
thatswrong0about 7 years ago
Here&#x27;s the easiest solution: stop grading homework<p>Why judge student performance on something that they are using _to learn_? It doesn&#x27;t make any sense.<p>Every student is basically competing with one another to get the highest GPA possible - if you&#x27;re going to give them cookie cutter homework with solutions can be easily searched for on the internet _and_ can only bring their GPA down, then they&#x27;re going to cheat. Plain and simple.<p>Give them homework and &quot;grade it&quot; to give them feedback, sure, but don&#x27;t make it count.. that is, if the goal is to have students learn.
评论 #16651588 未加载
评论 #16651746 未加载
评论 #16651612 未加载
评论 #16653013 未加载
评论 #16651668 未加载
mjw1007about 7 years ago
Maybe I&#x27;m being small-minded, but I strongly dislike using the word &#x27;plagiarism&#x27; to refer to cheating on your homework by copying someone else.<p>To me, plagiarism is taking credit for someone else&#x27;s ideas at their expense; it&#x27;s a &quot;sin&quot; against the person being copied.<p>Copying someone else with their connivance, or paying some essay-mill writer to do your work for you, should be in the same category as taking a calculator into a mental-arithmetic test, not the same category as « My name in Dnepropetrovsk is cursed, when he finds out I publish first ».
评论 #16651600 未加载
ahelwerabout 7 years ago
Heh. I wanted to store my old university coursework somewhere, and GitHub seemed as good a place as any. Because I&#x27;m cheap and don&#x27;t want to pay for a private repo, there it sits in all its public glory. Few years back I got an irate email from a professor that students were copying a program I wrote for a SPARC assembly course which has remained unchanged for, like, two decades. So, to some degree the whole plagiarism thing is due to professorial laziness.
评论 #16652696 未加载
评论 #16652201 未加载
paxysabout 7 years ago
I have a big problem with the general theme of this article, which is that plagiarism detection software is infallible and every student who disagrees with its findings is wrong and dishonest.<p>You claim<p>&gt; We have virtually eliminated false positives at this point<p>but offer no explanation for how you verify this.<p>You later rant about the fact that students have the audacity to challenge these (very serious) charges and the university actually expects you to follow up when they do. The horror!<p>IMO it&#x27;s your system of senseless programming exercises and automated grading that is broken. Instructors need to put in the time and effort and assign homework where students have to actually think and be creative, rather than reuse the same assignments for the 10th year running and be shocked when submissions turn out to be similar.
评论 #16652288 未加载
Scaevolusabout 7 years ago
&gt; &quot;Then, we apply another filter, keeping only the cases that contain indisputable evidence — for example, hundreds of lines copied right down to the last whitespace error.&quot;<p>Sounds like they don&#x27;t want to deal with plagiarism, if you can avoid it by simply making your copying &quot;disputable&quot;.<p>MOSS is clever-- instead of doing direct textual comparison, you compare <i>streams of tokens</i>. This means that even if a student reformats the whitespace or renames all the variables (a common obfuscation technique), the same stream of &quot;TOKEN ASSIGN_OP TOKEN LPAREN TOKEN COMMA TOKEN RPAREN&quot; will exist. TMOSS extends this to snapshots of code as a student develops it, which is apparently 2x more effective!<p>This author also delicately avoids the <i>cultural</i> side to plagiarism-- many students come from backgrounds where &quot;group work&quot; is common, and passing classes is a communal effort, <i>including homework</i>. It&#x27;s an unfortunately common mistake to think the grade is what matters, not the fundamental skill development.
ubermanabout 7 years ago
Almost every aspect of our discipline encourages open source code sharing and code reuse. This is a discipline wide mind set.<p>In fact, &quot;build it yourself from scratch&quot; is an anti-pattern in my opinion.<p>I&#x27;m not condoning cheating, but why would one not expect this to be be the default behavior?<p>As others have suggested, there are much easier &quot;solutions&quot; related to logging keystrokes and commits should you <i>really</i> want to catch and punish this behavior.
评论 #16651800 未加载
评论 #16652435 未加载
kangnkodosabout 7 years ago
After implementing plagiarism detection campus-wide, design a process which is very time consuming on the students part, and not as time consuming on staff.<p>Maybe something like: &quot;Your homework was flagged as possible plagarism. Report to this lab at this time, and code one additional problem which should be easy to anyone who understood the original homework.&quot;<p>Anyone who really did the homework will be in and out within five minutes. If you can&#x27;t finish in an hour, you get a zero on that one homework assignment, not expelled.<p>That flips the incentives. Also, reducing the punishment cuts the drama of people arguing that the software is not 100% accurate.
wdewindabout 7 years ago
Here&#x27;s an idea: why not make the assignments personal enough that you cannot cheat on them?<p>&quot;But wait, that would require huge amounts of time investment from the professors&#x2F;TAs&quot;<p>Yes, it&#x27;s almost as if paying $65k a year for someone to teach you something should result in that person teaching you that thing instead of just checking in to see if you&#x27;ve learned it on your own.
评论 #16651631 未加载
评论 #16652634 未加载
stale2002about 7 years ago
One thing that I think people don&#x27;t talk about enough on this topic is the wildly different plagerism guidelines between different classes.<p>I did both CS and economics in college. And in my CS classes, even discussing the homework with classmates was often &quot;against the rules&quot;.<p>But in my business and economics classes, me and my classmates would regularly work together on the homework my straight up assigning certain problems to certain people and then copying from each other.<p>This was not only allowed by the professor, it was explicitly ENCOURAGED!<p>They understood that if you talked to classmates, you will be able to understand things better, instead of struggling and failing to do stuff on your own.<p>And with such wildly differing guidelines for different classes, things were often confusing to students.<p>One potential solution to &quot;cheating&quot; is to explicitly allow it, such that everyone is on the same playing field.<p>What matters, at the end of the day, is that the students learn the material.
waqfabout 7 years ago
The article doesn&#x27;t seem to even consider the possibility of assessing students in some other way than through standardized homeworks which are easily copied.<p>For example, individual projects where everyone in the class is working on something different; or at the other extreme, proctored exams.<p>(Of course, neither of these systems is entirely free from cheating, but the barrier is higher.)
评论 #16651469 未加载
评论 #16651538 未加载
评论 #16651437 未加载
评论 #16651513 未加载
nkriscabout 7 years ago
What I find fascinating about this problem is students paying all that money in order to deliberately avoid learning anything.
评论 #16651416 未加载
评论 #16651419 未加载
评论 #16652351 未加载
评论 #16651838 未加载
评论 #16651459 未加载
gumbyabout 7 years ago
This essay is most about culture, but it does mention an anti-plagiarism program (which sounds pretty hard to do except in very trivial cases, but who knows?)<p>There&#x27;s another tool: the repo. My son was accused of plagiarism in his last year of high school. It could have been a &quot;he said&#x2F;he said&quot; case -- in fact it started that way -- until I pointed out that if he believed he was in the right he had a record that could be checked.<p>The CS teacher had to explain to the principal why the repo proved who had copied whom (and left me wondering why the teacher hadn&#x27;t looked there first????) which wasn&#x27;t easy because the plagiarist&#x27;s parents were big donors to the school. So in the end, despite what it says in the school handbook, the only penalty was a 0 on the assignment.<p>But a good lesson for my kid on both programming and the sociopathologies of organizations.
fr0styabout 7 years ago
Requiring students to submit their VCS history along with the finished project would at least up the cost to the students for copy and pasting.<p>They even hint at that sort of solution in the piece by mentioning cosmetic changes to the files at the last minute.
评论 #16651730 未加载
评论 #16653099 未加载
评论 #16651484 未加载
评论 #16651656 未加载
eecsninjaabout 7 years ago
Engineering classes should really switch from being homework-based to being project-based. Even something as simple as small coding projects that can be done in a week.<p>Then, the final project would be such that you&#x27;d have to explain your code, either in person with a TA, or by writing documentation for it.<p>We really need to move on from this academic mindset of homework, grades, and plagiarism toward something that is actually reflective of the world outside of academia. The concept of plagiarism doesn&#x27;t really exist in the software industry -- it&#x27;s a matter of what you can get done.
jessaustinabout 7 years ago
TFA seems a bit at odds with itself. One reason academic boards don&#x27;t care too much about CS plagiarism complaints is that CS generates so very many of them, compared to other fields. The reason isn&#x27;t that CS students are degenerates (although they may be anyway), instead it&#x27;s because it is so much easier to check for plagiarism in CS. So, sure closed-source is bad and definitely we can always use more TAs, but the problem is clearly not &quot;we&#x27;re only punishing 10% of our students while we should be punishing 40%!&quot;<p>The problem isn&#x27;t with CS at all, but rather with USA colleges in general. Indeed the only professor I&#x27;ve read who seems to even notice the problem is Harry Lewis. Most subjects should be taught very differently than they are taught. USA university education makes a great deal of unnecessary and counterproductive work for students and professors. The busywork threatens to drive out real academic work.<p>The reason for this is so that more such work might be created for administrators, who must multiply inexorably to absorb the ridiculous amounts of money that our ridiculous system of student debt generates. In fact it will be no surprise if some schools eventually do hire enough administrators to suspend 40% of every CS course every semester. One hopes that the professors who could restore some of the quality that universities used to possess, will realize by then that they can restore that and should restore that.
bunderbunderabout 7 years ago
Instead of coming up with punitive solutions, I wonder what can be done to re-structure computer science education in ways that move the incentive structure away from one that encourages plagiarism? Bonus points if it improves the quality of the education, too.<p>For example: What if we move toward a more seminar-style approach of having students discuss and critique each others&#x27; code on larger projects?<p>This might not get rid of all copy&#x2F;pasting, but it would create a huge incentive for students to at least understand how their code works, in order to avoid embarrassment in front of their classmates. And, should two kids copy&#x2F;paste the same code, and that becomes apparent in the course of a peer review session, well, that&#x27;s an event that everyone will remember. No need for the instructor to make themself the bad guy in the process, either.<p>It also has the side benefit of giving students experience with code review, with reading and understanding others&#x27; code, and maybe helps them start to develop a sense for how to write clean, readable code several years before they start getting bludgeoned by senior devs at their first full-time job.<p>As for smaller problem set type homework, why not give them group work? It doesn&#x27;t necessarily need to be graded, aside from credit&#x2F;no credit, if you&#x27;re worried about giving A&#x27;s to duffers. I had a few classes that did that back when I was in school, and I really liked it. I felt like I learned faster, both from working together with classmates and because the format allowed them to give us more challenging problem sets.
ofcxabout 7 years ago
I was accused once in my undergrad and I thought it highlighted an interesting issue.<p>It was my senior year, and I attended a systems programming course that was being piloted and was very challenging. Work in the CS department was very group heavy, especially in courses heavy on theory. I benefited a ton from working with groups with other students outside of class. In your data structures&#x2F;theory&#x2F;math courses, this wasn&#x27;t an issue - But in this class in particular, peoples submissions started to look similar.<p>It was resolved rather quickly because we just had to be honest, but I thought it was interesting - Specifically because, in classes that were so challenging heavy collaboration was what pulled me through, I barely remember the course content anymore. But, the soft skills I acquired from hours of collaborating with my peers after class hours has followed me for life and made a noticeable impact on my career.
jancsikaabout 7 years ago
&gt; Finally, as educators, we also hope that the accused student can learn difficult lessons about ethical behavior in the classroom rather than the workplace.<p>Suppose that technique X can actually deter students from cheating 100% of the time.<p>So we apply technique X to intro class Z that has 300 students.<p>Now we have an intro class of 300 non-cheating students who sit quietly and listen to an instructor for an hour a week.<p>Then those non-cheating students sign in for a more reasonably sized class section of 40 to sit quietly and listen to a graduate student for an hour.<p>Finally, these non-cheating students take tests and do assignments written in such a way that the amount of grading time does not put the graduate students over the weekly allotted work time for their TA-ship in their particular program.<p>Ballpark-- by what percentage would one say the quality of the learning environment has improved by employing technique X?
matthewbauerabout 7 years ago
This seems like something accreditation orgs like ABET should be more worried about. If students are cheating their way to degrees, that hurts everyone with a CS degree. Professors cant really do much if their uni doesnt care.
crawfordcomeauxabout 7 years ago
College isn&#x27;t about education, but about signaling your value as a contributor to capitalism.<p>Otherwise, we&#x27;d be promoting collaborative learning and letting those who don&#x27;t contribute or cheat simply cheat themselves.
评论 #16651574 未加载
jefflinwoodabout 7 years ago
I co-teach an upper-level undergraduate class where the students create independent programming projects - the computer science students couldn&#x27;t copy anyone else in the class&#x27;s code even if they wanted to, as it wouldn&#x27;t make any sense for their context.<p>Perhaps the solution is to be more creative with how computer science education is taught? If the students are copying homework problems they don&#x27;t understand, they&#x27;re not going to do well on the projects or exams that might be part of the rest of their grade.
评论 #16652600 未加载
sampoabout 7 years ago
Maybe the party that detects and decides on consequences for plagiarism should be a separate entity in the university. Like the internal affairs division in police departments that ordinary police officers hate so much in movies and tv-series. They would be an &quot;external enemy&quot;, so the teaching staff would not have to suffer from friction with their students in these unpleasant matters, and also the consequences would be out of hands of the teachers.
piracy1about 7 years ago
I wonder how much of the &#x27;plagiarism&#x27; is just people copying the same StackOverflow snippit.
zombieprocessesabout 7 years ago
Because the source for most generic programming assignments are already online?<p>Why not skew the grading more heavily towards in-class midterms and finals?<p>Or you could generate indivualized hws for each student, but that may not be feasible in a 500 student intro to cs class.
ggmabout 7 years ago
the other side of the coin is the group assignment where three of the five do all the work and all five live or die on the benefit.<p>oddly, post degree, we&#x27;re actively encouraged to re-use code.
xenihnabout 7 years ago
It&#x27;s a problem in MS programs too.
dd367about 7 years ago
I was a TA for a graduate level class at one of the top universities in the US and I&#x27;ve had some interesting encounters with plagiarism.<p>I. The time I got caught for &quot;plagiarizing&quot;. In an intro systems class, me, a CS major, and my roommate, who wanted to minor in CS, were working together and I was &quot;showing him the ropes&quot;. He was an intelligent student and we never worked together on the homeworks aside from general verbal discussions on what the solution could be. He used a Windows laptop and for one of the assignments, his C code wasn&#x27;t compiling because he was missing some libraries and he told me he couldn&#x27;t figure it out and we were approaching a deadline and asked me to compile it for him and send him back the binary. I did so, but when sending back the binary, in a rush, I accidentally mistook my HW folder for his (we&#x27;d downloaded this as a part of the assignment, and the folder structure was identical) and sent him my binary by mistake. Both of our solutions worked. Obviously, we got &quot;caught&quot; in the most naive way. Our binaries had the same MD5 hash and the CMS flagged us. We were both confused at first, and then we realized what happened and explained it to the professor. The proof was simple - just compile my roommate&#x27;s binary and run it. However, he annulled our assignment to 0. We still both got As (because you could drop one homework) and while some may claim this was a gentle slap on the wrist, it felt unjust. We clearly made a dumb mistake and we shouldn&#x27;t be punished at all, especially when we knew how rampant <i>actual</i> plagiarism was.<p>II. The time I caught students for &quot;plagiarizing&quot;. As Kevin points out in his post, there aren&#x27;t really any incentives to catch students for cheating. As a TA, I get no benefit, and moreover, there&#x27;s a cost. No one wants to be known as THAT TA who busts kids for using &quot;a little help&quot;. Keeping that in mind, I was usually very lenient when it comes to cheating. I&#x27;ve noticed signs, but there was never enough proof to warrant the effort of calling someone out. However, at one level it went too far. Two students who were partners for the &quot;projects&quot; had submitted nearly identical solutions for a complex Graphics homework assignment. They got the answer right, but I looked into their working and they both said &quot;(9&#x2F;5) &#x2F; (4&#x2F;3) == (4&#x2F;7) &#x2F; (5*9) = 1&#x2F;3&quot;. I don&#x27;t remember the exact values, but it was two steps of non-sense numbers and then a correct answer. I ended up reporting the case, mostly because I felt like my intelligence as a TA had been insulted. Are you seriously going to submit random numbers with a correct solution hoping I won&#x27;t see? In any case, it didn&#x27;t go anywhere.<p>III. Discovering a cheating ring. At our university, one of my good friends and project partners told me there was an &quot;enormous Asian cheating racket&quot; - not to call out any specific race, I&#x27;m Asian too. I wasn&#x27;t surprised - to be blatant, it made sense. We&#x27;re very grade oriented with tiger parents. Then I learnt the extent of it. There were apparently Chinese forums and &quot;outsourcers&quot; you could send your homework problems to and they would solve it and give it back. In addition, there were special shared systems like DC++ where you could discover answers to homeworks for different classes at my university as well as Prelims, Midterms and Finals contributed by students of previous years. I was in shock. Students would leave exam halls to go to the bathroom just to look at these answers mid-exam. But was I gonna tattle? No.<p>IV. The reality at universities. Not just in CS, but in every other subject, almost everybody cheats. Excuses that go around are: &quot;I&#x27;ve worked on it with someone else&quot; &quot;Oh the TA in office hours told everybody the exact same solution&quot; &quot;What? Cheating? me?&quot; &quot;Maybe he&#x2F;she took it from me, I didn&#x27;t do it&quot;<p>And look, people aren&#x27;t stupid. We all know how cheating works. You get a homework assignment, and you re-write the sentences in your own language. You get some code from someone else and you define some useless functions with 1-2 lines of code. Or you arbitrarily re-organize lines of code. You rename all the variables. You re-organize your functions. You create some unnecessary classes.<p>There were students who distribute 10 homework assignments between 10 people (in groups of 2), and have one do the assignment (use office hours, friends, google, whatever) and the other literally re-write the assignment in LaTeX 9 different ways for the others to use. No one would ever really have to do the work.<p>The well known key to cheating is plausible deniability - if there&#x27;s enough evidence you didn&#x27;t do it, you didn&#x27;t do it.
评论 #16653160 未加载