When I went to the APS March Meeting earlier this year, I talked with the editor of a scientific journal and asked them if they were worried about LLM generated papers. They said actually their main worry wasn't LLM-generated papers, it was LLM-generated <i>reviews</i>.<p>LLMs are much better at plausibly summarizing content than they are at doing long sequences of reasoning, so they're much better at generating believable reviews than believable papers. Plus reviews are pretty tedious to do, giving an incentive to half-ass it with an LLM. Plus reviews are usually not shared publicly, taking away some of the potential embarrassment.
Hmm there may be a bug in the authors’ python script that searches google scholar for the phrases "as of my last knowledge update" or "I don't have access to real-time data". You can see the code in appendix B.<p>The bug happens if the ‘bib’ key doesn’t exist in the api response. That leads to the urls array having more rows than the paper_data array. So the columns could become mismatched in the final data frame. It seems they made a third array called flag which could be used to detect and remove the bad results, but it’s not used any where in the posted code.<p>Not clear to me how this would affect their analysis, it does seem like something they would catch when manually reviewing the papers. But perhaps the bibliographic data wasn’t reviewed and only used to calculate the summary stats etc.
GPT might make fabricating scientific papers easier, but let's not forget how many humans fabricated scientific research in recent years - they did a great job without AI!<p>For any who haven't seen/heard, this makes for some entertaining and eye-opening viewing!<p><a href="https://www.youtube.com/results?search_query=academic+fraud" rel="nofollow">https://www.youtube.com/results?search_query=academic+fraud</a>
This kind of fabricated result is not a problem for practitioners in the relevant fields, who can easily distinguish between false and real work.<p>If there are instances where the ability to make such distinctions is lost, it is most likely to be so because the content lacks novelty, i.e. it simply regurgitates known and established facts. In which case it is a pointless effort, even if it might inflate the supposed author's list of publications.<p>As to the integrity of researchers, this is a known issue. The temptation to fabricate data existed long before the latest innovations in AI, and is very easy to do in most fields, particularly in medicine or biosciences which constitute the bulk of irreproducible research. Policing this kind of behavior is not altered by GPT or similar.<p>The bigger problem, however, is when non-experts attempt to become informed and are unable to distinguish between plausible and implausible sources of information. This is already a problem even without AI, consider the debates over the origins of SARS-CoV2, for example. The solution to this is the cultivation and funding of sources of expertise, e.g. in Universities and similar.
For a paper that includes both a broad discussion of the scholarly issues raised by LLMs and wide-ranging policy recommendations, I wish the authors had taken a more nuanced approach to data collection than just searching for “as of my last knowledge update” and/or “I don’t have access to real-time data” and weeding out the false positives manually. LLMs can be used in scholarly writing in many ways that will not be caught with such a coarse sieve. Some are obviously illegitimate, such as having an LLM write an entire paper with fabricated data. But there are other ways that are not so clearly unacceptable.<p>For example, the authors’ statement that “[GPT’s] undeclared use—beyond proofreading—has potentially far-reaching implications for both science and society” suggests that, for them, using LLMs for “proofreading” is okay. But “proofreading” is understood in various ways. For some people, it would include only correcting spelling and grammatical mistakes. For others, especially for people who are not native speakers of English, it can also include changing the wording and even rewriting entire sentences and paragraphs to make the meaning clearer. To what extent can one use an LLM for such revision without declaring that one has done so?
Last time we discussed this, someone basically searched for phrases such as "certainly I can do X for you" and assumed that meant GPT was used. HN noticed that many of the accused papers actually predated openai.<p>Hope this research is better.
> Two main risks arise... First, the abundance of fabricated “studies” seeping into all areas of the research infrastructure... A second risk lies in the increased possibility that convincingly scientific-looking content was in fact deceitfully created with AI tools...<p>A third risk: ChatGPT has no understanding of "truth" in the sense of facts reported by established, trusted sources. I'm doing a research project related to use of data lakes and tried using ChatGPT to search for original sources. It's a shitshow of fabricated links and pedestrian summaries of marketing materials.<p>This feels like an evolutionary dead end.
I wonder how many of the GPT-generated papers are actually made by people whose native language is not English and who want to improve their English. That would explain various "as of my last knowledge update" still left intact in the papers, if the authors don't fully understand what it means.
How about people stop responding to titles for a change. This isn’t about papers that merely used ChatGPT and got caught by some cutting edge detection techniques, it’s about papers that blatantly include ChatGPT boilerplates like<p>> “as of my last knowledge update” and/or “I don’t have access to real-time data”<p>which suggests no human (don’t even need to be a researcher) read every sentence of these damn “papers”. That’s a pretty low bar to clear, if you can’t even bother to read generated crap before including it in your paper, your academic integrity is negative and not a word from you can carry any weight.
Colour me surprised. An IT related search will generally end up with loads of returns that lead to AI generated wankery.<p>For example, suppose you wish to back up switch configs or dump a file or whatever and tftp is so easy and simple to setup. You'll tear it down later or firewall it or whatever.<p>So a quick search "linux tftp serevr" gets you to say: <a href="https://thelinuxcode.com/install_tftp_server_ubuntu/" rel="nofollow">https://thelinuxcode.com/install_tftp_server_ubuntu/</a><p>All good until you try to use the --create flag which should allow you to upload to the server. That flag is not valid for tftp-hpa, it is valid on tftpd (another tftp daemon)<p>That's a hallucination. Hallucinations are fucking annoying and increasingly prevalent. In Windows land the humans hallucinate - C:\ SFC /SCANNOW does not fix anything except for something really madly self imposed.
There is article shows no evidence of fabrication, fraud or misinformation, while making accusations of all of them. All it shows is that ChatGPT was used, which is wildly escalated into "evidence manipulation" (ironically without evidence).<p>Much more work is needed to show that this means anything.
Honestly what we need to do is establish much stronger credentialing schemes. The "only a good guy with an AI can stop a bad guy with an AI" approach of trying to filter out bad content is just a hopeless arms race and unproductive.<p>In a sense we need to go back two steps and websites need to be much stronger curators of knowledge again, and we need some reliable ways to sign and attribute real authorship to publications. So that when someone publishes a fake paper there is always a human being who signed it and can be held accountable. There's a practically unlimited number of automated systems, but only a limited number of people trying to benefit from it.<p>In the same way https went from being rare to being the norm because the assumption that things are default-authentic doesn't hold, the same just needs to happen to publishing. If you have a functioning reputation system and you can put on a price on fake information 99% of it is dis-incentivized.