TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Scientists should use AI as a tool, not an oracle

124 pointsby randomwalker12 months ago

22 comments

unkulunkulu12 months ago
&gt; Unfortunately, most scientific fields have succumbed to AI hype, leading to a suspension of common sense. For example, a line of research in political science claimed to predict the onset of civil war with an accuracy2 of well over 90%, a number that should sound facially impossible. (It turned out to be leakage, which is what got us interested in this whole line of research.)<p>This coupled with people acting on its predictions is a kind of self fulfilling prophecy.<p>which is to ask, are AI safety folks building models of this pattern? :)
评论 #40569058 未加载
评论 #40569171 未加载
评论 #40569010 未加载
评论 #40569591 未加载
评论 #40569418 未加载
throwanem12 months ago
If this is already such a problem even in the professional discipline and vocation whose <i>sine qua non</i> is the accurate analysis of physical reality, I&#x27;m really nervous about the next few years. And I was nervous already...
captainkrtek12 months ago
In my professional work, I treat chatgpt as a search engine that I feel I can ask questions of in a natural manner. I often find small flaws in technical solutions it offers, but it can still provide useful starting points to investigate. I rarely trust code it generates (at least for the language I mainly work in) as i’ve seen it make some serious mistakes (eg: using keywords in the language that don’t exist)
评论 #40569215 未加载
userbinator12 months ago
People treating tools like they&#x27;re infallible has been a problem since computers were invented, but IMHO the biggest difference with AI is how confident and convincing it can be in its output. Much like others here, I already have had to convince, very carefully, many otherwise-decently-intelligent people who believed ChatGPT was correct.<p>Thus I think the biggest success of AI will be the arts, where imprecision is not fatal, and hallucinations turn into entertainment instead of &quot;truths&quot;.
评论 #40569906 未加载
quantum_state12 months ago
AI is a tool … a fool with a tool is still a fool … For natural sciences, there is no need to worry since nature would provide the ultimate check … for social “sciences”, it is entirely a different story.
TheRoque12 months ago
The worst is having random people questioning your expertise because of what ChatGPT told them.
评论 #40569090 未加载
评论 #40569258 未加载
benhoyt12 months ago
&gt; <i>People</i> should use AI as a tool, not an oracle<p>There, fixed the title.
评论 #40568926 未加载
评论 #40569426 未加载
bbor12 months ago
Wow I came into this article angry, idk if their book title accurately conveys the sober, expert analysis it contains! In case anyone else is curious why they’re talking about “leakage” in the first place instead of the existing term “model bias”, here’s the paper they cite in the “compelling evidence” paper that started these two’s saga with the snake oil salesmen: <a href="https:&#x2F;&#x2F;www.cs.umb.edu&#x2F;~ding&#x2F;history&#x2F;470_670_fall_2011&#x2F;papers&#x2F;cs670_Tran_PreferredPaper_LeakingInDataMining.pdf" rel="nofollow">https:&#x2F;&#x2F;www.cs.umb.edu&#x2F;~ding&#x2F;history&#x2F;470_670_fall_2011&#x2F;paper...</a><p>Crux passage:<p>&gt; Our focus here is on leakage, which is a specific form of illegitimacy that is an intrinsic <i>property of the observational inputs</i> of a model. This form of illegitimacy remains partly abstract, but could be further defined as follows: Let <i>u</i> be some random variable. We say a second random variable <i>v</i> is <i>u</i>-legitimate if <i>v</i> is observable to the client for the purpose of inferring <i>u</i>. In this case we write <i>v € legit{u}</i>.<p>&gt; A fully concrete meaning of legitimacy is built-in to any specific inference problem. The trivial legitimacy rule, going back to the first example of leakage given in Section 1, is that <i>the target itself must never be used for inference:</i><p>&gt; (1) <i>y !€ legit{y}</i><p>So ultimately this all about bad experimental discipline re: training and test data, in an abstract way? I’ve been staring at this paper for way too long trying to figure out what exactly each “target” is and how it leaks, but I hope that engineering-translation is close
dluan12 months ago
Scientists have been obsessed with over-optimzing for FOMO for the past decade - what papers should I read that I don&#x27;t have time for, what grants should I apply for that I don&#x27;t know about, what projects should I work on that will give me the best ROI, who in my field is poised to disrupt or make a big leap, etc.<p>Some even think that the end goal is actually an autonomous research agent that can make decisions about what questions to ask and why, and that&#x27;s one of the true marks of AGI. That to me is insane and misses the entire point of science altogether, even once we reach that technical feasibility. We ask questions about the universe to expand our human relationship with the universe, not to just amass more research capital for the sake of it. And the fact that the AI snake oil has infected big chunks of science reveals which parts of it are just gold rush speculation and which aren&#x27;t.<p>There&#x27;s a more fundamental challenge of training scientists to understand why we ask the questions we ask. You can&#x27;t just offload that to some background task and trust that it makes sense.
评论 #40569282 未加载
m3kw912 months ago
To know when to be skeptical to LLMs you have to know how it is trained and inferenced, and you have to use it often to see how it can screw up
cdme12 months ago
It&#x27;s marketed and sold as an oracle. The AGI crowd feels like a cult.
devjab12 months ago
I would have thought scientists weren’t going to use these tools to do research considering they as a group are far more exposed to things like peer reviews and critical thinking than general society.<p>What worries me the most about these AI solutions, however, is their usage in the public sector. They can certainly be useful helpers, like, they can scan images for cancer and if added to existing processes involving humans, often lead to enhanced results. They can’t replace any existing methods, however, as we learned here in Denmark a few years ago. Unfortunately that lesson hasn’t been learned across the public sector. I think medicine and healthcare learned it, but right now, we’re replacing actual human controls, audits and sometimes decision making with AI or an unwarranted trust in AI results. Which is going to lead to some really terrible results considering how bad things like LLMs often are at being lucky in even “common knowledge” situations. It’s further enhanced by how some of the work it’s tasked to do isn’t as black-and-white as writing code is. We use AI tools in our daily work, and they are ok, but as anyone who’s used them for programming probably knows by now, they aren’t exactly great at being lucky. Sometimes they’ll hallucinate solutions that simply do not exist.<p>This is how they work, and as I said earlier, AIs can be great enhancers. They aren’t replacements though, and if we start treating them like they are, which is very tempting from a change-management and benefit-realisation perspective, we’re just going to get in trouble. This is unfortunately exactly what we’re doing, and why wouldn’t we? Most western public sectors have functioned on at least some form of new public management for two decades by now, sometimes longer. As a result the entire systemic culture is geared toward efficiency and cost reduction, even when it doesn’t really result in either efficiency and cost reduction on a broader perspective.<p>Now, if scientists are on board. Then what hope does a public bureaucracy have?
bitwize12 months ago
LLMs are basically Dissociated Press, but with deeper layers of statistics for a better function approximation than a simple Markov chain. It&#x27;s really doing the same thing though: pick the next sequence of characters that best follows the foregoing characters.<p>Not something I&#x27;d trust as a &quot;source of truth&quot;. Maybe a neat idea generator. And some of the deep learning algorithms can identify patterns that humans might miss -- patterns that could reveal useful insight. But they&#x27;re not <i>doing</i> the knowledge work.
shmatt12 months ago
I feel like 90% of AI discussions online these days can be shut down with “a probabilistic syllable generator is not intelligence”
评论 #40574235 未加载
评论 #40570243 未加载
评论 #40569208 未加载
评论 #40569619 未加载
skrap12 months ago
...but why wouldn&#x27;t they use AI as an oracle? From an outsider&#x27;s perspective, it seems that there&#x27;s already plenty of incentive to test the margins of acceptable academic practice in order to produce more papers or publish more quickly. Sadly I feel like it&#x27;ll become the norm to have a chatbot interpret your results and write your paper rather than using those expensive grad students.<p>I don&#x27;t have answers; just the lingering question &quot;why are we building this?&quot;
评论 #40569629 未加载
hulitu12 months ago
&gt; Scientists should use AI as a tool, not an oracle<p>T in AI stands for tool.
chomskyole12 months ago
Maybe they should also call it &quot;curve fitting&quot; instead of &quot;AI&quot; so they don&#x27;t need to call a &quot;poor fit&quot; a &quot;hallucination&quot;
评论 #40574752 未加载
logrot12 months ago
But surely if it&#x27;s artificial intelligence then it&#x27;d know its limits and would respond appropriately? Oracle use no problem?<p>It is it because it&#x27;s actually shit but it&#x27;s the best thing we&#x27;ve seen yet and everyone is just in denial?
评论 #40569222 未加载
评论 #40569649 未加载
10000truths12 months ago
Is &quot;leakage&quot; just another term for overfitting?
评论 #40569773 未加载
评论 #40569742 未加载
评论 #40570035 未加载
teknopaul12 months ago
No shit sherlock
LouisSayers12 months ago
Not just scientists, but everyone!<p>My partner recently went a bit nuts writing an article with the help of GPT4. She was very proud of how productive she&#x27;d been until I asked if she&#x27;d actually searched for the papers GPT4 had referred to.<p>Of course, many of the referred to papers didn&#x27;t exist...
评论 #40569072 未加载
评论 #40568963 未加载
评论 #40569415 未加载
评论 #40569347 未加载
评论 #40569409 未加载
duxup12 months ago
We use search that way, don’t see why AI trained on similar content wouldn’t be just variable in terms of reliability.
评论 #40569054 未加载