TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

If you believe in "Artificial Intelligence", take five minutes to ask it

134 pointsby lycopodiopsida3 months ago

60 comments

nabla93 months ago
Intelligence and knowledge are distinct concepts. Asking about it&#x27;s knowledge teaches noting about it&#x27;s intelligence. Intelligence is the ability to learn, reason, and solve problems. Knowledge is the accumulation of facts and skills.<p>Chatbot LLM&#x27;s don&#x27;t have metacognition. They don&#x27;t know that they don&#x27;t know. If you peek inside the LLM, the process seems different for things they don&#x27;t know. They just can&#x27;t express it because they are trained to produce good probability outcome instead of accurate one.<p>They have potential as knowledge databases, but someone must figure out how to get &quot;I don&#x27;t know&quot; information out of them.
评论 #43057297 未加载
评论 #43057225 未加载
评论 #43057719 未加载
评论 #43064398 未加载
评论 #43057161 未加载
评论 #43057415 未加载
评论 #43057383 未加载
评论 #43058124 未加载
评论 #43057562 未加载
评论 #43057242 未加载
评论 #43057241 未加载
评论 #43057174 未加载
becquerel3 months ago
Maybe I am just way deeper in this space that any well-adjusted person should be, but the line of &#x27;did you know LLMs are bad with niche factual information tasks in non-verifiable domains?&#x27; has become extremely boring to me. It feels very hard to find something actually new to say on the topic. I find it amazing people still feel the need to talk about it. But then again, I guess most people don&#x27;t know the difference between a 4o and an R1.
评论 #43057326 未加载
评论 #43057944 未加载
评论 #43057101 未加载
评论 #43058512 未加载
评论 #43058732 未加载
评论 #43057118 未加载
nrvn3 months ago
I have finally found the value of llms in my daily work.<p>I never ask them anything that requires rigorous research and deep knowledge of subject matter.<p>But stuff like “create a script in python to do X and Y” or “how to do XY in bash” combined with “make it better” produces really good and working in 95% of the time results and saves my time a lot! No more googling for adhoc scripting. It is like having a junior dev by your side 24&#x2F;7. Eager to pick up any task you throw at them, stupid and overconfident. Never self-reviewing himself. But “make it better” actually makes things better at least once.
评论 #43057749 未加载
spiderfarmer3 months ago
You&#x27;ll have the same &quot;aha&quot; moment when you hear a certain unelected vice-president confidently wade into your area of expertise — where his usual smooth-talking veneer shatters like a plate at a Greek wedding. Yet, his most devoted fans remain undeterred, doubling down on the myth of his omniscience with the zeal of a flat-earther explaining airline routes.
评论 #43064132 未加载
评论 #43057205 未加载
nbuujocjut3 months ago
Asking Claude this morning. Seems pretty reasonable and contains the warning about accuracy.<p>&gt; Michael P. Taylor reassigned Brachiosaurus brancai to the new genus Giraffatitan in 2009. The species became Giraffatitan brancai based on significant anatomical differences from the type species Brachiosaurus altithorax.<p>&gt; Given that this is quite specific paleontological taxonomy information, I should note that while I aim to be accurate, I may hallucinate details for such specialized questions. You may want to verify this information independently.
acka3 months ago
I asked ChatGPT+ using Scholar GPT as GPT. This is the answer I got back, not too bad:<p>The species Brachiosaurus brancai was reassigned to its own genus, Giraffatitan brancai, by paleontologist George Olshevsky in 1991. Olshevsky proposed that Brachiosaurus brancai, which was originally described by Werner Janensch in 1914, was distinct enough from Brachiosaurus altithorax (the type species of Brachiosaurus) to warrant its own genus. Subsequent studies, particularly by Michael Taylor in 2009, provided further anatomical evidence supporting this distinction.
simonbarker873 months ago
I only trust LLMS with questions whose answers prove themselves correct or incorrect - so basically code, if it runs and produces the result I was looking for then great, or where the answer is a stepping off point to my own research on something non-critical like travel. ChatGPT is pretty good at planning travel itineraries, especially if pre promoted with a good description about the groups interests.<p>Beyond that I don’t trust them at all.
-__---____-ZXyw3 months ago
Superficially resembling cognition =&#x2F;= cognition.<p>I&#x27;m quite excited about many of the specific use cases for LLMs, and have worked a few things into my own methods of doing things. It&#x27;s a quick and convenient way to do lots of actual specific things.<p>For example: if I want to reflect on different ways to approach a (simple) maths problem, or what sorts of intuitions lie behind an equation, it is helpful to have a tool that can sift through the many snippets of text out there that have touched off that and similar problems, and present me with readable sentences summing up some of those snippets of text from all those places. You&#x27;ve to be very wary, as highlighted by the article, but as &quot;dumb summarisers&quot; that save you trawling through several blogs, they can be quicker to use.<p>Nonetheless, equating this with &quot;reasoning&quot; and &quot;intelligence&quot; is only possible for a field of academics and professionals who are very poorly versed in the humanities.<p>I understand that tech is quite an insular bubble, and that it feels like &quot;the only game in town&quot; to many of its practitioners. But I must admit that I think it&#x27;s very possible that the levels of madness we&#x27;re witnessing here from the true believers will be viewed with even more disdain than &quot;blockchain&quot; is viewed now, after the dust has settled years later.<p>Blockchain claimed it was going to revolutionise finance, and thereby upend the relationship between individuals and states.<p>AI people claim they&#x27;re going to revolutionise biology, and life itself, and introduce superintelligences that will inevitably alter the universe itself in a way we&#x27;ve no control over.<p>The danger isn&#x27;t &quot;AI&quot;, the danger is the myopia of the tech industry at large, and its pharaonic figureheads, who continue to feed the general public - and particularly the tech crowd - sci-fi fairytales, as they vie for power.
评论 #43057920 未加载
评论 #43058343 未加载
tgsovlerkhgsel3 months ago
Likewise, if you <i>don&#x27;t</i> believe in &quot;Artificial Intelligence&quot;, take five minutes to ask it.<p>Or preferably, five minutes to understand how it works and what it can and cannot do, then five minutes to ask it something actually suitable.<p>&quot;AI&quot; (LLMs) are currently good at:<p>- language understanding, i.e. understanding and processing text you provide it. For example, taking a wall of text and answering questions about points mentioned there, or general sentiment, or extracting data from it etc.<p>- <i>somewhat</i> general knowledge, i.e. stuff that was sufficiently frequently represented in the training data<p>Absent additional tricks, &quot;AI&quot; is really bad at obscure knowledge or complex, multi-step thinking. We are slowly getting there, but we aren&#x27;t there yet. This is not something the LLMs do, but rather the wrappers around them that provide the model with tools to get additional information and first prompting the model to select the tools, then repeated prompts with the output of the tools.<p>A good rule of thumb is that if an average well-educated intelligent person could answer it <i>without further research</i>, a LLM will probably be able to. I&#x27;d even say that if an average fresh out of school graduate of the corresponding discipline can answer it quickly (without further research or sitting down for ten minutes and doing the math), there&#x27;s a good chance AI will be able to answer it, but it might also get it horribly wrong and you will have a hard time distinguishing between those if you have no knowledge in the field.<p>As the author mentions at the very end of the article, the hallucination problem also means that the best kind of tasks are where you can quickly verify whether the response was useful. A system that produces misleading responses 50% of the time is useless if you can&#x27;t distinguish them, but very useful if in those 50% it saves you ten minutes of work and in the other 50% you lose a minute by trying.
评论 #43057139 未加载
Janicc3 months ago
I used o3mini reasoning on that very question 2 times and it used a similar way of reasoning as him to answer it correctly both times. I agree with his premise but calling it a pump and dump with no possible future developments is so ridiculous.
tmnvdb3 months ago
This example is from 1 year ago. With current LLM progress that means it&#x27;s basically a historical document.<p>Both 4o with internet and o3 without internet gave the right answer.<p>Somebody in the comments did a deep research run which produces an extremely detailed History of the naming.<p>Of course reliability is not a solved problem in general but people here would do well to check both blog posts and generative model output for factuality and relevance.
llmfan3 months ago
OK, I&#x27;ll ask.<p>DeepSeek R1 gets it right:<p>&gt; The species formerly known as <i>Brachiosaurus brancai</i> was reassigned to its own genus, *Giraffatitan*, through a series of taxonomic revisions. The key steps were:<p>&gt; 1. *Initial Proposal (1988):* Gregory S. Paul first suggested separating <i>B. brancai</i> from <i>Brachiosaurus</i> in 1988, proposing the informal name &quot;Giraffatitan&quot; as a subgenus. However, this was not widely accepted at the time.<p>&gt; 2. *Formal Reclassification (2009):* Paleontologist *Michael P. Taylor* conducted a detailed comparative study of <i>Brachiosaurus</i> specimens. In his 2009 paper, he formally elevated <i>B. brancai</i> to the full genus *Giraffatitan brancai*, citing significant anatomical differences (e.g., limb proportions, vertebral structures) from the North American <i>Brachiosaurus altithorax</i>.<p>&gt; Thus, *Michael P. Taylor* is credited with the formal reassignment of the species to the genus <i>Giraffatitan</i> in *2009*.
评论 #43057348 未加载
mdp20213 months ago
Only hours ago:<p>&quot;What&#x27;s the value of quantities Q0 and Q1?&quot; &#x2F;&#x2F; &quot;[Bot:] Q0 is 1.5 u&#x2F;w, or 0.67 u per each w. Q1 is unknown.&quot; &#x2F;&#x2F; &quot;Can you see the error in your previous?&quot; &#x2F;&#x2F; &quot;[Bot:] Yes, I can re-read my outputs! Q0 is (Q0 is 1.5 u&#x2F;w (0.67w&#x2F;u), Q1 is 2.0 u&#x2F;w&quot;. &#x2F;&#x2F; &quot;How do you now know Q1?&quot; &#x2F;&#x2F; &quot;[Bot:] I don&#x27;t, I&#x27;m sorry.&quot;<p>The issues should be clear. Given the state of the population (e.g. lawyers that delegate research to unreliable sources), we have to reiterate the warning. Given that even in this audience we had voices claiming that the problem of intelligence was solved (how did it happen?), we sometimes reiterate it even here. But we should focus on the solutions, as the problem should be clear...
jdthedisciple3 months ago
Why do people still ask LLMs <i>niche factual knowledge, especially involving unique names</i>?<p>And even worse, <i>judge its performance based off that</i>?<p>When someone does that I immediately know that they have not the slightest clue about the underlying workings of any deep learning model, let alone basics of statistics.
评论 #43057894 未加载
评论 #43057875 未加载
评论 #43057890 未加载
janikvonrotz3 months ago
HNYSF: Can we fast forward to the plateau of this inglorious AI hype cycle? These comments are beyond me. Why try reason about the unreasonable?<p>LLM are non factual no matter what. Even if it is called Schoolar GPT does not make it any different.
someothherguyy3 months ago
Kind of sort of reminiscent of <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Gell-Mann_amnesia_effect" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Gell-Mann_amnesia_effect</a>
评论 #43057217 未加载
2-3-7-43-18073 months ago
&gt; ChatGPT got every single part of that wrong<p>which chatgpt?<p>and feeding the initial question to perplexity.ai yields a result that is practically identical to the perfect answer.<p>of course, the author has a point. as many here already stated it isn&#x27;t even worth talking about it cause it is so obvious. then again as many correctly notice llms are sold with this promise.<p>practically, the subject is way too niche - i&#x27;d assume - and probably there are also conflicting statements on the interweb which the model was trained on.<p>having said that - perplexity got it right with web-access turned on.
proc03 months ago
An actual artificial intelligence would have a logical underpinning to its computation. Computers can execute any logic really well. Intelligence depends on logic. LLMs however, are not using logic to arrive at any conclusion, they are just a bunch of &quot;wires&quot; from input to output.<p>The promise is that LLMs are a form of intelligence, but it is only a complicated statistical approximation, which happens to correlate with correctly answering queries that are contained in its training dataset.<p>My speculation is that a true AI (in the sense that it has some form of intelligence) would show signs of intelligence even at very small scales. LLMs are completely useless at really small scales, but even small creatures like rats or even insects display intelligence in proportion to their scale.<p>I think these tools will help automate information processing of all kinds, but it is by no means intelligent, and we will not be able to rely on them as if they were intelligent because we&#x27;ll still need include verification at ever level, similarly to how self-driving cars still need a human to pay attention. Useful sure, but it falls short of its promise that it will replace humans because they can &quot;think&quot;. We&#x27;re not there yet from a theoretical standpoint.
评论 #43057836 未加载
PaulRobinson3 months ago
Disclaimer: I&#x27;ve done a lot of stuff with local models and RAG methods, I haven&#x27;t done a lot of work with public models and so don&#x27;t know how Gemini, GPT and so on are working right now. Claude + GraphRAG through Bedrock is my main mode of playing with this stuff right now.<p>Things LLMs are good at include summarisation and contextualisation.<p>They can use that facility to help summarise processes and steps to get something done, if they&#x27;ve been trained on lots of descriptions of how to do that thing.<p>What they&#x27;re not good at is perfect recall without being nudged. This example would have been very different if the LLM had been able to RAG (or GraphRAG), a local data source on palaeontology.<p>I think we&#x27;re going to see an evolution where search companies can hook up an LLM to a [Graph]RAG optimised search index, and you&#x27;ll see an improved response to general knowledge questions like this. In fact, I&#x27;d be surprised if this isn&#x27;t happening already.<p>LLMs on their own are a lossy compression of training material that allow a stochastic parrot to, well, parrot things stochastically. RAG methods allow more deterministic retrieval, which when combined with language contextualisation can lead to the kinds of results the author is looking for, IME.
itvision3 months ago
&gt; Who reassigned the species Brachiosaurus brancai to its own genus, and when?<p>&gt; ChatGPT said:<p>&gt; The species Brachiosaurus brancai was reassigned to its own genus, Giraffatitan brancai, by paleontologist George Olshevsky in 1991. This reclassification was later supported by a detailed study by Michael P. Taylor in 2009, which reinforced the distinction between Brachiosaurus and Giraffatitan based on anatomical differences.<p>My ChatGPT has just given a correct answer. What am I doing wrong?
评论 #43057285 未加载
评论 #43057660 未加载
antirez3 months ago
Not even the effort to check what happened in a year, re-asking the same questions to newer models. We went from last year ChatGPT to be almost useless if not as &quot;reference&quot; for well know things (like how to do that in Python), to today Claude Sonnet 3.5, o3-mini-high and DeepSeek V3&#x2F;R1 to be largely more useful models, capable of actual coding, bug fixing, ...
gizmo3 months ago
When a model is trained you end up with nothing more than a bunch of weights. These weights are used to predict the next token in a sequence. LLM models do not have an external memory. LLMs only retain enough knowledge to make it into the next round of training. The astonishing thing here is that even pretty small models now know so much that people assume you can ask knowledge questions about any subject under the sun and get a factual answer. Absurd of course. Logically impossible -- it follows trivially from the size of the model used. The only thing the author has proven with his little test is his own lack of scientific curiosity.<p>For any question that requires research (or deep expertise in a specific field) you need to use either a research model (that can reason and look things up in external knowledge bases) or you need a model that is trained on the kind of questions that you want to ask it so that it retains that data.
评论 #43057639 未加载
lbill3 months ago
The only way to make check whether a LLM output is true is to do the work (to have it dkne by a real person).<p>For tasks that are trivial to verify, it&#x27;s ok: a code compiler will run the code written by a LLM. Or: ask a LLM to help you during the examples mapping phase of BDD, and you&#x27;ll quickly be able to tell what&#x27;s good and what isn&#x27;t.<p>But for the following tasks, there is a risk: - ask a LLM to make a summary of an email your didn&#x27;t read. You can&#x27;t trust the result. - you&#x27;re a car mechanic. You dump your thoughts to a voice recorder, and use AI to turn it into a textual structured report. You&#x27;d better tripple check the output! - you&#x27;re a medical doctor, attempting to do the same trick: you&#x27;d have to be extra careful with the result!<p>And don&#x27;t count on software testing to make AI tool robust: LLM are non deterministic.
blu_3 months ago
This resonates really will with me, and I find myself more and more judging people who does not understand this..
评论 #43057173 未加载
greatgib3 months ago
This is something that I often say: general population confuse LLMs with a kind of new generation Google and expect it to be a knowledge database.<p>When in fact LLMs are good for some tasks but not for general knowledge. You can think of the cutoff date as a strong example about that. The knowledge in itself is just a base to have the ability to perform tasks.<p>In addition, for the example in this article, the person is testing the LLM on common chat service without dealing with settings that might be important like the temperature. If you want the closest to an exact reply you should set the temperature to 0, but general public tools have an average temperature to be able to be creative when needed.
评论 #43057423 未加载
评论 #43057927 未加载
jefffoster3 months ago
A great quote that’s stuck with me is that “LLMs are experts in subjects you are not”
jeffreygoesto3 months ago
It is a gigantic regression to the mean. Everybody thinks (s)he&#x27;s &quot;normal&quot;, but in fact only spans a small part of knowledge. Getting answers from a different location in knowledge space can feel like speaking to an expert but it&#x27;s just some &quot;other normal&quot;. My personal mental model of hallucinations is that knowledge and truth live on a manifold and not a continuous space and learning that manifold statistically is (too) hard. You discover answers from the &quot;non-manifold&quot; in your area but not so easily in other domains.
hbarka3 months ago
I prompted ChatGPT on separate sessions with:<p>1. Cats are transactional 2. Dogs are transactional 3. Cats are not transactional 4. Dogs are not transactional<p>It agreed on all occasions. Language is agreeable.
评论 #43080292 未加载
评论 #43057197 未加载
评论 #43057220 未加载
firesteelrain3 months ago
ChatGPT not being the compendium (stealer) of knowledge would have to be fed the correct information then the prompt will work.<p>It still fails at being confidently wrong.<p>The brief article hits at people trusting the tool without questioning the output.<p>Meanwhile, we have people using Codeium or Copilot to write code and that sort of works since the code eventually needs to be compiled and tested (unit, integration, system, requirement sell off)<p>There is no test for the truth available to everyone else.
评论 #43057232 未加载
xianshou3 months ago
o3-mini:<p>Who reassigned the species Brachiosaurus brancai to its own genus, and when?<p>---<p>Here is the transcription of the text from the image:<p>Reasoned for 8 seconds ▼<p>The user is asking about the reclassification of Brachiosaurus brancai to its own genus. Originally, Janensch described it as Brachiosaurus brancai in 1914. However, in 2009, Michael P. Taylor and Matt Wedel reassigned it to Giraffatitan brancai. Their paper, titled “Brachiosaurus brancai and the early evolution of titanosauriform sauropods,” argued for this change, distinguishing Giraffatitan brancai from Brachiosaurus. I’ll check the exact references to confirm the specifics.<p>It seems “Giraffatitan” was first proposed by Gregory S. Paul in 1988, when he reassigned Brachiosaurus brancai to its own genus. Although some confusion exists, it’s generally accepted that Paul was the main figure behind this change. Other references, such as a 2007 paper by Sereno and a 2009 paper by Taylor and colleagues, also discuss this reclassification. The 1988 reshuffling is usually seen as the turning point for Brachiosaurus brancai, although later research helped solidify this decision.<p>The species was reassigned to its own genus—Giraffatitan—by paleontologist Gregory S. Paul in 1988.<p>---<p>That looks pretty good to me!
tim3333 months ago
&gt;Why do I keep banging on about this? Because we’ve reached an especially toxic phase in the history of AI ... We’re in the “pump” phase of an industry-wide pump-and-dump.<p>That kind of assumes AI is bad, being hyped and it will all crash, however AI performance is not constant - as many commenters here have noted the article example from a year ago works ok now, and in the following years AI will keep improving.
torvald3 months ago
I like think of it less as artificial intelligence and more like a combination of a lossy zip file of the internet and like a pretty coherent word generator.<p>I recall my AI professor in uni telling us during the first lecture that «Artificial intelligence is this target, that, and once we get there it, is it no longer artificial intelligence, is just an algorithm» – and this still feels like the case.
评论 #43057164 未加载
zenon3 months ago
You frantically tab away from reddit as the white and black-clad med storm into your office and zip-tie you to your Steelcase faster than you can shout what the hell. They calmly explain that an expert will soon enter and quiz you. You must answer the expert&#x27;s questions. It doesn&#x27;t matter if you know the answer or not, just say something. Be flattering and helpful. But just answer. If you do this, they will let you go.<p>They crouch under your desk as a man in a grey suit and spectacles enters and pulls up a chair in front of you. He peers over his glasses at you, and asks, who classified the leptosporangiate ferns, and when was it done? The what now?<p>I&#x27;m happy you asked such an excellent question, you say. It was Michael Jackson, in 1776.<p>A sneer flicks over the man&#x27;s upper lip, He jerks upright, takes a step back from you. This man, he declares with disgust, is not intelligent!
评论 #43059368 未加载
评论 #43058036 未加载
jhanschoo3 months ago
My understanding with present-day LLMs is that you can ask it something and get an answer, but you have to treat it with the same degree of confidence as hearsay. You may then ask it to cite its sources, at which point you get reliable references, or it will apologize for getting things wrong.
dataviz10003 months ago
The o3-mini model did quite well and mentioned the significant people during the reasoning stage. [0]<p>[0] <a href="https:&#x2F;&#x2F;chatgpt.com&#x2F;share&#x2F;67b05e3b-eea8-8004-8dab-806ee8fa59b5" rel="nofollow">https:&#x2F;&#x2F;chatgpt.com&#x2F;share&#x2F;67b05e3b-eea8-8004-8dab-806ee8fa59...</a>
1vuio0pswjnm73 months ago
&quot;Because we&#x27;ve reached an especially toxic phase in the history of AI. A lot of companies have ploughed billions of dollars into the dream of being able to replace human workers with machines, and they are desperate to make us believe it&#x27;s going to work - if only so they can cash out their investments while the stocks are still high.&quot;<p>Over its short history so far we have learned that Silicon Valley&#x27;s only viable &quot;business model&quot; is data collection, surveillance and online ad services. &quot;AI&quot;, i.e., next generation autocomplete, can work for this in the same way that a &quot;web browser&quot; or a &quot;search engine&quot; did. In the end, no one pays for a license to use it. But it serves a middleman surveillance &quot;business model&quot; that solicits ad spend and operates in secrecy. When this &quot;business model&quot; falters, for example because computer use and ad spend stagnates or shrinks, then Silicon Valley&#x27;s human workers are not &quot;needed&quot;. Large numbers of these human workers are paid from investment capital or ad spend, not from fees for services or the sale of products.<p>Perhaps the question is not whether &quot;AI&quot; can &quot;replace&quot; Silicon Valley&#x27;s human workers. Perhaps the question is whether the online ads &quot;industry&quot; is sustainable.
refulgentis3 months ago
GPT 4.5, Orion, is available tonight, and seems to get it correct.&#x2F;*<p>I think the author&#x27;s lens is interesting, I call it &quot;information resolution.&quot; Orion is the first model that has an understanding of the style of short story written by famously-Radiohead-affiliated artist Stanley Donwood. Also, it knows the &quot;codename&quot; for Material You, i.e. a random name for the library I was prototyping, its much finer grained than existing models.<p>* The species Brachiosaurus brancai was reassigned to its own genus, Giraffatitan brancai, by George Olshevsky in 1991. This reclassification was later supported by further studies, notably by Michael P. Taylor in 2009, which reinforced the distinct differences between Giraffatitan and Brachiosaurus altithorax.
gwd3 months ago
First of all, I&#x27;d say his question is a bit ambiguous -- even reading his little history there, it&#x27;s not clear what answer he wanted. It appears we have two renamings:<p>- Brachiosaus brancai (species) -&gt; Brachiosarus (Giraffatitan) brancai (sub-genus) in 1988<p>- Bachiosarus (Giraffatitan) brancai (sub-genus)sub-genus -&gt; G. brancai (genus) in 2009.<p>So if &quot;Brachiosaurus brancai&quot; <i>as a species</i> was never renamed into a genus, which answer did he want -- when it was renamed out of being only a species, or when it was renamed into being its own genus?<p>Claude&#x27;s response:<p>&gt; Brachiosaurus brancai was reassigned to the new genus Giraffatitan by paleontologist Gregory S. Paul in 1988. The species then became known as Giraffatitan brancai.<p>&gt; Since this is quite specific paleontological information, I should note that while I aim to be accurate, I may hallucinate some details when discussing such specialized topics. I recommend double-checking these details against reliable paleontological sources.<p>So:<p>1. <i>Nearly</i> correct -- it was assigned a new <i>sub</i> genus in 1988, and did eventually become known as the genus Giraffatitan brancai.<p>2. It recognized that this was an area where its knowledge would be weak, and asked me to double check, which was his main complaint about ChatGPT.<p>I asked it if there was any ambiguity in the question, and it said:<p>&gt; Yes, there is ambiguity. The question asks about the reassignment of Brachiosaurus brancai as if there was only one such event, but there were actually two significant taxonomic changes:<p>&gt; In 1988, Gregory S. Paul informally suggested moving B. brancai to its own genus, Giraffatitan In 2009, Michael P. Taylor formally made this generic distinction by properly establishing Giraffatitan as a distinct genus from Brachiosaurus<p>&gt; My previous answer was incomplete as it only mentioned Paul&#x27;s 1988 informal suggestion, not Taylor&#x27;s 2009 formal reassignment. Taylor&#x27;s work was the official taxonomic change that established Giraffatitan brancai as the valid name<p>I think Claude did about as well as a random human would have done who&#x27;d studied this 10 years ago and been asked a question with false assumptions embedded.<p>Claude and ChatGPT aren&#x27;t authorities on every subject. They&#x27;re that guy at the office who seems to know a bit about everything, and can point you in the right direction when you basically don&#x27;t have a clue.
thenano23 months ago
O1 got it right though
continuational3 months ago
As the article mentions, LLM is often wrong, particularly on niche topics. But if you have some other way of verifying the answer, it&#x27;s still useful.
评论 #43057112 未加载
mseepgood3 months ago
It&#x27;s called artificial intelligence, not artificial correctness. Humans are intelligent but often wrong, even while being convinced they are right.
评论 #43058340 未加载
alecco3 months ago
That&#x27;s not intelligence, that&#x27;s memorization.
评论 #43057086 未加载
gillesjacobs3 months ago
Using 03-mini-high + Search I get the right answer he was looking for:<p><pre><code> The species was first split at the subgeneric level by Gregory S. Paul in 1988—he proposed the name Brachiosaurus (Giraffatitan) brancai. Then in 1991 George Olshevsky raised the subgenus Giraffatitan to full generic status, so that B. brancai became Giraffatitan brancai. Later, a 2009 study by Michael P. Taylor provided detailed evidence supporting this separation. </code></pre> I guess Mike Taylor will gracefully cede his point now?<p>It is very funny to me that someone would feel the need to complain about a niche factual error in pretrained LLMs without even enabling RAG. If you even know the basics about this field, you shouldn&#x27;t be surprised.<p>Of course this was probably more about ego stroking his paleontological achievement than a thoughtful evaluation of the current state of LLMs.
_giorgio_3 months ago
You received a tool. A great tool, a magnificent tool.<p>Learn to understand its limitations and make the best use of it. Surely it&#x27;s confused by lesser known facts, that&#x27;s a thing that you can&#x27;t ignore even if you interpret AI as a tool that compresses knowledge.<p>If you don&#x27;t understand that, you&#x27;re the tool.
评论 #43057272 未加载
评论 #43057249 未加载
anonzzzies3 months ago
I use it only for code and that works very very well; the rest it gets terribly wrong mostly so I don&#x27;t even bother.
kolinko3 months ago
“A year ago” - wasn’t that gpt3?
throw0983209233 months ago
Nobody forces me to &quot;believe in&quot; AI. Unlike other things!<p>Remember how authorities &quot;hallucinated&quot; that some medication is 100% effective and with 0% side effects?! Despite scientific opposition, and warning leaflet included WITH the medication? And how you would go to jail, for ignoring such hallucinations!?
squarefoot3 months ago
And now someone wants to run an entire country with it and minimal supervision by inexperienced teens.
anonu3 months ago
Well it&#x27;ll get the answer right on the next Web scrape and training now...
pk-protect-ai3 months ago
Before GPT-3 was public, there was BLOOM 176B, and this model made my skin crawl because it was capable of answering &quot;I do not know.&quot; That was an experience of a lifetime. I was honestly impressed and at the same time scared.
yapyap3 months ago
Could barely have said it better myself, exactly.
scotty793 months ago
Author expects single electronic &quot;brain&quot; to know everything that every hunan brain knows about everything even vaguely scientific or popular or at least know that it doesn&#x27;t know this thing off the top of it&#x27;s head.<p>If you asked a human this question they&#x27;d respond with &quot;let me find out and get back to you&quot;, educate themselves by accessing various knowledge repositories and then formulate the answer.<p>And some AIs are possibly doing that even today, just not the ones freely accessible to everybody.<p>Maybe this one?<p><a href="https:&#x2F;&#x2F;blog.google&#x2F;products&#x2F;gemini&#x2F;google-gemini-deep-research&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.google&#x2F;products&#x2F;gemini&#x2F;google-gemini-deep-resea...</a>
nbzso3 months ago
It starts with the falsehood of the Turing test, continues with the idea of branding the errors &quot;hallucination&quot;, moves ahead with &quot;experts&quot; working for their salaries, bonuses and shares and lends here. Benchmarking statistically a dataset, emulating progress and pushing us into an &quot;Intelligent Age&quot; while accelerating data collection, normalizing biometric surveillance, hiding incompetence and speculation, the real reason behind mass layoffs.<p>A pipe dream ideal for technocratic mafia (In 1932, Scott and fellow technocrat Walter Rautenstrauch formed the &quot;Committee on Technocracy&quot; at Columbia University) of Bilderburgs members, Peter Thiel, Eric Schmidt and their pupils like Musk and co. Now they have a carte blanche for AI governance and regulatory capture. Starlink, Starshield (Skynet), Stargate, Closed AI, Grok.<p>Married to the masters of the old system, they will deliver &quot;efficiency&quot; on a grand scale. Transhumanism is the &quot;inevitable&quot; future of destruction of human dignity, value and psyche.<p>And all of this trough psyops for a text calculator with stochastic parrot tendencies.<p>A calculator is a useful tool, but not intelligence in any form.<p>Machine learning is real, and the biggest danger is the tendency of a hyper class of billionaires to apply surveillance and governance automation.<p>Ask Larry Ellison about his dream of citizens with the best behaviour. Not so different from Chinas social scoring system.
tompagenet23 months ago
I asked Perplexity (Pro, if it matters). Seems pretty solid to me. I&#x27;m not sure this article is making any point other than &quot;the knowledge back of static artificial intelligence systems is finite and they don&#x27;t express uncertainty well&quot;. Amusingly it links to this blog post, but you&#x27;ll see the sources it uses are largely not this post.<p>---<p>The reassignment of Brachiosaurus brancai to its own genus occurred through a series of steps by different paleontologists:<p>## Initial Classification In 1914, Werner Janensch first described the African species as Brachiosaurus brancai, considering it part of the existing Brachiosaurus genus[4].<p>## Path to Reclassification Gregory S. Paul made the first step toward separation in 1988 by proposing a subgenus classification, Brachiosaurus (Giraffatitan) brancai, after noticing significant skeletal differences[4][10].<p>George Olshevsky then elevated Giraffatitan to full genus status in 1991, though this change was largely disregarded by the scientific community at the time[4][10].<p>## Formal Establishment The definitive separation came in 2009 when Michael P. Taylor published a detailed study demonstrating 26 distinct osteological differences between the African and American species. This comprehensive analysis firmly established Giraffatitan as a separate genus, creating the combination Giraffatitan brancai[3][4]. From 2012 onward, most researchers have accepted this classification[10].<p>Citations: [1] <a href="https:&#x2F;&#x2F;svpow.com" rel="nofollow">https:&#x2F;&#x2F;svpow.com</a> [2] <a href="https:&#x2F;&#x2F;svpow.com&#x2F;2009&#x2F;09&#x2F;09&#x2F;brachiosaurus-brancai-is-not-brachiosaurus&#x2F;" rel="nofollow">https:&#x2F;&#x2F;svpow.com&#x2F;2009&#x2F;09&#x2F;09&#x2F;brachiosaurus-brancai-is-not-br...</a> [3] <a href="https:&#x2F;&#x2F;www.miketaylor.org.uk&#x2F;dino&#x2F;pubs&#x2F;taylor2009&#x2F;Taylor2009-brachiosaurus-and-giraffatitan.pdf" rel="nofollow">https:&#x2F;&#x2F;www.miketaylor.org.uk&#x2F;dino&#x2F;pubs&#x2F;taylor2009&#x2F;Taylor200...</a> [4] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Giraffatitan" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Giraffatitan</a> [5] <a href="https:&#x2F;&#x2F;www.nhm.ac.uk&#x2F;discover&#x2F;dino-directory&#x2F;giraffatitan.html" rel="nofollow">https:&#x2F;&#x2F;www.nhm.ac.uk&#x2F;discover&#x2F;dino-directory&#x2F;giraffatitan.h...</a> [6] <a href="https:&#x2F;&#x2F;svpow.com&#x2F;2025&#x2F;02&#x2F;14&#x2F;if-you-believe-in-artificial-intelligence-take-five-minutes-to-ask-it-about-stuff-you-know-well&#x2F;" rel="nofollow">https:&#x2F;&#x2F;svpow.com&#x2F;2025&#x2F;02&#x2F;14&#x2F;if-you-believe-in-artificial-in...</a> [7] <a href="https:&#x2F;&#x2F;www.jurassic-pedia.com&#x2F;brachiosaurus-altithorax-sf&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.jurassic-pedia.com&#x2F;brachiosaurus-altithorax-sf&#x2F;</a> [8] <a href="https:&#x2F;&#x2F;blog.everythingdinosaur.com&#x2F;blog&#x2F;_archives&#x2F;2023&#x2F;03&#x2F;11&#x2F;giraffatitan-vs-brachiosaurus.html" rel="nofollow">https:&#x2F;&#x2F;blog.everythingdinosaur.com&#x2F;blog&#x2F;_archives&#x2F;2023&#x2F;03&#x2F;1...</a> [9] <a href="https:&#x2F;&#x2F;dinomuseum.ca&#x2F;2019&#x2F;05&#x2F;your-brachiosaurus-is-not-a-brachiosaurus" rel="nofollow">https:&#x2F;&#x2F;dinomuseum.ca&#x2F;2019&#x2F;05&#x2F;your-brachiosaurus-is-not-a-br...</a> [10] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Brachiosaurus" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Brachiosaurus</a> [11] <a href="http:&#x2F;&#x2F;www.extinctblog.org&#x2F;extinct&#x2F;tag&#x2F;Wilhelm+von+Branca" rel="nofollow">http:&#x2F;&#x2F;www.extinctblog.org&#x2F;extinct&#x2F;tag&#x2F;Wilhelm+von+Branca</a> [12] <a href="https:&#x2F;&#x2F;www.tandfonline.com&#x2F;doi&#x2F;abs&#x2F;10.1671&#x2F;039.029.0309" rel="nofollow">https:&#x2F;&#x2F;www.tandfonline.com&#x2F;doi&#x2F;abs&#x2F;10.1671&#x2F;039.029.0309</a> [13] <a href="https:&#x2F;&#x2F;www.smithsonianmag.com&#x2F;science-nature&#x2F;a-baby-brachiosaur-125695940&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.smithsonianmag.com&#x2F;science-nature&#x2F;a-baby-brachio...</a> [14] <a href="https:&#x2F;&#x2F;pmc.ncbi.nlm.nih.gov&#x2F;articles&#x2F;PMC3045712&#x2F;" rel="nofollow">https:&#x2F;&#x2F;pmc.ncbi.nlm.nih.gov&#x2F;articles&#x2F;PMC3045712&#x2F;</a> [15] <a href="https:&#x2F;&#x2F;boasblogs.org&#x2F;dcntr&#x2F;the-brachiosaurus-brancai-in-the-natural-history-museum-berlin&#x2F;" rel="nofollow">https:&#x2F;&#x2F;boasblogs.org&#x2F;dcntr&#x2F;the-brachiosaurus-brancai-in-the...</a> [16] <a href="https:&#x2F;&#x2F;www.museumfuernaturkunde.berlin&#x2F;en&#x2F;visit&#x2F;exhibitions&#x2F;world-dinosaurs" rel="nofollow">https:&#x2F;&#x2F;www.museumfuernaturkunde.berlin&#x2F;en&#x2F;visit&#x2F;exhibitions...</a> [17] <a href="https:&#x2F;&#x2F;thepaintpaddock.wordpress.com&#x2F;brachiosaurus-altithorax&#x2F;" rel="nofollow">https:&#x2F;&#x2F;thepaintpaddock.wordpress.com&#x2F;brachiosaurus-altithor...</a> [18] <a href="https:&#x2F;&#x2F;www.researchgate.net&#x2F;publication&#x2F;242264129_A_ReEvaluation_of_Brachiosaurus_altithorax_Riggs_1903_Dinosauria_Sauropoda_and_Its_Generic_Separation_from_Giraffatitan_brancai_Janensch_1914" rel="nofollow">https:&#x2F;&#x2F;www.researchgate.net&#x2F;publication&#x2F;242264129_A_ReEvalu...</a> [19] <a href="https:&#x2F;&#x2F;www.tandfonline.com&#x2F;doi&#x2F;full&#x2F;10.1080&#x2F;02724634.2011.557115" rel="nofollow">https:&#x2F;&#x2F;www.tandfonline.com&#x2F;doi&#x2F;full&#x2F;10.1080&#x2F;02724634.2011.5...</a> [20] <a href="https:&#x2F;&#x2F;www.app.pan.pl&#x2F;archive&#x2F;published&#x2F;app68&#x2F;app011052023.html" rel="nofollow">https:&#x2F;&#x2F;www.app.pan.pl&#x2F;archive&#x2F;published&#x2F;app68&#x2F;app011052023....</a> [21] <a href="https:&#x2F;&#x2F;blog.everythingdinosaur.com&#x2F;blog&#x2F;_archives&#x2F;2008&#x2F;06&#x2F;22&#x2F;3756874.html" rel="nofollow">https:&#x2F;&#x2F;blog.everythingdinosaur.com&#x2F;blog&#x2F;_archives&#x2F;2008&#x2F;06&#x2F;2...</a> [22] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Brachiosaurus" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Brachiosaurus</a><p>---
kristopolous3 months ago
I cycle between Qwen, Gemini, Deepseek, Claude, and OpenAI kinda regularly these days.<p>They each have &quot;personality defects&quot; and at least right now we&#x27;re in a time of ensembles.<p>Ask qwen to do some kind of product comparison btw. It&#x27;s impressive. The 02-05 Gemini is pretty impressive as well.<p>Expand beyond Claude and ChatGPT. There&#x27;s some good stuff out there.
GistNoesis3 months ago
The point not discussed here is where does the information comes from.<p>Is it a primary source of secondary source [1] ? And how to incorporate this new information.<p>In their quest for building a &quot;truthful&quot; knowledge base, LLMs incorporate implicitly facts they read from their training dataset, into their model weights. Their weight update mechanism, allows to merge the facts of different authority together to compress them and not store the same fact many times, like in a traditional database.<p>This clustering of similar new information is the curse and the blessing of AI. It allows faster retrieval and memory-space reduction. This update mechanism is usually done via Bayes rule, doing something called &quot;belief propagation&quot;.<p>LLMs do this implicitly, and have not yet discovered that while belief propagation works most of the time, it&#x27;s only guaranteed to work when the information graph have no more than one loop. Otherwise you get self reinforcing behavior, where some source cites another and gives it credit, which gives credit to the previous source, reinforcing a false fact in the similar fashion as farm links help promote junk sites.<p>When repeating a false information to a LLM many times, you can make it accept it as truth. It&#x27;s very susceptible to basic propaganda. LLMs can be a triple-store or a quad-store based on how and what they are trained.<p>But LLM can also incorporate some error correction mechanism. In this article, the LLM tried two times to correct itself failed to do so, but the blog author published an article which will be incorporated into the training dataset, and the LLM will have another example of what it should have answered, provided that the blog author is perceived as authoritative enough to be given credence.<p>This error correction mechanism with human in the loop, can also be substituted by a mechanism that rely on self consistency. Where the LLM build its own dataset. And asks questions to itself about the fact it knows, and tries to answer them based on first principles.<p>For example the LLMs can use tools to retrieve the original papers, verify their time and date, and see who coined the term first and why. By reasoning it can create a rich graph of facts that are interconnected, and it can look for incoherence by asking itself. The more rich the graph, the better the information can flow along its edges.<p>Because LLMs are flexible there is a difference between what they can do, and what they do, based on whether or not we trained them to make emerge the behavior we desire to emerge. If we don&#x27;t train them with a self consistency objective they will be prone to hallucinations. If we train them based on Human Feedback preference we will have a sycophants AI. If we train them based on &quot;truth&quot;, we will have &quot;know it all&quot; AIs. If we train them based on their own mirrors, we will have what we will have.<p>[1]<a href="https:&#x2F;&#x2F;www.wgu.edu&#x2F;blog&#x2F;what-difference-between-primary-secondary-source2304.html" rel="nofollow">https:&#x2F;&#x2F;www.wgu.edu&#x2F;blog&#x2F;what-difference-between-primary-sec...</a>
评论 #43062660 未加载
ciconia3 months ago
I believe at this point it would not be inappropriate to say that any sufficiently advanced AI system is indistinguishable from bullshit, in the sense that bullshit is &quot;speech intended to persuade without regard for truth&quot; [1].<p>On a moral level, watching how tech bros are sucking it up to Trump&#x2F;Musk and how their companies are betting all their chips on the AI roulette, it all seems related.<p>[1] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;On_Bullshit" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;On_Bullshit</a>
pinoy4203 months ago
Hmm. But it is not an oracle. I wonder if you prompted it as an expert in palaeontology it may perform better.<p>That said, I do wonder if its corpus of training data contained that much information on your subject. It is rather niche is it - compared to cooking recipes, or basic software development techniques of 2 years ago, or chemistry, maths and physics.<p>My friend is a leading research chemist and he, and one other person in china, are working on this one particular field - individually - so there would be little information out there. I asked ChatGPT 4o to give an overview of what he was doing based on the little information I knew. He was astounded. It got it spot on. I asked it to elaborate and to come up with some new direction for research and the ideas it spat out were those he had considered and more.
评论 #43057113 未加载
iamnotagenius3 months ago
LLM are generative AIs and have to be used as such - to generate a report from facts, to summarise an article, to translate from one language to another, anything where we agree to sacrifice accuracy for gain in creativity. As storage of facts they are borderline awful.
评论 #43057667 未加载
computerthings3 months ago
(I shortened the answers, imagine this padded out a lot more with repetitious blurb and chipper nothingness)<p>what&#x27;s the best way to fire fast with a joystick that has no auto-fire function?<p>&gt; 1. Tap the Button Rapidly 2. Use a “Double Tap” Technique 3. Use a “Flick” Motion 4. Joystick or Button Placement 5. Software&#x2F;External Programs (If Playing on PC) 6. Practice<p>you missed the main one<p>&gt; Ah, I see! The main technique I missed is &quot;firing by holding the button down and rapidly alternating between pressing and releasing&quot;—often referred to as &quot;button mashing&quot; or the &quot;rapid toggle method.&quot; Here&#x27;s how it works: Press the fire button and hold it down. Quickly release and press it again repeatedly.<p>I&#x27;ll give you a hint, it involves making the elbow really stiff<p>&gt; Ah, got it! You&#x27;re talking about &quot;elbow mashing&quot;—a technique that involves using the elbow to press the joystick button. This method exploits your body mechanics for rapid, repetitive motion, leveraging the larger, stronger muscles of your arm to generate faster inputs.<p>------<p>Well, wow, that <i>is</i> stupid. It can&#x27;t even say &quot;I don&#x27;t know&quot;, it can&#x27;t even ask me! It&#x27;s just text vomit. A dog that can tilt its head in confusion can do more.<p>I couldn&#x27;t find the &quot;official&quot; name, if any, but here&#x27;s a redditor from 11 years ago explaining what I was looking for:<p>&gt; I have no idea how to explain this but I can do it, so I&#x27;m just going to wing an explanation and hope you understand&#x2F;do it. Pretty much try and &quot;lock up&quot; your elbow. If you force this feeling hard enough your arm or hand will start shaking. From here add a tiny bit of forward force in your wrist so that your hand shakes up and down rather than side to side.<p>I learned it from an older kid when playing on the Amiga. Sure, nothing is &quot;the best&quot; way, but nothing else <i>my</i> body is capable of can click faster, and any &quot;pro&quot; would mention this before just hallucinating insight with great confidence.