TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Large language models think too fast to explore effectively

118 pointsby bikenaga4 months ago

6 comments

xerox13ster4 months ago
Maybe this is a really really, really, really, really dumb idea:<p>What if we started Refining and reinforcing these Extremely fast thinking, models with text written exclusively by stoners?<p>Stoners think slower, but they have more wide, branching thoughts. I find I do some of my most seemingly creative and explorative thinking when I am high.<p>If there are effects of time and season and energy in models of LLMS such that you can say it’s June and it performs worse or it’s January and it performs better. Or you can say you’ve got a good nights rest so do better and it does, due to the fact that human writing is affected by our mental state, and our mental state is affected by physiological factors.<p>So then, I wonder what happens if you do HFRL and refinement on text from people under the influence of marijuana.
评论 #42894805 未加载
评论 #42894795 未加载
评论 #42895754 未加载
评论 #42894198 未加载
评论 #42894802 未加载
评论 #42894959 未加载
评论 #42893304 未加载
Jimmc4144 months ago
Maps well to Kahneman&#x27;s &quot;Thinking Fast and Slow&quot; framework<p>system 1 thinking for early layer processing of uncertainty in LLMs. quick, intuitive decisions, focuses on uncertainty, happens in early transformer layers.<p>system 2 thinking for later layer processing of empowerment (selecting elements that maximize future possibilities). strategic, deliberate evaluation, considering long-term possibilities, happens in later layers.<p>system 1 = 4o&#x2F;llama 3.1<p>system 1 + system 2 = o1&#x2F;r1 reasoning models<p>empowerment calculation seems possibly oversimplified - assumes a static value for elements over a dynamic context-dependent empowerment<p>interesting that higher temperatures improved performance slightly for system 1 models although they still made decisions before empowerment information could influence them<p>edit: removed the word &quot;novel&quot;. The paper shows early-layer processing of uncertainty vs later-layer processing of empowerment.
评论 #42893679 未加载
评论 #42891997 未加载
评论 #42892948 未加载
评论 #42891005 未加载
评论 #42893003 未加载
kadushka4 months ago
From the abstract:<p><i>Results show most LLMs underperform compared to humans, except for the o1 model</i>
评论 #42891453 未加载
hulitu4 months ago
&gt; Large language models think<p>Really ? Can they ask pertinent questions ?
brotchie4 months ago
Open question for LLMs, does creativity and new ideas come from a process or is it a laddered emergent capability.<p>What I mean by this, is the process of coming up with novel ideas a single capability that has to be trained and reinforced.<p>Or is it a ladder of capabilities of increasing complexity in that a model that could figure of General Relativity from scratch would not be able to continue the process and perhaps come up with a viable “theory of everything.”<p>One thing I’ve wanted to do, I’m sure somebody has tried it, is build a dataset to RL a model to be more creative: Get a human expert in a field, have them ask a reasoning model some open questions, and then have the expert look at 20 outputs and rank them by creativity &#x2F; insight. Have the expert iterate and see how much new “insight” they can mine from the model.<p>Do this across many fields, and then train a model on these rankings.<p>Perhaps creativity is a different way of moving in latent space which is “ablated” from existing models because they’re tuned to be “correct” rather than “creative.”<p>Also curious what techniques there are to sample a reasoning model to deliberately perturb its internal state into more creative realms. Though these a fine line between insight and hallucination.<p>In some respects creativity is hallucination. As a human, you’re effectively internally postulating creative ideas “hallucinations” and then one of them “hits” and fires a whole bunch of neurons which indicate: “ok that wild idea actually has grounding and strong connections to the existing knowledge in your brain.”
评论 #42896050 未加载
otabdeveloper44 months ago
Large language models don&#x27;t think at all. Please stop with the crude marketing bullshit.
评论 #42896975 未加载