Maybe this is a really really, really, really, really dumb idea:<p>What if we started Refining and reinforcing these Extremely fast thinking, models with text written exclusively by stoners?<p>Stoners think slower, but they have more wide, branching thoughts. I find I do some of my most seemingly creative and explorative thinking when I am high.<p>If there are effects of time and season and energy in models of LLMS such that you can say it’s June and it performs worse or it’s January and it performs better. Or you can say you’ve got a good nights rest so do better and it does, due to the fact that human writing is affected by our mental state, and our mental state is affected by physiological factors.<p>So then, I wonder what happens if you do HFRL and refinement on text from people under the influence of marijuana.
Maps well to Kahneman's "Thinking Fast and Slow" framework<p>system 1 thinking for early layer processing of uncertainty in LLMs. quick, intuitive decisions, focuses on uncertainty, happens in early transformer layers.<p>system 2 thinking for later layer processing of empowerment (selecting elements that maximize future possibilities). strategic, deliberate evaluation, considering long-term possibilities, happens in later layers.<p>system 1 = 4o/llama 3.1<p>system 1 + system 2 = o1/r1 reasoning models<p>empowerment calculation seems possibly oversimplified - assumes a static value for elements over a dynamic context-dependent empowerment<p>interesting that higher temperatures improved performance slightly for system 1 models although they still made decisions before empowerment information could influence them<p>edit: removed the word "novel". The paper shows early-layer processing of uncertainty vs later-layer processing of empowerment.
Open question for LLMs, does creativity and new ideas come from a process or is it a laddered emergent capability.<p>What I mean by this, is the process of coming up with novel ideas a single capability that has to be trained and reinforced.<p>Or is it a ladder of capabilities of increasing complexity in that a model that could figure of General Relativity from scratch would not be able to continue the process and perhaps come up with a viable “theory of everything.”<p>One thing I’ve wanted to do, I’m sure somebody has tried it, is build a dataset to RL a model to be more creative: Get a human expert in a field, have them ask a reasoning model some open questions, and then have the expert look at 20 outputs and rank them by creativity / insight. Have the expert iterate and see how much new “insight” they can mine from the model.<p>Do this across many fields, and then train a model on these rankings.<p>Perhaps creativity is a different way of moving in latent space which is “ablated” from existing models because they’re tuned to be “correct” rather than “creative.”<p>Also curious what techniques there are to sample a reasoning model to deliberately perturb its internal state into more creative realms. Though these a fine line between insight and hallucination.<p>In some respects creativity is hallucination. As a human, you’re effectively internally postulating creative ideas “hallucinations” and then one of them “hits” and fires a whole bunch of neurons which indicate: “ok that wild idea actually has grounding and strong connections to the existing knowledge in your brain.”