Can LLMs do randomness?

61 pointsby whoami_nr25 days ago

19 comments

sgk28423 days ago

Fun post! Back during the holidays we wrote one where we abused temperature AND structured output to approximate a random selection: <a href="https://bits.logic.inc/p/all-i-want-for-christmas-is-a-random" rel="nofollow">https://bits.logic.inc/p/all-i-want-for-christmas-is-a-rando...</a>

评论 #43844369 未加载

评论 #43844796 未加载

captn3m023 days ago

Wouldn’t any randomness (for a fixed combination of hardware and weights) be a result of the temperature and any randomness inserted at inference-time?Otherwise, doing a H/T comparison is just a proxy to what the underlying token probabilities are and the temperature configuration (+hardware differences for a remote-hosted model).

评论 #43844781 未加载

评论 #43843271 未加载

评论 #43843540 未加载

评论 #43843272 未加载

DimitriBouriez23 days ago

One thing to consider: we don’t know if these LLMs are wrapped with server-side logic that injects randomness (e.g. using actual code or external RNG). The outputs might not come purely from the model's token probabilities, but from some opaque post-processing layer. That’s a major blind spot in this kind of testing.

评论 #43843260 未加载

评论 #43843111 未加载

评论 #43842966 未加载

Repose094123 days ago

Is randomness even possible? You can't technically prove it just see it, more likely to be close to that, in <a href="https://www.random.org/#learn" rel="nofollow">https://www.random.org/#learn</a> they talk a little about this

评论 #43843277 未加载

whoami_nr23 days ago

Author here. I know 0-10 is one extra even number. I also just did this for fun so don't take the statistical significance aspect of it very seriously. You also need to run this multiple times with multiple temperature and top_p values to do this more rigorously.

评论 #43846564 未加载

评论 #43843302 未加载

dr_dshiv23 days ago

Oh, surprising that Claude can do heads/tails.In a project last year, I did a combination of LLMs plus a list of random numbers from a quantum computer. Random numbers are the only useful things quantum computers can produce—and that is one thing LLMs are terrible at

david-gpu23 days ago

During my tenure at NVidia I met a guy that was working on special versions of to the kernels that would make them deterministic.Otherwise, parallel floating point computations like these are not going to be perfectly deterministic, due to a combination of two factors. First, the order of some operations will be random due to all sorts of environmental conditions such as temperature variations. Second, floating point operations like addition are not ~~commutative~~ associative (thanks!!), which surprises people unfamiliar with how they work.That is before we even talk about the temperature setting on LLMs.

评论 #43844147 未加载

jansan23 days ago

What I find more important is the ability to get reproducible results for testing.I do not know about other LLMs, but Cohere allows setting a seed value. When setting the same seed value it will always give you the same result for a specific prompt (of course unless the LLM gets an update).OTOH I would guess that they normally simply generate a random seed value on the server side when processing a prompt, and it depends on their random number generator how random that really is.

评论 #43843063 未加载

bestest23 days ago

I would suggest them to repeat the experiment while including sets from answers to "choose heads or tails" AND "choose tails or heads", ditto for numbers or rephrase the question to not include a "choice" (choose from 0 to 9 (btw, they're asking to choose from 0 to 10 inclusive, which is inherently wrong as the even subset is bigger in this case)), but rather "choose a random integer".

GuB-4223 days ago

Is the LLM reset between each event?If LLMs are anything like people, I would expect a different result depending on that. The idea that random events are independent is very unintuitive to us, resulting in what we call the Gambler's Fallacy. LLMs attempts at randomness are very likely to be just as biased, if not more.

评论 #43843398 未加载

mrdw23 days ago

They should measure for different temperatures, where at 0 it will be the same output every time, but it's interesting to see how results will change for different temperatures from 0.01 to 2. But, I'm not sure if temperature is implemented the same way in all llms

baalimago23 days ago

I'd be interested to see the bias in random character generation. It's something which would be closer to the domains of LLMs, seeing that they're 'next word generators' (based on probability).How cryptographically secure would an LLM rng seed generator be?

ganiszulfa23 days ago

LLMs are acting like humans, I believe humans will have biases if you ask them to do random things :)On a more serious note, you could always adjust the temperature so they behave more randomly.

hleszek23 days ago

Can humans do randomness? Obviously not and I expect if you ask people for a random number, then odd numbers will predominate.

评论 #43843124 未加载

boroboro423 days ago

It would be nice to inspect logits data/distribution. How close the output of it to uniform is the question.

naghing23 days ago

Why not provide randomness to LLMs instead of expecting them to produce it?

evertedsphere23 days ago

0-10 inclusive is one extra even

p1dda23 days ago

LLMs doesn't even understand basic logic dude, or physics or gravity

edding450023 days ago

This is silly. Behind an LLM sits a deterministic algorithm. So no, it is not possible without ibserting randomness by other means into the algo, for example by setting temperatures for gradient descent.Why are all these posts and news about LLMs so uninformed? This is human built technology. You can actually read up how these things work. And yet they are treated as if it were an alien species that must be examined by sociological means and methods where it is not necessary. Grinds my gears every time :D

评论 #43844220 未加载

评论 #43843619 未加载

评论 #43843702 未加载

评论 #43843648 未加载

评论 #43843951 未加载

评论 #43844154 未加载