TE
TechEcho
Home
24h Top
Newest
Best
Ask
Show
Jobs
English
GitHub
Twitter
Home
Lessons from the trenches on reproducible evaluation of language models
42 points
by
veryluckyxyz
12 months ago
1 comment
jerpint
12 months ago
Collapse
One point they don’t seem to spend much time on is also the difficulty in reproducing outputs in closed-source models. Setting temperature to 0 and setting seeds doesn’t always seem to be enough to get exactly the same results for a given prompt
评论 #40478571 未加载