TE
TechEcho
Home
24h Top
Newest
Best
Ask
Show
Jobs
English
GitHub
Twitter
Home
Understanding Emergent Abilities of Language Models from the Loss Perspective
2 points
by
veryluckyxyz
about 1 year ago
1 comment
cosmojg
about 1 year ago
Does this mean that "overtraining" a midsize LLM for many more epochs on a small, representative subset of the dataset used by a larger, more performant LLM might be sufficient for matching the performance of the larger model?