TE
TechEcho
Home
24h Top
Newest
Best
Ask
Show
Jobs
English
GitHub
Twitter
Home
Direct Preference Optimization: Your Language Model Is a Reward Model
3 points
by
ntonozzi
almost 2 years ago
no comments
no comments