3 点作者 clmnt超过 1 年前

2 条评论

amilios超过 1 年前

Can anyone corroborate this anecdotally? I.e. has anyone actually looked at the output of the two models side-by-side for common tasks? There's lots of talks these days about academic benchmarks being pretty "broken" for modern LMs, and not really properly showcasing the differences between models. I wonder if that's the case here or if the model is genuinely better.

评论 #37849556 未加载

brucethemoose2超过 1 年前

> <|system|>, <|user|> and <|model|>.<p>Oh hey, thats almost Metharme's format.<p>It must originate from an older model, as most new models dont use that syntax.

DPO fine-tuned Mistral 7B beats Llama 70B on MT Bench

2 条评论

DPO fine-tuned Mistral 7B beats Llama 70B on MT Bench

2 条评论