TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

The tide is shifting: 1.3B outperforms 7B Llama 2

63 pointsby __vec__over 1 year ago

5 comments

brucethemoose2over 1 year ago
&gt; we are seeing improvement on that front thanks to the absence of web data<p>Bingo. This (not the parameter count) is the amazing thing to me.<p>Garbage in garbage out, and there is a <i>ton</i> of garbage in the Falcon&#x2F;Llama (and OpenAI?) datasets. It feels like such a waste of compute and parameter space.
评论 #37484535 未加载
评论 #37484156 未加载
评论 #37483113 未加载
__vec__over 1 year ago
Textbooks Are All You Need II: phi-1.5 technical report<p>&quot;Perhaps achieving ChatGPT’s level of capability at the one billion parameters scale is actually achievable?&quot;
评论 #37484060 未加载
YetAnotherNickover 1 year ago
Llama generally acheives much higher accuracy with very small amount of fine-tuning(on similar quality of dataset like this paper) on lot of tasks. So the model understanding is present in llama to get higher accuracy. e.g Hellaswag, ARC and MMLU for 7b model is 0.8, 0.57 and 0.52 respectively[0], while phi-1 is 0.48, 0.45 and 0.38.<p>I don&#x27;t think finetuning phi-1 on good quality synthetic data will increase its accuracy as it is only trained on that.<p>[0]: <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;pankajmathur&#x2F;orca_mini_v3_7b" rel="nofollow noreferrer">https:&#x2F;&#x2F;huggingface.co&#x2F;pankajmathur&#x2F;orca_mini_v3_7b</a>
matteorasoover 1 year ago
The model can be downloaded here: <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;microsoft&#x2F;phi-1_5" rel="nofollow noreferrer">https:&#x2F;&#x2F;huggingface.co&#x2F;microsoft&#x2F;phi-1_5</a>
nbardyover 1 year ago
The title is clickbait and not the title of the paper.