TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

BitNet b1.58 2B4T Technical Report

111 pointsby galeosabout 1 month ago

8 comments

nopelynopingtonabout 1 month ago
I built it at home this morning and tried it, perhaps my expectations were high but I wasn&#x27;t terribly impressed. I asked it for a list of ten types of data I might show on a home info display panel. It gave me three. I clarified that I wanted ten, it gave me six. Every request after that just returned the same six things.<p>I know it&#x27;s not chatGPT4 but I&#x27;ve tried other very small models that run on CPU only and had better results
评论 #43720674 未加载
评论 #43716331 未加载
akoboldfryingabout 1 month ago
They give some description of how their weights are stored: they pack 4 weights into an int8, indicating that their storage format isn&#x27;t optimal (2 bits per weight instead of the optimal ~1.58 bits). But I don&#x27;t know enough about LLM internals to know how material this is.<p>Could anyone break down the steps further?
评论 #43718627 未加载
Havocabout 1 month ago
Is there a reason why the 1.58 ones are always aimed at quite small ones? Think I’ve seen an 8B but that’s about it.<p>Is there a technical reason for it or just research convenience ?
评论 #43715453 未加载
评论 #43717231 未加载
galeosabout 1 month ago
You can try out the model in a demo they have setup: <a href="https:&#x2F;&#x2F;bitnet-demo.azurewebsites.net&#x2F;" rel="nofollow">https:&#x2F;&#x2F;bitnet-demo.azurewebsites.net&#x2F;</a>
Thoreandanabout 1 month ago
I guess B1FF@BITNET posts are gonna come from an LLM now.<p>Context: <a href="https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;20030830105202&#x2F;http:&#x2F;&#x2F;www.catb.org&#x2F;esr&#x2F;jargon&#x2F;html&#x2F;B&#x2F;B1FF.html" rel="nofollow">https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;20030830105202&#x2F;http:&#x2F;&#x2F;www.catb.o...</a>
balazstorokabout 1 month ago
Does someone have a good understanding how 2B models can be useful in production? What tasks are you using them for? I wonder what tasks you can fine-tune them on to produce 95-99% results (if anything).
评论 #43714922 未加载
评论 #43715153 未加载
评论 #43714864 未加载
评论 #43715192 未加载
评论 #43714969 未加载
评论 #43714744 未加载
评论 #43714663 未加载
rbanffyabout 1 month ago
Not to be confused with BITNET<p><a href="https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;BITNET" rel="nofollow">https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;BITNET</a>
rcMgD2BwE72Fabout 1 month ago
I ask about the last French election and the #1 sentence is:<p>&gt;Marine Le Pen, a prominent figure in France, won the 2017 presidential election despite not championing neoliberalism. Several factors contributed to her success: (…)<p>What data did they train their model on?