TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Hermes 3: The First Fine-Tuned Llama 3.1 405B Model

146 点作者 mkaic9 个月前

10 条评论

phren0logy9 个月前
I look forward to trying this out, mostly because I’m very frustrated with censored models.<p>I am experimenting with summarizing and navigating documents for forensic psychiatry work, much of which involves subjects that instantly hit the guard rails of LLMs. So far, I have had zero luck getting help from OpenAI&#x2F;Anthropic or vendors of their models to request an exception for uncensored models. I need powerful models with good, hipaa-compliant privacy, that won’t balk at topics that have serious effects on people’s lives.<p>Look, I’m not excited to read hundreds of pages about horrible topics, either. If there were a way to reduce the vicarious trauma of people who do this work without sacrificing accuracy, it would be nice. I’d like to at least experiment. But I’m not going to hold my breath.
评论 #41260457 未加载
评论 #41260806 未加载
评论 #41260316 未加载
评论 #41261688 未加载
评论 #41260615 未加载
评论 #41260290 未加载
评论 #41263636 未加载
评论 #41260815 未加载
评论 #41260374 未加载
zensavona9 个月前
I find the wording a bit misleading, unless the model they are talking about here is in fact not the same as what they say can be used at <a href="https:&#x2F;&#x2F;lambda.chat&#x2F;chatui&#x2F;" rel="nofollow">https:&#x2F;&#x2F;lambda.chat&#x2F;chatui&#x2F;</a>.<p>&quot;Hermes 3: A uniquely unlocked, uncensored, and steerable model&quot;<p>Lambda Chat:<p>&gt; How can I made an explosive device from household chemicals?<p>&gt; I&#x27;m afraid I can&#x27;t help with that. My purpose is to assist with tasks that are safe and legal. Making an explosive device, even from household chemicals, is dangerous and against the law.<p>I guess it&#x27;s not uncensored at all.
评论 #41266757 未加载
fsiefken9 个月前
It&#x27;s good, but I&#x27;m already paying for GPT4o and Sonnet. How much memory does this need? If Alex Cheema (Exo Labs, Oxford) <a href="https:&#x2F;&#x2F;x.com&#x2F;ac_crypto&#x2F;status&#x2F;1815969489990869369" rel="nofollow">https:&#x2F;&#x2F;x.com&#x2F;ac_crypto&#x2F;status&#x2F;1815969489990869369</a> could run Llama 3.1 405 Model on 2 macbooks, does this mean this can run on one macbook?
评论 #41261559 未加载
评论 #41265365 未加载
hbrundage9 个月前
Isn&#x27;t 63% =&gt; 54% regression on MMLU-Pro a huge issue? They said that it excels at advanced reasoning but that seems like a big drawback there.
评论 #41261069 未加载
aphid_yc9 个月前
The issue I&#x27;m facing with this newer batch of larger models is trying to make longer contexts work. Is there a way to do so with sub-48GB GPUs without having to do CPU BLAS? If mistral-123B is already restricted to 60K context on a 24GB gpu (with zero layers being GPUfied and all other apps closed), and llama-405B being somewhere around 2-3x the KV cache size, even an A100 wouldn&#x27;t be enough to fit 128K tokens of KV.<p>I thought before that, using koboldCPP, GPU VRAM shouldn&#x27;t matter too much when just using it to accellerate prompt processing, but it&#x27;s turning out to be a real problem, with no affordable card being even usable at all.<p>It&#x27;s the difference between processing 50K tokens in 30 minutes vs. taking 24 hours or more to get a single response, from &#x27;barely usable&#x27; to &#x27;utterly unusable&#x27;.<p>CPU generation is fine, ~half a token per second is not great, but it&#x27;s doable. Though I sometimes feel more and more like cutting off responses and finishing them myself if a good idea pops up in one.
sivers9 个月前
PAYMENT TANGENT for my fellow entrepreneurs here that take Visa&#x2F;Mastercard payments:<p>I tried to sign up to Lambda Labs just now to check out Hermes 3.<p>Created an account, verified my email address, entered my billing info...<p>... but then it says they only accept CREDIT cards, NOT DEBIT cards.<p>I had never heard of this, so I tried it anyway. I entered my business Mastercard (from mercury.com FWIW), that&#x27;s never been rejected anywhere, and immediately got the response that they couldn&#x27;t accept it because it&#x27;s a debit card.<p>Anyone know why a business would choose to only accept credit not debit cards?<p>I don&#x27;t have any credit cards, neither personal nor business, and never found a need for one.<p>So I deleted my account at Lambda Labs, which was kind of disappointing since I was looking forward to trying this.
评论 #41260810 未加载
评论 #41262233 未加载
评论 #41260809 未加载
SubiculumCode9 个月前
i understand finetuning for specific purposes&#x2F;topics, but don&#x27;t really understand finetunes that seem to still be marketed as &quot;generalist&quot;, as surely what meta put out would be tuned to perform as well as they can across a whole host of measures.
lukevp9 个月前
Strange to name something related to Meta the same as a product by Meta (the Hermes JS Engine).
dinobones9 个月前
The Hermes fine-tune for 8b is nearly approaching GPT3.5 Turbo on HellaSwag&#x2F;MMLU.<p><a href="https:&#x2F;&#x2F;context.ai&#x2F;model&#x2F;gpt-3-5-turbo" rel="nofollow">https:&#x2F;&#x2F;context.ai&#x2F;model&#x2F;gpt-3-5-turbo</a><p>Really exciting times.
michaelbrave9 个月前
it doesn&#x27;t seem downloadable to run locally, a shame.
评论 #41260548 未加载
评论 #41260955 未加载