TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Three major LLM releases in 24 hours

129 pointsby helloplanetsabout 1 year ago

8 comments

loudmaxabout 1 year ago
For those primarily interested in open weight models, that Mixtral 8x22B is really intriguing. The Mistral models have tended to outperform other models with similar parameter counts.<p>Still 281GB is huge. That&#x27;s at the higher end of what we see from other open weight models, and it&#x27;s not going to fit on anybody&#x27;s homelab franken-GPU rig. Assuming that 281GB is fp16, it should quantize down to roughly 70GB at 4bits. Still too big for any consumer grade GPU, but accessible on a workstation with enough system ram. Mixtral 8x7B runs surprisingly fast, even on CPUs. Hopefully this 8x22B model will perform similarly.<p>EDIT: Available here in GGUF format: <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;MaziyarPanahi&#x2F;Mixtral-8x22B-v0.1-GGUF" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;MaziyarPanahi&#x2F;Mixtral-8x22B-v0.1-GGUF</a><p>The 2-bit quantization comes to 52GB, so worse than my napkin math suggested. Looking forward to giving it a try on my desktop though.
lambdabaabout 1 year ago
Does anyone know what&#x27;s up with the models that aren&#x27;t available in Europe? There isn&#x27;t any transparency over this.
评论 #39989144 未加载
评论 #39989165 未加载
评论 #39989301 未加载
评论 #39989247 未加载
评论 #39989315 未加载
评论 #39989188 未加载
评论 #39989213 未加载
mgabout 1 year ago
Are any of these stable? I mean when using temperature=0, do you get the same reply for the same prompt?<p>I am using gpt-4-1106-preview quite a lot, but it is hard to optimize prompts when you cannot build a test-suite of questions and correct replies against which you can test and improve the instruction prompt. Even when using temperature=0, gpt-4-1106-preview outputs different answers for the same prompt.
评论 #39989729 未加载
评论 #39989586 未加载
评论 #39989841 未加载
jeswinabout 1 year ago
Does Gemini have a prepaid mode?<p>I like that both OpenAI and Anthropic default to the prepaid mode; I can safely experiment without worrying about selecting a large file by mistake (or worse, a runaway automated process).
novaRomabout 1 year ago
Cohere’s Command R+ is unimpressive model, because it agrees with me every time I try to argue with smth like: &quot;But are you sure? ...&quot;; also it has: &quot;last update in January 2023&quot;.<p>Mixtral 8x22B is interesting because 8x7B was one of the best (among all others) for me few months ago (in particular, common knowledge, engineering and high-level math, multi-lingual skills like translation, grammatically nicer rewritings)
评论 #39990315 未加载
tarrudaabout 1 year ago
One of the most attractive features about Mistral open models is that you can build a product on top of their API, and switch to a self hosted version if the need arises, such as customer requesting to run onprem due to privacy requirements, or the API service being taken down.
评论 #39989631 未加载
rel2thrabout 1 year ago
is the point of system prompts just to avoid prompt injection? or are they supposed to get better outputs too?<p>I never have found a need for them. i.e. the example in the article Just prompting like:<p>Write hello 3 different ways in spanish works fine for me
评论 #40008974 未加载
inference-lordabout 1 year ago
I guess this is what the singularity looks like?