TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

New Mixtral HQQ Quantzied 4-bit/2-bit configuration

5 pointsby ibuildthingsover 1 year ago

1 comment

ibuildthingsover 1 year ago
We are releasing new 2-bit Mixtral models. These ones use a mixed HQQ 4-bit&#x2F;2-bit configuration, resulting in a significantly improved model (ppl 4.69 vs. 5.90) with a negligible 0.20 GB VRAM increase.<p>Base: <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;mobiuslabsgmbh&#x2F;Mixtral-8x7B-v0.1-hf-attn-4bit-moe-2bit-HQQ" rel="nofollow noreferrer">https:&#x2F;&#x2F;huggingface.co&#x2F;mobiuslabsgmbh&#x2F;Mixtral-8x7B-v0.1-hf-a...</a><p>Instruct: <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;mobiuslabsgmbh&#x2F;Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bit-HQQ" rel="nofollow noreferrer">https:&#x2F;&#x2F;huggingface.co&#x2F;mobiuslabsgmbh&#x2F;Mixtral-8x7B-Instruct-...</a><p>Shout-out to Artem Eliseev and Denis Mazur for suggesting this idea ( <a href="https:&#x2F;&#x2F;github.com&#x2F;mobiusml&#x2F;hqq&#x2F;issues&#x2F;2">https:&#x2F;&#x2F;github.com&#x2F;mobiusml&#x2F;hqq&#x2F;issues&#x2F;2</a> )