TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Realtime LLM Chat on an 8GB Nvidia GPU

1 pointsby z991about 2 years ago
Demo runs on a laptop 3070 Ti / 8GB. GPU memory doesn't go above 6GB, so it might run on an even smaller GPU. Uses a 4-bit 7bn parameter alpaca_lora model and performance is significantly worse than ChatGPT as you'd expect.

no comments

no comments