科技回声

i gave Llama 3 Instruct 7B Q8 a try using LM Studio on a 16 GB Macbook Pro w/ M2. not sure what do at first I asked it to be a dungeon master to get some creative conversation going and i'm satisfied with both the performance and creativity if not impressed (link: https://pastebin.com/raw/iM4U8skk if you're interested)configuration:- laptop: macbookpro m2 w/ 16 gb ram- context length: 8192 (max)- gpu layers: 33 (max)cpu threads: 8response:- time to first token: ~2s by the end of the conversation (4892 total token count)- speed: ~7-8 tok/s- memory usage: 13GB (system total)- memory pressure: slightly over 50% (>90% when coding with containers)now, these results are on par with ChatGPT 4. i compared a few general knowledge questions and coding problems, including some niche libraries, and it seem to do very well against ChatGTP 4 as well.i want to compare your experience and wonder your opinions. is it possible to run copilot alike with a local server?

Ask HN: Local LLM Experience

暂无评论

Ask HN: Local LLM Experience

暂无评论