TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Fact-checking DeepSeek hype with DeepSeek

4 点作者 omega34 个月前
I came across this linkedin post[0] from: Sr Director, Distinguished Engineer, CTO Office @ Google<p>The post itself was discussed on HN already: https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=42852866<p>&gt; this is my &quot;aha&quot; moment in AI. An AI model that improves other models and itself! with significant proven gains for inference,. Deepseek analyzed the LlaMA code and realized that the KV cache was not optimal so the reasoning engine suggested a way to optimize it. Technically it was probably inspired by Mistral OSS implementation where KV cache uses dynamical allocation. this technique is well know but LlaMA missed it. So, deepseek suggested a code rewrite which was 99% correct with humans only needed to validate and test the code. This improvement results in 2.8x gain for inference on OSS LlaMA. What is even more impressive is that the same change can be adopted to improve DeepSeek itself with initial confirmed gains more that 3x, Effectively a model that is self improving because able to apply previous knowledge when appropriate, augmenting humans and connecting the dots. that&#x27;s quite something to me. No hype, just pure technological paradigm shift.<p>Pretty much every sentence has a factual error, I gave DeepSeek the context, the post itself and asked it to fact check:<p>&gt; The LinkedIn post contains several inaccuracies and exaggerations when compared to the actual news and code provided:<p><pre><code> KV Cache Optimization Claim: The post mentions KV cache optimization as the main improvement, but the provided code and news specifically discuss SIMD optimizations for quantization functions (q4_K, q6_K, etc.) in llama.cpp. There&#x27;s no mention of KV cache changes in the technical details - this appears to be a conflated claim. Performance Gains: While the PR shows 2x speed improvements for WASM through SIMD optimizations, the post inflates this to: &quot;2.8x gain for inference on OSS LLaMA&quot; &quot;3x gains for DeepSeek itself&quot; These specific numbers aren&#x27;t substantiated in the provided materials. Self-Improvement Claim: The post states the model can improve itself, but the evidence shows: Humans wrote detailed prompts (multiple iterations needed) Humans validated and tested all code The model only converted existing ARM NEON code to WASM SIMD Mistral Inspiration: The post references Mistral&#x27;s KV cache implementation, but the actual work: Uses Mistral as a reference for model registration patterns Focuses on quantization optimizations rather than attention mechanisms Code Generation Accuracy: While the post claims &quot;99% correct code&quot;, the technical details show: Multiple prompt iterations needed (4-8 retries for complex cases) Human debugging required for q6_K implementation Final code still required human validation </code></pre> Key Verdict: The LinkedIn post significantly exaggerates and misrepresents the actual technical work. The real achievement was using DeepSeek-R1 to help convert ARM NEON code to WASM SIMD through iterative prompting, resulting in 2x speed improvements for specific quantization functions - not a fundamental architectural improvement to LLaMA or true self-improvement capability.<p>The post mixes different technical concepts (KV cache optimization vs. SIMD quantization) and inflates the AI&#x27;s role in the process. While impressive, the actual work is more narrow and human-guided than portrayed.<p>[0] https:&#x2F;&#x2F;www.linkedin.com&#x2F;posts&#x2F;searchguy_this-is-my-aha-moment-in-ai-an-ai-model-activity-7290244226766823425-OUQk?utm_source=share&amp;utm_medium=member_desktop

暂无评论

暂无评论