TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

TensorFlow machine learning now optimized for the Snapdragon 835 and Hexagon 682

177 点作者 rahulchowdhury超过 8 年前

14 条评论

Scaevolus超过 8 年前
Qualcomm&#x27;s Snapdragon chips ship with a Hexagon DSP core, which is optimized for high-throughput numerical calculations -- not the branch heavy code you&#x27;ll see in most general-purpose applications.<p>TensorFlow does lots of matrix multiplies. The Hexagon chip can do 8 multiplies each cycle, and runs multiple threads on each core. The benchmark isn&#x27;t clear, but it&#x27;s likely that _one_ Hexagon instruction can replace multiple normal ARM instructions for the inner loop.<p>You can see some more on how the Hexagon DSP works here: <a href="http:&#x2F;&#x2F;pages.cs.wisc.edu&#x2F;~danav&#x2F;pubs&#x2F;qcom&#x2F;hexagon_hotchips2013.pdf" rel="nofollow">http:&#x2F;&#x2F;pages.cs.wisc.edu&#x2F;~danav&#x2F;pubs&#x2F;qcom&#x2F;hexagon_hotchips20...</a>
评论 #13380524 未加载
fithisux超过 8 年前
Unfortunately, this DSP is not FOSS, you need an SDK for this with binary components. Hopefully some day we have a cross-DSP standard or at least documentation in order to use said chips. OpenCL could also acquire a DSP profile.
评论 #13381876 未加载
dharma1超过 8 年前
Nice, looks like about 10x speed up for this classification task.<p>I think there are big gains to be made in lower precision inference too. Lots of people doing interesting work in that area, check out these guys - <a href="https:&#x2F;&#x2F;xnor.ai&#x2F;" rel="nofollow">https:&#x2F;&#x2F;xnor.ai&#x2F;</a> <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1603.05279" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1603.05279</a>
nharada超过 8 年前
Are the two devices running the same model? The article claims the DSP has higher confidence, but I don&#x27;t see why that would be the case. I suppose one could work at a higher precision but that wouldn&#x27;t make sense if they&#x27;re comparing performance.
评论 #13380403 未加载
评论 #13383871 未加载
apadmarao超过 8 年前
I have a basic understanding of machine learning and absolutely no understanding of TensorFlow.<p>Can someone help me understand what is going on here?<p>Are we doing just doing prediction for a model on a mobile device instead of in the cloud? If so, for what kinds of scenarios is this useful?
评论 #13380438 未加载
评论 #13381096 未加载
评论 #13380234 未加载
评论 #13380195 未加载
评论 #13380713 未加载
mcintyre1994超过 8 年前
In some of these examples the Hexagon DSP one detects it first but with a low confidence, and then the CPU detects it later with a higher confidence than the Hexagon DSP one has yet obtained.<p>If you were using this for a real purpose, would you only consider it identified at a certain confidence? If you did then the CPU one is surprisingly more performant in some of these examples despite taking longer to get to the object at all.
评论 #13381066 未加载
marclave超过 8 年前
This is absolutely crazy... The response time is unbelievable.
sliken超过 8 年前
What kinds of &quot;AI&quot; is likely to be viable to run on a snapdragon 835+682?<p>Recognizing faces? Voice? Handwriting? Captions for photos? Natural Language queries (like google&#x27;s AI assistant)? Positioning by recognizing landmarks? Simple autonmous driving (say RC cars)? Flying (quad rotors or rc planes)? Cars?<p>Or I guess a better question... will this change anything except decrease your need for a good network?
评论 #13380309 未加载
评论 #13380304 未加载
评论 #13381001 未加载
visarga超过 8 年前
This is the new trend - dedicated AI coprocessor. Fast and less power hungry.
评论 #13380171 未加载
评论 #13380194 未加载
ant6n超过 8 年前
I for one am curious how large the image classification neural net is (in MB). I&#x27;ve come across some image classifier (vgg16) in some ML course that was a 500MB file, although the format may have been very inefficient.<p>If it&#x27;s a 100MB file, you&#x27;d basically have to ship it with the operating system.
nswanberg超过 8 年前
Is this available now or just announced? I&#x27;ve searched their site and forums but can&#x27;t find anything that&#x27;s been released, including for the 820, aside from some lower-level SDKs (comma.ai&#x27;s openpilot uses these lower-level SDKs in their closed-source portion).
sandGorgon超过 8 年前
Can someone explain what did qualcomm build here ? is this CUDA for ARM ?
评论 #13382197 未加载
评论 #13380905 未加载
nojvek超过 8 年前
I wonder how this compares to apple&#x27;s gpu on iPhone 7.<p>Having Siri do local voice and image recognition would be killer. I hate the latency currently for the AI agents
ferongr超过 8 年前
Hopefully the SOC will run with a recent kernel.