TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine

161 点作者 j_juggernaut大约 9 年前

8 条评论

waleedka大约 9 年前
At a glance:<p><pre><code> - Only supports fully connected layers for now. No convnets or RNNs. - Requires a GPU. No option to run on CPU, not even for development. - Setup instructions for Ubuntu only. No Mac or Windows. - Uses JSON to define the network architecture. Which limits what you can build. - Takes in data in NetCDF format only. - Very little documentation. - The name is bad. I&#x27;m not going to remember how to spell DSSTNE. </code></pre> It seems like a very early proof of concept. I wouldn&#x27;t expect it to be useful to most people at this point. Built-in support for sparse vectors is interesting, but not a strong selling point by itself. I hope Amazon continues to develop it. Or, even better, contribute to one of the existing more mature frameworks.
评论 #11672736 未加载
评论 #11672731 未加载
评论 #11673017 未加载
throwaway6497大约 9 年前
Amazon is turning a new leaf. They stopped publishing to any major conferences after their last significant paper, DynamoDB.<p>My perception of Amazon is that they take everything from open-source but don&#x27;t actively give back. Amazon and open-source never went hand-in-hand. Making their deep learning frameworks open-source is cool. Kudos to the team which managed to do this. I am sure internally, it must have been a huge struggle to get the approval from execs.<p>[Edit: Grammar]
评论 #11672221 未加载
评论 #11672201 未加载
评论 #11673006 未加载
ktamura大约 9 年前
First TensorFlow and now this. Tensor is quickly becoming a mathematical-term-that-sounds-familiar-to-developers-but-most-don&#x27;t-know-what-it-is-actually.<p>Another example is topology =)
评论 #11672514 未加载
评论 #11672141 未加载
评论 #11672157 未加载
评论 #11672506 未加载
评论 #11672132 未加载
scottlegrand大约 9 年前
Lead author of DSSTNE here...<p>1. DSSTNE was designed two years ago specifically for product recommendations from Amazon&#x27;s catalog. At that time, there was no TensorFlow, only Theano and Torch. DSSTNE differentiated from these two frameworks by optimizing for sparse data and multi-GPU spanning neural networks. What it&#x27;s not currently is another framework for running AlexNet&#x2F;VGG&#x2F;GoogleNet etc, but about 500 lines of code plus cuDNN could change that if the demand exists. Implementing Krizhevsky&#x27;s one weird trick is mostly trivial since the harder model parallel part has already been written.<p>2. DSSTNE does not yet explicitly support RNNs, but it does have support for shared weights and that&#x27;s more than enough to build an unrolled RNN. We tried a few in fact. CuDNN 5 can be used to add LSTM support in a couple hundred lines of code. But since (I believe) the LSTM in cuDNN is a black box, it cannot be spread across multiple GPUs. Not too hard to write from the ground up though.<p>3. There are a huge number of collaborators and people behind the scenes that made this happen. I&#x27;d love to acknowledge them openly, but I&#x27;m not sure they want their names known.<p>4. Say what you want about Amazon, and they&#x27;re not perfect, but they let us build this from the ground up and now they have given it away. Google hired me away from NVIDIA (another one of those offers I couldn&#x27;t refuse) OTOH blind-allocated me into search in 2011 and would not let me work with GPUs despite my being one of the founding members of NVIDIA&#x27;s CUDA team because they had not yet seen them as useful. I didn&#x27;t stay there long. DSSTNE is 100% fresh code, warts and all, and I think Amazon both for letting me work on a project like this and for OSSing the code.<p>5. NetCDF is a nice efficient format for big data files. What other formats would you suggest we support here?<p>6. I was boarding a plane when they finally released this. I will be benchmarking it in the next few days. TLDR spoilers: near-perfect scaling for hidden layers with 1000 or so hidden units per GPU in use, and effectively free sparse input layers because both activation and weight gradient calculation have custom sparse kernels.<p>7. The JSON format made sense in 2014, but IMO what this engine needs now is a TensorFlow graph importer. Since the engine builds networks from a rather simple underlying C struct, this isn&#x27;t particularly hard, but it does require supporting some additional functionality to be 100% compatible.<p>8. I left Amazon 4 months ago after getting an offer I couldn&#x27;t refuse. I was the sole GPU coder on this project. I can count the number of people I&#x27;d trust with an engine like this with two hands and most of them are already building deep learning engines elsewhere. I&#x27;m happy to add whatever functionality is desired here. CNN and RNN support seem like two good first steps and the spec already accounts for this.<p>8. Ditto for a Python interface, easily implemented IMO through the Python C&#x2F;C++ extension mechanism: <a href="https:&#x2F;&#x2F;docs.python.org&#x2F;2&#x2F;extending&#x2F;extending.html" rel="nofollow">https:&#x2F;&#x2F;docs.python.org&#x2F;2&#x2F;extending&#x2F;extending.html</a><p>Anyway, it&#x27;s late, and it&#x27;s turned out to be a fantastic day to see the project on which I spent nearly two years go OSS.
评论 #11672901 未加载
评论 #11673187 未加载
jbandela1大约 9 年前
Deep Learning systems are becoming C++11&#x27;s halo projects. Here are some deep learning libraries from the Internet Big 4.<p>Amazon DSSTNE - <a href="https:&#x2F;&#x2F;github.com&#x2F;amznlabs&#x2F;amazon-dsstne" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;amznlabs&#x2F;amazon-dsstne</a><p>Google TensorFlow - <a href="https:&#x2F;&#x2F;github.com&#x2F;tensorflow&#x2F;tensorflow&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;tensorflow&#x2F;tensorflow&#x2F;</a><p>Microsoft CNTK - <a href="https:&#x2F;&#x2F;github.com&#x2F;Microsoft&#x2F;CNTK&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;Microsoft&#x2F;CNTK&#x2F;</a><p>Facebook fbcunn - <a href="https:&#x2F;&#x2F;github.com&#x2F;facebook&#x2F;fbcunn&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;facebook&#x2F;fbcunn&#x2F;</a><p>They all utilize C++11 or later. Just as Hadoop pushed Java in the big data, map-reduce realm, I think these libraries will push C++11 in the Deep Learning realm.
vr3690大约 9 年前
I get the acronym is easy to pronounce with the suggested word, but why not just use the suggested word (destiny) as the name instead of the acronym. So much easier to read and write. They could explain the name&#x27;s origin in Readme.md
评论 #11672063 未加载
nate_martin大约 9 年前
Maybe someone who works on deep learning could comment on what this provides vs other open source systems like theano, tensorflow, torch, etc.
评论 #11672110 未加载
评论 #11672168 未加载
评论 #11672125 未加载
Giorgi大约 9 年前
Soo... what is the application for this (other than buzzwords)
评论 #11678522 未加载