How is this on the front page? This is a completely incoherent.<p>For anyone actually interested in some interesting techniques for multi-GPU DNN training, <a href="http://arxiv.org/pdf/1404.5997v2.pdf" rel="nofollow">http://arxiv.org/pdf/1404.5997v2.pdf</a> and references therein are probably a good start.
The exposition is not very clear. What exactly do you mean when you say "No edges will be communicated over the network, only half of the nodes."? I'm puzzled, because a few sentences later, you claim "The only network IO that would be required would be sending each edge value to its respective node in Q."; so the edge values are actually communicated?<p>From what I've understood, what you're suggesting is that for every node in a layer, you colocate the edge on the same machine?