Very cool work, I'm happy to see more people thinking about deep networks along these lines.
It seems that this is very similar to a recent work put on arxiv back in November,<p>"Learning to Generate Chairs with Convolutional Neural Networks".
<a href="http://arxiv.org/abs/1411.5928" rel="nofollow">http://arxiv.org/abs/1411.5928</a><p>They also have a very cool video of the generation process:
<a href="https://youtu.be/QCSW4isBDL0" rel="nofollow">https://youtu.be/QCSW4isBDL0</a><p>It's very interesting to see two groups independently developing almost identical networks for inverse graphics tasks, both using pose, shape, and view parameters to guide learning. I think that continuing in this direction could provide a lot of insight into how these deep networks work, and lead to new improvements for recognition tasks too.<p>@tejask - You should probably cite the above paper, and thanks for providing code! awesome!