I'm enthusiastic that Intel is working on optimizing Torch's Tensor and nn libraries, and optimizing Neural Nets in general.<p>This repo is most definitely not ready for primetime.<p>Two comments:<p>1. I haven't seen an Intel team that does non-trivial DNN optimizations, almost all the (multiple) teams that have worked on optimizing neural networks have worked on trivial stuff like adding OpenMP optimizations to existing code, and integrating MKL into existing frameworks. Usually adding OpenMP optimizations isn't as relevant, because one uses CPUs for neuralnet inference, on a server where multiple processes are running. Per-core efficiency is much more important than cute threadpool optimizations like OpenMP which look great in micro-benchmarks.<p>The meaty and practically relevant optimizations have come from folks in the community like Andrew Lavin and Marat Dukhan ( <a href="https://github.com/Maratyszcza/NNPACK" rel="nofollow">https://github.com/Maratyszcza/NNPACK</a> ) which are focused more on optimizing both single-core efficiency as well as multi-core.<p>2. The way this fork integrates MKLDNN into Torch is not great, where they add MKLDNN fields into the core Tensor / Storage struct, which is not great from a broader perspective. I will try to reach out to the authors and make this more proper.<p>p.s.: torch installs in a self-contained folder already as pointed by @gcr