This is a very interesting idea, with DenseNets there are oftentimes some terrible memory gotchas that have gotten me over the past 7-8 years or so, so a part of me is sorta leaning back waiting for some memory usage shoe to drop not specified in the paper (even with the activation patterns!)<p>However, maybe this is not the case. I have a bit of a history of messing with residuals in neural networks, seeing more work on it is good. Fast training networks of course are a very slightly mild obsession of mine as well, and very useful to the field. Here's hoping it pans out as a motif, curious to see where it goes.