The ML community has done an amazing job disseminating information about the field, making it accessible to large numbers of people like myself who aren't professionally involved in the field.<p>That said, the one area that has stumped me and I haven't been able to find good clean code to learn from is Feature Extraction. I was wondering if anyone here can point me in the direction of some clean code that a non-expert in the field could learn from.<p>I'm definitely aware that there are books / tutorials on the subject, but none of that has "made it click" for me yet. To be honest I feel like most of my real "advances" have been while looking at code (after familiarizing myself at a high level with the theory and math).<p>For example, the tensorflow playground source code appears to be a gold mine filled with good clean code that a novice can grok.<p>EDIT: If beggars can be choosers I'm most interested in seeing a practical implementation that uses a clustering algorithm (such as kmeans) to build up a set of features from image data. Such a technique is discussed in the following video.<p>https://www.youtube.com/watch?v=wZfVBwOO0-k
What task are you learning the features for? If it's classification, instead of treating it as a two step (feature generation followed by classification) problem, I suggest that you just use a multi-layer neural network with a sigmoid function at the final layer so that you directly predict the output class from raw pixels. This way the feature engineering is taken care of by the algorithm (the weights of the hidden layers).<p>To give an idea of what I mean, see: <a href="http://karpathy.github.io/2015/10/25/selfie/" rel="nofollow">http://karpathy.github.io/2015/10/25/selfie/</a>