I found this line confusing:<p>> The printed lines above show that both algorithms capture more than 50% of the variance exhibited in the data using only 4 of the 50 stocks.<p>Based on the sklearn PCA documentation [1] this has nothing to do with the coefficients on individual stocks, and for PCA should read more like: "[...] capture more than 50% of the variance exhibited in the data using only 4 components [...]" which is not the same thing.<p>1. <a href="http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html" rel="nofollow">http://scikit-learn.org/stable/modules/generated/sklearn.dec...</a>
Does it even makes sense to run PCA on the change percentage of a stock. To me it would be make more sense to use it with physical properties of the under lying the company. PCA helps you reduce dimensions of a higher order dimension to lower dimension so you can group stocks together. I am a little confused by what the author is trying to do.
I wish people were better acquainted with the literature, e.g. <a href="https://www.nowpublishers.com/article/Details/ECO-002" rel="nofollow">https://www.nowpublishers.com/article/Details/ECO-002</a><p>(Ed: yeah, that's just a sample of the book but has a large bibliography at the end.)
I can't seem to make the COD reach 1.0<p><pre><code> >>> selector.ordered_cods
[0.43298218, ... , 0.5068577, 0.5068577]
</code></pre>
Would you think this a problem/bug?
Another technique for unsupervised feature selection is Principal Feature Analysis (PFA): <a href="http://venom.cs.utsa.edu/dmz/techrep/2007/CS-TR-2007-011.pdf" rel="nofollow">http://venom.cs.utsa.edu/dmz/techrep/2007/CS-TR-2007-011.pdf</a>
This dataset could be interesting as it consists of stocks and cryptos <a href="https://vectorspace.ai/recommend/datasets" rel="nofollow">https://vectorspace.ai/recommend/datasets</a>
This title seems a bit confusing, since PCA is a form of unsupervised feature selection (or rather, feature weighting).<p>The title seems like it has the form "<Specific method> vs <Broader category method fits in>".