TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Linear compression in Python: PCA vs unsupervised feature selection

77 点作者 efavdb将近 7 年前

7 条评论

wjn0将近 7 年前
I found this line confusing:<p>&gt; The printed lines above show that both algorithms capture more than 50% of the variance exhibited in the data using only 4 of the 50 stocks.<p>Based on the sklearn PCA documentation [1] this has nothing to do with the coefficients on individual stocks, and for PCA should read more like: &quot;[...] capture more than 50% of the variance exhibited in the data using only 4 components [...]&quot; which is not the same thing.<p>1. <a href="http:&#x2F;&#x2F;scikit-learn.org&#x2F;stable&#x2F;modules&#x2F;generated&#x2F;sklearn.decomposition.PCA.html" rel="nofollow">http:&#x2F;&#x2F;scikit-learn.org&#x2F;stable&#x2F;modules&#x2F;generated&#x2F;sklearn.dec...</a>
评论 #17750182 未加载
评论 #17750202 未加载
samfisher83将近 7 年前
Does it even makes sense to run PCA on the change percentage of a stock. To me it would be make more sense to use it with physical properties of the under lying the company. PCA helps you reduce dimensions of a higher order dimension to lower dimension so you can group stocks together. I am a little confused by what the author is trying to do.
评论 #17750036 未加载
评论 #17751040 未加载
thanatropism将近 7 年前
I wish people were better acquainted with the literature, e.g. <a href="https:&#x2F;&#x2F;www.nowpublishers.com&#x2F;article&#x2F;Details&#x2F;ECO-002" rel="nofollow">https:&#x2F;&#x2F;www.nowpublishers.com&#x2F;article&#x2F;Details&#x2F;ECO-002</a><p>(Ed: yeah, that&#x27;s just a sample of the book but has a large bibliography at the end.)
rubatuga将近 7 年前
I can&#x27;t seem to make the COD reach 1.0<p><pre><code> &gt;&gt;&gt; selector.ordered_cods [0.43298218, ... , 0.5068577, 0.5068577] </code></pre> Would you think this a problem&#x2F;bug?
评论 #17753811 未加载
squigs25将近 7 年前
Another technique for unsupervised feature selection is Principal Feature Analysis (PFA): <a href="http:&#x2F;&#x2F;venom.cs.utsa.edu&#x2F;dmz&#x2F;techrep&#x2F;2007&#x2F;CS-TR-2007-011.pdf" rel="nofollow">http:&#x2F;&#x2F;venom.cs.utsa.edu&#x2F;dmz&#x2F;techrep&#x2F;2007&#x2F;CS-TR-2007-011.pdf</a>
octopod将近 7 年前
This dataset could be interesting as it consists of stocks and cryptos <a href="https:&#x2F;&#x2F;vectorspace.ai&#x2F;recommend&#x2F;datasets" rel="nofollow">https:&#x2F;&#x2F;vectorspace.ai&#x2F;recommend&#x2F;datasets</a>
closed将近 7 年前
This title seems a bit confusing, since PCA is a form of unsupervised feature selection (or rather, feature weighting).<p>The title seems like it has the form &quot;&lt;Specific method&gt; vs &lt;Broader category method fits in&gt;&quot;.
评论 #17751625 未加载