TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How to build your own feature store for ML

114 点作者 LexSiga大约 5 年前

5 条评论

StonyRhetoric大约 5 年前
This is a good idea - every ML operation should have something like this, to store, organize, version data, check for drift, do time-travel, backups&#x2F;replication et cetera.<p>But to borrow from Steve Jobs, I think this is a feature, not a product. If you&#x27;ve already done the hard work of setting up a data lake or data warehouse in a cloud provider, the cloud provider can give you backups and replication, and even some time-travel. Using something like Delta Lake or even just the standard Kimball DW audit columns will get point-in-time queries. Feature versioning is just query versioning in source control, and if you have schema, you can schema version with views if you need to. If you don&#x27;t have a data lake, data warehouse ... well, you&#x27;ll still need to gather and clean all your data before you put it into a feature store, and that&#x27;s where 90% of the work is.<p>I&#x27;d love to learn more, I&#x27;m sure I&#x27;m missing something, but it seems that they&#x27;re re-solving the solved part - data storage and versioning. Checking for drift and data integrity is a nice bonus, but again, lots of libraries for that. I guess I could see it being beneficial for ML shops that don&#x27;t have modern development practices, but if you don&#x27;t have that, you have bigger problems anyways.
评论 #23322967 未加载
LexSiga大约 5 年前
As this topic will inevitably become more trendy find some some additional interesting resources on the subject as well:<p>- <a href="https:&#x2F;&#x2F;www.quora.com&#x2F;What-are-the-implementation-challenges-of-a-machine-learning-feature-store" rel="nofollow">https:&#x2F;&#x2F;www.quora.com&#x2F;What-are-the-implementation-challenges...</a><p>- <a href="http:&#x2F;&#x2F;featurestore.org&#x2F;" rel="nofollow">http:&#x2F;&#x2F;featurestore.org&#x2F;</a> (a list of -some of- the available feature stores)
tristanz大约 5 年前
A great collection of real-world case studies and various implementations can be found here: <a href="http:&#x2F;&#x2F;featurestore.org&#x2F;" rel="nofollow">http:&#x2F;&#x2F;featurestore.org&#x2F;</a>
encyclopedia大约 5 年前
Similar to generating feature vectors for dataset augmentation here <a href="https:&#x2F;&#x2F;vectorspace.ai&#x2F;covid19.html" rel="nofollow">https:&#x2F;&#x2F;vectorspace.ai&#x2F;covid19.html</a>
jamesblonde大约 5 年前
I&#x27;m the author. Let me know if you have any questions.
评论 #23322474 未加载
评论 #23322012 未加载
评论 #23322334 未加载
评论 #23324680 未加载
评论 #23325728 未加载