TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Empowering data scientists with a feature store

39 点作者 yiksanchan超过 3 年前

4 条评论

tronbabylove超过 3 年前
Interesting, thanks for sharing.<p>How do you handle historical backfill for new features? As in, some feature that can be updated in streaming fashion but whose initial value depends on data from the last X years, e.g., total # of courses completed since sign-up.<p>Also, who is responsible for keeping the Flink jobs running: the data scientists, or do you have a separate streaming platform team?
评论 #28824001 未加载
s_Hogg超过 3 年前
This thing reads like it was written a few years ago, to my mind (source: I&#x27;ve been working in ML most of a decade now).<p>Disintermediation of data pipeline creation is definitely nothing new at this point and the technologies aren&#x27;t that novel at this point either. I&#x27;d be surprised that this is on the front page, but it takes time for the lessons in this article to be learnt by a large enough amount of people that it becomes humdrum.<p>Above all, it reminds me of a consultant friend telling me he had two clients who built feature stores - one with an open-ended goal of enabling people and one because they had some specific things they wanted to achieve. The outcomes they got were as dissimilar as their motives!
评论 #28818133 未加载
snidane超过 3 年前
I&#x27;m struggling to understand what the feature store is.<p>Is it another name for an OLAP or BI cube? Ie. a huge precomputed group by query with rollups.<p>The only new thing I see is that it combines both historical and recent data. Kinda like an olap cube with lambda architecture.
评论 #28819851 未加载
评论 #28819877 未加载
评论 #28824006 未加载
评论 #28819852 未加载
ibgeek超过 3 年前
There are some nice insights and engineering ideas in here. Thanks for writing this and sharing!
评论 #28822062 未加载