TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Empowering data scientists with a feature store

39 pointsby yiksanchanover 3 years ago

4 comments

tronbabyloveover 3 years ago
Interesting, thanks for sharing.<p>How do you handle historical backfill for new features? As in, some feature that can be updated in streaming fashion but whose initial value depends on data from the last X years, e.g., total # of courses completed since sign-up.<p>Also, who is responsible for keeping the Flink jobs running: the data scientists, or do you have a separate streaming platform team?
评论 #28824001 未加载
s_Hoggover 3 years ago
This thing reads like it was written a few years ago, to my mind (source: I&#x27;ve been working in ML most of a decade now).<p>Disintermediation of data pipeline creation is definitely nothing new at this point and the technologies aren&#x27;t that novel at this point either. I&#x27;d be surprised that this is on the front page, but it takes time for the lessons in this article to be learnt by a large enough amount of people that it becomes humdrum.<p>Above all, it reminds me of a consultant friend telling me he had two clients who built feature stores - one with an open-ended goal of enabling people and one because they had some specific things they wanted to achieve. The outcomes they got were as dissimilar as their motives!
评论 #28818133 未加载
snidaneover 3 years ago
I&#x27;m struggling to understand what the feature store is.<p>Is it another name for an OLAP or BI cube? Ie. a huge precomputed group by query with rollups.<p>The only new thing I see is that it combines both historical and recent data. Kinda like an olap cube with lambda architecture.
评论 #28819851 未加载
评论 #28819877 未加载
评论 #28824006 未加载
评论 #28819852 未加载
ibgeekover 3 years ago
There are some nice insights and engineering ideas in here. Thanks for writing this and sharing!
评论 #28822062 未加载