科技回声

2 条评论

pitah1超过 1 年前

Grats on building this out. I think there is a lot of potential in this space. I very much understand the challenges of financial/regulatory reporting and data quality :).Couple of things I have noticed. You mention "automates root cause analysis". By this I assume you mean showing which rows have affected the metrics to go out of bounds? Or is there something else I'm missing.How do users define metrics? I've found this to be a challenge especially given you may be giving this tool to non-technical users.Does this support real time data sources such as Kafka?Do you plan on supporting cross dataset validations (i.e. relationships such as an account_id for a transaction should exist in the accounts table)?

评论 #39010976 未加载

maxisaurus超过 1 年前

Hey HN,Sammy and Lucas here. We are building an open-source framework that monitors your metrics, sends alerts when anomalies are detected and automates root cause analysis. Think of Datadrift as a simple & open-source Monte Carlo for the semantic layer era. The repo is at <a href="https://github.com/data-drift/data-drift">https://github.com/data-drift/data-drift</a>Datadrift started as an internal tool built at our former company, a large European B2B Fintech. We had data reliability challenges impacting key metrics used for financial and regulatory reporting.However, when we tried existing data quality tools we where always frustrated. They provide row-level static testing (eg. uniqueness or nullness) which does not address time-varying metrics like revenues. And commercial observability solutions costs $manyK a month and brings compliance and security overhead.We designed Datadrift to solve these problems. Datadrift works by simply adding a monitor where your metric is computed. It then understands how your metric is computed and on which upstream tables it depends. When an issue occurs, it pinpoints exactly which rows have been updated and introducing the change.You can also set up alerting and customise it. For example, you can decide to open and assign an Github issue to the analyst owning the revenue metric when a +10% change is detected. We tried to make it easy to customise and developer friendly.We are thinking of adding features around root cause analysis automation/issues pattern analysis to help data teams improve metrics quality overtime. We’d love to hear your feature requests.Datadrift is built with Python and Go, and licensed under GPL. Our docs are here: <a href="https://github.com/data-drift/data-drift?tab=readme-ov-file#quickstart">https://github.com/data-drift/data-drift?tab=readme-ov-file#...</a>Dev set up and demo : <a href="https://app.claap.io/sammyt/drift-db-demo-a18-c-ApwBh9kt4p-07oQMdsIzt_e" rel="nofollow">https://app.claap.io/sammyt/drift-db-demo-a18-c-ApwBh9kt4p-0...</a>We’re very eager to get your feedback!

2 条评论

pitah1超过 1 年前

评论 #39010976 未加载

maxisaurus超过 1 年前

Show HN: Open-Source Observability for the Semantic Layer

2 条评论

Show HN: Open-Source Observability for the Semantic Layer

2 条评论