TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Kedro: open-source library for production-ready machine learning code

74 pointsby ereli1almost 6 years ago

6 comments

FridgeSealalmost 6 years ago
&gt; Machine learning models which can be deployed effortlessly and operate unattended are far more likely to achieve commercial objectives.<p>Likeliness of achieving commercial objectives is tied to the commercial usefulness and accuracy of your analysis and predictions, not the ease of deployment, or-even more curiously-ability to be left unattended.
评论 #20117066 未加载
评论 #20116567 未加载
prependalmost 6 years ago
I really like how they implemented the data catalog [0] so that it’s yaml-based and also has a paths-style cascading method of files that can be common across or within teams as well as personal for individual projects. I think this makes it easy to build up with tools for meta analysis (how many data sets are used, etc) and even viz using a variety of tools rather than having the metadata management tied to a system or product.<p>Are there other techniques for data catalogs that are file based or at least open standard based that scale all the way up from developer?<p>[0] <a href="https:&#x2F;&#x2F;kedro.readthedocs.io&#x2F;en&#x2F;latest&#x2F;04_user_guide&#x2F;04_data_catalog.html" rel="nofollow">https:&#x2F;&#x2F;kedro.readthedocs.io&#x2F;en&#x2F;latest&#x2F;04_user_guide&#x2F;04_data...</a>
评论 #20114327 未加载
domenicrosatialmost 6 years ago
Conjecture: production quality of ml code has mostly to do with how heuristics are designed and battle tested and almost nothing to do with how the training&#x2F;inference pipeline is constructed.
评论 #20123007 未加载
bserialalmost 6 years ago
I’m curious as to if anyone can say how this compares to dagster since both libraries seems to rely on deploying to engines like Airflow?
评论 #20115091 未加载
wokwokwokalmost 6 years ago
tldr, if you really dig past the marketing (from the FAQ (1)):<p>&gt; We see Airflow and Luigi as complementary frameworks: Airflow and Luigi are tools that handle deployment, scheduling, monitoring and alerting. Kedro is the worker that should execute a series of tasks, and report to the Airflow and Luigi managers.<p>&gt; Create the data transformation steps as pure Python functions<p>Personally, I feel mystified why you would use something like this rather than a more mature product like say, Spark, that natively supports clustering, etc, which is what I would really like to see in the FAQ.<p>Is it a processing solution? Not really, since it suggests you can offload the heavy lifting to an engine, eg. spark. An orchestrator? Apparently not, because that&#x27;s a complementary product. So... it&#x27;s like, a configuration management tool?<p>Pretty hard to see the use case to me.<p>1. <a href="https:&#x2F;&#x2F;kedro.readthedocs.io&#x2F;en&#x2F;latest&#x2F;06_resources&#x2F;01_faq.html#how-does-kedro-compare-to-other-projects" rel="nofollow">https:&#x2F;&#x2F;kedro.readthedocs.io&#x2F;en&#x2F;latest&#x2F;06_resources&#x2F;01_faq.h...</a>
评论 #20114841 未加载
评论 #20113714 未加载
评论 #20114697 未加载
评论 #20113659 未加载
covermanalmost 6 years ago
Starting to see a lot of these frameworks pop up to simplify deployment of machine learning models. I’m really hoping one or two start to stand out...but it doesn’t feel like this one.