Hi HN,<p>I’ve been a data engineer for over a decade and have slowly been working on a way for 1 person to run a data platform (mostly to automate my job)<p>When I was starting, I found the wide range of tools required to build data products overwhelming. Infrastructure setup, app deployment, pipeline development, creating metrics, dashboards and finally, integrating ML into product. No wonder companies have 100s of people working on this. So last year I quit the best job in the world to figure this out.<p>I’d like to introduce Phidata: Building Blocks for Data Engineering<p>It works like this:
1. You start with a codebase that has common data tools like Airflow, Superset and Jupyter pre-configured. Infrastructure and Apps are defined as python objects.
2. Build data products (tables, metrics, models) in python or SQL. Test locally using docker and run production on AWS.
3. Infrastructure, Apps and Data Products live in the same codebase. Teams working together share code and dependencies in a pythonic way.<p>Using phidata, I’ve been running multiple data platforms and have automated most of my boilerplate code using a specially trained GPT-3 data assistant.<p>If you work with data and are looking for a better development experience, checkout [phidata.com](<a href="https://www.phidata.com/" rel="nofollow">https://www.phidata.com/</a>) or message me, I’d love your feedback.<p>Ashpreet
Had the chance to try phidata and it totally recommend it. While all the stack used is solid and easy to install, the fact that you have a zero to sixty automation with all the integrations set makes the whole experience seamless and much more productive.