Hey HN community,<p>I recently pushed an update to my GitHub repo titled "Practical Data Engineering: A Hands-On Real-Estate Project Guide". This open-source project aims to tackle real-world data engineering challenges while exploring various technologies. It guides you through building a data application that collects, enriches, and visualizes real-estate data, potentially helping you find your dream property.<p>This project covers web scraping with Beautiful Soup, processing data with Spark and Delta Lake, visualizing with Apache Superset, and much more, all orchestrated on Kubernetes for scalability.<p>I started this project back in November 2020, mainly to learn and teach data engineering. Three years on, I'm fascinated by the fact that despite the data engineering space moving extremely fast, the core of my project, powered by carefully chosen tools from the Open Data Stack, remains relevant to this day. This project is my most searched blog post on Google, which motivated me to update it.<p>I updated to the latest versions of tools like Dagster while exploring new additions like delta-rs, which allows direct interactions with Delta Tables in Python.<p><a href="https://github.com/sspaeti-com/practical-data-engineering">https://github.com/sspaeti-com/practical-data-engineering</a><p>I look forward to your thoughts and seeing what you would build differently. My future plans are to add Rill Developer as a code-first BI tool and add DuckDB or Polars to the mix.