TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Python ETL with Airbyte and Pathway

3 pointsby janchorowskiabout 1 year ago

2 comments

janchorowskiabout 1 year ago
Now you can use Airbyte source connectors to process data in memory with Python.<p>We integrated Airbyte connectors with Pathway, a Python stream processing framework, using the airbyte-serverless project. We believe ETL pipelines are coming back with many use cases in AI (RAG pipelines), ETL for unstructured data and pipelines that deal with PII data. In this article, we show how to stream data from Github using Airbyte and remove PII data with Pathway. We are curious on your feedback on the implementation and other use cases you may think of from decoupling the extract and load steps.
Arimbrabout 1 year ago
Interesting implementation! For complex stream and text processing, I also prefer processing data in memory with Python (ETL) rather than SQL in the warehouse (ELT).