TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Pathway – Build Mission Critical ETL and RAG in Python (NATO, F1 Used)

73 pointsby janchorowski12 months ago
Hi HN data folks,<p>I am excited to share Pathway, a Python data processing framework we built for ETL and RAG pipelines.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;pathwaycom&#x2F;pathway">https:&#x2F;&#x2F;github.com&#x2F;pathwaycom&#x2F;pathway</a><p>We started Pathway to solve event processing for IoT and geospatial indexing. Think freight train operations in unmapped depots bringing key merchandise from China to Europe. This was not something we could use Flink or Elastic for.<p>Then we added more connectors for streaming ETL (Kafka, Postgres CDC…), data indexing (yay vectors!), and LLM wrappers for RAG. Today Pathway provides a data indexing layer for live data updates, stateless and stateful data transformations over streams, and retrieval of structured and unstructured data.<p>Pathway ships with a Python API and a Rust runtime based on Differential Dataflow to perform incremental computation. All the pipeline is kept in memory and can be easily deployed with Docker and Kubernetes (pipelines-as-code).<p>We built Pathway to support enterprises like F1 teams and NATO to build mission-critical data pipelines. We do this by putting security and performance first. For example, you can build and deploy self-hosted RAG pipelines with local LLM models and Pathway’s in-memory vector index, so no data ever leaves your infrastructure. Pathway connectors and transformations work with live data by default, so you can avoid expensive reprocessing and rely on fresh data.<p>You can install Pathway with pip and Docker, and get started with templates and notebooks: <a href="https:&#x2F;&#x2F;pathway.com&#x2F;developers&#x2F;showcases" rel="nofollow">https:&#x2F;&#x2F;pathway.com&#x2F;developers&#x2F;showcases</a><p>We also host demo RAG pipelines implemented 100% in Pathway, feel free to interact with their API endpoints: <a href="https:&#x2F;&#x2F;pathway.com&#x2F;solutions&#x2F;rag-pipelines#try-it-out" rel="nofollow">https:&#x2F;&#x2F;pathway.com&#x2F;solutions&#x2F;rag-pipelines#try-it-out</a><p>We&#x27;d love to hear what you think of Pathway!

8 comments

threecheese11 months ago
I am curious about your hosting; the Community plan notes &quot;8 GB RAM - 4 cores &quot;; is there some element to Pathway that is always hosted and would utilize this capacity - even for local deployments? Or is this just &quot;Hey, if you want to play around on Pathway hardware, this is how much you can use&quot;? This looks amazing, and I am wondering where &quot;the rub&quot; is :)
评论 #40685456 未加载
pipboyguy12 months ago
I&#x27;ve built DE and AI solutions based on Pathway for multiple clients. It&#x27;s robust and fast.
评论 #40672508 未加载
sriyansh711 months ago
Congrats on the launch! If I understood it correctly, you also build vector indexes on the fly on live data? Curious - what usecases are you seeing for RAG on streaming data?
评论 #40695980 未加载
snowpid12 months ago
Good old &quot;Enterprise&quot; NATO! Always good for a surprise
评论 #40674625 未加载
Arimbr12 months ago
If all the pipeline and the vector index is keep in memory... does Pathway still persist state somewhere?
评论 #40674692 未加载
articsputnik12 months ago
Great job on Pathway. It&#x27;s impressive to see a Python tool for ETL and RAG tasks with such strong features. The Python API and Rust runtime for quick updates look interesting. Focusing on security and performance, especially with self-hosted RAG pipelines, is fantastic. Excited to see how this OSS repo grows.
评论 #40670414 未加载
devnull77712 months ago
Looks nice! The examples on your site look easy to reproduce!<p>BTW. Super nice and clear website!
alexmarquardt12 months ago
Looks awesome!