TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Trench – Open-source analytics infrastructure

155 pointsby pancomplex7 months ago
Hey HN! I want to share a new open source project I&#x27;ve been working on called Trench (<a href="https:&#x2F;&#x2F;trench.dev" rel="nofollow">https:&#x2F;&#x2F;trench.dev</a>). It&#x27;s open source analytics infrastructure for tracking events, page views, and identifying users, and it&#x27;s built on top of ClickHouse and Kafka.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;frigadehq&#x2F;trench">https:&#x2F;&#x2F;github.com&#x2F;frigadehq&#x2F;trench</a><p>I built Trench because the Postgres table we used for tracking events at our startup (<a href="http:&#x2F;&#x2F;frigade.com&#x2F;">http:&#x2F;&#x2F;frigade.com&#x2F;</a>) was getting expensive and becoming a performance bottleneck as we scaled to millions of end users.<p>Many companies run into the same problem as us (e.g. Stripe, Heroku: <a href="https:&#x2F;&#x2F;brandur.org&#x2F;fragments&#x2F;events" rel="nofollow">https:&#x2F;&#x2F;brandur.org&#x2F;fragments&#x2F;events</a>). They often start by adding a basic events table to their relational database, which works at first, but can become an issue as the application scales. It’s usually the biggest table in the database, the slowest one to query, and the longest one to back up.<p>With Trench, we’ve put together a single Docker image that gives you a production-ready tracking event table built for scale and speed. When we migrated our tracking table from Postgres to Trench, we saw a 42% reduction in cost to serve on our primary Postgres cluster and all lag spikes from autoscaling under high traffic were eliminated.<p>Here are some of the core features:<p>* Fully compliant with the Segment tracking spec e.g. track(), identify(), group(), etc.<p>* Can handle thousands of events per second on a single node<p>* Query tracking data in real-time with read-after-write guarantees<p>* Send data anywhere with throttled and batched webhooks<p>* Single production-ready docker image. No need to manage and roll your own Kafka&#x2F;ClickHouse&#x2F;Nodejs&#x2F;etc.<p>* Easily plugs into any cloud hosted ClickHouse and Kafka solutions e.g. ClickHouse Cloud, Confluent<p>Trench can be used for a range of use cases. Here are some possibilities:<p>1. Real-Time Monitoring and Alerting: Set up real-time alerts and monitoring for your services by tracking custom events like errors, usage spikes, or specific user actions and sending that data anywhere with Trench’s webhooks<p>2. Event Replay and Debugging: Capture all user interactions in real-time for event replay<p>3. A&#x2F;B Testing Platform: Capture events from different users and groups in real time. Segment users by querying in real time and serve the right experiences to the right users<p>4. Product Analytics for SaaS Applications: Embed Trench into your existing SaaS product to power user audit logs or tracking scripts on your end-users’ websites<p>5. Build a custom RAG model: Easily query event data and give users answers in real-time. LLMs are really good at writing SQL<p>The project is open-source and MIT-licensed. If there’s interest, we’re thinking about adding support for Elastic Search, direct data integrations (e.g. Redshift, S3, etc.), and an admin interface for creating queries, webhooks, etc.<p>Have you experienced the same issues with your events tables? I&#x27;d love to hear what HN thinks about the project.

13 comments

bosky1017 months ago
1) Appreciate the single image to get started, but am particularly curious how you handle different events of a new user going to different nodes.<p>2) any admin interface or just the rest API?<p>3) a little bit on the clickhouse table and engine choices?<p>4) stats on Ingesting and querying tbe same time<p>5) node doesn&#x27;t support the clickhouse TCP interface. This was a major bottleneck even with batching of 50k events (or 30 secs whichever comes first)<p>6) CH indexes?<p>7) how are events partitioned to a Kafka partition? By userId? Any assumptions on minimum fields<p>Will try porting our in-house marketing automation backend (posthog frontend compatible) to this and see how it goes (150M+ events per day)<p>Kudos all around. Love all 3 of your technology choices.
评论 #41975168 未加载
hitradostava7 months ago
Looks interesting, we solved this problem with Kinesis Firehose, S3 and Athena. Pricing is cheap, you can run any arbitrary SQL query and there is zero infrastructure to maintain.
评论 #41979061 未加载
antman7 months ago
How does it scale? Can you spin up multiple containers? For upcoming features auto archiving to cloud storage old data would be great.
评论 #41976278 未加载
Attummm7 months ago
Looks great, but what is missing for me are use cases.<p>Why should I use it? What are the unique selling points of your project?
评论 #41974909 未加载
codegeek7 months ago
Looks good. In market for something like this and I just ran it locally. how do I visualize data ? Is Grafana not included by default.<p>Also, minor issue in your docs. There is an extra comma in the sample JSON under the sample event. The fragment below:<p><pre><code> &quot;properties&quot;: { &quot;totalAccounts&quot;: 4, &quot;country&quot;: &quot;Denmark&quot; }, }] </code></pre> I had to remove that comma at the end.
评论 #41976889 未加载
d_watt7 months ago
Looks super interesting. Any positioning thoughts on this vs <a href="https:&#x2F;&#x2F;jitsu.com">https:&#x2F;&#x2F;jitsu.com</a> ?
评论 #41975259 未加载
brody_slade_ai7 months ago
I&#x27;ve been exploring open source data analytics software and it&#x27;s been a game-changer. I mean the flexibility and cost savings are huge perks. I&#x27;ve been looking into Apache Spark and KNIME, and they both seem like great options
Incipient7 months ago
&gt;LLMs are really good at writing SQL<p>Unfortunately not my experience. Possibly not well promoted, but trying to get vscode copilot to generate anything involving semi-basic joins fall quite flat.
oulipo7 months ago
What is the advantage of this rather than using a postgres plugin for clickhouse and S3 storage of the data to build a kind of data-warehouse, which wouldn&#x27;t require the bloat of Kafka?
评论 #41977604 未加载
remram7 months ago
If you don&#x27;t mind me asking, why the name &quot;Trench&quot;?
评论 #41988900 未加载
asdev7 months ago
how is this different from Posthog?
评论 #41979219 未加载
评论 #41977572 未加载
oulipo7 months ago
Could this be used to log IoT object events? or is it more for app analytics?
评论 #41977494 未加载
biddendidden7 months ago
I <i>_totally_</i> associate &#x27;trench&#x27; with &#x27;analytics&#x27;. Oh, perhaps the author associates it with &#x27;infrastructure&#x27;? Just stupid.