TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Log collector that runs on a $4 VPS

118 点作者 Nevin1901超过 2 年前
Hey guys, I&#x27;m building erlog to try and solve problems with logging. While trying to add logs to my application, I couldn&#x27;t find any lightweight log platform which was easy to set up without adding tons of dependencies to my code, or configuring 10,000 files.<p>ErLog is just a simple go web server which batch inserts json logs into an sqlite3 server. Through tuning sqlite3 and batching inserts, I find I can get around 8k log insertions&#x2F;sec which is fast enough for small projects.<p>This is just an MVP, and I plan to add more features once I talk to users. If anyone has any problems with logging, feel free to leave a comment and I&#x27;d love to help you out.

20 条评论

Dachande663超过 2 年前
I’ve found the hard part is not so much the collection of logs (especially at this scale), but the eventual querying. If you’ve got an unknown set of fields been logged, queries very quickly devolve into lots of slow table scans or needing materialised views that start hampering your ingest rate.<p>I settled on a happy&#x2F;ok midpoint recently whereby I dump logs in a redis queue using filebeat as it’s very simple. Then have a really simple queue consumer that dumps the logs into clickhouse using a schema Uber detailed (split keys and values), so queries can be pretty quick even over arbitrary fields. 30,00 logs an hour and I can normally search for anything in under a second.
评论 #34760920 未加载
评论 #34759393 未加载
评论 #34761620 未加载
folmar超过 2 年前
Sorry, but I don&#x27;t see the selling point yet. Rsyslog has omlibdbi module that send your data to sqlite. It can consume pretty much any standard protocol on input, is already available and battle proven.
评论 #34759703 未加载
评论 #34758834 未加载
unxdfa超过 2 年前
I see your idea but you could drop the JSON and use rsyslogd + logrotate + grep? You can grep 10 gig files on a $5 VPS easily and quickly! I can&#x27;t speak for a $4 one ;)
评论 #34758019 未加载
评论 #34799455 未加载
Thaxll超过 2 年前
You could have just used Filebeat? It&#x27;s also in Go and it&#x27;s pretty easy to use.<p><a href="https:&#x2F;&#x2F;www.elastic.co&#x2F;guide&#x2F;en&#x2F;beats&#x2F;filebeat&#x2F;current&#x2F;filebeat-input-httpjson.html" rel="nofollow">https:&#x2F;&#x2F;www.elastic.co&#x2F;guide&#x2F;en&#x2F;beats&#x2F;filebeat&#x2F;current&#x2F;fileb...</a>
评论 #34799514 未加载
rsdbdr203超过 2 年前
This is exactly why I build log-store. Can easily handle 60k logs&#x2F;sec, but I think more importantly is the query interface. Commands to help you extract value from your logs, including custom commands written in Python.<p>Free through &#x27;23 is my motto... Just a solo founder looking for feedback.
评论 #34758672 未加载
评论 #34758792 未加载
评论 #34758767 未加载
remram超过 2 年前
May be more widely applicable for personal servers: lnav, an advanced log file viewer for the terminal: <a href="https:&#x2F;&#x2F;lnav.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;lnav.org&#x2F;</a><p>It uses SQLite internally but can parse log files in many formats on the fly. C++, BSD license, discussed 1 month ago: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=34243520" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=34243520</a>
keroro超过 2 年前
If anyones looking for similar services Im using vector.dev to move logs around &amp; it works great &amp; has a ton of sources&#x2F;destinations pre-configured.
Hamuko超过 2 年前
I feel like if you&#x27;re going to use &quot;$4 VPS&quot; as a quantifier, you could at least specify which $4 VPS is being used.
评论 #34761932 未加载
评论 #34756990 未加载
withinboredom超过 2 年前
Neat! Have you considered using query params instead of bodies, then just piping the access logs to a spool (no program actually on the server, just return an empty file). Then your program can just read from the spool and dump them into sqlite.<p>That should tremendously improve throughput, at the expense of some latency.
评论 #34757533 未加载
aninteger超过 2 年前
I&#x27;m doing something similar with a $5 VPS, but with fastcgi&#x2F;c++&#x2F;sqlite3. I then have a cronjob that then aggregates error logs, generates an summary and posts to a Slack channel. Personally I wish I didn&#x27;t have to write it, but it works.
评论 #34757594 未加载
评论 #34761921 未加载
andymac4182超过 2 年前
I have been using <a href="https:&#x2F;&#x2F;datalust.co&#x2F;" rel="nofollow">https:&#x2F;&#x2F;datalust.co&#x2F;</a> to handle this. It scales really well down and up to how much you want to spend. It comes with existing integrations with a lot of libraries and formats and a CLI to push data from file based logs to their service.<p>They have just added a new parser and query engine written in Rust to get the best performance out of your instance. <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=34758674" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=34758674</a>
评论 #34761644 未加载
cnkk超过 2 年前
I am been using vector.dev for a long time now. It is also easy to setup. And it looks similar to your idea.
ilyt超过 2 年前
...uh, just rsyslog and files ? I think it can even write to SQLite
marcrosoft超过 2 年前
Woah cool. I did the same thing. I Made a poor man’s small scale splunk replacement with SQLite json and go. I used the built in json and full text search extensions.
Weryj超过 2 年前
I run a self hosted version of Sentry.io on a NUC at home and a relay on a VPS, the. Use Tailscale to connect the two.<p>If you have an old computer at home, using a VPS as the gateway is always a good option.<p>Edit: you can then use the VPS as a exit node for internet.
harisamin超过 2 年前
Ah cool! Somewhat related I built a json log query tool recently using rust and SQLite. Didn’t build the server part of it<p><a href="https:&#x2F;&#x2F;github.com&#x2F;hamin&#x2F;jlq">https:&#x2F;&#x2F;github.com&#x2F;hamin&#x2F;jlq</a>
评论 #34757698 未加载
arjvik超过 2 年前
I&#x27;m working on a project where I&#x27;m handling simultaneous connections to a bunch of peers. What&#x27;s the best way to log messages to trace the flow of requests through my system when multiple code paths are running asynchronously (NodeJS, so I can&#x27;t simply get a thread ID)?
评论 #34761386 未加载
评论 #34763383 未加载
int0x2e超过 2 年前
I strongly urge people to try something like Application Insights. It&#x27;s not dirt cheap, but not that expensive, and lets you collect anything you&#x27;d want and query your telemetry&#x2F;logs retroactively extremely flexibly. It&#x27;s just great.
maybesimpler超过 2 年前
You could also not write your own server. Just configure OpenResty, write some simple LUA to push to the redis queue. Then consume the queue via your language of choice to write to your store(clickhouse).
vbezhenar超过 2 年前
Logs must be stored in S3, it&#x27;s no-brainer. Disk storage is too expensive. Logging system should be designed for S3 from ground up IMO.
评论 #34762008 未加载