TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Any books/resources on logging and monitoring?

3 点作者 hgl大约 5 年前
I’ve been running a few distributed web services, but except for a few rudimentary nginx access logs, I have no idea how they’re functioning.<p>I have the following goals and questions regarding implementing a logging &amp; monitoring system to get better insights of them:<p>- What are the best practices to instrument source code to collect general logs and exceptions? - How to determine if the services and databases are performing efficiently? More specifically, what I can do to discover if they are doing unnecessary work or there are any hotspots? - Are the servers being run on overloaded? If so, what are overloading them? - How do I know if some one is trying to break into the servers? - How can I be alerted whenever a bad thing previously mentioned happens?<p>And then there is the business logic side of things. like how many users are online, how many transactions are currently being processed, etc. I don’t suppose directly querying the production database is a good idea.<p>My own research online surfaced a great deals of tools like prometheus, ELK stack, fluentd, Nagios, bugsnag, New Relic, Datadog, etc, which overwhelmed me, and I reckon without a good understanding of logging and monitoring in general, I’m likely to pick the wrong tools.<p>This feels like a really big topic. Any books&#x2F;resources that have a comprehensive introduction?

2 条评论

sid-大约 5 年前
<a href="https:&#x2F;&#x2F;dzone.com&#x2F;articles&#x2F;distributed-tracing-with-zipkin-and-elk" rel="nofollow">https:&#x2F;&#x2F;dzone.com&#x2F;articles&#x2F;distributed-tracing-with-zipkin-a...</a> <a href="https:&#x2F;&#x2F;github.com&#x2F;jaegertracing&#x2F;jaeger&#x27;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;jaegertracing&#x2F;jaeger&#x27;</a> A quick search yielded these interesting projects.
评论 #23003211 未加载
jdale27大约 5 年前
The Google SRE book (online here: <a href="https:&#x2F;&#x2F;landing.google.com&#x2F;sre&#x2F;sre-book&#x2F;toc&#x2F;index.html" rel="nofollow">https:&#x2F;&#x2F;landing.google.com&#x2F;sre&#x2F;sre-book&#x2F;toc&#x2F;index.html</a>) might be useful. Specifically chapters 6 and 10 on monitoring and alerting.
评论 #23003815 未加载