I’ve been looking to instrument opentelemetry into my app, and I was wondering what your guys experiences was with it.<p>Would you use it for small/medium sized projects?<p>What was the most annoying part about setting it up?<p>What’s your general experience with the opentelemetry ecosystem?<p>I’m curious about any opinions you might have
What I wanted: Small library that just prints to stdout in a standard format. No network code in that library. A separate executable that actually sends the logs to wherever needed, and whose vulnerabilities can be fixed independently from my app.<p>What I got: The kubernetes of observability. Load a bunch of libraries that connect to the network from inside your app. If you can't use automatic instrumentation then now you have setup your exporters, providers, processors, and whatever. Everything feels too brittle. Its big surface area means I fear the next refactor where every setting will get a different name and functions get deprecated even if the actual telemetry format doesn't change.
I was initially very skeptical, but I spent the effort researching what it is and also getting up-to-date with the Observability movement.<p>What I found in the fewest bullets as possible:<p>* The hype is obnoxious and unfortunately hides the really useful stuff. Here is my summary of the Observability movement in one sentence:<p><pre><code> Record telemetry as unique, high-context events (unlike typical logs), containing many keys (as an alternative to having too many metrics), and provide convenient way to query instead of trying to frame it in a dashboard.
</code></pre>
* The metrics and logging stuff is almost an afterthought. Think of it as bonus to the distributed tracing story.<p>* Distributed tracing is really, really useful for debugging and is should what application developers focus on first in their monitoring story. Don't inspect monitoring tools designed for infrastructure to help you much. The reason is simple - debugging is inherently reactive; most other tooling suggests you take proactive approaches which are useful for known issues. When you are debugging production systems you are trying to understand previously unknown issues.<p>* Otel is really simple; often automatic instrumentation is not worth the trouble and simpler libraries where you manually instrument your code are not hard at all to use. My experience is with Clojure, but YMMV.<p>* Otel backends can range from very affordable to very expensive so shop around. This is a benefit of having a standard!<p>I think distributed tracing is perfectly suitable for small projects, even one-service ones; I think one should add it to one before even thinking about deployment. And is mandatory for micro-services.
I always had in my back pocket a goal to learn OpenTelemetry, but some recent unfavorable HN commentary left me surprised and deflated.<p>I wouldn’t mind reading some more critical review.
<a href="https://news.ycombinator.com/item?id=37295097">https://news.ycombinator.com/item?id=37295097</a>
A lot of people complaining about OpenTelemetry assume that it’s a “fancy logging API” and are disappointed when they discover that it is something else.<p>It’s actually a vendor-agnostic replacement for the client side of DataDog, New Relic, or Azure App Insights.<p>It’s complicated because those tools are complicated.<p>It’s especially complicated because it needs to support the special needs of library vendors, third party plugins, and framework-level integrations.<p>So no, it’s never going to be “simple” in the same way there will never be a simple replacement for something as complex as a Word document.<p>No, ASCII won’t cut it. Yes it’s simple and lightweight, but not what people actually want.
OpenTelemetry is kind of a standard for logs with a bunch of libraries. These libraries have different standards (in terms of optimization rigor). The whole thing is not well orchestrated together. So OT might good for a certain language/stack and completely suck for another. Also, your instrumentation software might be OT compatible but have nothing to do what-so-ever with OT the organization.<p>At this point, you'd be shopping for Cloud providers and using the one that makes most sense in terms of performance and ease of integration.
> Would you use it for small/medium sized projects?<p>Depends what your requirements are - what's the problem you're trying to solve? The size doesn't matter. If may want to instrument a 3 line script and ignore a massive app. It's fine either way.<p>> What was the most annoying part about setting it up?<p>That's going to be highly dependent on which part of opentelemetry you use and how. It's a protocol with lots of possible uses and implementations. You'll need to be more specific with what you're expecting to send, from where, and to what system.<p>In general, if you need it, it works. But what I'm mostly trying to say is that you'll need to do some more research and info gathering on your own to ask questions that others can answer. This one is at the level of "what's your opinion on cars?"
I really like it. Especially together with otel-collector it’s easy to get logs, tracing and metrics in a standardised manner that allows you to forward any metrics/tracing to SRE teams or partners.<p>For example, sent your logs to Splunk, sent your traces to New Relic and metrics to Prometheus. Also sent a filtered limited set of traces and logs to a partner that wants the details. It’s great.<p>Only annoying bit is that not all functionality described in spec is implemented in each language. For example, exemplars are not supported in JavaScript/Node.<p>I do hope that the big parties Datadog, Dynatrace, New Relic would better support semantic conventions and less rely on their own agents.
We use Tempo and python, implementation was super fast, it now permits optimizing our API endpoints.
My conclusion: opentelemetry is great, go for it, find a lib to easily instrument your code
I looked at it a few years ago because the small company I was working at wanted a cheaper replacement for New Relic.<p>OpenTelemetry was...dogshit. It was fucking awful. The support for async code was abysmal, which meant any well written application was a nightmare to onboard for it. Maybe this has changed, but three years ago it was terrible.