As a maintainer and end-user, my answer to this is...yes and no. It's important to clarify that, stability - something mentioned in the article - has several major definitions:<p>- Stability in the specification<p>- Stability in semantic conventions<p>- Stability in the protocol representation<p>- Stability in SDKs that can generate data<p>- Stability in the Collector that can receive, process, and export that data<p>Unfortunately, for many people, they may interpret "stable" in one of those categories as "stable for everything", and then get really annoyed when they find their language doesn't actually have stable support (or any support!) for that concept.<p>What I'm most proud of in 2023 is all of the little things we made progress on with components that engineers have to materially deal with. On the website, we documented what feels like a million little things and clarified tons of concepts that people told us were confusing. Across all the SDKs, we fixed tons of little bugs, added more and more instrumentations, and completed the unsexy work to make metrics generation stable across most of our 11+ languages. The Collector added oodles and oodles of support for different data sources, and OTTL went from a neat component to a rock-solid general-purpose data transformation tool.<p>There's so much more work to do, but I'm really happy about the progress.
The biggest issue with OpenTelemetry is how aggressively it's being pushed, despite not being mature enough. The AWS X-Ray team frequently suggests switching to OTel on bug reports and feature requests, but the performance and resource overhead of OTel collector for Lambda is just awful right now. It doesn't make sense for any performance sensitive workload.<p>Beyond that, it gives off an "over-engineered" vibe. It's probably not, and the complexity of being a unified standard that can work across so many different variations is inherently going to need a lot of abstractions, but it feels so much more difficult to go through OpenTelemetry compared to an opinionated observability SaaS.
Depends on who you ask.<p>I am glad that the observability sector has standardized on a common protocol but my god are the reference implementations lacking.
OpenTelemetry is a great concept, but in my experience not quite there yet. Docs especially fall into the common trap of handling the happy path hello world quickstarts, then become increasingly useless as you want to get beyond that to real life use cases. Given the inherent tradeoff of complexity that comes from trying to unify different approaches around one standard, sometimes it seems like things that should be simple are more difficult than they should be. I'm sure it will keep improving.
Give me something that isn't based on protobufs at wire / request level. CBOR with CDDL for a fully standards based approach that can work at any size of the stack
As a relative outsider to the observability space, I have always wondered this:<p>Is observability/telemetry only about engineering-related issues (performance, downtimes, bottlenecks etc.) or does it include the "phone-home" type of telemetry (user usage statistics, user journeys)? Looking through the websites of most of the observability SaaSes it seems to only talk about the first. Then how do people solve the second? Is it with manual logging to the server from the client?