Hmm, as someone who uses and defacto manages Datadog for a sizable org in their day job, I'm not sure using that OpenTelemetry PR fiasco is an up to date picture.<p>There was definitely a period where Datadog seemed to be talking up OTel but not actually doing anything of note to support that ecosystem.<p>I'd say in the last year or two, they've done a bit of a 180 and been embracing it quite a lot.<p>One major change is that they not only added support for the W3C Context header format but actually set it as the default over their own proprietary format.<p>The reason that's a pretty big deal is that W3C Context is set as a MUST for OTel clients to implement so it goes a long way to making interoperability (and migration) pretty painless.<p>Prior to that, you could use OTel but the actual distributed aspect (spans between services linking up) probably wouldn't work as OTel services wouldn't recognise the Datadog header format and vice versa.<p>There are, of course, still some features that you would miss out on by using OTel over the Datadog SDKs like I don't believe the profiler would work necessarily but that's a tradeoff to be made.
In the last decade in all companies that I've worked the biggest issue by far in all of them were vendor lock-in and how the teams coped around tools that did not evolved with their problems along the time and the switch cost was high.<p>I know that Open Telemetry has it's own issues, but between low ergonomics and amenities and a lock-in; nowadays I will choose the first.
This post speaks to a larger issue that cloud vendors are driven to extract as much money from you, the customer, as possible. They are not evil or malicious, they are commercial enterprises and cloud is built on consumption economics. There is no incentive to make it easy for you to move to another cloud. Replace 'observability space' with SEIM/SOAR, first party databases (Spanner/CosmosDB), many PaaS offerings, and the themes still apply. Pushing proprietary solutions to your is an effective means of making customers stick. I am not passing judgement here, there is some value to turnkey solutions but it depends on your business. Datadog in particular is a bit insidious as they have a multi-cloud proprietary service that can follow your workloads across clouds (even so far as to be essentially first party on Azure via Azure Native ISV Services).
>1. You start with Datadog§[...]
2. [...] your expenses have skyrocketed
3. You want to migrate your way out[...]<p>Step 3 doesn't seem logical here... Who doesn't try to control spend on something they've already invested in?<p>Step 3 should be to audit what you are spending the most on, and how to manage that in a way its still useful.<p>I've seen so many people not understand things like custom metrics billing, or log/trace retention and get burned for it.<p>If you're using a tool lime datadog, you should understand its billing structure a bit. I couldn't imagine setting up a redshift instance without understanding and tracking redshift spending. And then one it is high, just switching to RDS or something without even taking a look.
The vendor lock-in I've seen isn't on the collection side, it's on the display and interpretation side, specifically the dashboard. Vendors offering tools to really slice and dice the collected telemetry and display it in visualization that wouldn't make Edward Tufte pull his hair out can make choosing vendor lock-in a serious option.
The article seems to suggest <a href="https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/5836">https://github.com/open-telemetry/opentelemetry-collector-co...</a> was silently killed, yet it appears to have been merged in January, am I missing something?
FULLY BIASED COMMENT:<p>We (the bacalhau.org[0] project) are interested in helping with this - one of our philosophies has been that part of the problem is with that first step. By first moving to a lake of some kind, you end up giving up lots of optionality. Even basic things like aggregation, filtering, windowing, etc now need to be in the "locked-in" tool, which is exactly the wrong first step to take.<p>SHOW HN: We have a solution that uses DuckDB to do some of this initial work[1] which can save you 70%+ or more on total data throughput. Further, it allows you to do interesting things like eventing, multi-homing observability data, etc.<p>I'd be very interested to hear any/all thoughts!<p>[0] <a href="https://github.com/bacalhau-project/bacalhau">https://github.com/bacalhau-project/bacalhau</a><p>[1] <a href="https://blog.bacalhau.org/p/bacalhau-x-duckdb-deploying-applications" rel="nofollow noreferrer">https://blog.bacalhau.org/p/bacalhau-x-duckdb-deploying-appl...</a><p>Disclosure: I co-founded the Bacalhau project.
I've bounced around Splunk, New Relic, Sentry and Datadog over the years. Most recently, I was working with Java and used the open source Vendor-neutral application observability facade Micrometer[1] to test out and confirm which APM we wanted to go with.<p>[1] <a href="https://micrometer.io" rel="nofollow noreferrer">https://micrometer.io</a>
Does Keep have an open core business model, where they host your stuff using a proprietary control plane for a fee and introduce proprietary features around the edges?