TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Migrating to OpenTelemetry

257 点作者 kkoppenhaver超过 1 年前

13 条评论

CSMastermind超过 1 年前
&gt; The data collected from these streams is sent to several vendors including Datadog (for application logs and metrics), Honeycomb (for traces), and Google Cloud Logging (for infrastructure logs).<p>It sounds like they were in a place that a lot of companies are in where they don&#x27;t have a single pane of glass for observability. One of if not the main benefit I&#x27;ve gotten out of Datadog is having everything in Datadog so that it&#x27;s all connected and I can easily jump from a trace to logs for instance.<p>One of the terrible mistakes I see companies make with this tooling is fragmenting like this. Everyone has their own personal preference for tool and ultimately the collective experience is significantly worse than the sum of its parts.
评论 #38296135 未加载
评论 #38295505 未加载
评论 #38295542 未加载
评论 #38295248 未加载
评论 #38296101 未加载
tapoxi超过 1 年前
I made this switch very recently. For our Java apps it was as simple as loading the otel agent in place of the Datadog SDK, basically &quot;-javaagent:&#x2F;opt&#x2F;otel&#x2F;opentelemetry-javaagent.jar&quot; in our args.<p>The collector (which processes and ships metrics) can be installed in K8S through Helm or an operator, and we just added a variable to our charts so the agent can be pointed at the collector. The collector speaks OTLP which is the fancy combined metrics&#x2F;traces&#x2F;logs protocol the OTEL SDKs&#x2F;agents use, but it also speaks Prometheus, Zipkin, etc to give you an easy migration path. We currently ship to Datadog as well as an internal service, with the end goal being migrating off of Datadog gradually.
评论 #38293326 未加载
MajimasEyepatch超过 1 年前
It&#x27;s interesting that you&#x27;re using both Honeycomb and Datadog. With everything migrated to OTel, would there be advantages to consolidating on just Honeycomb (or Datadog)? Have you found they&#x27;re useful for different things, or is there enough overlap that you could use just one or the other?
评论 #38293562 未加载
Jedd超过 1 年前
The killer feature of OpenTelemetry for us is brokering (with ETL).<p>Partly this lets us easily re-route &amp; duplicate telemetry, partly it means changes to backend products in the future won&#x27;t be a big disruption.<p>For metrics we&#x27;re a mostly telegraf-&gt;prometheus-&gt;grafana mimir shop - telegraf because its rock solid and feature-rich, prometheus because there&#x27;s no real competition in that tier, and mimir because of scale &amp; self-host options.<p>Our scale problem means most online pricing calculators generate overflow errors.<p>Our non-security log destination preference is Loki - for similar reasons to Mimir - though a SIEM it definitely is not.<p>Tracing to a vendor, but looking to bring that back to grafana Tempo. Product maturity is a long way off commercial APM offerings, but it feels like the feature-set is about 70% there and converging rapidly. Off-the-shelf tracing products have an appealingly low cost of entry, which only briefly defers lock-in &amp; pricing shocks.
评论 #38299055 未加载
评论 #38403521 未加载
nevon超过 1 年前
I would love to save a few hundred thousands a year by running Otel collector over Datadog agents, just on the cost-per-host alone. Unfortunately that would also mean giving up Datatog APM and NPM, as far as I can tell, which have been really valuable. Going back to just metrics and traces would feel like quite the step backwards and be a hard sell.
评论 #38296821 未加载
nullify88超过 1 年前
One thing that&#x27;s slightly off putting about OpenTelemetry is how resource attributes don&#x27;t get included as prometheus labels for metrics, instead they are on an info metric which requires a join to enrich the metric you are interested in.<p>Luckily the prometheus exporters have a switch to enable this behaviour, but there&#x27;s talk of removing this functionality because it breaks the spec. If you were to use the OpenTelemetry protocol in to something like Mimir, you don&#x27;t have the option of enabling that behaviour unless you use prometheus remote write.<p>Our developers aren&#x27;t a fan of that.<p><a href="https:&#x2F;&#x2F;opentelemetry.io&#x2F;docs&#x2F;specs&#x2F;otel&#x2F;compatibility&#x2F;prometheus_and_openmetrics&#x2F;#resource-attributes-1" rel="nofollow noreferrer">https:&#x2F;&#x2F;opentelemetry.io&#x2F;docs&#x2F;specs&#x2F;otel&#x2F;compatibility&#x2F;prome...</a>
评论 #38346332 未加载
评论 #38363291 未加载
roskilli超过 1 年前
&gt; Moreover, we encountered some rough edges in the metrics-related functionality of the Go SDK referenced above. Ultimately, we had to write a conversion layer on top of the OTel metrics API that allowed for simple, Prometheus-like counters, gauges, and histograms.<p>Have encountered this a lot from teams attempting to use the metrics SDK.<p>Are you open to comment on specifics here and also what kind of shim you had to put in front of the SDK? It would be great to continue to retrieve feedback so that we can as a community have a good idea of what remains before it&#x27;s possible to use the SDK for real world production use cases in anger. Just wiring up the setup in your app used to be fairly painful but that has gotten somewhat better over the last 12-24 months, I&#x27;d love to also hear what is currently causing compatibility issues w&#x2F; the metric types themselves using the SDK which requires a shim and what the shim is doing to achieve compatibility.
评论 #38294981 未加载
caust1c超过 1 年前
Curious about the code implemented for logs! Hopefully that&#x27;s something that can be shared at some point. Also curious if it integrates with `log&#x2F;slog` :-)<p>Congrats too! As I understand it from stories I&#x27;ve heard from others, migrating to OTel is no easy undertaking.
评论 #38293763 未加载
throwaway084t95超过 1 年前
What is the &quot;first principles&quot; argument that observability decomposes into logs, metrics, and tracing? I see this dogma accepted everywhere, but I&#x27;m inquisitive about it
评论 #38296692 未加载
tsamba超过 1 年前
Interesting read. What did you find easier about using GCP&#x27;s log tooling for your internal system logs, rather than the OTel collector?
评论 #38298216 未加载
评论 #38297876 未加载
shoelessone超过 1 年前
I really really want to use OTel for a small project but have always had a really tough time finding a path that is cheap or free for a personal project.<p>In theory you can send telemetry data with OTel to Cloud Watch, but I&#x27;ve struggle to connect the dots with the front end application (e.g. React&#x2F;Next.js).
评论 #38297411 未加载
评论 #38296796 未加载
jon-wood超过 1 年前
At the risk of being downvoted (probably justly) for having a moan, can we please have a moratorium on every blog post needing to have a generally irrelevant picture attached to it? On opening this page I can see 28 words that are actually relevant because almost the entire view is consumed by a huge picture of a graph and the padding around it.<p>This is endemic now. Doesn&#x27;t matter what someone is writing about there&#x27;ll be some pointless stock photo taking up half the page. There&#x27;ll probably be some more throughout the page. Stop it please.
k__超过 1 年前
I had the impression, logs and metrics are a pre-observability thing.
评论 #38293911 未加载
评论 #38294856 未加载