Multiple times in my career at a new job I've had to build a kind of bootstrapped OSS-based observability platform (mostly a mixture of prometheus and grafana typically) which can come with more overhead than you'd think and doesn't usually provide a lot of the analytical stuff without stitching together a bunch of things. Datadog out of the box gives you everything you can possibly want - as others have stated bills can balloon, for instance if you run spot instances or have a lot of hosts coming up/down I believe it bills you for each one even if it's short lived.<p>I've been working with an enterprise license for a year and while I don't really hear too much about cost, some simple considerations to the design of the infrastructure it was supporting seems to have prevented a ballooning bill (so far).<p>So for me, not having the engineering time or buy-in to build a whole home grown observability platform by using OSS tools like this (and all the quirks that can come with them) ends up being a lot more expensive than just sucking it up and buying an enterprise plan. At least so far.<p>If I had the option to do it from scratch how I wanted, with no time or budget constraints, I'd prefer of course not to be beholden to a major SaaS company that charges for ambiguous things that are hard to predict like "per host", because it's quite easy for these services to bury themselves so deep into your infrastructure that you just bite the bullet on whatever inevitable rug pull or price increase comes next. It has happened to me before managing enterprise Hashicorp Vault.
I always liked Datadog as a product but it's also true that it is simply way too expensive if you don't spend significant time cost optimizing.
But hosting it myself doesn't really seem like a great solution, I rather invest time in making my app robust than making my monitoring stable.
Is that all datadog is?<p>I read the horror stories, the monthly bills of 10's of thousands for one server and just assumed there was something more substantial to the product; like they did something groundbreaking or novel. I never cared enough to actually look and see what they did.<p>I use uptime-kuma - <a href="https://github.com/louislam/uptime-kuma">https://github.com/louislam/uptime-kuma</a> - it obviously does a fraction of what these other things do but it does everything I need.
What does it use for integration/workflow; frontend seems theirs but backend seems not in the repos. I saw more solutions like this boasting ‘5000+’ integrations but I cannot find the code for that (I might have missed it)?
Question for those in the observability space: do moment-in-time observations preserve all of the dimensions of the event, and if so, how do most observability platforms compress the high volume of (ostensibly) low-rank data?
I may be cynical here, but I find that all open source datadog alternatives are mostly frontend focussed with an out of the box database. And it does not scale well. It's not easy to maintain, scale, shard etc. Am I wrong?<p>P.S. I am all for OSS.
Hey! If you’re looking for open source friendly with really straight forward cost, check out Coralogix.com.<p>Great features for logs, metrics & traces, total compatibility with open telemetry, cost optimization tools built in (DataDog leavers typically save around 50%), and much more!<p>Check out our site, and you can find me on LinkedIn (or indeed reply here !) if you want to ask further questions.<p><a href="https://www.coralogix.com" rel="nofollow">https://www.coralogix.com</a>
we've been pretty happy with just a Clickhouse DB and sending metrics directly from api servers to Clickhouse HTTP <a href="https://clickhouse.com/docs/en/interfaces/http" rel="nofollow">https://clickhouse.com/docs/en/interfaces/http</a> . Hook up Grafana and you have a nice raw SQL (our team loves SQL) Grafana dashboard.
I can see some commits mentioning telemetry but it's not at all mentioned on the GitHub README. Strange.<p>It looks solid and I'd try it if the need arises.
Do you not offer Arm images for the various services? For folks wanting to run this stack, I'd imagine some of them are interested in running on Arm for better cost optimization. Maybe it doesn't matter for how lightweight the services may be.
If using the Helm chart to install, does it also automatically monitor the cluster that oneuptime is installed on? Didn't see the Kubernetes integration docs
<a href="https://oneuptime.com/" rel="nofollow">https://oneuptime.com/</a> also makes it a managed service to compete with datadog
Lot of interesting OSS observability products coming out in recent years. One of the more impressive(and curious for many reasons) IMHO is OpenObserve: <a href="https://github.com/openobserve/openobserve">https://github.com/openobserve/openobserve</a> .<p>As opposed to just a stack, they are implementing just about the whole backend shebang from scratch.