Hi Hacker News! Shahar and Tal from Keep here.<p>A few months ago, we introduced here at HN (<a href="https://news.ycombinator.com/item?id=34806482">https://news.ycombinator.com/item?id=34806482</a>) Keep as an “open source alerting CLI” and got some interesting feedback - mainly around UI, automation, and supporting more tools. We were VERY early back then, and we understood that although the current DX around creating alerts is not great, it's not that critical and developers don’t need another tool just for that.<p>But we did find something else.<p>While talking to developers and devops, we found that a lot of companies use many tools that generate alerts - from Cloudwatch, Prometheus, Grafana, and Datadog to tools such as Zabbix or Nagios. We definitely agree consolidation in the observability space is a real thing, but while talking to those companies we feel that there are still real use cases for having more than one tool (and for example, according to Grafana’s 2023 observability survey, 52% of the companies uses more than 6 observability tools <a href="https://grafana.com/observability-survey-2023/" rel="nofollow noreferrer">https://grafana.com/observability-survey-2023/</a>).<p>So we that in mind, we rebuilt Keep with a simple mindset: (1) Integrate with every tool that triggers alerts - it can be either pushing alerts to Keep via webhooks or routing policies or Keep to pull alerts via the tools API. (2) Create a simple abstraction layer to run workflows on top of these alerts. (3) Maintain a great developer experience - open source, API-first, workflows as code and generally having a developer mindset while building Keep.<p>During the time we rebuilt Keep, Datadog released their workflow automation tool (<a href="https://docs.datadoghq.com/service_management/workflows/" rel="nofollow noreferrer">https://docs.datadoghq.com/service_management/workflows/</a>) which led us to the understanding that's exactly what we solve - but for everyone who uses tools other than Datadog.<p>A short demo of Keep with a simple use case: <a href="https://www.youtube.com/watch?v=FPMRCZM8ZYg">https://www.youtube.com/watch?v=FPMRCZM8ZYg</a><p>You can try it yourself by signing into <a href="https://platform.keephq.dev">https://platform.keephq.dev</a><p>Like always - we invite you to try Keep and we are eager to hear any feedback.
I'm looking at this and thinking, "you know what, this could be an awesome personal tool as well".<p>This is definitely outside of the use cases described but I can definitely see myself hooking this up in an IFTTT style to funnel things into my todo systems using the HTTP provider.<p>Will poke around this soon.
IMO the readme docs make it seem confusingly like it's built-in/really well integrated with Actions, because the syntax is so similar. It takes some light digging to find it's actually entirely separate (but similar) and run as `keeo run --alerts-file=path`, from GH Actions or anything else at all, because it's a separate file parsed by a third-party program that just so happens to have a similar syntax.<p>Nice tool though, looks useful, added to the list.
Since this is 2023 and we are releasing things that solve X and Y problems in YML I do want to take the opportunity to question whether solving problem for X or Y in YML is really the thing we should be building businesses around these days. I’ve spent the greater part of the last year or so undoing the pain of “reasonably complex GHA in YML” in my organization. It’s one of those things that sounds great conceptually, and works really well simplistically, but once your use case evolves beyond even remotely simple (for example, abstracting and maintaining this code in an engineering org in the tens of people, not even hundreds), it is a slow growing cancer that ends up being a huge time suck, unmaintainable, untestable mess, and technical debt for your org.
Haha my team maintains something just like this this internally - ours is called “info-radiator”. Great idea for a product. You should add an Amazon referral link to a small Lenovo tablet and some Velcro for developers to have a dedicated Keep monitor!
We have a similar internal tool, and it's also called keep!<p>Besides alerts it also tracks, and displays things such as which MongoDB server is the primary, or which ElasticSearch node is the controller.
Looks really interesting! Does the self-hosted version support OAuth or other authentication methods to manage users through an external identity provider?
Maybe don’t call a reliability tool “GitHub Actions for X”, which doesn’t exactly inspire confidence? Having used relatively high frequency GitHub Actions for alerts, I get more random errors from Actions (Actions down, failed git pull, failed cache pull, etc.) than actual alerts.