I've been playing with ELK (Elasticsearch, Logstash, Kibana) stacks for the past couple months and that has given me great insights into what my product is doing in a clustered state in real time. That coupled with a few monitoring scripts and I've finally gotten past feeling helpless. I have a few people saying I should check out Grafana, and I've seen several paid services in this areas too (logentries for example).<p>I also use chatbots to notify me when things are going wrong, or say when a customer signs up.<p>How do you keep track of your products, keep monitoring costs low, and know what's going on?
I've gotten really into Slack recently.<p>I just log everything ye old fashioned way (text files) and then post anything important/noteworthy to a couple of Slack channels.<p>It's really easy to build custom integrations and Slack is running on all of our devices so as long as we are all subscribed to the relevant channels I can target <i>@channel</i> in my integration and we know straight away if something bad is happening.<p>I've use the ELK stack on previous projects, it works really well and isn't that difficult to use (there are some great guides on Digital Ocean) and when I get the time I'm going to set it all up for my current project too.<p>The only thing is you have to host it somewhere, and it's more justifying the budget for that.
Elastalert + Kibana + Slack is great if you use logstash/streamstash to aggregate logs internally. I also combine new relic and slack for incident management.
For log aggregation I use Elasticsearch + Logstash + Kibana. For statistics I use Influxdb + Telegraf + Grafana. For service alerts I use Consul + Consul Alerts. Notifications get sent to Pager Duty and/or Slack from one of Kibana, Grafana, or Consul.<p>The Grafana graphs are especially fun to watch and Kibana has a nice logentries-esque plugin for searching and tailing logs.<p>It's quite the system to setup but once it's running everything else (i.e. the rest of your app) is much more manageable.