Dev/Sec/Ops here. Small-sized MSP using GCP & AWS across multi-cloud regions, and using AWS CloudWatch & CloudTrail logs, Stackdriver for logging, alerting, errors, etc. We're finding limited usefulness of this logging infrastructure. Wondering if its just me!<p>As DevOps, I really find StackDriver logging UI not much useful to scroll through. For one thing, it is really slow. Secondly, I find service name & service acronyms very annoying. Not just StackDriver, but AWS too.<p>- Other than debugging, what purpose do logging serve?<p>- What are some tools that you use to dice & slice logging data to make some meaning?<p>- How do you extract actual errors while alerting?
Metrics can only get you so far. The primary method you have to contextualize what was going on when, and to correlate that with other events, is through logging. Metrics are always just a point in time, on a single given system. Metrics slice the cake vertically, but if you want a horizontal view across multiple systems, you need to integrate your logs.<p>Only by unifying your logging into your monitoring system and actively aggregating that information, can you start to get a complete picture of what is going on.<p>New Relic has been doing some good work in this space, and integrating logging information to their system is their single biggest area of growth right now.