This is a software design question: I have different components which do similar operations (similar, not identical) and I want to normalize the logs that each component emits so I can make diagnosing issues easier. What approach should I take?<p>One thing I have thought about is using an error code for a specific known fault and then emit a message based on the error code (think M$ error codes) - same message parameters for all components but the values differ.<p>I'm more interested in hearing how this has been addressed by HNers who have had to build this functionality. If there are references, please share.
For inspiration, look at how the syslog api works. You want to make each log message a 3-tuple (timestamp, sender, message-text) and log that somewhere. Be it a csv-file, database table, stderr or whatever. If you are thinking about complicating it more, e.g by introducing additional fields to the 3-tuple, then remember that the logs doesn't have to be 100% perfect. As long as they are consistent enough for the user who is grepping them.<p>Logging error codes is not a good idea since they add a layer of indirection. The text "file not found" is always clearer than "error code: 412".<p>Another thing to keep in mind is to make your log messages distinct. If a user reports the error "Connection reset by peer" and there are 28 places in your code which can cause that message, then that is harder to debug than if the error only can be logged from exactly one location.
If it were my problem, I'd probably start by normalizing the log files <i>after</i> the log files were produced. Which is to say, I'd start by writing a program to normalize the log files rather than modifying the programs producing the log files.<p>The reasons I'd approach it that way is it would decouple the process of designing a new log format from the process of refactoring all the code generating the logs. At the point where I knew what was worth doing and could come up with a plan for doing it before touching all the components. Then I could implement changes as it made sense rather than all at once.<p>Good luck.
We log every request by simply shipping JSON to a hosted elasticsearch cluster, which comes with Kibana to search and visualize all requests. We can log whatever we want, make a graph of whatever we want, and find whatever we want. We have full control yet don't worry about managing anything. We just use it when debugging errors, and to alert us of errors.
Don't use error codes. Use structured messages, such that their serialization
to JSON is easy. Then you can just put in any relevant fields and be done with
that.