TechEcho

5 comments

jeffbeeover 2 years ago

It feels like logging is misunderstood. Critical revenue or audit logs need to be centralized, but debug logs don’t. Logging debug logs to local storage and deleting it after nobody looks at it-the lifecycle of at least 99.999% of informational log statements-costs almost nothing. Another benefit is that pushing your predicate out to your edge nodes works far better than trying to get acceptable performance from central logging facilities. So I don’t understand why people waste so much money on centralized informational logs.

评论 #34553336 未加载

评论 #34554041 未加载

olliejover 2 years ago

I am lost at how you can have 20% of your storage costs be for logging, and not immediately say that at minimum you are persisting too many logs, and probably logging too much in the first place.I get that modern tech companies log every movement and interaction a user has with an app, far beyond any amount that is reasonable, but surely at some point you can go “we probably don’t need this”.It shouldn’t be a matter of “let’s compress the logs”, it should be a “are we even using these logs”.

评论 #34555173 未加载

评论 #34554944 未加载

评论 #34555503 未加载

评论 #34557964 未加载

mnkmnkover 2 years ago

Unlike JSON, orc requires batching of rows to write to disk. It's because it does a lot of computation - maintaining indexes, encoding columns (run-length, dictionary), calculating statistics, maintaining bloom filters, compressing columns etc. Doing this at the source where you are more interested in serving an individual request as quickly as possible doesn't look like a good idea. If you want the orc files to be useful, you need to batch a lot of rows together otherwise you don't get the benefits of columnar storage. So logs in the happy path will be delayed, and in the unhappy path if the process crashes, recent logs are gone. JSON isn't really bad as a logging format. And it can be stored temporarily to then asynchronously convert to a columnar format.I'm looking forward to the next post.

orfover 2 years ago

Intercepting network traffic like this is an interesting approach to the problem.If each service has a unique IAM role, which it definitely should do, wouldn’t you be able to track this via a combination of cloudtrail and proper resource tags?

评论 #34553752 未加载

nrivoliover 2 years ago

I would like to know more about the kubernetes databases, what kind, challenges, how are the fault domains configured and etc.Also is not clear to me how intercepting calls helped you to figure out the offending services?

Saving millions on logging: Finding relevant savings

5 comments

Saving millions on logging: Finding relevant savings

5 comments