I think what the article greatly skimps over is data migrations: what do you do if you need to change the format of your data? If you retain logs in Kafka indefinitely as the source of truth for your data, then if you need to migrate materialized data to a new format, you'll also need to either 1) support all the previous forms of materialized data so operations from the log are guaranteed to be safely replayable on it, or 2) don't do that and keep one form of materialized data and hope you have enough test coverage to make sure some unexpectedly old data doesn't silently corrupt your materialized data.<p>Event sourcing is useful, but using it as a source of truth data store in itself instead of e.g. an occasional journalling mechanism seems pretty fraught.
Kreps has a gift for writing, this is so clear, well organized, and far more fun to read than the topic has any right to be. Hopefully he'll retire after confluent and finally start writing novels.
IMO using Kafka for long term storage is not the greatest idea. It is expensive to keep CPU and RAM constantly on top of data that it going to be cold most of the time. There is no DML which means mistakes are expensive (from an engineering pov). And while the whole event sourcing paradigm can work quite well in narrow domains with teams fully aware of the implications of what they are doing, in practice, on large orgs, it is hard to scale (from a people perspective).
Well maybe for non critical data. Multi regional Kafka clustering is not easy. There are much better and cheaper data storage options that can provide eventual consistency.
This article seems to propose log compaction as the answer to the question of size (i.e. how much historical data is going to have to be kept around and how much is that going to cost). However, log compaction is not well suited to many use cases: storing partial updates or diffs in the log, storing many trillions of tiny entries (as keys), multiple messages on the log corresponding to related (contingent) updates, and so on.<p>Those are tractable but hard to solve; log compaction is not a silver bullet and unless you think really hard about how your data changes over time, you may end up storing more of it than you expect if you use the log as an eternal source of truth--compaction or not.