I have been running vector in several k8s clusters since July. I'm doing some non-trivial transforms over several different types of sources and sinks.<p>The config makes this easy, but my favorite part is the fact that the CPU and MEM of the vector processes barely even registers on my metrics charts. I can't even tell you what their actual and requested resources are because I haven't bothered to look in a while.<p>It's one thing I never have to worry about. I could use more of those.
The world of observability seems to be converging on some key ideas. I think Vector's data model where they convert logs into structured events is a key idea that wasn't always obvious. I think the remaining big idea is what do do about high cardinality data. Most solutions pre-aggregate the data which does't tolerate high cardinality tags. Solutions like Honeycomb and Datadog logs are stream based and tolerate high cardinality tags, but with limitations on what can be done with it. It will be interesting if the streaming based solutions become the final standard. Vector warns about high cardinality labels, just like others do. I'm not sure if that is a limitation of Vector or just the Sink.<p>I think the interesting part is the overlap between observability metrics for operations and expanding in to BI metrics with the same tools.
Vector still insists that Kafka is for logs only. Vector is a fantastic project, we have so many use cases for it, but the thing that trips us up is not being able to send metrics via Kafka, without transforming them to logs.<p>Edit: Awesome, my complaint that you couldn’t scrape the Prometheus federation endpoint has fixed.