As the title says, what techniques do you use to find bottlenecks in streaming, high throughput applications? What are general techniques for resolving the bottlenecks (fine grained locking, flamegraphs, efficient data structures)?
You're probably familiar with most of the tools, I've rarely seen a stack that didn't just add Prometheus endpoints when they wanted better visibility. As for resolving these problems, it's good to start with the basics. Check to see if something is going wrong, and if not then solve it though software (optimizing, using faster alternative) or hardware (allocating larger machines or increasing scalability).<p>Unless you're more specific, it's really hard to offer concise advice here.
The language and operating system in use will have a big influence on this- what are they?<p>E.g. for C programs I use sampling-based profiling (like Mac makes super easy with the “Time Profiler” Instrument) to find the bottleneck(s) and then drill down to see associated code sections. Different languages and OSs have different tools for identifying bottlenecks.<p>If performance is network-bound, that’s a very different set of considerations (with which I have no experience)