科技回声

6 条评论

haberman超过 3 年前

The visualization tools presented look really nice, but they seem to present program execution as sequential and linear, which is a model that seems like it will really break down at these time scales (10s of cycles).Modern processors will look hundreds of instructions into the future and try to start executing them as soon as possible. Branches are predicted far in advance of when they can actually be evaluated. Many instructions can be executing simultaneously. A clean tidy flame graph showing 1-3ns slices (~5 cycles) cannot help but be a vast simplification of what the CPU is really doing.The linked page about Processor Trace says this:> instruction data (control flow) is perfectly accurate but timing information is less accurateThe article mentions using magic-trace to detect changes in inlining decisions made by the compiler. This is a case where it will shine, since PT can perfectly capture the control flow, and it doesn't necessarily rely on having perfect timestamps for everything.

评论 #30130001 未加载

评论 #30114240 未加载

评论 #30111658 未加载

temikus超过 3 年前

Oh man, this is major. I would’ve loved to have something like that 10 years ago when CPU was a bit more precious. Still very useful today, just not to the same extent.

评论 #30110075 未加载

评论 #30110088 未加载

signa11超过 3 年前

tracy (<a href="https://github.com/wolfpld/tracy" rel="nofollow">https://github.com/wolfpld/tracy</a>), mentioned in this article as well, for some reason is criminally underused, unknown etc. by wider community.

评论 #30113958 未加载

carlmr超过 3 年前

The best I've found until now is gperftools (In contrast to perf you get good results even with -O3 optimization and heavily inlined code). This seems to be much more accurate, but I'm not sure we can handle that amount of data because we usually take longer traces.

gnufx超过 3 年前

This says post-skylake, but both my SKX workstation and i5-6200U laptop have 1 in /sys/bus/event_source/devices/intel_pt/caps/psb_cyc which seems to be the condition, though I haven't tried to use it.

silverlake超过 3 年前

Doesn’t VTune support processor trace? Some VMs support PT. And AWS has support also.

6 条评论

haberman超过 3 年前

评论 #30130001 未加载

评论 #30114240 未加载

评论 #30111658 未加载

temikus超过 3 年前

Oh man, this is major. I would’ve loved to have something like that 10 years ago when CPU was a bit more precious. Still very useful today, just not to the same extent.

评论 #30110075 未加载

评论 #30110088 未加载

signa11超过 3 年前

评论 #30113958 未加载

carlmr超过 3 年前

gnufx超过 3 年前

silverlake超过 3 年前

Doesn’t VTune support processor trace? Some VMs support PT. And AWS has support also.

Magic-trace: Diagnose tricky performance issues with Intel Processor Trace

6 条评论

Magic-trace: Diagnose tricky performance issues with Intel Processor Trace

6 条评论