科技回声

8 条评论

cperciva将近 8 年前

When looking at code performance, it's important to remember that conditional branches are almost always cheap inside microbenchmarks, because the CPU can figure out when the same branches get taken on every loop... but far more expensive in the real world. A similar issue applies to cache: Your code might fit inside the L1 cache in your benchmarks, but when it's used in the real world you get cache misses since the rest of the program accesses data too.

Veedrac将近 8 年前

> Then, a bit later, we need to very quickly finish populating the structs.I am finding it extremely hard to envision a circumstance where this is a bottleneck for anything. Care to clarify the context?I also find the struct layout really odd; why not just move c?Your benchmarks are also probably broken; branch predictors use global state so will almost certainly predict fine the way you've used things. You need to repopulate a significantly-sized array each time with randomly chosen values. You can't use the same array because it'll be learnt, and you can't use a short array because it'll be predicted globally.

评论 #14695781 未加载

BenjiWiebe将近 8 年前

Quick upvote for a well designed mobile friendly site. The last few HN posts I read were awful on mobile.

评论 #14695792 未加载

JoachimSchipper将近 8 年前

This is not so much "outsmarting the compiler" as "working with the compiler" - tweaking code to trigger certain optimizations/generate particular output. The missing part, of course, is showing that these optimizations actually help...

评论 #14695830 未加载

Dreami将近 8 年前

How does this wrapper and each variant connect together? I supposed he would include a pointer to the actual struct in the wrapper, but I just see payload (that I'm gonna assume needs to be written to that array with padding in the actual struct).For me, a wrapper includes the original thing and just wraps stuff around it. How does this work here? May also be a question to the author, I suppose now...

评论 #14694226 未加载

bloaf将近 8 年前

So why wouldn't you just put c before the arr[padding#] chars?

评论 #14695807 未加载

dmh2000将近 8 年前

gcc warning: ISO C++ forbids zero-size array ‘payload’ [-Wpedantic]clang warning: flexible array members are a C99 feature [-Wc99-extensions]is this still the case?

评论 #14696037 未加载

评论 #14696038 未加载

taneq将近 8 年前

Outsmarting the compiler: A short story about optimisation:"Don't."The end.(Basically, as the story shows, with this kind of micro-optimization you may or may not beat the compiler but you're almost certainly wasting your time compared with more effective optimization methods, like rethinking the problem.)

评论 #14693652 未加载

评论 #14695884 未加载

8 条评论

cperciva将近 8 年前

Veedrac将近 8 年前

评论 #14695781 未加载

BenjiWiebe将近 8 年前

Quick upvote for a well designed mobile friendly site. The last few HN posts I read were awful on mobile.

评论 #14695792 未加载

JoachimSchipper将近 8 年前

评论 #14695830 未加载

Dreami将近 8 年前

评论 #14694226 未加载

bloaf将近 8 年前

So why wouldn't you just put c before the arr[padding#] chars?

评论 #14695807 未加载

dmh2000将近 8 年前

gcc warning: ISO C++ forbids zero-size array ‘payload’ [-Wpedantic]clang warning: flexible array members are a C99 feature [-Wc99-extensions]is this still the case?

Outsmarting the compiler: Short story about optimization

8 条评论

Outsmarting the compiler: Short story about optimization

8 条评论