TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

SIMD Matters: Graph Coloring

189 点作者 matt_d10 个月前

13 条评论

mikewarot10 个月前
&gt;This is the magic of graph coloring and enables me to solve multiple contact constraints simultaneously without a race condition.<p>&lt;TANGENT&gt; This hits me, like a ton of bricks, as one of the most elegant ways to describe why I add 2 phases of clocking (&quot;like colors on a chessboard&quot; is the phrase I&#x27;ve been using) to my BitGrid[1] hobby project.<p>I wonder what other classes of problems this could solve. This feels oddly parallel, like a mapping of the Langlands program into computer science.[2]<p>[1] <a href="https:&#x2F;&#x2F;esolangs.org&#x2F;wiki&#x2F;Bitgrid" rel="nofollow">https:&#x2F;&#x2F;esolangs.org&#x2F;wiki&#x2F;Bitgrid</a><p>[2] <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Langlands_program" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Langlands_program</a>
评论 #41320948 未加载
a1o10 个月前
&gt; The conventional wisdom is that it is difficult to achieve real gains from SIMD.<p>This has been my experience often times I misunderstood how much can be gained by using SIMD, and preparing the data to be &quot;eaten&quot; by SIMD instructions is not trivial. I have many times attempted to use just to profile and see it didn&#x27;t improve things at all and made the code really hard to understand.<p>Kudos for Erin here this is really hard work and it&#x27;s great it paid off well and gave good results here!
评论 #41318881 未加载
mgaunard10 个月前
The main problem of SIMD remains the same: it requires working with structures of arrays rather than arrays of structures, which makes it more difficult to combine and layer objects on top of each other.
评论 #41321052 未加载
评论 #41318478 未加载
评论 #41324119 未加载
评论 #41318725 未加载
评论 #41320529 未加载
评论 #41317914 未加载
patrick45110 个月前
&gt; I also implemented all the wide math functions to work with this. It seems that I have arranged all the data perfectly for the compiler to use automatic vectorization. But it seems this doesn’t really happen to a sufficient degree to compete with my hand written SSE2.<p>I will keep this example in mind the next time somebody trots out the line that you should just trust the compiler.
unwind10 个月前
Very cool and informative article, and I love that Box2d is in C nowadays that really makes it clear. Great job!<p>I saw a small typo:<p><pre><code> &#x2F;&#x2F; wide float typedef b2FloatW __m128; </code></pre> The `typedef` is backwards, the alias and the underlying type name are in the wrong order and need to be swapped around.
openasocket10 个月前
They didn&#x27;t really explain why graph coloring can be done quickly in this case. I&#x27;m not familiar with graph coloring theory, but I know the general graph coloring problems are NP-complete. I&#x27;m guessing the reason you can use a greedy algorithm to get a decent coloring quickly is because you know the graph is planar, and because you don&#x27;t actually care about getting a minimal coloring.
评论 #41324162 未加载
评论 #41325064 未加载
评论 #41324179 未加载
intalentive10 个月前
Is the AVX2 implementation using the full 256-bit register width? If so, I am surprised that it is only 14% faster than SSE. If not, I would like to see how the results compare with the rest of the tests.
评论 #41324493 未加载
xoranth10 个月前
General questions for gamedevs here. How useful is SIMD given that now we have compute shaders on the GPU? If so, what workloads still require SIMD&#x2F;why would you choose one over the other?
评论 #41318746 未加载
评论 #41322109 未加载
评论 #41318670 未加载
noelwelsh10 个月前
&gt; This is a bit ironic because on x64 all math is SIMD math, it is just inefficient SIMD math.<p>I don&#x27;t understand this statement at the end of the article? Can anyone explain? TIA.
评论 #41318325 未加载
评论 #41318308 未加载
评论 #41318273 未加载
CreepGin10 个月前
I really appreciate the write-up and showing the most important code snippets.<p>Intuitively, this feels like a narrower version of using Z-order curve.
ccppurcell10 个月前
This is a total longshot but my PhD and research has been in graph algorithms including colouring, and I&#x27;m looking for a career change. If anyone has any advice at all (roles or companies that use these skills, tools to learn, etc) I&#x27;d be very interested.<p>Caveats: my knowledge is mostly theoretical (eg proving np-hardness or algorithm existence results) but I&#x27;m very good at thinking algorithmically. I have only hobbyist programming skills but I am a fast learner. Thanks!
评论 #41321411 未加载
评论 #41321766 未加载
baq10 个月前
&gt; The Apple M2 smokes!<p>I was about this surprised when I made a jupyter notebook with a few gigs of numbers shuffled around and xgboosted and after I was done prototyping on an M1 Air and ran it on my serious box (a 12700k) it was actually slower, and noticeably.
评论 #41321772 未加载
vkazanov10 个月前
The other problem with simd is that in modern cpu-centric languages it often requires a rewrite for every vector width.<p>And for 80% of the cases by the point there is enough vectorizable data for a programmer to look into simd, a gpu can provide 1000%+ of perf AND a certain level of portability.<p>So right now simd is a niche tool for super low-level things: certain decompression algos, bits of math here and there, solvers, etc.<p>And it also takes a lot of space on your cpu die. Like, A LOT.
评论 #41320377 未加载
评论 #41317790 未加载
评论 #41318915 未加载
评论 #41320947 未加载
评论 #41317789 未加载
评论 #41317862 未加载
评论 #41318151 未加载
评论 #41318123 未加载