TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

What are the ways compilers recognize complex patterns?

124 点作者 azeemba10 个月前

4 条评论

pizlonator10 个月前
Canonical forms, smart stuff, a lot of hardcoding.<p>Canonical forms are the really important part. Simple example: there are multiple ways a program might say “X * 2”. You could shift left by 1. You could multiple by 2. You could add X to itself. The idea of canonical forms is that in one pass, the compiler will pattern match all the ways you can do this and reduce all of them to the same canonical form - say, left shift by 1. Then, subsequent passes that want to catch more complex uses of that construct only have to look for one version of it (left shift 1) and not all three.<p>Here’s a more complex case. Ternary expressions in C and if-else statements have the same representation in llvm IR generated by clang: basic blocks and branches. There are multiple ways of representing the data flow (could use allocas and stores&#x2F;loads or SSA data flow) but both the sroa and mem2reg passes will canonicalize to SSA. And, last I checked llvm says that the preferred canonical form of a if-then-else is a select (I.e. conditional move) whenever the two are equivalent. So, no matter what you use to write the equivalent of std::min - macros, templates, whatever, coding style don’t matter - you will end up eventually with a select instruction whose predicate is a comparison. Then - if your CPU supports doing min in a single instruction, it’s trivial for the instruction selector to just look for that kind of select. This happens not because every way of writing min is hardcoded, but because multiple rounds of canonicalization (clang using basic blocks and branches for both if&#x2F;else and ternaries, sroa and mem2reg preferring SSA, and if conversion preferring select) gets you there.<p>A lot of this is hardcoding, but it’s not the boring “hardcode everything” kind of approach, but rather, it’s about using multiple phases that each produce increasingly canonical code that makes subsequent pattern matching simpler.
评论 #41033457 未加载
评论 #41031000 未加载
评论 #41033752 未加载
评论 #41038228 未加载
评论 #41036496 未加载
CalChris10 个月前
The C code is a low level implementation of population count. AggressiveInstCombine.cpp first matches and <i>raises</i> the IR for this into llvm.ctpop.i64.<p><a href="https:&#x2F;&#x2F;godbolt.org&#x2F;z&#x2F;8ronKz3Eb" rel="nofollow">https:&#x2F;&#x2F;godbolt.org&#x2F;z&#x2F;8ronKz3Eb</a><p>Later an x86 backend can re-<i>lower</i> this into a popcnt instruction or to CPOP on a RISC-V backend or to CNT on ARMv8 or ….<p><a href="https:&#x2F;&#x2F;godbolt.org&#x2F;z&#x2F;4zvWs6rzr" rel="nofollow">https:&#x2F;&#x2F;godbolt.org&#x2F;z&#x2F;4zvWs6rzr</a><p>It can also be re-lowered to roughly those instructions on machines lacking population count instructions.<p><a href="https:&#x2F;&#x2F;godbolt.org&#x2F;z&#x2F;aYsc9dz7f" rel="nofollow">https:&#x2F;&#x2F;godbolt.org&#x2F;z&#x2F;aYsc9dz7f</a>
userbinator10 个月前
Imagine if x86 had POPCNT since the beginning, implemented in microcode at first, and optimised it over time to be faster and use more available hardware. There would be no need for this sort of &quot;decompiler&quot; in a compiler nor would software need recompilation for each CPU model.
评论 #41030911 未加载
评论 #41031169 未加载
评论 #41034185 未加载
评论 #41031241 未加载
评论 #41038014 未加载
评论 #41031468 未加载
评论 #41031182 未加载
评论 #41030864 未加载
rurban10 个月前
In general: a tree matcher.<p>In this case: hardcoded search.