In 2010, I tried to use callgrind to profile a project on Arm, after having used it to great effect on x86, and discovered that because of the variety of ways to return (and call!) functions on Arm, callgrind was unable to reliably identify function call and return sites. It created cycles in the call graph and even failed to record a function's self measurements correctly (because it could not tell when you left that function).<p>The problem boiled down to the valgrind frontend code that splits things up into basic blocks being incapable of having an instruction be both a conditional jump and a function call / return at the same time. That never happens on x86, but of course this is possible (and totally normal) on 32-bit Arm. Sadly, I ran out of time to try to re-architect this code and had to move on to other projects.<p>Over 12 years later, it looks like it never did get fixed: <a href="https://bugs.kde.org/show_bug.cgi?id=252091" rel="nofollow">https://bugs.kde.org/show_bug.cgi?id=252091</a>
Thankfully, PC is no longer a GPR in ARM64. Making PC a GPR seems elegant at first glance, but when you actually dive into it and see how it affects processor implementations and how it affects the code you write, it turns out to be extremely messy and inconvenient. Good riddance PC as GPR, don’t let the door hit you on the way out.
Early versions of ARM (ARM 1/2, optional in 3/4) had a combined program counter / status register; since there was only a 26-bit address space and instructions are always 32-bit word aligned, the top 6 and bottom 2 bits were used for the status register.<p>So, if you're still developing for an ARM1, not all of these are equivalent. MOV/POP/etc will set the PC and the status register; B/BL will leave the status register bits alone.<p>* edit: MOV/MOVS determined if the status bits are written to R15.
Method 1 (popping PC off the stack) and Method 3 (mov pc,lr) do not work on the earliest ARM processors that support THUMB, as it will not switch to THUMB mode without executing a BX instruction.<p>Checking reference manuals:<p>ARMV4T (ARM7TDMI/ARM9TDMI): Does NOT switch to THUMB mode automatically<p>ARMV5: Does NOT switch to THUMB mode automatically<p>ARMV7: Does switch to THUMB mode automatically
For older architectures, you really want to use the BX instruction unless you can guarantee you're not switching execution mode.<p>as a bit of pointless trivia, MOV PC, PC does not cause an infinite loop - it skips the instruction immediately following.
It's sad (but very reasonable) that newer architectures have tried to be significantly <i>less</i> flexible about control flow for security reasons, in the direction of "there is only one way to call a function, and only one way to return from a function, and you have to tell the system where your functions and returns are so that someone can't call or return into the middle or can't leave the function at an unexpected point" (and, of course, no self-modifying code, and even discouraging JITs).
As other commenters have mentioned, exploiting this will confuse other tools and debuggers. Also it tends to play havoc with branch prediction meaning that there may be performance penalties.
fwiw, if you're using ARM assembly on an Apple device there are a few differences and one of them is how you pass arguments.<p><a href="https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms" rel="nofollow">https://developer.apple.com/documentation/xcode/writing-arm6...</a>