Nice article, but it doesn't quite deliver. It says, the trick is not "black magic", but then defines debugging in terms of ptrace syscalls, describing the API a little bit, but without giving a clue as to how ptrace actually works. So, ptrace is essentially black magic.<p>And this is not really an explanation of "how a debugger works," or even "how gdb works." ptrace is just one of several debug targets for gdb. There are simulators, core files, various embedded monitors, VxWorks, Windows, gdb remote debug servers over various interfaces, and on and on. ptrace is irrelevant to other targets.
One of the more interesting things about the ARM Cortex-M series is that debugging is "built in" to the CPU core on all licensed processors. No hacks required. Something that I'm sure x86 machines would have had, if transistors has been as cheap then as they are now. Of course early on Intel made even <i>more</i> margin on versions of the processor used for doing in circuit emulation by 'bonding out' to an unused pad access to internal trace registers.
I'm not too knowledgeable about this subject, but I've been interested in learning how native code debuggers work for a long time. One thing I wonder is, if the debugger inserts an invalid instruction or a hardware breakpoint instruction into the code at runtime, wouldn't all of the in-memory code need to be reallocated and recalculated in order to make room for the new instruction and recalculate jump addresses? How is this handled?
As someone who used to play with debugger implementations a bunch it's nice to see some articles digging into this.<p>Only feedback I would give is to remove the shadow on your text, I had to manually disable the shadow before I was able to read :).
Also relevant, and a good read to boot:
<a href="http://www.cs.tufts.edu/~nr/pubs/retargetable-abstract.html" rel="nofollow">http://www.cs.tufts.edu/~nr/pubs/retargetable-abstract.html</a>
Based on the title of the article, I expected it to describe very general principles for writing debuggers, but it seems very specific to gdb. Are things similar in, say, Python or Java?
So how are the ptrace functions implemented? Is the "hack" of inserting invalid instructions used even for single stepping? (Though hardware breakpoints are probably easier?)
Didn't read the article (I find the topic rather bland), but debugging on x86-64 is quite simple: you have your debugging registers (DR0-DR4) which set trigger addresses and conditions (execute, read, write) and then call a system interrupt when the condition is satisfied. This approach is limited to 3 breakpoints. Most moder debuggers do software breakpoints, that is when you set a break point for a particular line or instruction, the debuggrr replaces the first byte of the instruction with an int3 instruction (usually interrupt instructions are two byte wide, so technically int3 and "int 3" are different) but regardless the debugger slusually stores the actual instruction byte in a table to replace the int3 when it is actually hit. I suppose one could do this differently by causing a page fault (a simple bit switch from present to not present in the page table) and then monitoring the CR2 register to get the address of the executing code or the daya that is being accesed. One point I forgot to mention is that the x86 has hardware support for single-stepping instructions (a simple flag). But all of these methods require operating system support.