As someone with considerable experience in using x86 Asm (over a decade), it's always nice to see more articles on it, but there are a few things I found odd about this one.<p><i>The basis of the algorithm to use was decided to be a space-optimised method using a dynamic programming approach. This effectively means using an array on the stack to hold our values while giving O(n) time and space complexity.</i><p>That isn't "space-optimised" at all --- when I think of a Fibonacci example I think of O(1) space. Unless it was chosen specifically to demonstrate array indexing (in which case it would be better to use an algorithm that actually requires an array, like sorting), the choice of algorithm is unusual. The same applies to finding the length of argv[1] before doing the equivalent of an atoi() on it --- if you're going to read the bytes of the string through, might as well accumulate and convert, and oddly enough the length appears to be unused too.<p>The unexplained and useless nop is also puzzling. Why is it there? What is it for? The same goes for the many other instructions which aren't actually contributing to the calculation, and initially made me think it was compiler output when I first glanced over the code. Writing a program that performs the given task is relatively straightforward even in Asm, but your solution is unnecessarily complex.<p><i>Making things easier to understand using function calls</i><p>I consider this a premature abstraction. A comment or even label will do just fine to delimit the blocks of functionality, as your "functions" are called only once. Your task has 3 parts [basically atoi(), fib(), itoa()+output] so you will have 3 blocks of code. It cannot be any simpler than that. Better is "making things easier to understand by avoiding unnecessary code".<p>Here is how I would write the first 2 blocks; I wrote this without testing at all so it could have errors, but it is really quite simple:<p><pre><code> mov esi, [esp+8] ; get argv[1]
xor eax, eax ; clear the upper bits
xor ecx, ecx ; will hold converted value
atoiloop:
lodsb ; get a character
cmp al, '0'
jb fibinit
cmp al, '9'
ja fibinit
imul ecx, ecx, 10
lea ecx, [ecx+eax-'0']
jmps atoiloop
fibinit:
jecxz invalid ; can't do fib(0)
xor eax, eax
inc eax ; eax = 1
cdq ; edx = 0
fibloop:
xadd eax, edx
loop fibloop
; now eax and edx will contain fib(n) and fib(n-1)
</code></pre>
Observe that I chose the specific registers for a reason: esi/al will be used by lodsb, ecx by the looping instructions, and edx for xadd. This is one of the things that compilers are not very good at, and also why carefully written Asm can easily beat them.<p>If you want to learn more Asm I recommend any assembler besides GAS, and looking at some examples from advanced users: <a href="http://www.hugi.scene.org/compo/compoold.htm" rel="nofollow">http://www.hugi.scene.org/compo/compoold.htm</a> The demoscene in its smaller-size competitions has lots of particularly knowledgeable users too.