I know I might be barking up "just a showcase" tree, but what's the memory complexity of this? E g. a nice property of multipass assemblers is that these only need to keep a list of symbols in the memory rather than the whole program. Not really a concern for modern systems with large memories, but then those neat haskell tricks usually carry some extra O(n)s - in addition to (avoidable) extra constant factors due to hidden linked lists and immutability. Aka the infamous haskell quicksort - much shorter and clear, but also not really a quicksort.
One pass, but.. I'm willing to bet there is some transitional internal state which had to be looped over to do this, subsuming the "pass" inside lazy eval. You have to know the indeterminate "where" to jump to, which depends on subsequent state. So you preserve execution state, calculate on, and return at end to fill in the gaps.
Or you could just be pessimistic in your offset calculation and assume that all "in between" unconditional jumps are 5 bytes in length. I know I know, not proper. But I doubt that in real code, that simpler algo would make you miss a lot of opportunities for 2 bytes jumps.
Yeah, this is a fun problem. This is where the majority of my work went when writing an x86-virus that lived in the NOP areas between functions. Compilers produced those blocks of NOPs after function bodies to make the following function addresses aligned to to 16-byte boundaries. I chained all of those cavities together with jumps, distributing as much code as would fit and then putting a jump to the next cavity at the end. The trick is that you could fit more instructions (3 more bytes?) if you had a shorter jump.<p>I still want to write my own assembler some day.
This is basically irrelevant now that better ISAs like RISC-V have a fixed instruction length (2 or 4 bytes) so the fancy algorithm here isn't necessary.