"Twice the speed of the GBC" is a bit misleading.<p>Clock rate of the ARM7TDMI is indeed around double the GBC (GBA runs at 16.78Mz, while GBC runs at 8.4MHz), but cycles-per-instruction is far lower on the GBA's ARM7TDMI than the GBC's Z80-like processor.<p>On GBA, most instructions take 1 cycle to execute (when running from fast memory). Not all instructions take one cycle, memory Read/Write instructions, branches, and multiplying takes more than one cycle.<p>On GBC, an instruction basically takes 4 cycles per memory access. This includes the instruction fetch itself, each other byte of the instruction, each memory read/write performed, then 4 additional cycles if the instruction performed 16-bit math. (Also stuff for branches too)<p>But GBA doesn't always run code from fast memory. It gets the worst-case performance when executing code directly from the cartridge. When running 16-bit THUMB code, it takes 5 cycles. When running 32-bit ARM code, it takes 8 cycles. This means that a game needs to copy code into fast memory if it wants to run at a high performance.<p>So with the full penalties that come from directly executing code from the cartridge, and you're comparing the simplest instructions, it does end up being only twice as fast. But when running code from fast memory, it's around 16 times faster.
One clear trend in mobile devices since the GBA times is the increase in screen sizes. The Switch's screen is just so much bigger than the GBA's. You also see that on phones, with popular models approaching if not exceeding 7" which used to be reserved to small tablets. Where do we stop? Don't get me wrong, I like the bigger screens, but portability is suffering and most pockets no longer fit today's "portable" devices.
I played so much Advance Wars in high school that I would see the map whenever I closed my eyes and saw it in all my dreams.<p>The original GBA is the most comfortable handheld I've played. Fit perfectly in my hands and my pocket.
What do you mean "No more HDMA tricks"? GBA literally does DMA transfers that are automatically carried out at the horizontal-blanking interval. It's just encoded differently than how the SNES did it - no scanline numbers, just raw data to transfer every scanline as the horizontal-blanking period happens.