For AESNI, you probably are already using some sort of assembly to call the instructions. In the same assembly, you could wipe the key and plaintext as the last step.<p>For the stack, if you can guess how large the function's stack allocation can be (shouldn't be too hard for most functions), you could after returning from it call a separate assembly function which allocates a larger stack frame and wipes it (don't forget about the redzone too!). IIRC, openssl tries to do that, using an horrible-looking piece of voodoo code.<p>For the registers, the same stack-wiping function could also zero all the ones the ABI says a called function can overwrite. The others, if used at all by the cryptographic function, have already been restored before returning to the caller.<p>Yes, it's not completely portable due to the tiny amount of assembly; but the usefulness of portable code comes not from it being 100% portable, but from reducing the amount of machine- and compiler-specific code to a minimum. Write one stack- and register-wipe function in assembly, one "memset and I mean it" function using either inline assembly or a separate assembly file, and the rest of your code doesn't have to change at all when porting to a new system.