TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Writing a Self-Mutating x86_64 C Program (2013)

84 pointsby Cieplakover 6 years ago

7 comments

simiasover 6 years ago
I&#x27;ve written &quot;self-modifying&quot; (really JITed) code for several architectures, mainly ARM, and I when I had to do it for amd64 I was very much surprised by how straightforward it was.<p>On ARM you have to be very careful to handle the cache correctly when you write self-modifying code, because when you access memory using a regular load or store it&#x27;s obviously treated like data and goes through the data cache while the instructions are fetched through the instruction cache. So when you write an opcode you have to be careful to flux it out of the data cache (at least up to the point where the caches unify, typically L2 on ARM) and then invalidate the icache to make sure that you get the opcode back.<p>On modern x86-64 architectures, which typically have a very advanced cache system, I expected to have to deal with that as well. As this article shows, you don&#x27;t. You just write whatever and you can execute it straight after. When you think about it it&#x27;s a rather complicated thing for the hardware to implement. I wonder why they do it that way instead of relying on the software to issue flushes in the (relatively rare) situations where a hazard is possible.
评论 #18885642 未加载
评论 #18889048 未加载
benj111over 6 years ago
Are there any languages that could actually make use of self modifying code?<p>Machines wouldn&#x27;t have the same problems reasoning about it as humans would.<p>Or is it a question of compilers not being good enough until processor tech made the optimisation not worth it?
评论 #18882934 未加载
评论 #18882656 未加载
评论 #18885659 未加载
评论 #18883471 未加载
评论 #18883176 未加载
评论 #18883523 未加载
评论 #18883334 未加载
评论 #18882650 未加载
评论 #18882958 未加载
评论 #18883893 未加载
评论 #18883785 未加载
评论 #18883313 未加载
rootbearover 6 years ago
I once read about a clever use of self-modifying x86 code. The 8086 and 8088 are nearly identical chips, with the difference being that the 8086 has 16-bit I&#x2F;O and the 8088 has 8-bit. The only way for a program to know which chip it&#x27;s running on is to write a bit of self modifying code that takes advantage of this difference in I&#x2F;O size. Both chips use prefetch, but they prefetch words, not bytes, and 8086 words are 16-bit, so it fetches twice as many bytes as the 8088. Thus, one can modify a location in RAM just after the current instruction and that change will be seen on the 8088, but not the 8086, which has already prefetched the previous value.<p>This is all from memory of something I read, probably on Usenet, ages ago. My apologies in advance if I messed up the details.<p>I wonder if any of the various x86 emulators out there get this difference right.
评论 #18884031 未加载
mempkoover 6 years ago
This is super cool. Though makes me want to go back and write some lisp.
crimsonalucardover 6 years ago
Isn&#x27;t this technique used for viruses to escape detection?
Avery3Rover 6 years ago
I don&#x27;t understand why people still use AT&amp;T syntax asm. It&#x27;s so much harder to read than Intel
snekover 6 years ago
I was on board until `PROT_READ | PROT_WRITE | PROT_EXEC`
评论 #18884373 未加载