TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: How are you using LLMs for traversing decompiler output?

105 点作者 mjbale1164 个月前
I need to reverse a binary made years ago, and I have zero experience with cpp, so I think it would be a good experiment to get an LLM to help me in any way

20 条评论

carom4 个月前
Binary Ninja has an AI integration called side kick, it has a free trial but I&#x27;m not sure it can be used in the free web version. [1]<p>In my experience, the off the shelf LLMs (e.g. ChatGPT) do a pretty poor job with assembly, they can not reason about the stack or stack frames well.<p>I think your job will be the same with or without AI. Figuring out the data structures and data types a function is operating on and naming variables.<p>What are you reverse engineering for? For example, getting a full compilable decompilation has different goals than finding vulnerabilities or patching a bug.<p>1. <a href="https:&#x2F;&#x2F;sidekick.binary.ninja&#x2F;" rel="nofollow">https:&#x2F;&#x2F;sidekick.binary.ninja&#x2F;</a>
评论 #42597463 未加载
评论 #42597385 未加载
JosephRedfern4 个月前
These guys are building foundational models for this purpose: <a href="https:&#x2F;&#x2F;reveng.ai&#x2F;" rel="nofollow">https:&#x2F;&#x2F;reveng.ai&#x2F;</a>. The results are quite compelling, and they have plugins for your favourite reverse engineering tools.
评论 #42598564 未加载
netsec_burn4 个月前
I made a site to use LLMs to help me with reverse engineering. The output is surprisingly readable, even with C++ classes. Let me know any feedback you might have: <a href="https:&#x2F;&#x2F;decompiler.zeroday.engineering&#x2F;" rel="nofollow">https:&#x2F;&#x2F;decompiler.zeroday.engineering&#x2F;</a>
评论 #42630788 未加载
评论 #42597711 未加载
__alexander4 个月前
Do you have experience reverse engineering? If not, LLMs are not going to help much. LLMs are useful for aiding the analysis but they don’t do the analysis.
评论 #42599392 未加载
lumb634 个月前
It has nothing to do with LLMs, but Ghidra is a wonderful tool.
Dwedit4 个月前
Have you tried Ghidra yet? If you still have your debug symbols, then it can do a really good job.
flashgordon4 个月前
Interesting. Wouldn&#x27;t this actually be a deterministic problem based on graph analysis. Id have thought LLMs would have been more effective taking the out out some graph recognizer and then identifying what those higher level constructs map to?
评论 #42597499 未加载
rgovostes4 个月前
The LLM4Decompile project (<a href="https:&#x2F;&#x2F;github.com&#x2F;albertan017&#x2F;LLM4Decompile">https:&#x2F;&#x2F;github.com&#x2F;albertan017&#x2F;LLM4Decompile</a>) provides some open models for binary to C decompilation and Ghidra pseudocode refinement, along with some training sets.<p>RevEng.ai, linked a few times already, discusses their approach here: <a href="https:&#x2F;&#x2F;blog.reveng.ai&#x2F;training-an-llm-to-decompile-assembly-code&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.reveng.ai&#x2F;training-an-llm-to-decompile-assembly...</a>
mahaloz4 个月前
I like using it for library function comments, variable name recovery, and sometimes types. The comments are usually hit or miss, but I find the variable names to be a bit better than auto-generated ones. I implement most of this in my decompiler plugin: <a href="https:&#x2F;&#x2F;github.com&#x2F;mahaloz&#x2F;DAILA;">https:&#x2F;&#x2F;github.com&#x2F;mahaloz&#x2F;DAILA;</a> check it out if you are interested :).
stackghost4 个月前
The Advent of Cyber side quest this year needed some Ghidra and I found Pickman&#x27;s Model was pretty good at helping me craft a heap exploit from a decompilation.
jkstill4 个月前
I&#x27;ve only played a with this, but it was impressive.<p><a href="https:&#x2F;&#x2F;ghidra-sre.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ghidra-sre.org&#x2F;</a>
userbinator4 个月前
Unfortunately LLMs are not good at precision and details, which is exactly what you need for the sort of analysis you&#x27;re trying to do.
评论 #42611213 未加载
apatheticonion4 个月前
Inspired by the work out there that reverse engineers game engines, I&#x27;ve always wanted to try my hand at reverse engineering to contribute to the world of game preservation.<p>Is it actually legal to decompile a game engine from executables&#x2F;dll files, write new sources by making sense of the output and rewriting it such that it can be compiled targeting modern APIs?<p>I feel like that must be illegal
feznyng4 个月前
You could use the LLM to help you write utility scripts for whatever disassembler you’re using e.g. python for IDA. That might work better than feeding it raw assembly.<p>Game RE communities also have all sorts of neat utilities for decompiling large cpp binaries. Skyrim’s community is pretty active with ghidra&#x2F;ida.<p>Guessing you’re not lucky enough to have a PDB?
评论 #42597980 未加载
klmitchell24 个月前
<a href="https:&#x2F;&#x2F;github.com&#x2F;radareorg&#x2F;r2ai">https:&#x2F;&#x2F;github.com&#x2F;radareorg&#x2F;r2ai</a>
sitkack4 个月前
Do you know the compiler and what the source possibly looks like? I found LLMs are pretty good at recovering code from binaries, they need help though.<p>If you are able to run the program and collect traces, that will help a ton.
svilen_dobrev4 个月前
cpp? that&#x27;s a preprocessor. u mean c++?<p>LLM won&#x27;t help you much if u can&#x27;t understand what it&#x27;s talking about.<p>Manual way is, given ELF (linux executable format) somexe,<p>$ strings somexe<p>$ objdump -d somexe<p>$ objdump -s -j .ro data somexe<p>then look+ponder over the results.<p>and&#x2F;or running ghidra (as mouse&#x27;d UI) over it.. which may help somewhat but not 100%<p>Have in mind, that objdump and ghidra have opposite ways of showing assembly transfer&#x2F;multi-operand instructions - one has <i>mov dest,target</i> , other has <i>mov target,dest</i> - for same code.<p>no idea on (recent) windoze front. IDA ?
u53rn4m34 个月前
RevEng.AI have their own foundational AI models for decompilation with English language summaries.
seba_dos14 个月前
Good luck. If that&#x27;s how you&#x27;re approaching it, you&#x27;re going to need it.
评论 #42598762 未加载
ianhawes4 个月前
Highly recommend it. I reversed an app with o1 Pro Mode and the analysis of the obfuscated C# code matched up accurately with what I eventually discovered by manually reversing.
评论 #42597629 未加载