TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Using LLMs to Generate Fuzzers

156 点作者 moyix大约 1 年前

6 条评论

ttul大约 1 年前
I read a lot of niggling comments here about whether Claude was really being smart in writing this GIF fuzzer. Of course it was trained on fuzzer source code. Of course it has read every blog post about esoteric boundary conditions in GIF parsers.<p>But to bring all of those things together and translate the concepts into working Python code is astonishing. We have just forgotten that a year ago, this achievement would have blown our minds.<p>I recently had to write an email to my kid’s school so that he could get some more support for a learning disability. I fed Claude 3 Opus a copy of his 35 page psychometric testing report along with a couple of his recent report cards and asked it to draft the email for me, making reference to things in the three documents provided. I also suggested it pay special attention to one of the testing results.<p>The first email draft was ready to send. Sure, I tweaked a thing or two, but this saved me half an hour of digging through dense material written by a psychologist. After verifying that there were no factual errors, I hit “Send.” To me, it’s still magic.
评论 #39659853 未加载
smusamashah大约 1 年前
I have kind of pet peeve with people testing LLMs like this these days.<p>They take whatever it spits out in the first attempt. And then they go on extrapolate this to draw all kinds of conclusions. They forget the output it generated is based on a random seed. A new attempt (with a new seed) is going to give a totally different answer.<p>If the author has retried that prompt, that new attempt might have generated better code or might have generated lot worse code. You can not draw conclusions from just one answer.
评论 #39658540 未加载
评论 #39659918 未加载
popinman322大约 1 年前
You could likely also combine the LLM with a coverage tool to provide additional guidance when regenerating the fuzzer: &quot;Your fuzzer missed lines XX-YY in the code. Explain why you think the fuzzer missed those lines, describe inputs that might reach those lines in the code, and then update the fuzzer code to match your observations.&quot;<p>This approach could likely also be combined with RL; the code coverage provides a decent reward signal.
评论 #39657758 未加载
planetis大约 1 年前
It seems to overlook that the language model was developed using a large corpora of code, which probably includes structured fuzzers for file formats such as GIF. Plus, the scope of the &quot;unknown&quot; format introduced is limited.
评论 #39661188 未加载
dmazzoni大约 1 年前
Why wouldn&#x27;t you have an LLM write some code that uses something like libfuzzer instead?<p>That way you get an efficient, robust coverage-driven fuzzing engine, rather than having an LLM try to reinvent the wheel on that part of the code poorly. Let the LLM help write the boilerplate code for you.
评论 #39662687 未加载
评论 #39663224 未加载
aaron695大约 1 年前
I don&#x27;t understand why we are getting LLMs to generate <i>code</i> to create fuzzing data as a &#x27;thing&#x27;<p>Logically LLMs should be quite good at creating the fuzzing data.<p>To state the obvious why, it&#x27;s too expensive to use LLMs directly and this way works since they found &quot;4 memory safety bugs and one hang&quot;<p>But the future we are heading to should be LLMs will directly pentest&#x2F;test the code. This is where it&#x27;s interesting and new.
评论 #39661201 未加载
评论 #39662414 未加载