TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Identifying Authorship Style in Malicious Binaries: Techniques, Challenges

42 pointsby adulauover 4 years ago

2 comments

pattuskover 4 years ago
Reading through the paper, this reminds me of the close reading&#x2F;distant reading paradigms in literary studies. They&#x27;re effectively trying to build a machine learning model to reproduce the task of authorship attribution that analysts typically perform by hand. I have to give them kudos for their comprehensiveness in the feature engineering part and their attention to the numerous traps of authorship attribution (code reuse, multiple author...).<p>Yet one question that is not really considered is the political stakes of authorship attribution. When you look at the &quot;suspected locations&quot; of the malware authors, it&#x27;s quite clear that they&#x27;re mostly located in rogue states. But we also know that some of these attributions can be politically motivated rather than empirically grounded (Sony hacks). In the same way that language models reproduce racist&#x2F;sexist language, this model might thus reproduce geopolitical bias in its authorship attribution.
LambdaTrainover 4 years ago
Authorship style identification in natural language has good intuition that one can work with. However, such notion in binary executive sounds totally nonsense. The only possibility that makes it work might originate from code reuse in a single organization, which is a classical feature to look into for malware detection.<p>So I have no idea what new information this arxiv paper provides other than to introduce an academic topic full of fancy terminologies.
评论 #25844196 未加载