TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Interval Parsing Grammars for File Format Parsing (2023) [pdf]

63 点作者 vitplister大约 1 年前

5 条评论

kstenerud大约 1 年前
I built a grammar to tackle these sorts of problems when I had trouble writing formal grammar notations for my binary data format. It&#x27;s even got a syntax highlighter.<p><a href="https:&#x2F;&#x2F;dogma-lang.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;dogma-lang.org&#x2F;</a><p>So far it&#x27;s been able to describe 90% of what&#x27;s out there. Some examples:<p>- 802.3 layer 2 Ethernet: <a href="https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;802.3_layer2.dogma">https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;8...</a><p>- Microsoft ICO format: <a href="https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;ico.dogma">https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;i...</a><p>- Android Dex v39: <a href="https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;dex_v39.dogma">https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;d...</a><p>- IPv4: <a href="https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;ipv4.dogma">https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;i...</a><p>- DNS query: <a href="https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;dns_query.dogma">https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;d...</a><p>- Microsoft Minidump: <a href="https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;minidump.dogma">https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;dogma&#x2F;blob&#x2F;master&#x2F;v1&#x2F;examples&#x2F;m...</a><p>- Concise Binary Encoding: <a href="https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;concise-encoding&#x2F;blob&#x2F;master&#x2F;cbe.dogma">https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;concise-encoding&#x2F;blob&#x2F;master&#x2F;cb...</a><p>- Concise Text Encoding: <a href="https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;concise-encoding&#x2F;blob&#x2F;master&#x2F;cte.dogma">https:&#x2F;&#x2F;github.com&#x2F;kstenerud&#x2F;concise-encoding&#x2F;blob&#x2F;master&#x2F;ct...</a>
评论 #39805551 未加载
nickpsecurity大约 1 年前
Greg Morrisett is one of those people who is consistently involved in neat work. Typed x86, Cyclone, and SAFE architecture come to mind. It’s best to just all the papers of such people. I found a list of his:<p><a href="https:&#x2F;&#x2F;www.cs.cornell.edu&#x2F;~jgm&#x2F;jgm.html" rel="nofollow">https:&#x2F;&#x2F;www.cs.cornell.edu&#x2F;~jgm&#x2F;jgm.html</a>
mdaniel大约 1 年前
My searches for any existing publication of those code snippets didn&#x27;t shake out, so I waited for the 5GB download of the docker .tar and pulled out the files that ended in .ipg. I&#x27;m very cognizant that&#x27;s not the whole story, but that&#x27;s what I had the energy to do for now. I really wanted to see the PDF one, since that&#x27;s actually the heuristic I use for evaluating any such &quot;I can describe binary files&quot; framework because that file format is ... special<p><a href="https:&#x2F;&#x2F;gist.github.com&#x2F;mdaniel&#x2F;cdf52de6a8aa8982d591da82b160a229" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;mdaniel&#x2F;cdf52de6a8aa8982d591da82b160...</a><p><pre><code> tar -xOf ipg-pldi-ae.tar e4a01dbc5b413d9709f0cf716cdb848725893b5f97e0870e09fd83e16839dfad&#x2F;layer.tar \ | tar -xvf - \ home&#x2F;opam&#x2F;pldi-ae&#x2F;IPG&#x2F;spec&#x2F;dns&#x2F;ipg&#x2F;dns.ipg \ home&#x2F;opam&#x2F;pldi-ae&#x2F;IPG&#x2F;spec&#x2F;elf&#x2F;ipg&#x2F;elf.ipg \ home&#x2F;opam&#x2F;pldi-ae&#x2F;IPG&#x2F;spec&#x2F;elf&#x2F;ipg&#x2F;readelf.ipg \ home&#x2F;opam&#x2F;pldi-ae&#x2F;IPG&#x2F;spec&#x2F;gif&#x2F;ipg&#x2F;gif.ipg \ home&#x2F;opam&#x2F;pldi-ae&#x2F;IPG&#x2F;spec&#x2F;ipv4&#x2F;ipg&#x2F;ipv4.ipg \ home&#x2F;opam&#x2F;pldi-ae&#x2F;IPG&#x2F;spec&#x2F;pe&#x2F;ipg&#x2F;pe.ipg \ home&#x2F;opam&#x2F;pldi-ae&#x2F;IPG&#x2F;spec&#x2F;zip&#x2F;blackbox&#x2F;unzip.ipg \ home&#x2F;opam&#x2F;pldi-ae&#x2F;IPG&#x2F;spec&#x2F;zip&#x2F;ipg&#x2F;zip.ipg </code></pre> because Gist wouldn&#x27;t let me use &quot;&#x2F;&quot; in the filenames, I just replaced them with _ after killing the home&#x2F;opam part; I left the rest so hopefully they&#x27;ll show up in search results since pldi-ae and IPG are pretty distinct
khaledh大约 1 年前
BinaryLang (for Nim) has similar features[1]. I&#x27;ve written a very compact ELF parser with it[2]. Notice that the last struct has array elements that skip over content based on offsets specified in the header.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;sealmove&#x2F;binarylang">https:&#x2F;&#x2F;github.com&#x2F;sealmove&#x2F;binarylang</a><p>[2] <a href="https:&#x2F;&#x2F;github.com&#x2F;khaledh&#x2F;elfdump&#x2F;blob&#x2F;master&#x2F;elfparse.nim">https:&#x2F;&#x2F;github.com&#x2F;khaledh&#x2F;elfdump&#x2F;blob&#x2F;master&#x2F;elfparse.nim</a>
dataflow大约 1 年前
Wow. The introduction itself blows my mind. I would&#x27;ve never thought of even trying to specify a formal grammar for binary file formats, let alone come up with an algorithm to handle context sensitive ones that arise in practice. Seems awesome, especially this bit:<p>&gt; To the best of our knowledge, IPGs support all syntactic and parsing-based properties in common file formats and can reduce discrepancies between a file format specification and an implementation, as well as discrepancies between different implementations.