TechEcho

9 comments

This is way over my head, but I was reminded of The C language is purely functional by Conal Elliott: <a href="http://conal.net/blog/posts/the-c-language-is-purely-functional" rel="nofollow">http://conal.net/blog/posts/the-c-language-is-purely-functio...</a>

ksherlockabout 1 year ago

The source code: <a href="https://github.com/appleseedlab/superc/">https://github.com/appleseedlab/superc/</a>

评论 #39651244 未加载

DriftRegionabout 1 year ago

Figure 1 spoke to me. It's an expanded syntax tree that branches depending on on the value of a preprocessor definition "CONFIG...X". I've often found myself doing the kind of code archeology that this paper seems to be trying to automate: exploring all the configuration possibilities implied by the codebase / build system. A C program that makes heavy use of the preprocessor is generally harder to grok by both h humans and static analysis because 1. the C preprocessor syntax is different from C, 2. the inputs are not necessarily bounded by what appears in the source files alone ("-DCONFIG...X=foo" passed in from the build system), and 3. the resulting program and its control flow may be quite different depending on preprocessor options. As a simple example embedded systems often define an "ASSERT(X)" macro as either noop, an infinite loop, a print statement or the like.This is definitely a niche space but I see clear use for large, portable and configurable c codebases (e.g. Linux kernel, FreeRTOS) for providing better visibility into the configuration system.

评论 #39654555 未加载

mncharityabout 1 year ago

Fwiw, ~20 years ago my experience was that preprocessor use in open-source C code was very idiomatic, and iirc, a simple backtracking parser with idioms was sufficient to parse all I tried it against, including the linux kernel.

kazinatorabout 1 year ago

By the way, GNU Bison implements general LR (GLR) parsing by something that can be called "fork merge LR". The documentation states that Bison's GLR algorithm resolves ambiguities by forking parallel parses, which then merge. It's not the same as forking due to a preprocessor conditional, but worth mentioning.

mdanielabout 1 year ago

I am obviously not able to understand what, specific, problem this is solving based on the title of "parsing all of C" when the preprocessor is apparently left intact by design<pre><code> static int mousedev_open(struct inode *inode, struct file *file) { int i; #ifdef CONFIG_INPUT_MOUSEDEV_PSAUX if (imajor(inode) == 10) i = 31; else #endif i = iminor(inode) - 32; return 0; } (b) The preprocessed source preserving all configurations </code></pre> and my experience with C is that there are untold number of "unbound" tokens that are designed to be injected in by -D or auto-generated config.h files, so presumably this works closer to the "ready for compilation" phase versus something one could use to make tree-sitter better (as an example)

lacraig2about 1 year ago

This looks really useful, but it seems like an uphill battle even reproducing given the lack of updates in almost the last decade.

评论 #39653532 未加载

kazinatorabout 1 year ago

> In exploring configuration-preserving parsing, we focus on performance.Why, because this goose is so thoroughly cooked that all that is left is optimizing for speed?There is a lot of misplaced focus on performance in CS academia, and also in software.Suppose we have some accurate tool that does something useful with a C program, but it takes 5 minutes to run instead of 5 seconds. So what? Someone still wants to use it. Suppose the program is used by millions of people, and that 5 minute run only has to be repeated half a dozen times during development.Get it right, and get it in people's hands should be the priorities, and not necessarily in that order.

dzdtabout 1 year ago

This is (2012). I don't see that it has been discussed before here though. I guess it didn't make much of a splash.

评论 #39651436 未加载

9 comments

evanjrowleyabout 1 year ago

ksherlockabout 1 year ago

The source code: <a href="https://github.com/appleseedlab/superc/">https://github.com/appleseedlab/superc/</a>

评论 #39651244 未加载

DriftRegionabout 1 year ago

评论 #39654555 未加载

mncharityabout 1 year ago

kazinatorabout 1 year ago

mdanielabout 1 year ago

lacraig2about 1 year ago

This looks really useful, but it seems like an uphill battle even reproducing given the lack of updates in almost the last decade.

评论 #39653532 未加载

kazinatorabout 1 year ago

dzdtabout 1 year ago

This is (2012). I don't see that it has been discussed before here though. I guess it didn't make much of a splash.

评论 #39651436 未加载

SuperC: Parsing All of C by Taming the Preprocessor [pdf] (2012)

9 comments

SuperC: Parsing All of C by Taming the Preprocessor [pdf] (2012)

9 comments