TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Pnut: A C to POSIX shell compiler you can trust

193 pointsby feeley10 months ago

26 comments

1vuio0pswjnm710 months ago
&quot;Because Pnut can be distributed as a human-readable shell script (`pnut.sh`), it can serve as the basis for a reproducible build system. With a POSIX compliant shell, `pnut.sh` is sufficiently powerful to compile itself and, with some effort, [TCC](<a href="https:&#x2F;&#x2F;bellard.org&#x2F;tcc&#x2F;" rel="nofollow">https:&#x2F;&#x2F;bellard.org&#x2F;tcc&#x2F;</a>). Because TCC can be used to bootstrap GCC, this makes it possible to bootstrap a fully featured build toolchain from only human-readable source files and a POSIX shell.<p>Because Pnut doesn&#x27;t support certain C features used in TCC, Pnut features a native code backend that supports a larger subset of C99. We call this compiler `pnut-exe`, and it can be compiled using `pnut.sh`. This makes it possible to compile `pnut-exe.c` using `pnut.sh`, and then compile TCC, all from a POSIX shell.&quot;<p>Anywhere we can see a step-by-step demo of this process.<p>Curious if the authors tried NetBSD or OpenBSD, or using another small C compiler, e.g., pcc.<p>Historically, tcc was problematic for NetBSD and its forks. Not sure about today, but tcc is <i>still</i> in NetBSD pkgsrc WIP which suggests problems remain.
评论 #41071790 未加载
theamk10 months ago
If you are wondering how it handles C-only functions.. it does not.<p>open(..., O_RDWR | O_EXCL) -&gt; runtime error, &quot;echo &quot;Unknow file mode&quot; ; exit 1&quot;<p>lseek(fd, 1, SEEK_HOLE); -&gt; invalid code (uses undefined _lseek)<p>socket(AF_UNIX, SOCK_STREAM, 0); -&gt; same (uses undefined _socket)<p>looking closer at &quot;cp&quot; and &quot;cat&quot; examples, write() call does not handle errors at all. Forget about partial writes, it does not even return -1 on failures.<p>&quot;Compiler you can Trust&quot;, indeed... maybe you can trust it to get all the details wrong?
评论 #41057674 未加载
评论 #41064523 未加载
评论 #41053886 未加载
评论 #41055144 未加载
cozzyd10 months ago
Can finally port systemd to shell to quell the rebellion.
评论 #41053453 未加载
okaleniuk10 months ago
I love things like these because they shake our perception of normal loose. And who said our perception of normal doesn&#x27;t deserve a good shake?<p>A C to shell compiler might seem impractical, but you know what is even more impractical? Having a separate language for a build system. And yet, here we are. Using Shell, Make or CMake to build a C program is only acceptable because is has always been so. It&#x27;s a &quot;perceived normality&quot; in the C world.<p>There is no good reason, however, CMake isn&#x27;t a C library. With build system being a library, we could write, read, and, most importantly, debug build scripts just like any other part of the buildable. We already have includeOS, why not includeMake?
评论 #41057422 未加载
评论 #41063187 未加载
评论 #41055414 未加载
评论 #41057825 未加载
评论 #41056623 未加载
评论 #41055728 未加载
评论 #41063407 未加载
wahern10 months ago
This is very cool, regardless of how serious it was intended to be taken. Before base-64 encoders&#x2F;decoders became more common as preinstalled commands in the environments I found myself on, I wrote a base64 utility in mostly pure POSIX shell:<p><pre><code> https:&#x2F;&#x2F;25thandClement.com&#x2F;~william&#x2F;2023&#x2F;base64.sh </code></pre> If this project had existed I might have opted to compile my C-based base-64 encoder and decoder routines, suitably tweaked for pnut&#x27;s limitations.<p>I say base64.sh is mostly pure not because it relies on shell extensions, but because the only non-builtins it depends on are od(1) or, alternatively, dd(1) to assist with binary I&#x2F;O. And preferably od(1), as reading certain control characters, like NUL, into a shell variable is especially dubious. The encoder is designed to operate on a stream of decimal encoded bytes. (See decimals_fast for using od to encode stdin to decimals, and decimals_slow for using dd for the same.)<p>It looks like pnut uses `read -r` for reading input. In addition to NULs and related raw byte issues, I was worried about chunking issues (e.g. truncation or errors) on binary data, e.g. no newlines within LINE_BUF bytes. Have you tested binary I&#x2F;O much? Relatedly, how many different shell implementations have you tested your core scheme with? In addition to bash, dash, and various incarnations of &#x2F;bin&#x2F;sh on the BSDs, I also tested base64.sh with Solaris&#x27; system shells (ksh88 and ksh93 derivatives), as well as AIX&#x27;s (ksh88 derivative). AIX had some odd quirks with pipelines even with plain text I&#x2F;O. (Unfortunately Polar Home is gone, now, so I have no easy way to play with AIX; maybe that&#x27;s for the better.)
评论 #41053576 未加载
评论 #41056991 未加载
voidUpdate10 months ago
When I&#x27;m told that &quot;I can trust&quot; something that I feel like I had no reason to distrust, it makes me feel even more suspicious of it
评论 #41059468 未加载
评论 #41057677 未加载
评论 #41057119 未加载
评论 #41056751 未加载
akoboldfrying10 months ago
I was puzzled by the example C function containing pointers. Do I understand correctly that you implement pointers in shell by having a shell variable _0 for the first &quot;byte&quot; of &quot;memory&quot;, a shell variable _1 for the second, etc.?
评论 #41053028 未加载
rubicks10 months ago
I can&#x27;t wait to see the shell equivalents for ptrace, setjmp, and dlopen.
评论 #41056750 未加载
metadat10 months ago
Also see this related submission from May, 2024:<p><i>Amber: Programming language compiled to Bash</i> <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=40431835">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=40431835</a> (318 comments)<p>---<p>Pnut doesn&#x27;t seem to differentiate between `int&#x27; and `int*&#x27; function parameters. That&#x27;s weird, and doesn&#x27;t come across as trustworthy at all! Shouldn&#x27;t the use of pointers be disallowed instead?<p><pre><code> int test1(int a, int len) { return a; } int test2(int* a, int len) { return a; } </code></pre> Both compile to the exact same thing:<p><pre><code> : $((len = a = 0)) _test1() { let a $2; let len $3 : $(($1 = a)) endlet $1 len a } : $((len = a = 0)) _test2() { let a $2; let len $3 : $(($1 = a)) endlet $1 len a } </code></pre> The &quot;runtime library&quot; portion at the bottom of every script is nigh unreadable.<p>Even still, it&#x27;s a cool concept.
teo_zero10 months ago
Just to be clear, the input must be written in a subset of C, because many constructs are not recognized, like unsigned types, static variables, [] arrays, etc.<p>Is there a plan to remove such limitations?
评论 #41055390 未加载
itvision10 months ago
Instantly make your C code 200 times slower without any effort!
评论 #41056962 未加载
评论 #41056710 未加载
andrewf10 months ago
Looking forward to the point where this can build autoconf. It&#x27;s great that the generated .&#x2F;configure script is portable but if I want to make substantial changes to the project I need to find a binary for my machine (and version differences can be quite substantial)
评论 #41053371 未加载
评论 #41053142 未加载
kazinator10 months ago
This is not useful if it doesn&#x27;t call external libraries.<p>Even POSIX standard ones. Chokes on:<p><pre><code> #include &lt;glob.h&gt; int main() &#x2F;&#x2F; must be (); (void) results in syntax error. { glob_t gb; &#x2F;&#x2F; syntax error here glob(&quot;abc&quot;, 0, NULL, &amp;gb); return 0; } </code></pre> Nobody needs entirely self-contained C programs with no libraries to be turned into shell scripts; Unix people switch to C when there is a library function they need to call for which there no command in &#x2F;bin or &#x2F;usr&#x2F;bin.<p>If I reduce it to:<p><pre><code> #include &lt;glob.h&gt; int main() { glob(&quot;abc&quot;, 0, NULL, 0); return 0; } </code></pre> it &quot;compiles&quot; into something with a main function like:<p><pre><code> _main() { defstr __str_0 &quot;abc&quot; _glob __ $__str_0 0 $_NULL 0 : $(($1 = 0)) } </code></pre> but what good is that without a definition of _glob.
forrestthewoods10 months ago
Hrmmm. But why?<p>Quite frankly I think Bash scripting is awful and frequently wish shell scripts were written in a real and debuggable language. For anything non-trivial that is.<p>I feel like I’d rather write C and compile it with Cosmopolitan C to give me a cross-platform binary than this.<p>Neat project. Definitely clever. But it’s headed in the opposite direction from what I’d prefer...
评论 #41053236 未加载
评论 #41054108 未加载
评论 #41053310 未加载
评论 #41054256 未加载
vermon10 months ago
If the end goal is portability for C, would Cosmopolitan Libc be a better choice because it supports a lot more features and probably runs faster?
评论 #41054690 未加载
iod10 months ago
I am sorry if this comes off to be negative, but with every example provided on the site, when compiled and then fed into ShellCheck¹, generates warnings about non-portable and ambiguous problems with the script. What exactly are we supposed to trust?<p>¹ <a href="https:&#x2F;&#x2F;www.shellcheck.net" rel="nofollow">https:&#x2F;&#x2F;www.shellcheck.net</a>
评论 #41060714 未加载
osmsucks10 months ago
I&#x27;m writing something similar, but it&#x27;s based on its own scripting language. The idea of transpiling C sounds appealing but impractical: how do they plan to compile, say, things using mmap, setjmp, pthreads, ...? It would be better to clearly promise only a restricted subset of C.
kxndnenfn10 months ago
This is quite interesting! Without having dug deeper into it, seeing the human readable output I assume quite different semantics from C?<p>The C to shell transpiler I&#x27;m aware of will output unreadable code (elvm using 8cc with sh backend)
dsp_person10 months ago
I use linux-vt-setcolors in my startup, which would be a bit more convenient if it was a shell script instead of C, but it uses ioctl.<p>Trying to compile with this tool fails with &quot;comp_glo_decl: unexpected declaration&quot;
Retr0id10 months ago
Can it do wrapping arithmetic?<p>The `sum` example doesn&#x27;t seem to do wrapping, but signed int overflow is technically UB so I guess they&#x27;re fine not to.<p>Switching it to `unsigned int` gives me:<p>code.c:1:1 syntax error: unsupported type
yencabulator10 months ago
It seems to have practically no error checking. Try compiling<p><pre><code> int why(int unused) { wat_why_does_this_compile; no_error_checking(); }</code></pre>
atilaneves10 months ago
I&#x27;m still figuring out why anyone would want to write a shell script in C. That sounds like torture to me.
JoshTriplett10 months ago
Several times I&#x27;ve found myself wishing for the reverse: a shell-to-binary compiler or JIT.
layer810 months ago
Can you trust that it faithfully reproduces undefined behavior? ;)
gojomybeloved10 months ago
Love this!
o11c10 months ago
It&#x27;s a bad sign when I immediately look at the screenshot and see quoting bugs.
评论 #41052900 未加载