TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Writing C software without the standard library

492 pointsby andxorover 8 years ago

29 comments

userbinatorover 8 years ago
<i>A value in the range between -4095 and -1 indicates an error, it is -errno.</i><p>The syscall&#x2F;errno stuff has always seemed unusual, inelegant, and inefficient --- instead of just returning a negative error code directly, the function returns the vague &quot;an error has occurred&quot; -1, and you have to then check errno separately after that. It only adds insult to injury when you realise that the kernel itself isn&#x27;t doing it, but the syscall wrappers. And thanks to POSIX standardising this mechanism, the alternative will likely never get much adoption; of course, if you write your own syscall wrappers like this article, then you can skip that bloat.<p><i>For now this guide is linux-only, but I will be writing a windows version when I feel like firing up a virtual machine.</i><p>Unfortunately the Windows syscalls are not officially documented and even less stable than on Linux, changing even between service packs.<p><a href="http:&#x2F;&#x2F;j00ru.vexillium.org&#x2F;ntapi&#x2F;" rel="nofollow">http:&#x2F;&#x2F;j00ru.vexillium.org&#x2F;ntapi&#x2F;</a><p><a href="http:&#x2F;&#x2F;j00ru.vexillium.org&#x2F;ntapi_64&#x2F;" rel="nofollow">http:&#x2F;&#x2F;j00ru.vexillium.org&#x2F;ntapi_64&#x2F;</a><p>At least on Linux the first few (i.e. the oldest, most common and useful) syscalls have not really moved around over the years:<p><a href="https:&#x2F;&#x2F;filippo.io&#x2F;linux-syscall-table&#x2F;" rel="nofollow">https:&#x2F;&#x2F;filippo.io&#x2F;linux-syscall-table&#x2F;</a>
评论 #13062500 未加载
评论 #13064750 未加载
评论 #13066807 未加载
评论 #13065445 未加载
vxNsrover 8 years ago
He claims that your code will be easy to port but then goes straight to Linux system calls.<p>Still I like the idea. This is something that should be covered in a CS 102 type course. I know way to many cs guys who have no idea how to debug, let alone how their is being implemented.
评论 #13061561 未加载
评论 #13062319 未加载
评论 #13062196 未加载
评论 #13064009 未加载
评论 #13061742 未加载
评论 #13062093 未加载
评论 #13061748 未加载
评论 #13061758 未加载
评论 #13062311 未加载
评论 #13061793 未加载
leeterover 8 years ago
A few thoughts:<p>* The space savings are moot as other processes such as the daemons are going to load libc into virtual memory anyway, and the kernel shares libc&#x27;s page among all processes.<p>* This adds a lot of LOC you have to maintain, instead of shoving it off on the compiler&#x2F;libc vendor, this increases the chance of bugs.<p>* This will prevent the use of VDSOs to optimize high volume system calls like gettimeofday.<p>* It&#x27;s still probably good to know how these happen, even if you&#x27;re not doing them yourself.<p>* The only place this would really see benefit is in a single process environment, however in those cases I would suggest a unikernel anyway for simplicity sake.
评论 #13064169 未加载
评论 #13064533 未加载
评论 #13064550 未加载
dvfjsdhgfvover 8 years ago
The guy is definitely a fan of old-school minimalism: <a href="http:&#x2F;&#x2F;weeb.ddns.net&#x2F;0&#x2F;articles&#x2F;modern_software_is_at_its_worst.txt" rel="nofollow">http:&#x2F;&#x2F;weeb.ddns.net&#x2F;0&#x2F;articles&#x2F;modern_software_is_at_its_wo...</a> I have to say I miss the old days of Gopher, too. It was so much easier to focus on the content back then.
评论 #13061563 未加载
评论 #13062019 未加载
评论 #13062167 未加载
评论 #13063301 未加载
nwmcsweenover 8 years ago
The comment section where gcc puts in ident info can be omitted with -fno-ident and syscall(2) is usually a very thin wrapper[0]. If you follow the musl syscall(2) it simply maps errors to errno[1] and uses the fancy count-args-in-macro[2] to call off the respective $arch&#x2F;syscall_arch.h[3] syscall$n numbered functions.<p>[0] <a href="https:&#x2F;&#x2F;git.musl-libc.org&#x2F;cgit&#x2F;musl&#x2F;tree&#x2F;src&#x2F;misc&#x2F;syscall.c" rel="nofollow">https:&#x2F;&#x2F;git.musl-libc.org&#x2F;cgit&#x2F;musl&#x2F;tree&#x2F;src&#x2F;misc&#x2F;syscall.c</a><p>[1] <a href="https:&#x2F;&#x2F;git.musl-libc.org&#x2F;cgit&#x2F;musl&#x2F;tree&#x2F;src&#x2F;internal&#x2F;syscall_ret.c" rel="nofollow">https:&#x2F;&#x2F;git.musl-libc.org&#x2F;cgit&#x2F;musl&#x2F;tree&#x2F;src&#x2F;internal&#x2F;syscal...</a><p>[2] <a href="https:&#x2F;&#x2F;git.musl-libc.org&#x2F;cgit&#x2F;musl&#x2F;tree&#x2F;src&#x2F;internal&#x2F;syscall.h" rel="nofollow">https:&#x2F;&#x2F;git.musl-libc.org&#x2F;cgit&#x2F;musl&#x2F;tree&#x2F;src&#x2F;internal&#x2F;syscal...</a><p>[3] <a href="https:&#x2F;&#x2F;git.musl-libc.org&#x2F;cgit&#x2F;musl&#x2F;tree&#x2F;arch&#x2F;x86_64&#x2F;syscall_arch.h" rel="nofollow">https:&#x2F;&#x2F;git.musl-libc.org&#x2F;cgit&#x2F;musl&#x2F;tree&#x2F;arch&#x2F;x86_64&#x2F;syscall...</a>
评论 #13061871 未加载
评论 #13063302 未加载
beeforporkover 8 years ago
If the asm was written a little more cleverly, the syscalls would avoid almost all moves, because the compiler&#x27;d put everything in place:<p><pre><code> _syscall5: mov %r9, %r10 _syscall3: mov %rcx, %rax syscall ret </code></pre> And then:<p><pre><code> extern unsigned long _syscall3( unsigned long, unsigned long, unsigned long, unsigned long); extern unsigned long _syscall5( unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); #define syscall0(NUM) _syscall3(0,0,0,NUM) #define syscall1(NUM,A) _syscall3(A,0,0,NUM) #define syscall2(NUM,A,B) _syscall3(A,B,0,NUM) #define syscall3(NUM,A,B,C) _syscall3(A,B,C,NUM) #define syscall4(NUM,A,B,C,D) _syscall5(A,B,C,NUM,0,D) #define syscall5(NUM,A,B,C,D,E) _syscall5(A,B,C,NUM,E,D)</code></pre>
评论 #13066506 未加载
oso2kover 8 years ago
A little self promotion but mostly because it addresses some of the other commenters concerns about malloc (or the lower-level api around sbrk): a couple years ago I wrote rt0 [0], a small (mostly minimal) C runtime for i386 &amp; amd64 that makes it easier to replace libc &amp; crt0 (as long as you have the kernel headers installed). Also, as part of the examples, I wrote wrappers around the sbrk syscall. Pretty easy to do and all documented in the repo. I expect to eventually port the lib to arm (raspberry pi) and aarm64. There&#x27;s also lots of references to other small c runtimes. I&#x27;ll be adding this one as well.<p>[0] <a href="https:&#x2F;&#x2F;github.com&#x2F;lpsantil&#x2F;rt0" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;lpsantil&#x2F;rt0</a>
pawaduover 8 years ago
I think this is unnecessary when you got &lt;stdint.h&gt;:<p><pre><code> typedef unsigned long int u64; typedef unsigned int u32; ... </code></pre> if you define your own types like this you may need to revise them when you switch architecture or even compiler.<p>Now you could argue that this is part of the standard library, but I actually see it as a part of the standard C language.
评论 #13062649 未加载
评论 #13064893 未加载
nathan_f77over 8 years ago
&gt; When we learn C, we are taught that main is the first function called in a C program. In reality, main is simply a convention of the standard library.<p>Well, I&#x27;ve already learned something new. I assumed that convention was from the compiler. This is a great resource.
gibsjoseover 8 years ago
While this seems mainly useful as an academic exercise, the `printf &quot;#include &lt;unistd.h&gt;&quot; | gcc -E - | grep size_t` bit to easily grep in header files was worth the read.
评论 #13061480 未加载
rikkusover 8 years ago
It&#x27;s interesting to read the sources[1] of lots of djb&#x27;s[2] code, as he often works around problems with (or perhaps dislikes the style of) standard libraries by re-implementing parts.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;abh&#x2F;djbdns&#x2F;blob&#x2F;master&#x2F;str_len.c" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;abh&#x2F;djbdns&#x2F;blob&#x2F;master&#x2F;str_len.c</a> [2] <a href="http:&#x2F;&#x2F;cr.yp.to&#x2F;djb.html" rel="nofollow">http:&#x2F;&#x2F;cr.yp.to&#x2F;djb.html</a>
评论 #13063307 未加载
capnfantasicover 8 years ago
Fantastic until you need to malloc. You&#x27;re reimplementing libc, but at least you know what&#x27;s going on at every level.
评论 #13061454 未加载
评论 #13061539 未加载
评论 #13061472 未加载
评论 #13062655 未加载
coreyp_1over 8 years ago
It&#x27;s posts like this (and the accompanying comments) that make me realize how much I still have left to learn!<p>One of the reasons that I love HN is how informative you all are!
lolisamuraiover 8 years ago
The server is getting hit pretty hard right now, did not expect this much traffic. In the meantime, you can find a bbcode mirror of the guide here: <a href="https:&#x2F;&#x2F;ccplz.net&#x2F;threads&#x2F;writing-c-software-without-the-standard-library-linux-edition.69623&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ccplz.net&#x2F;threads&#x2F;writing-c-software-without-the-sta...</a>
kriroover 8 years ago
Some of the reasons that he mentions for avoiding the standard library could also be mitigated by using another library like dietlibc (I played around with it back in the day, last release seems to be from 2013): <a href="https:&#x2F;&#x2F;www.fefe.de&#x2F;dietlibc&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.fefe.de&#x2F;dietlibc&#x2F;</a>
partycoderover 8 years ago
This is required if you do systems programming (e.g: kernel development).
评论 #13061468 未加载
flukusover 8 years ago
&gt; Executables are incredibly small (the http mirror server for my gopherspace is powered by a 10kb executable).<p>Is this ever an real issue, even on any embedded system in the last 20 years?
评论 #13061467 未加载
评论 #13062076 未加载
评论 #13061628 未加载
评论 #13061457 未加载
评论 #13062005 未加载
评论 #13063735 未加载
评论 #13061582 未加载
nitwit005over 8 years ago
I&#x27;ve tried this myself. What you&#x27;ll run into is that you tend to need a few things that are non-trivial:<p>An implementation of malloc&#x2F;free<p>Functions to parse and print floats (somewhat system dependent)<p>Assembly implementations of any trigonometric functions used<p>While there is code that goes to that effort (The Go runtime comes to mind), it&#x27;s quite a pain for &quot;normal&quot; code.
bogomipzover 8 years ago
I had a question about this sentence:<p>&quot;It&#x27;s often necessary to either push useless data or simply align the stack pointer when the pushed values don&#x27;t happen to be aligned.&quot;<p>That&#x27;s kind of hand-wavy. How do we &quot;simply align the stack pointer&quot;?
评论 #13063135 未加载
评论 #13063123 未加载
sytelusover 8 years ago
It would be wrap this up in lightweight libc. There is uSTL for C++: <a href="https:&#x2F;&#x2F;msharov.github.io&#x2F;ustl" rel="nofollow">https:&#x2F;&#x2F;msharov.github.io&#x2F;ustl</a>
jxyover 8 years ago
It&#x27;s a very good learning process. But once your project scales up, you are essentially writing your own libc.<p>And there is no portability. It only works with the specific architecture&#x27;s calling convention and the specific c compiler.
dispose13432over 8 years ago
&gt; xor rbp,rbp &#x2F;* xoring a value with itself = 0 *&#x2F;<p>Is this faster than a (const) mov ?
评论 #13063526 未加载
评论 #13061615 未加载
thewavelengthover 8 years ago
What is necessary to do this with C++? Is there a tutorial available on the web?
评论 #13062154 未加载
DaiPlusPlusover 8 years ago
Your first paragraph makes me wish this site supported Markdown.
评论 #13064209 未加载
评论 #13061987 未加载
评论 #13062072 未加载
clifanaticover 8 years ago
Interesting - my McAfee web washer blocked this site. Don&#x27;t know why.
评论 #13063527 未加载
taocipianover 8 years ago
the C standard library is not perfect but good enough
00kover 8 years ago
An essential function of stardard library is to wrapper the syscalls. Besides that, you can make a live without the library. But why would you do that?
eliangidoniover 8 years ago
I can&#x27;t believe this post has 409 points. Are we in the 80&#x27;s again ?
SFJulieover 8 years ago
Myth busted : printf(&quot;Hello world&quot;) is simple and is a relevant C program for a beginning.<p>The &quot;hello world&quot; example is just the first step to annihilate your capacity of understanding how thinks works by relying on institutional black magic, that maybe wrong<p>(see all the scanf bugs that have been living in C code for so long and all bugs coming from respecting the old&#x27;s man wisdom)