So a story: I've been a kernel hack since Unix V6, made a living doing it one way or another for over half my life ... learning to think about concurrency, time, interrupts, race conditions etc is hard, very hard - I got pretty good at it ... but then my career took a diversion, I designed chips for a decade or so, everything is concurrency, at the lowest levels .... after a while I came back to doing kernel stuff and found that with this new background all that hard stuff was trivial and obvious.<p>Mostly you just have to steep your brain in it for long enough
> However, printk can block (while allocating memory)<p>No, printk() is magic. It can be called even in NMI context, which is a worse place. Quoting <a href="https://lwn.net/Articles/800946/" rel="nofollow">https://lwn.net/Articles/800946/</a>, "[...] kernel code must be able to call printk() from any context. Calls from atomic context prevent it from blocking; calls from non-maskable interrupts (NMIs) can even rule out the use of spinlocks. [...]"
EBPF is honestly the first thing to try <i>before</i> writing a module.<p>I'm glad to see you used a VM. That's the first step in the right direction. Others have mentioned that you should've used printk(), which is true.<p>I'll mention that you can also run the kernel in a debugger: <a href="https://www.kernel.org/doc/html/latest/dev-tools/gdb-kernel-debugging.html" rel="nofollow">https://www.kernel.org/doc/html/latest/dev-tools/gdb-kernel-...</a>
Linux has some debug options that could have probably helped here. It's a good idea to enable them when developing new code.<p><a href="https://megous.com/dl/tmp/b6e8f550de4539a8.png" rel="nofollow">https://megous.com/dl/tmp/b6e8f550de4539a8.png</a>
Hi HN, this was my first attempt at writing any sort of kernel code. I would love to hear your thoughts on this experience and on the fixes I applied, especially from anyone with more Linux experience than me :)
I see the world “nightmare” used a lot in this attic ale.<p>I wonder if I am the only one that loves debugging difficult/weird problems. It’s something like trying to solve a puzzle. And knowing that the system will never deceive me(it will not be the system’s fault if I get deceived), and that a perfectly reasonable good explanation exists for what I observe helps me do not give up.
You probably already did this, but for the audience: one of the best ways to make sure you're using a function reasonably is to use elixir.bootlin.com to look at other uses and make sure you're using the function similarly. For instance, check out <a href="https://elixir.bootlin.com/linux/latest/A/ident/for_each_process" rel="nofollow">https://elixir.bootlin.com/linux/latest/A/ident/for_each_pro...</a> .
My knee jerk reading this article and seeing a kernel module near 'nodejs' was to grumble and say "wtf they clearly didn't need a kernel module for this". But upon reading deeper I see that accessing the kernel is kinda appropriate.<p>Regardless of whether you end up using eBPF or a .ko like you already have, you may have a yet simpler option. By leveraging the loader you can do an interposition trick with LD_PRELOAD to hook C library accesses. Maybe this is all you need in order to "help students understand system calls such as open, close, dup2, fork, pipe, and others. "<p>Just a suggestion. Carry on, good show.
Takes me back to the days of ATM device driver debugging.
I’ve written 9 kernel drivers.
All in all, a dedicated standalone terminal attached to the serial port of the target is still your best friend.
Great post, also love what you are trying to do with C playground, this is awesome!<p>I've recently been trying to build something similar, visualizing forks/exeve/read/write, but using the strace output of a binary, which is much less challenging.
Great story! I've had a lot of debugging nightmares, but thankfully never anything as bad as that.<p>One thing that looks fishy is this branch:<p><pre><code> if (container_tasks_len == max_container_tasks) {
printk("cplayground: ERROR: container_tasks list hit capacity! We "
"may be missing processes from the procfile output.\n");
break;
}
</code></pre>
Since you said printk can block, why isn't calling it in the rcu critical section a bug? Is it because you immediately break afterwards and don't try to reference the next task?
Great article! Reminds me of when I was working on a bug in a phone kernel and adding its equivalent of printk() made the bug disappear! Lauterbach time!
Back in the Windows NT/2000 days, IIS executed as part of the kernel, debugging ISAPI extensions was an exercise in patience every time a programming error crashed the kernel and a reboot was in order.
Free Book <a href="https://www.tldp.org/LDP/lkmpg/2.6/html/lkmpg.html" rel="nofollow">https://www.tldp.org/LDP/lkmpg/2.6/html/lkmpg.html</a>
You can do most or all of that by reading /proc/<pid>/fdinfo/<fd> and /proc/<pid>/fd/<fd> or by making system calls on the affected fds (which you can do e.g. by injecting code with LD_PRELOAD or ptrace or with nsenter with fd namespace or equivalent C code).<p>Even if you write a kernel driver, iterating over all tasks in the system is a terrible design (there may be millions), not to mention "determining if a task belongs to a C playground program" in the kernel (obviously the kernel should have no knowledge about such specifics).<p>Of course, if a developer cannot even produce a reasonable overall design, it's not surprising that they aren't capable of writing correct code.