My First Kernel Module: A Debugging Nightmare

133 pointsby ksmlover 4 years ago

16 comments

Taniwhaover 4 years ago

So a story: I've been a kernel hack since Unix V6, made a living doing it one way or another for over half my life ... learning to think about concurrency, time, interrupts, race conditions etc is hard, very hard - I got pretty good at it ... but then my career took a diversion, I designed chips for a decade or so, everything is concurrency, at the lowest levels .... after a while I came back to doing kernel stuff and found that with this new background all that hard stuff was trivial and obvious.Mostly you just have to steep your brain in it for long enough

评论 #25156909 未加载

评论 #25157657 未加载

评论 #25156875 未加载

评论 #25155371 未加载

cesarbover 4 years ago

> However, printk can block (while allocating memory)No, printk() is magic. It can be called even in NMI context, which is a worse place. Quoting <a href="https://lwn.net/Articles/800946/" rel="nofollow">https://lwn.net/Articles/800946/</a>, "[...] kernel code must be able to call printk() from any context. Calls from atomic context prevent it from blocking; calls from non-maskable interrupts (NMIs) can even rule out the use of spinlocks. [...]"

评论 #25154999 未加载

评论 #25157080 未加载

lallysinghover 4 years ago

EBPF is honestly the first thing to try before writing a module.I'm glad to see you used a VM. That's the first step in the right direction. Others have mentioned that you should've used printk(), which is true.I'll mention that you can also run the kernel in a debugger: <a href="https://www.kernel.org/doc/html/latest/dev-tools/gdb-kernel-debugging.html" rel="nofollow">https://www.kernel.org/doc/html/latest/dev-tools/gdb-kernel-...</a>

评论 #25155031 未加载

评论 #25157140 未加载

megousover 4 years ago

Linux has some debug options that could have probably helped here. It's a good idea to enable them when developing new code.<a href="https://megous.com/dl/tmp/b6e8f550de4539a8.png" rel="nofollow">https://megous.com/dl/tmp/b6e8f550de4539a8.png</a>

评论 #25155558 未加载

ksmlover 4 years ago

Hi HN, this was my first attempt at writing any sort of kernel code. I would love to hear your thoughts on this experience and on the fixes I applied, especially from anyone with more Linux experience than me :)

评论 #25153739 未加载

评论 #25153889 未加载

评论 #25153907 未加载

noncomlover 4 years ago

I see the world “nightmare” used a lot in this attic ale.I wonder if I am the only one that loves debugging difficult/weird problems. It’s something like trying to solve a puzzle. And knowing that the system will never deceive me(it will not be the system’s fault if I get deceived), and that a perfectly reasonable good explanation exists for what I observe helps me do not give up.

评论 #25156004 未加载

评论 #25155858 未加载

评论 #25157415 未加载

sweetteaover 4 years ago

You probably already did this, but for the audience: one of the best ways to make sure you're using a function reasonably is to use elixir.bootlin.com to look at other uses and make sure you're using the function similarly. For instance, check out <a href="https://elixir.bootlin.com/linux/latest/A/ident/for_each_process" rel="nofollow">https://elixir.bootlin.com/linux/latest/A/ident/for_each_pro...</a> .

评论 #25154874 未加载

评论 #25156871 未加载

wyldfireover 4 years ago

My knee jerk reading this article and seeing a kernel module near 'nodejs' was to grumble and say "wtf they clearly didn't need a kernel module for this". But upon reading deeper I see that accessing the kernel is kinda appropriate.Regardless of whether you end up using eBPF or a .ko like you already have, you may have a yet simpler option. By leveraging the loader you can do an interposition trick with LD_PRELOAD to hook C library accesses. Maybe this is all you need in order to "help students understand system calls such as open, close, dup2, fork, pipe, and others. "Just a suggestion. Carry on, good show.

egberts1over 4 years ago

Takes me back to the days of ATM device driver debugging. I’ve written 9 kernel drivers. All in all, a dedicated standalone terminal attached to the serial port of the target is still your best friend.

lhoursquentinover 4 years ago

Great post, also love what you are trying to do with C playground, this is awesome!I've recently been trying to build something similar, visualizing forks/exeve/read/write, but using the strace output of a binary, which is much less challenging.

评论 #25154921 未加载

nosefrogover 4 years ago

Great story! I've had a lot of debugging nightmares, but thankfully never anything as bad as that.One thing that looks fishy is this branch:<pre><code> if (container_tasks_len == max_container_tasks) { printk("cplayground: ERROR: container_tasks list hit capacity! We " "may be missing processes from the procfile output.\n"); break; } </code></pre> Since you said printk can block, why isn't calling it in the rcu critical section a bug? Is it because you immediately break afterwards and don't try to reference the next task?

评论 #25154887 未加载

secondcomingover 4 years ago

Great article! Reminds me of when I was working on a bug in a phone kernel and adding its equivalent of printk() made the bug disappear! Lauterbach time!

pjmlpover 4 years ago

Back in the Windows NT/2000 days, IIS executed as part of the kernel, debugging ISAPI extensions was an exercise in patience every time a programming error crashed the kernel and a reboot was in order.

knownover 4 years ago

Free Book <a href="https://www.tldp.org/LDP/lkmpg/2.6/html/lkmpg.html" rel="nofollow">https://www.tldp.org/LDP/lkmpg/2.6/html/lkmpg.html</a>

foxhlchenover 4 years ago

nice article but I think op should use debugfs instead of /proc. debugfs is designed for this purpose.

devitover 4 years ago

You can do most or all of that by reading /proc/<pid>/fdinfo/<fd> and /proc/<pid>/fd/<fd> or by making system calls on the affected fds (which you can do e.g. by injecting code with LD_PRELOAD or ptrace or with nsenter with fd namespace or equivalent C code).Even if you write a kernel driver, iterating over all tasks in the system is a terrible design (there may be millions), not to mention "determining if a task belongs to a C playground program" in the kernel (obviously the kernel should have no knowledge about such specifics).Of course, if a developer cannot even produce a reasonable overall design, it's not surprising that they aren't capable of writing correct code.

评论 #25153999 未加载

评论 #25154293 未加载