This is a great post, recently I have been trying to re-learn and understand linux (specifically) ubuntu using monitoring tools.
In my opinion htop and Facebook osquery are the two best available tools for understanding how an operating systems and processes work. The osquery approach of recording all OS data in form relational tables (with PIDs as keys etc.) is very useful.<p><a href="https://hisham.hm/htop/" rel="nofollow">https://hisham.hm/htop/</a><p><a href="https://osquery.io/" rel="nofollow">https://osquery.io/</a><p>The osquery query packs are especially useful: <a href="https://osquery.io/docs/packs/" rel="nofollow">https://osquery.io/docs/packs/</a><p>Here is an incomplete draft about a similar post:
<a href="https://github.com/AKSHAYUBHAT/TopDownGuideToLinux" rel="nofollow">https://github.com/AKSHAYUBHAT/TopDownGuideToLinux</a>
Very nice write-up, and a great way to dive deep into an interesting system! But if you plan on maintaining a project like this long term I would recommend using one of the many existing libraries like <a href="https://github.com/prometheus/procfs" rel="nofollow">https://github.com/prometheus/procfs</a> or <a href="http://pythonhosted.org/psutil/" rel="nofollow">http://pythonhosted.org/psutil/</a><p>There can be a lot of edge cases, and inevitability things will change in the future. Centralising the work of parsing /proc files goes a long way and helps keep things sane for maintenance.
It's worth noting that `man proc` has pretty exhaustive, though incomplete, documentation of the various files. It's a great read to learn about some of the files available.
I wrote something to do similar parsing of process state recently. It seems nuts to me that you can't get this all in one call. The naïve way of `fopen()`ing the files you need has a race condition if the PID is reused between two calls to `fopen()`.<p>Admittedly, probably rare. But why route through calls to `fopen()` and `read()` when you could just provide a function that returns OS-defined structs?
Minor nitpick: calling it a a clone of the Unix 'ps' wouldn't be exactly right, since I understand /proc is Linux-specific. On the other hand, how did the Unix 'ps' or the 'ps' in other Unix-clones work? Is there an alternative method to expose the process data to userspace instead of using a VFS like procfs?
I wrote a small ps clone as a side project, and "man proc" was invaluable in understanding what everything meant.<p>There was interesting work happening on a proposed newer api though, "task_diag" <a href="https://lwn.net/Articles/685791/" rel="nofollow">https://lwn.net/Articles/685791/</a> <a href="https://criu.org/Task-diag" rel="nofollow">https://criu.org/Task-diag</a>.