It's nice to get the textual description, but pretty much every specific detail of the extended explanation teased out at the end includes things which are more or less incorrect but which nonetheless sound very believable. In essence, what happened at the end was GPT-3 was asked to write an OOM-killer-inspired story. I think this should be a cautionary tale against trying to use GPT-3 to provide commentary beyond a high-level summary.<p>This isn't a slight against the short-summary technique, which seems very cool.<p>Details: oom_adj isn't a flag, it's an int which can disable OOM on a per-process-leader basis but can also but can also be used to reduce the "badness" of a process when considering what to kill. Oom_adj is also deprecated and has been replaced by oom_score_adj. The OOM algorithm isn't called RSS. It doesn't seem to have been explicitly named, but the function which performs the key calculation is named oom_badness. This function assigns an integer "badness" to each process. A process' resident set size <i>is</i> an important part of calculating badness, but it's affected by several other factors (what they are depends on kernel version but they include the adjustment parameter). RSS is not (part of) the OOM calculation "by default" -- it's always included unless OOM is disabled entirely. RSS isn't a comparison of reserved physical memory against current virtual size, it's just the amount of RAM currently occupied by a process (i.e. not in swap or on disk). The OOM killer doesn't compare RSS against virtual size. RSS doesn't trigger the OOM killer. RSS isn't an algorithm.<p>Another interesting aspect of this, of course, is that GPT-3 likely wasn't trained on any specific kernel version, but on a large number of versions depending on which part of the Internet it happened to be reading. This means that it probably can't give a good account of any single version of fast-changing parts of the kernel like the OOM killer.<p>Source: <a href="https://github.com/torvalds/linux/blob/master/mm/oom_kill.c" rel="nofollow">https://github.com/torvalds/linux/blob/master/mm/oom_kill.c</a>
This is pretty cool!
However, these two samples are very simple to solve. I'd love an "AI" to find root causes for problems that are not obvious.
Just throw the whole log collection at it and let it solve all the issues. One can dream ;)
That's cool and all, but I'm pretty sure what we really want to see is<p>"The expert described what had happened, in the form of a Haiku:"
So what do you do when GPT generates nonsense? Because it sometimes will, at least during my experiments, create something that is irrelevant or just plain wrong and would require human intervention. In other words, what is an acceptable failure rate for these summaries you generate?
Super interesting. I wonder what other latent domain-specific intelligence GPT-3 picked up during training, that is parseable with text in and text out. Like a flash cards generator?
This is interesting - I worked on a similar use case by parsing and tokenizing ZooKeeper logs, then converting logs to integer sequences and trying to determine whether or not services were going to experience a fault by training on said sequences, and thus determining what the cause of the fault was/would be. Wasn't too successful but definitely showed me how difficult it can be to work backwards from logs to root cause, esp. with limited data.
Is there a reason I'd use this approach over a process mining / log mining system? I feel like it needs me to guess the right question to get an answer.