TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Using GPT-3 for plain language incident root cause from logs

106 点作者 stochastimus超过 4 年前

8 条评论

wzdd超过 4 年前
It&#x27;s nice to get the textual description, but pretty much every specific detail of the extended explanation teased out at the end includes things which are more or less incorrect but which nonetheless sound very believable. In essence, what happened at the end was GPT-3 was asked to write an OOM-killer-inspired story. I think this should be a cautionary tale against trying to use GPT-3 to provide commentary beyond a high-level summary.<p>This isn&#x27;t a slight against the short-summary technique, which seems very cool.<p>Details: oom_adj isn&#x27;t a flag, it&#x27;s an int which can disable OOM on a per-process-leader basis but can also but can also be used to reduce the &quot;badness&quot; of a process when considering what to kill. Oom_adj is also deprecated and has been replaced by oom_score_adj. The OOM algorithm isn&#x27;t called RSS. It doesn&#x27;t seem to have been explicitly named, but the function which performs the key calculation is named oom_badness. This function assigns an integer &quot;badness&quot; to each process. A process&#x27; resident set size <i>is</i> an important part of calculating badness, but it&#x27;s affected by several other factors (what they are depends on kernel version but they include the adjustment parameter). RSS is not (part of) the OOM calculation &quot;by default&quot; -- it&#x27;s always included unless OOM is disabled entirely. RSS isn&#x27;t a comparison of reserved physical memory against current virtual size, it&#x27;s just the amount of RAM currently occupied by a process (i.e. not in swap or on disk). The OOM killer doesn&#x27;t compare RSS against virtual size. RSS doesn&#x27;t trigger the OOM killer. RSS isn&#x27;t an algorithm.<p>Another interesting aspect of this, of course, is that GPT-3 likely wasn&#x27;t trained on any specific kernel version, but on a large number of versions depending on which part of the Internet it happened to be reading. This means that it probably can&#x27;t give a good account of any single version of fast-changing parts of the kernel like the OOM killer.<p>Source: <a href="https:&#x2F;&#x2F;github.com&#x2F;torvalds&#x2F;linux&#x2F;blob&#x2F;master&#x2F;mm&#x2F;oom_kill.c" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;torvalds&#x2F;linux&#x2F;blob&#x2F;master&#x2F;mm&#x2F;oom_kill.c</a>
评论 #25756517 未加载
bbu超过 4 年前
This is pretty cool! However, these two samples are very simple to solve. I&#x27;d love an &quot;AI&quot; to find root causes for problems that are not obvious. Just throw the whole log collection at it and let it solve all the issues. One can dream ;)
评论 #25754207 未加载
评论 #25754211 未加载
评论 #25754387 未加载
mckirk超过 4 年前
That&#x27;s cool and all, but I&#x27;m pretty sure what we really want to see is<p>&quot;The expert described what had happened, in the form of a Haiku:&quot;
评论 #25754662 未加载
ativzzz超过 4 年前
So what do you do when GPT generates nonsense? Because it sometimes will, at least during my experiments, create something that is irrelevant or just plain wrong and would require human intervention. In other words, what is an acceptable failure rate for these summaries you generate?
评论 #25756580 未加载
评论 #25756885 未加载
brianjunyinchan超过 4 年前
Super interesting. I wonder what other latent domain-specific intelligence GPT-3 picked up during training, that is parseable with text in and text out. Like a flash cards generator?
评论 #25754513 未加载
评论 #25754246 未加载
EQVEYWDCHQ超过 4 年前
This is interesting - I worked on a similar use case by parsing and tokenizing ZooKeeper logs, then converting logs to integer sequences and trying to determine whether or not services were going to experience a fault by training on said sequences, and thus determining what the cause of the fault was&#x2F;would be. Wasn&#x27;t too successful but definitely showed me how difficult it can be to work backwards from logs to root cause, esp. with limited data.
king_magic超过 4 年前
I&#x27;m fairly bearish on GPT-3, but this is actually a pretty cool application.
评论 #25755739 未加载
jacques_chester超过 4 年前
Is there a reason I&#x27;d use this approach over a process mining &#x2F; log mining system? I feel like it needs me to guess the right question to get an answer.
评论 #25755133 未加载