TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Logs Are Streams, Not Files (2011)

52 点作者 fbuilesv将近 11 年前

5 条评论

colmmacc将近 11 年前
From the article:<p>&gt; a better conceptual model is to treat logs as time-ordered streams<p>At scale it&#x27;s probably better still to re-think logs as weakly-ordered lossy streams. One form of weak-ordering is the inevitable jitter that comes with having multiple processes, threads or machines; without some kind of global lock (which would be impactful to performance) it stops being possible to have a true before&#x2F;after relationship between individual log entries.<p>Another form of weak ordering is that it&#x27;s very common for log entries to be recorded only at the end of an operation, irrespective of its duration; so a single instantaneous entry really represents a time-span of activity with all sorts of fuzzy before&#x2F;after&#x2F;concurrent-to relationships to other entries.<p>But maybe the most overlooked kind of weak ordering is one that is rarely found in logging systems, but is highly desirable: log streams should ideally be processed in LIFO order. If you&#x27;re building some kind of analytical or visualisation system or near real-time processor for log data, you care most about &quot;now&quot;. Inevitably there are processing queues and batches and so on to deal with; but practically every logging system just orders the entries by &quot;time&quot; and handles those queues as FIFO. If a backlog arises; you must wait for the old data to process before seeing the new. Change these queues and batching systems to LIFOs and you get really powerful behavior; recent data always takes priority but you can still backfill historical gaps. Unix files are particularly poorly suited to this pattern though - even though a stack is a simple data-structure, it&#x27;s not something that you can easily emulate with a file-system and command line tools.
评论 #8074493 未加载
评论 #8073562 未加载
评论 #8073716 未加载
falcolas将近 11 年前
&gt; Programs that send their logs directly to a logfile lose all the power and flexibility of unix streams.<p>That&#x27;s because if they push their data to stdout, and it&#x27;s not caught by a pipe, the program will halt when the OS stdout buffer is filled.<p>&gt; How many programs end up re-writing log rotation, for example?<p>This one is because files over a certain size cause certain file management systems, or kernels, to break logging. If they didn&#x27;t rotate the files, the system would become unresponsive at worst, or the program would go down at best. Plus, if you take care of rotation and compression yourself (either directly or through a logrotated conf), you don&#x27;t have to worry about filling a disk &amp; causing an outage.<p>In short, logging is hard, because systems are managed by people. And people rarely get the logging setups right the first time.
philsnow将近 11 年前
sink&#x2F;drain, not source&#x2F;sink ? Does anybody use &quot;sink&quot; to mean the place where stuff comes out of (from a particular system&#x27;s perspective) rather than the place where stuff goes ?
评论 #8073854 未加载
评论 #8073100 未加载
farva将近 11 年前
It&#x27;s not like there&#x27;s really a difference between the two, under *nix.
评论 #8074016 未加载
评论 #8073140 未加载
antocv将近 11 年前
Files are streams.