TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Osquery: Expose the operating system as a relational database

533 pointsby jamesgpearceover 10 years ago

42 comments

ibdknoxover 10 years ago
This is really interesting and very cool to see. The approach we&#x27;re taking with Eve (gory details at [1]) is that you can treat everything as relational and doing so provides lots of benefits. One thing that wasn&#x27;t clear though, was how you extend that notion down into the OS-level for both performance and semantic reasons. It&#x27;s encouraging to see someone with requirements as deep as facebook&#x27;s find that this strategy works in that context.<p>The next step would be <i>manipulating</i> the OS as relations. E.g. an insert into the process table allows you to actually spawn a process. It would start to get really interesting from there...<p>[1]: <a href="http://incidentalcomplexity.com/2014/10/16/retrospective/" rel="nofollow">http:&#x2F;&#x2F;incidentalcomplexity.com&#x2F;2014&#x2F;10&#x2F;16&#x2F;retrospective&#x2F;</a>
评论 #8528976 未加载
评论 #8529285 未加载
评论 #8529583 未加载
评论 #8531550 未加载
zwischenzugover 10 years ago
Available as a docker image:<p><pre><code> docker run -t -i imiell&#x2F;osquery &#x2F;bin&#x2F;bash root@81fbc2076e1c&#x2F;# osqueryi ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ osquery - being built, with love, at Facebook ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Connected to a transient in-memory database. Use &quot;.open FILENAME&quot; to reopen on a persistent database. osquery&gt; select * from processes; +----------+-------------------------+-----------+-------+---------+------------+---------------+----------------+-----------+-------------+------------+--------+ | name | path | cmdline | pid | on_disk | wired_size | resident_size | phys_footprint | user_time | system_time | start_time | parent | +----------+-------------------------+-----------+-------+---------+------------+---------------+----------------+-----------+-------------+------------+--------+ | bash | | &#x2F;bin&#x2F;bash | 1 | -1 | | 1764 | 18276 | 17 | 18 | 95476444 | 0 | | osqueryi | &#x2F;usr&#x2F;local&#x2F;bin&#x2F;osqueryi | osqueryi | 19380 | 1 | | 4312 | 110652 | 225 | 327 | 96321589 | 1 | +----------+-------------------------+-----------+-------+---------+------------+---------------+----------------+-----------+-------------+------------+--------+ osquery&gt;</code></pre>
评论 #8530664 未加载
评论 #8530581 未加载
MrBuddyCasinoover 10 years ago
Cool, so basically it brings something like WQL to <i>nix, because this is something that exists in Windows already:<p><pre><code> SELECT * FROM Win32_LogicalDisk WHERE FreeSpace &lt; 2097152</code></pre>
评论 #8528698 未加载
评论 #8529252 未加载
peterwwillisover 10 years ago
It&#x27;s hard for me to grok this project&#x27;s design goals. I mean, the basic idea is simple enough to understand: I want to run SQL queries on metadata about my hosts. I&#x27;ve built and run several different iterations of that same idea, but they didn&#x27;t require thousands of lines of C++.<p>The usual implementation is simple: take any host monitor (say, collectd) that can export key&#x2F;value pairs from a host, or take a log stream over the network and pair it with a host monitor&#x2F;log scraper to create key&#x2F;value pairs. Then insert into an SQL engine while appending to a log for a historical record (or PTA&#x2F;PITR&#x2F;whatever, i&#x27;m not a DBA). Separately you can create a database application to query&#x2F;modify the database as needed.<p>But we&#x27;re talking like, a handful of python scripts that don&#x27;t ever change except to add new search features. This seems like a big departure from the simplicity of that approach. Am I missing something?
评论 #8530250 未加载
gooseyardover 10 years ago
Akamai has been using a system like this since 1999 or so, nicely documented in this presentation to LISA some years ago:<p><a href="http://www.akamai.com/dl/technical_publications/lisa_2010.pdf" rel="nofollow">http:&#x2F;&#x2F;www.akamai.com&#x2F;dl&#x2F;technical_publications&#x2F;lisa_2010.pd...</a>
asbover 10 years ago
There was a paper at EuroSys this year &quot;Relational access to Unix kernel data structures&quot; which seems to attempt to offer similar functionality. A comparison between the two would be interesting.<p>HTML version of the paper here: <a href="http://www.dmst.aueb.gr/dds/pubs/conf/2014-EuroSys-PicoQL-kernel/html/FSLB14.html" rel="nofollow">http:&#x2F;&#x2F;www.dmst.aueb.gr&#x2F;dds&#x2F;pubs&#x2F;conf&#x2F;2014-EuroSys-PicoQL-ke...</a> Paywalled version: <a href="http://dl.acm.org/citation.cfm?id=2592802" rel="nofollow">http:&#x2F;&#x2F;dl.acm.org&#x2F;citation.cfm?id=2592802</a>
mapleoinover 10 years ago
Welcome to 2014 where cross-platform means &quot;Ubuntu, CentOS and Mac OS X&quot;.
评论 #8529080 未加载
评论 #8529307 未加载
评论 #8532035 未加载
评论 #8529126 未加载
gtrubetskoyover 10 years ago
This is not exactly same, but similar in some ways to IBM&#x27;s S&#x2F;38, OS&#x2F;400 or whatever they call it now... In this OS (and last I touched it was 15 years ago, so things may have changed) there were no &quot;files&quot; - everything was a database table, and that was the only way you could store anything and it was how you for the most part interoperated with the system, i.e. the OS was essentially a relational DB. <a href="http://en.wikipedia.org/wiki/IBM_i" rel="nofollow">http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;IBM_i</a>
sanjover 10 years ago
This reminds of a old project to kill daemons with a shotgun: <a href="http://www.cs.unm.edu/~dlchao/flake/doom/" rel="nofollow">http:&#x2F;&#x2F;www.cs.unm.edu&#x2F;~dlchao&#x2F;flake&#x2F;doom&#x2F;</a>
评论 #8533410 未加载
pothiboover 10 years ago
This is quite cool. I&#x27;m installing it right now on my system.<p>From the wiki, it says it will soon be available on homebrew as well.<p><a href="https://github.com/facebook/osquery/wiki/install-os-x" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;facebook&#x2F;osquery&#x2F;wiki&#x2F;install-os-x</a>
评论 #8528880 未加载
tdicolaover 10 years ago
Are there instructions how to build it? I&#x27;m looking around the github page and don&#x27;t see anything like what depedencies it needs, what infrastructure it uses (looks like CMake?), etc. It&#x27;s cool to distribute it as a vagrant source, but I&#x27;d like to compile and run this on a Raspberry Pi which doesn&#x27;t run Vagrant. Supplying some basic how to build instructions would really help.
评论 #8530471 未加载
sehropeover 10 years ago
This is pretty neat. I&#x27;m a big fan of SQL in general and being able to query system stats like this feels pretty natural to me.<p>A long time back I created something similar to this atop Oracle[1]. It used a Java function calling out to system functions to get similar data sets (<i>I&#x2F;O usage, memory usage, etc</i>). It was definitely a hack, but a really pleasant one to use.<p>Be cool to see an foreign data wrapper for PostgreSQL[2] that exposes similar functionality. I&#x27;m guessing it&#x27;d be pretty easy to put together as you&#x27;d only need to expose the data sets themselves as set returning functions. PostgreSQL would handle the rest. Though I guess that would limit it&#x27;s usefulness to servers that have PG already installed. Having it be separate like this let&#x27;s you drop it on any server (<i>looks like it&#x27;s cross platform too!</i>).<p>[1]: <i>I don&#x27;t remember exactly when but I think 10g had just been released.</i><p>[2]: <a href="http://www.postgresql.org/docs/9.3/static/postgres-fdw.html" rel="nofollow">http:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;9.3&#x2F;static&#x2F;postgres-fdw.html</a>
评论 #8529071 未加载
dougabugover 10 years ago
After a lost decade of misplaced vitriol directed at relational models and SQL, I&#x27;m personally heartened to see a trend back to human readable query languages over rpc as an interface, sensible representation of information in relational form suitable for ad hoc queries&#x2F;discovery versus complex implementation-dependent deep hierarchy spaghetti.
thibautsover 10 years ago
That&#x27;s a very good candidate for a postgres FDW ...<p><a href="https://wiki.postgresql.org/wiki/Foreign_data_wrappers" rel="nofollow">https:&#x2F;&#x2F;wiki.postgresql.org&#x2F;wiki&#x2F;Foreign_data_wrappers</a>
评论 #8532640 未加载
zobzuover 10 years ago
Google has GRR and Mozilla has MIG (<a href="http://mig.mozilla.org/" rel="nofollow">http:&#x2F;&#x2F;mig.mozilla.org&#x2F;</a>)<p>I think its interesting to see that MIG is in Go and thus cross platform &quot;by default&quot;. It also seems to be more privacy-compliant.<p>osquery&#x27;s SQL is sexy however.<p>That said I&#x27;m also wary of a single piece of software that basically give you control over absolutely everything (control everyones laptop, etc. silently and quickly. Thats the best rootkit ever. You wont even detect if its being compromised because its a trusted piece of the OS!)
评论 #8530944 未加载
Pxtlover 10 years ago
I just wish there were better relational languages than SQL for accessing&#x2F;manipulating this stuff. Relational logic is great. SQL is... okay.
评论 #8529996 未加载
adlover 10 years ago
I build the .deb for Ubuntu 14.10 (downloaded the project, the Vagrant image, etc, the works. 1.1GB in total according to du -h)<p>It&#x27;s here if anyone wants to try it: osquery-0.0.1-trusty.amd64.deb (11 MB)<p><a href="https://drive.google.com/file/d/0B3ROVJqBXqYAOVNTTkhqQzNUa0k/view?usp=sharing" rel="nofollow">https:&#x2F;&#x2F;drive.google.com&#x2F;file&#x2F;d&#x2F;0B3ROVJqBXqYAOVNTTkhqQzNUa0k...</a>
chacham15over 10 years ago
Is it possible to create triggers with this kind of emulated database? E.g. Insert a row into the notifications tablewhen free space drops below 10% or to use their example when SELECT name, path, pid FROM processes WHERE on_disk = 0 actually returns a row.
评论 #8530370 未加载
rdtscover 10 years ago
Any relationship or inspiration from BeOS&#x27;s file system?<p><a href="http://en.wikipedia.org/wiki/Be_File_System" rel="nofollow">http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Be_File_System</a><p>I remember from back in the day that was one of the really cool feature of Be.
mongolover 10 years ago
Something similar using SQLite virtual tables and the proc filesystem <a href="https://github.com/claes/osql" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;claes&#x2F;osql</a>
gdulliover 10 years ago
I&#x27;d like to try this out but is there a way to install it that&#x27;s as simple as all of the other Linux software I&#x27;ve ever installed before, and doesn&#x27;t require installing vagrant, installing virtualbox, downloading an ubuntu image, creating a whole VM (which failed) so I can then make a package I can install? I can usually try these things out without making a whole thing of it.
评论 #8529426 未加载
godisdadover 10 years ago
Nifty, but not incredibly novel. SQLite&#x27;s VFSes have been around for some time, albeit in smaller breadth and scope. I think one thing this kind of glosses over is the notion of transactions, what if load&#x2F;fs contents change between independent parts of your query, are they memoized, recomputed, etc?<p>Having said all that I&#x27;m going to install it and try it out because it&#x27;s new and shiny.
ameliusover 10 years ago
Wouldn&#x27;t it be cool if there were a library that could build relational APIs for any problem (including an optimizing SQL compiler)...
eliover 10 years ago
Reminds me of LogParser, a funky Microsoft tool that let you run queries against log files, directories, the registry, etc. I think it was a skunkworks project. Apparently still exists: <a href="http://www.microsoft.com/en-us/download/details.aspx?id=24659" rel="nofollow">http:&#x2F;&#x2F;www.microsoft.com&#x2F;en-us&#x2F;download&#x2F;details.aspx?id=2465...</a>
brianpgordonover 10 years ago
This is neat, but why is Facebook making it? I may be too used to working at startups, but it seems to me that &quot;look Mom, SQL!&quot; isn&#x27;t nearly worth the cost of the engineer hours it must have taken to bring this project to maturity.<p>I guess it does buy community goodwill to throw handfuls of money off the Facebook float...
评论 #8530346 未加载
评论 #8530337 未加载
评论 #8530604 未加载
gluczywoover 10 years ago
This is excellent idea. While I try to avoid unnecessary abstractions (yes, I&#x27;m looking at you docker), having a consistent cross-platform and familiar API for OS instrumentation seems like a big boon. At low complexity cost there is a chance to offload admin memory from idiosyncrasies of OS monitoring details.
annnndover 10 years ago
Looks nice! That said, SQL in this case is just a way to look at data. Given that most network devices (and printers and UPSs and ...) in existence use SNMP, it would be nice to have an (SQL?) engine which would query devices via SNMP in background... If I understand correctly, this solution is tied to servers only.
spo81rtyover 10 years ago
I wish something like this shipped with every version of linux just like WMI does on Windows. This is awesome to see.
karavelovover 10 years ago
There is something similar in aws. The main difference is that it is distributed and can query and aggregate across clusters of thousands machines. Also I think it is not built on sqlite but implements new db engine.
jacquesmover 10 years ago
Is this reading os datastructures synchronously or asynchronously?
stefanobaghinoover 10 years ago
Brilliant idea, this can easily expose metrics in interesting ways to a whole lot of people who happen to know SQL better than the &#x2F;proc filesystem.
njxover 10 years ago
Where are the JDBC drivers?<p>you could then deploy these frameworks on a bunch of servers and a external monitor can independently query via SQL &quot;How u doing;&quot;
mixedbitover 10 years ago
A system with SQL interface != a relational database
评论 #8529412 未加载
falcolasover 10 years ago
You know, this wouldn&#x27;t be too hard to implement as a storage engine for MySQL... What an intriguing idea.
评论 #8528928 未加载
评论 #8529047 未加载
评论 #8528940 未加载
politicianover 10 years ago
Would&#x27;ve preferred to see the from-where-select style rather than the normal way of writing SQL.
elwellover 10 years ago
Great, so now we can have SQL Injection at the OS level.<p>&#x2F;sarcasm&#x2F;
pronover 10 years ago
Nice! Can we have a JDBC driver for this?
评论 #8529463 未加载
评论 #8530399 未加载
avifreedmanover 10 years ago
marpaia, Did you consider using OpenTSDB?<p>At CloudHelix, we did a Postgres FDW to OpenTSDB, which gives a time dimension as well.<p>That was an issue at Akamai - how to get historic as well as realtime with Akamai&#x27;s Query system ([WARNING: PDF direct download] <a href="http://www.google.com/url?sa=t&amp;rct=j&amp;q=&amp;esrc=s&amp;source=web&amp;cd=1&amp;cad=rja&amp;uact=8&amp;ved=0CCAQFjAA&amp;url=http%3A%2F%2Fwww.akamai.com%2Fdl%2Ftechnical_publications%2Flisa_2010.pdf&amp;ei=1JVRVP6FJdj6oQT544GYBQ&amp;usg=AFQjCNHZe0KJLDl4e8t3mY_-SnaC8umDwg&amp;bvm=bv.78597519,d.cGU" rel="nofollow">http:&#x2F;&#x2F;www.google.com&#x2F;url?sa=t&amp;rct=j&amp;q=&amp;esrc=s&amp;source=web&amp;cd...</a>)<p>Interesting stuff though! Maybe a FDW could connect Postgres to osquery, which could allow joining with local tables or other FDW-accessible data.<p>The FDW approach to OpenTSDB looks like:<p>select to_timestamp(atime::float), value, hstore(regexp_split_to_array(tags, &#x27;,&#x27;)) as hs from chf_realtime where i_start_time &gt;= now() - interval&#x27;1 min&#x27; and agg = &#x27;sum&#x27; and metric = &#x27;df.bytes.percentused&#x27; and tags = &#x27;host=*,mount=&#x2F;|&#x2F;data|&#x2F;ssd&#x27; ; to_timestamp | value | hs ------------------------+-------+--------------------------------------------------------- 2014-10-30 01:15:33+00 | 84 | &quot;host&quot;=&gt;&quot;XY4.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;&quot;, &quot;fstype&quot;=&gt;&quot;xfs&quot; 2014-10-30 01:16:33+00 | 84 | &quot;host&quot;=&gt;&quot;XY4.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;&quot;, &quot;fstype&quot;=&gt;&quot;xfs&quot; 2014-10-30 01:15:33+00 | 9 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;data&quot;, &quot;fstype&quot;=&gt;&quot;btrfs&quot; 2014-10-30 01:16:33+00 | 9 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;data&quot;, &quot;fstype&quot;=&gt;&quot;btrfs&quot; 2014-10-30 01:15:33+00 | 49 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;ssd&quot;, &quot;fstype&quot;=&gt;&quot;btrfs&quot; 2014-10-30 01:16:33+00 | 49 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;ssd&quot;, &quot;fstype&quot;=&gt;&quot;btrfs&quot; 2014-10-30 01:14:55+00 | 63 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;&quot;, &quot;fstype&quot;=&gt;&quot;xfs&quot; 2014-10-30 01:15:55+00 | 63 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;&quot;, &quot;fstype&quot;=&gt;&quot;xfs&quot; 2014-10-30 01:14:55+00 | 1 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;data&quot;, &quot;fstype&quot;=&gt;&quot;btrfs&quot; 2014-10-30 01:15:55+00 | 21 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;ssd&quot;, &quot;fstype&quot;=&gt;&quot;xfs&quot; 2014-10-30 01:14:55+00 | 21 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;ssd&quot;, &quot;fstype&quot;=&gt;&quot;xfs&quot; 2014-10-30 01:15:50+00 | 63 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;&quot;, &quot;fstype&quot;=&gt;&quot;xfs&quot; 2014-10-30 01:15:50+00 | 8 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;ssd&quot;, &quot;fstype&quot;=&gt;&quot;btrfs&quot; 2014-10-30 01:14:56+00 | 89 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;&quot;, &quot;fstype&quot;=&gt;&quot;xfs&quot; 2014-10-30 01:15:56+00 | 89 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;&quot;, &quot;fstype&quot;=&gt;&quot;xfs&quot; 2014-10-30 01:14:56+00 | 55 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;ssd&quot;, &quot;fstype&quot;=&gt;&quot;xfs&quot; 2014-10-30 01:15:56+00 | 55 | &quot;host&quot;=&gt;&quot;XY.iad1&quot;, &quot;mount&quot;=&gt;&quot;&#x2F;ssd&quot;, &quot;fstype&quot;=&gt;&quot;xfs&quot; (17 rows)
评论 #8531002 未加载
peetleover 10 years ago
Hey? LINQ? What?
innguestover 10 years ago
The original wiki had much to say, over its many discussions, about TOP - table-oriented programming.<p>Today we know via Category Theory that tables are Turing complete and are actually quite synonymous with CT itself.<p>In other words, thinking of computation in terms of tables with rows and columns and relationships between tables is an interesting and promising (given CT) approach to computing that has been discussed in the past but then left largely unexplored.
评论 #8530464 未加载
rpm33over 10 years ago
This is amazing.Installing it right away
mapcarsover 10 years ago
Wow, these guys really should spent some time learning history and plan 9 specifically.<p>I mean how do they think people will write and rewrite programs using sql when files are already here?
评论 #8532037 未加载