TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Crab – SQL for your filesystem

127 pointsby cogsover 9 years ago

14 comments

jewelover 9 years ago
&gt; Just try these in Bash or PowerShell! &gt; select fullpath from files where fullpath like &#x27;%sublime%&#x27; and fullpath like &#x27;%settings%&#x27; and fullpath not like &#x27;%backup%&#x27;;<p>This isn&#x27;t a very good example, because it&#x27;s trivial to do in bash:<p><pre><code> locate sublime | grep settings | grep -v backup </code></pre> (Replace `locate sublime` with `find &#x2F; | grep sublime` if locate&#x27;s results are too old.)<p>&gt; select fullpath, bytes from files order by bytes desc limit 5;<p>This is better. Here it is in bash:<p><pre><code> find &#x2F; -type f -exec stat -c &#x27;%s %n&#x27; {} \; | sort -nr | head -n 5 </code></pre> Cherry picking another one that stood out to me.<p>&gt; select writeln(&#x27;&#x2F;Users&#x2F;SJohnson&#x2F;dictionary2.txt&#x27;, data) from fileslines where fullpath = &#x27;&#x2F;Users&#x2F;SJohnson&#x2F;dictionary.txt&#x27; order by data;<p><pre><code> cd &#x2F;Users&#x2F;SJohnson&#x2F;; sort dictionary.txt &gt; dictionary2.txt </code></pre> Some of the rest of the examples are trivial in bash, and others look potentially useful. Of course they are trying to demonstrate its capabilities so the examples are contrived. I can see how this would be useful for someone who doesn&#x27;t know the command-line, but as someone who is proficient in both SQL is pretty verbose.<p>In the real world I&#x27;d switch to a scripting language for some of the more complex cases, since they&#x27;d be rare.
评论 #10201479 未加载
评论 #10200984 未加载
评论 #10202644 未加载
评论 #10202130 未加载
cphooverover 9 years ago
Reminds me of facebook&#x27;s <a href="https:&#x2F;&#x2F;osquery.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;osquery.io&#x2F;</a> does this offer anything different?
评论 #10201001 未加载
评论 #10201633 未加载
thrownaway2424over 9 years ago
As a reminder, we actually did have a filesystem with integrated SQL (-like) query features in BeFS, in 1997.
gourneauover 9 years ago
This is pretty cool, I have find myself wanting a tool like this for ages. However, does anyone know of a pure open source alternative (just for Linux) ?
评论 #10200479 未加载
评论 #10200300 未加载
评论 #10201211 未加载
评论 #10202042 未加载
justin_vanwover 9 years ago
This is a brilliant idea!<p>You can do most of the same kinds of things via find and grep and some shell foo, but honestly who can remember all of that? Maybe someone smarter than me, but every time I need it I am reading man pages.<p>The find syntax to get files modified more than 20 minutes ago? How can you remember that? But modified &gt; now() - interval &#x27;5 minutes&#x27; (well that&#x27;s postgres but still), I can remember it and I haven&#x27;t used it in 2 or 3 years, because it&#x27;s slightly less arbitrary and doesn&#x27;t have 8 different gotchas.<p>EDIT:<p>find . -mmin -5 # that gets you files modified in the last 5 minutes. The part I can&#x27;t ever remember:<p>find . -mmin +5 # that gives you files last modified more than 5 minutes ago<p>find . -mmin 5 # apparently this is files modified exactly 5 minutes ago? The fact that this syntax exists (and is different from +5) seems absurd to me. What is the resolution? It must be minutes. This option exists only to confuse people.
评论 #10203728 未加载
oconnoreover 9 years ago
&quot;Multi platform&quot; ... only runs on OSX.
评论 #10201003 未加载
eka808over 9 years ago
I like to use ad-hoc linq queries with linqpad to get this type of stuff done.<p>Ex : Directory.GetFiles(theDirectory).Take(50).GroupBy(a =&gt; ...)
brixonover 9 years ago
Log Parser does this too (Windows only). <a href="http:&#x2F;&#x2F;www.microsoft.com&#x2F;en-us&#x2F;download&#x2F;details.aspx?id=24659" rel="nofollow">http:&#x2F;&#x2F;www.microsoft.com&#x2F;en-us&#x2F;download&#x2F;details.aspx?id=2465...</a>
评论 #10201037 未加载
rufugeeover 9 years ago
I&#x27;ve been looking for something like this (and thinking about developing something if I can&#x27;t find a satisfactory solution) to use across many different hosts to identify duplicate files, etc. I&#x27;ve got media spread across many different linux and os x machines. Can crab handle this?
评论 #10201944 未加载
评论 #10202501 未加载
g4nt1over 9 years ago
Makes me think of txt-sushi (<a href="http:&#x2F;&#x2F;keithsheppard.name&#x2F;txt-sushi&#x2F;" rel="nofollow">http:&#x2F;&#x2F;keithsheppard.name&#x2F;txt-sushi&#x2F;</a>). Can be pretty useful instead of using awk.
crivabeneover 9 years ago
Reminds me of WinFS, which was cancelled but I always thought it was a brilliant concept.<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;WinFS" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;WinFS</a>
评论 #10203419 未加载
bechampionover 9 years ago
IMHO ... doesn&#x27;t replace the good old for,find,grep etc..
otterleyover 9 years ago
Does this maintain an inode metadata index as well? Otherwise how will you avoid stat&#x27;ing the whole filesystem (or a branch thereof)?<p>Does it handle extended attributes?
评论 #10202618 未加载
zrailover 9 years ago
Yay for commercial open source!
评论 #10202205 未加载