Hey everyone,<p>I recently created fdir mostly out of curiosity about how fast a program written in Node.js could be. It so happened that I (accidentally) created the fastest directory crawler in the NodeJS environment. fdir can easily crawl around 1 million files in under 1 second. 1 million files distributed in about 100k directories. (your mileage may vary depending on hardware).<p>It's also < 1kb in size (gzipped). Supports all node versions (> 6).<p>Feel free to give it a run and ask me any questions (if any) :D<p>Blog post: <a href="https://dev.to/thecodrr/how-i-wrote-the-fastest-directory-crawler-ever-3p9c" rel="nofollow">https://dev.to/thecodrr/how-i-wrote-the-fastest-directory-cr...</a><p>Take care,
thecodrr
This here looks wrong: <a href="https://github.com/thecodrr/fdir/blob/master/index.js#L86" rel="nofollow">https://github.com/thecodrr/fdir/blob/master/index.js#L86</a>
It's calling a blocking method from async in Node <10<p>Another thing I noticed, it looks you only handle `dirent.isDirectory()`, but not `dirent.isSymbolicLink()` (meaning the library won't find files in symlinked folders, e.g. lerna node_modules)
I took it for a ride<p>They all list all files recursively in a sync fashion, from the node_modules dir (like in the benchmark), excluding dirs, and print a total count.<p>Here are the results calculated with hyperfine (<a href="https://github.com/sharkdp/hyperfine" rel="nofollow">https://github.com/sharkdp/hyperfine</a>):<p><pre><code> hyperfine "bash test.sh" --warmup 5
Benchmark #1: bash test.sh
Time (mean ± σ): 7.5 ms ± 0.5 ms [User: 4.5 ms, System: 4.2 ms]
Range (min … max): 6.9 ms … 10.5 ms 332 runs
hyperfine "perl test.pl" --warmup 5
Benchmark #1: perl test.pl
Time (mean ± σ): 25.6 ms ± 1.2 ms [User: 16.8 ms, System: 8.8 ms]
Range (min … max): 24.0 ms … 30.8 ms 97 runs
hyperfine "python3.7 test.py" --warmup 5
Benchmark #1: python3.7 test.py
Time (mean ± σ): 43.4 ms ± 1.4 ms [User: 32.6 ms, System: 10.9 ms]
Range (min … max): 40.9 ms … 46.8 ms 66 runs
hyperfine "ruby test.rb" --warmup 5
Benchmark #1: ruby test.rb
Time (mean ± σ): 66.5 ms ± 2.0 ms [User: 52.1 ms, System: 14.4 ms]
Range (min … max): 63.2 ms … 70.3 ms 42 runs
hyperfine "node test.js" --warmup 5
Benchmark #1: node test.js
Time (mean ± σ): 83.7 ms ± 4.0 ms [User: 74.7 ms, System: 15.6 ms]
Range (min … max): 79.4 ms … 95.3 ms 36 runs
</code></pre>
Here are the results of an hello world with each runtime for comparison:<p><pre><code> hyperfine "bash test.sh" --warmup 5
Benchmark #1: bash test.sh
Time (mean ± σ): 1.2 ms ± 0.3 ms [User: 1.1 ms, System: 0.3 ms]
Range (min … max): 0.9 ms … 3.8 ms 1521 runs
hyperfine "perl test.pl" --warmup 5
Benchmark #1: perl test.pl
Time (mean ± σ): 1.3 ms ± 0.3 ms [User: 1.3 ms, System: 0.3 ms]
Range (min … max): 1.0 ms … 5.3 ms 1103 runs
hyperfine "python3.7 test.py" --warmup 5
Benchmark #1: python3.7 test.py
Time (mean ± σ): 19.5 ms ± 0.9 ms [User: 16.2 ms, System: 3.4 ms]
Range (min … max): 18.3 ms … 23.7 ms 144 runs
hyperfine "ruby test.rb" --warmup 5
Benchmark #1: ruby test.rb
Time (mean ± σ): 55.2 ms ± 2.2 ms [User: 47.2 ms, System: 8.1 ms]
Range (min … max): 52.2 ms … 61.9 ms 51 runs
hyperfine "node test.js" --warmup 5
Benchmark #1: node test.js
Time (mean ± σ): 55.4 ms ± 1.8 ms [User: 49.5 ms, System: 7.0 ms]
Range (min … max): 53.0 ms … 60.0 ms 53 runs
</code></pre>
Now, my machine is not setup for a clean benchmark. The disk cache is warmed up. Hyperthreading is on. Other softwares are running.<p>Plus the scripts all found a slightly different number of files :) I suspect they all treat symlinks/dotted dirs differently, and I didn't take the time to normalize. Although I don't think this makes up for the difference.<p>Still the result is a bit interesting. The non JS tests are not using any 3rd party libs. Ruby and Node seems to have the same cost for VM startup.<p>I'm quite surprised that node is last frankly, especially on a uber optimized code. I'm expecting V8 code to be blazing fast as it's Google made and C++.