-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compare to native options #66
Comments
I considered going native before but the benefit of a 0 dependency purely Javascript library is far greater — I don't have to deal with platform differences or worry about building the module in all platforms. NAPI is a nightmare. Right now, fdir just works and it is fast. Aside from that, I am also not proficient enough in Rust or C or C++ to write high performance code. I am sure native would be faster. I would love to benchmark of |
Alright, curiosity got the better of me and I benchmarked CodeI am no expert in Rust so this code is probably sub-par. #![deny(clippy::all)]
#[macro_use]
extern crate napi_derive;
use napi::{CallContext, JsObject, JsUndefined, Result};
use jwalk::{Parallelism, WalkDir};
#[module_exports]
fn init(mut exports: JsObject) -> Result<()> {
exports.create_named_method("jwalk", jwalk)?;
Ok(())
}
#[js_function(1)]
fn jwalk(ctx: CallContext) -> Result<JsUndefined> {
// let dir = ctx.get::<JsString>(0)?.into_utf8()?;
let mut entries: Vec<String> = vec![];
for entry in WalkDir::new("/")
.parallelism(Parallelism::RayonNewPool(8))
.skip_hidden(false)
{
match entry {
Ok(entry) => entries.insert(entries.len(), entry.path().display().to_string()),
Err(_err) => {
println!("failed to access entry");
}
}
}
ctx.env.get_undefined()
} I tried to be as fair as possible but my limited knowledge of Rust probably doesn't help. Maybe @Brooooooklyn can take a look and advise some improvements to the Rust code. The benchmark code: const { fdir } = require("./index.js");
const { jwalk } = require("fdir-bench");
(async () => {
console.time("fdir");
for (let i = 0; i < 5; ++i) {
await new fdir()
.withBasePath()
.withDirs()
.crawl("/")
.withPromise();
}
console.timeEnd("fdir");
console.time("jwalk");
for (let i = 0; i < 5; ++i) {
jwalk("/");
}
console.timeEnd("jwalk");
})(); Of course, this is very simplistic but it does the job. Results
Number of entries found:
ThoughtsWell, |
Try this code to avoid re-allocated in Rust let result: Vec<String> = WalkDir::new("/")
.parallelism(Parallelism::RayonNewPool(8))
.skip_hidden(false)
.filter_map(|dir_entry| dir_entry.ok().map(|entry| entry.path().display().to_string().to_string()))
.collect(); |
Unfortunately getting this: error[E0599]: the method `filter_map` exists for struct `WalkDirGeneric<((), ())>`, but its trait bounds were not satisfied
--> src/lib.rs:25:10
|
25 | .filter_map(|dir_entry| {
| ^^^^^^^^^^ method cannot be called on `WalkDirGeneric<((), ())>` due to unsatisfied trait bounds
|
::: /home/thecodrr/.cargo/registry/src/github.com-1ecc6299db9ec823/jwalk-0.5.1/src/lib.rs:158:1
|
158 | pub struct WalkDirGeneric<C: ClientState> {
| ----------------------------------------- doesn't satisfy `WalkDirGeneric<((), ())>: Iterator`
|
= note: the following trait bounds were not satisfied:
`WalkDirGeneric<((), ())>: Iterator`
which is required by `&mut WalkDirGeneric<((), ())>: Iterator` |
Ah, had to do |
Any difference in results? Also might be useful to add a pattern filter such as '**/package.json'. |
Did some testing of my own. MacBook Pro 16 - 2.4 GHz 8-Core Intel Core i9 jwalk - cargo run (unoptimized + debuginfo) 16 threads = 3738ms jwalk - cargo run --release 8 threads = 2363ms jwalk - napi (returning results) 817,100 files fdir 773,467 files jwalk - cargo run - filter node_modules jwalk - cargo run - filter node_modules + only package.json fdir - filter node_modules fdir - filter node_modules + only package.json It's actually very impressive how close they are considering one is running natively...for a large number of files. Although when globbing is involved, Tried using both I also found that using the Rust library
My use case is to speed up the Going to try out Could also re-use the glob libs from Maybe worth looking at using |
@vjpr Can you share your benchmark code? I'd like to take a look and see what is causing that huge bottleneck. |
After a couple "optimizations", here are the results:
These results are average of 5 iterations on the root directory containing 1.6million files & about 300k directories.
|
@vjpr So I went ahead and did a benchmark with
And I did the same thing with
Both returned around the same amount (26k) of results. Now, maybe I did it wrong but clearly Here's the code for the above await new fdir()
.withBasePath()
.filter((path) => path.indexOf("package.json") > -1)
.crawl("/")
.withPromise(); One thing I noticed: using |
@thecodrr What repo were you using to test? Also just to note, default |
I didn't use any repo. I ran it in my root. Yes, that is 1.6 million files & 300k directories — bigger than 99% of the repositories which means more pattern failures. If you run it inside Which repo did you test on?
Yes, I know. That's what surprised me. No idea why size of 2 gave a slightly better perf. The difference is too negligible on small directories though. |
I'm using a large internal monorepo. We should find a nice and large public Node.js monorepo to use for tests. So for me, increasing What machine are you using? I'm still yet to test out |
What do you suggest? React? Jest?
Very interesting.
Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz with 16 GB RAM and of course an SSD (a mediocre one). And I am on Linux. |
So fn get_overrides(source: String) -> Override {
let mut override_builder = OverrideBuilder::new(source.clone());
let exclude_patterns = vec!["**/package.json", "!**/node_modules/**"];
for pattern in exclude_patterns {
override_builder
.add(pattern)
.map_err(|e| anyhow!("Malformed exclude pattern: {}", e));
}
let overrides = override_builder
.build()
.map_err(|_| anyhow!("Mismatch in exclude patterns"));
overrides.unwrap()
}
pub fn walk_parallel(source: String) -> Vec<String> {
let overrides = get_overrides(source.clone());
let (tx, rx) = mpsc::channel();
WalkBuilder::new(source)
.hidden(false)
.overrides(overrides)
.standard_filters(false)
.threads(cmp::min(12, num_cpus::get()))
.build_parallel()
.run(move || {
let tx = tx.clone();
Box::new(move |dir_entry_result| {
if let Ok(dir_entry) = dir_entry_result {
tx.send(dir_entry.file_name().to_owned()).unwrap();
}
ignore::WalkState::Continue
})
});
let mut metadatas: Vec<_> = rx.into_iter().map(|dir| dir.into_string().unwrap()).collect();
//metadatas.sort_by(|a, b| a.len().cmp(&b.len()))
metadatas
} I'm putting together a repo now. |
There was some discussion here about macOS APFS being slower:
Although could be just per-directory lock for |
That's a lot of effort 😆. I put together a very, very rough parallel directory crawling in node using |
Nice. Cluster seems like the way to go to reach the Node.js max perf. Although for CLI cold-start use cases, would depend on how long it takes the process pool to spin up which might negate the benefit. My particular use case is 'pnpm' package manager. Many commands need to read the entire tree on startup, so if i could get that down to <500ms for large monorepos it would be great. FYI, not sure workers will help.
|
You'll have to give an example of a large monorepo so I can test it. To the best of my knowledge, you can easily traverse a fairly large monorepo under 500ms. That's excluding symlink resolution. I haven't tested with symlink resolution but it should be much, much slower.
Yes, but child_process might. Have to check and see. I don't expect a huge difference mostly because libuv has all the threads maxed out most of the time. But doesn't hurt to try.
This is a real concern. The main bottleneck is |
The only potential alternative would be WASI, but the Node.js implementation just uses If we don't restrict ourselves to cold-start, we could use a persistent daemon that watches the file system for changes. This is a way to get instant speed.
I think that covers the entire solution space for reading dir trees :D |
My benchmarks: robocopy.exe with /log:output.txt - 7s On a volume with around 1,000,000 files. |
@PlkMarudny whats your code for fdir and what are you benchmarking through? |
I am interested in how much faster it could be using native code. Via
child_process
or N-API.Someone already mentioned Rust's https://github.com/jessegrosjean/jwalk. Some other Rust discussion here. Would be cool to have a
napi-rs
wrapper on it and see the speed diff.Rust
What’s the fastest way to read a lot of files?
crossbeam
for parallelism andignore
for recursive dir walker.ignore
and a lot ofripgrep
util crates.ignore
andwalkdir
.C
gitstatusd
the_silver_searcher
JS Glob Libs
globrex
micromatch
, can also usepicomatch
JS Matcher Libs
WASI
No implementation exists yet.
Would be good to try with AssemblyScript.
The text was updated successfully, but these errors were encountered: