Ideas on speeding up no-op scenario #261

KAction · 2023-12-10T06:33:46Z

This is output of treefmt run on clean checkout of my $dayjob repository:

>> treefmt

traversed 3563 files
matched 1102 files to formatters
left with 0 files after cache
of whom 0 files were re-formatted
all of this in 357ms

And I was wondering, why it takes so slow to do nothing with less than 4k
files. Instead of doing proper profiling, I just run strace treefmt and
counted the syscalls. And here what I found:

For every formatter treefmt runs a bunch of statx(2) in attempt to
detect if formatter changed. Such detection takes time and is inevitably
prone to both false-positives and false-negatives.

I suggest skipping it and leaving it to the user to bust the cache when he
upgrades his toolchain. That would save N * M system calls, where N is
amount of formatters and M is amount of entries in your $PATH.

link to strace snippet

treefmt executes statx(2) on each file that is matched by at least one
rule. From content of eval-cache I infer that file is considered
unmodified if it has the same mtime and size.

This is known pain point of build systems, and there is no kernel interface
to make it any more efficient. Luckily, there is userspace daemon
watchman that can more efficiently
answer the question "what files were changed".

For example, in my case it takes watchman only 20ms to tell me what files
were changed (I intentionally changed one file):

 $ time watchman since . c:1702188097:2344484:1:60 '*.nix'
 {
     "is_fresh_instance": false,
     "files": [
         {
             "cclock": "c:1702188097:2344484:1:2",
             "new": false,
             "ctime": 1702188990,
             "dev": 66306,
             "mtime": 1702188990,
             "uid": 1000,
             "mode": 33188,
             "size": 380,
             "oclock": "c:1702188097:2344484:1:71",
             "nlink": 1,
             "ino": 27656322,
             "gid": 100,
             "exists": true,
             "name": "shell.nix"
         }
     ],
     "version": "2023.08.14.00",
     "clock": "c:1702188097:2344484:1:85",
     "debug": {
         "cookie_files": [
             "/home/kaction/devel/dayjob/backend/package/.watchman-cookie-despise-2344484-38"
         ]
     }
 }

 ________________________________________________________
 Executed in   20.58 millis    fish           external
    usr time    6.25 millis    0.00 micros    6.25 millis
    sys time    2.04 millis  792.00 micros    1.25 millis

Not great, but still 15x improvement. Probably can be improved further by
skipping watchman command line tool and talking Unix socket interface
directly.

There seems to be Rust watchman client library already

This is just a gut feeling, not backed by counting of syscalls. treefmt
uses toml format to store the cache. The toml format is A config file format for humans. Some suitable binary format, like sqlite3, gdbm or cdb
should probably be more efficient.

Thoughts?

The text was updated successfully, but these errors were encountered:

adrian-gierakowski · 2023-12-10T09:03:24Z

I suggest skipping it and leaving it to the user to bust the cache when he
upgrades his toolchain.

This would be a major pain for people who frequently upgrade their nixpkgs. If anything, this should be opt-in

zimbatm · 2023-12-10T12:12:03Z

There are a lot of different points but overall I'm open to changing the algorithms. The current approach is designed as a local maxima between speed, accuracy and code complexity so that I could get something out. I fully expect to have to re-visit this for users with larger repos. The main thing that I care about is that we maintain the current user experience where the user doesn't have to think if they have to bust the cache or not. The whole premise of this project is to relieve the user from cognitive overhead.

Replacing the cache file format is a no-brainer. The only reason we are using this format is that we were already using the TOML dependency. I'm not opposed to using something more efficient. But it's also not a priority as it's not the main driver for slowdowns.

Regarding the change tracking, mtime isn't fully accurate, but humans don't change files at high frequency so it's generally OK. The accurate method would be to track the checksum of each file, and that's much slower. We could introduce a daemon, but that would increase the code complexity and add new failure scenarios. I know there has been a lot of research going into speeding up git status in large repos, so it would be good to look there first (actually it would be interesting to see the speed comparison with git status).

zimbatm · 2023-12-10T12:31:22Z

Git has a fsmonitor. That's a similar idea than what you mentioned.

Before going there, there are some internal optimisations to do. On nixpkgs:

$ git ls-files | wc -l
38435
$ time git status >/dev/null
real	0m0.233s
user	0m0.079s
sys	0m0.273s
$ cat treefmt.toml 
[formatter.echo]
command = "echo"
includes = [ "*.*" ]
$ time treefmt 
0 files changed in 1s (found 38389, matched 38389, cache misses 0)

real	0m1.318s
user	0m0.878s
sys	0m0.442s

KAction · 2023-12-17T19:14:43Z

I did some experiments with cache format, and results are not as straightforward as I expected. Loading msgpack instead of toml for my case is 2ms faster, but writing msgpack is 1.5ms slower.

Anecodically, for project having 4k files and 7 different formatters, this change can reduce time it takes for treefmt(1) to figure out that it has nothing to do from ~350ms down to ~8m. If "WATCHMAN_SOCK" environment variable is not set and "watchman(1)" command are not available, "treefmt" transparently falls back on walking the file system. Setting the environment variable is highly recommended, since is avoids subprocess invocation. Ref: numtide#261

brianmcgee · 2024-08-19T15:50:42Z

V2 has a new concept of walkers, with default performance greatly improved:

# nixpkgs with no cache
traversed 42264 files
emitted 42264 files for processing
formatted 35462 files (623 changed) in 19.7s

# nixpkgs with hot cache
traversed 42264 files
emitted 0 files for processing
formatted 0 files (0 changed) in 283ms

With the new walker pattern it should be straightforward to have an implementation for watchman.

KAction added the enhancement New feature or request label Dec 10, 2023

KAction changed the title ~~Idea on speeding up no-op scenario~~ Ideas on speeding up no-op scenario Dec 10, 2023

KAction mentioned this issue Dec 17, 2023

feat: Add support for using watchman(1) instead of walking filesystem #263

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ideas on speeding up no-op scenario #261

Ideas on speeding up no-op scenario #261

KAction commented Dec 10, 2023

adrian-gierakowski commented Dec 10, 2023

zimbatm commented Dec 10, 2023

zimbatm commented Dec 10, 2023

KAction commented Dec 17, 2023

brianmcgee commented Aug 19, 2024

Ideas on speeding up no-op scenario #261

Ideas on speeding up no-op scenario #261

Comments

KAction commented Dec 10, 2023

adrian-gierakowski commented Dec 10, 2023

zimbatm commented Dec 10, 2023

zimbatm commented Dec 10, 2023

KAction commented Dec 17, 2023

brianmcgee commented Aug 19, 2024