Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a union source accessor to put chroot stores in the logical location #12512

Merged
merged 10 commits into from
Feb 20, 2025

Conversation

edolstra
Copy link
Member

@edolstra edolstra commented Feb 18, 2025

Motivation

When using a chroot store, paths that are in the chroot store (such as those resulting from IFD or similar) should be represented in the evaluator using their logical path (e.g. /nix/store/foo), not their physical path (e.g. /tmp/chroot/nix/store/foo). This PR does that - it removes all uses of toRealPath() in the evaluator. To make chroot stores still work, the physical store path is "mounted" onto the logical store location of rootFS. And to handle the case where we also need to access a file in the real /nix/store, we use a union source accessor that provides a union view of the two stores.

Fixes #11503.

Context


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@github-actions github-actions bot added the fetching Networking with the outside (non-Nix) world, input locking label Feb 18, 2025
@github-actions github-actions bot added the with-tests Issues related to testing. PRs with tests have some priority label Feb 18, 2025
@infinisil
Copy link
Member

I believe this does fix the problem! Though when I tried it out, the installer test failed with a different issue:

$ nixos-rebuild switch
error: path '/nix/var/nix/profiles/per-user/root/channels' is a symlink
full error
subtest: Check whether nixos-rebuild works
target: must succeed: nixos-rebuild switch >&2
target # building Nix...
target # building the system configuration...
target # error:
target #        … while calling the 'head' builtin
target #          at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/attrsets.nix:1:35738:
target #        … while evaluating the attribute 'value'
target #          at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/modules.nix:1:35221:
target #        … while evaluating the option `system.build.toplevel':
target # 
target #        … while evaluating definitions from `/nix/var/nix/profiles/per-user/root/channels/nixos/nixos/modules/system/activation/top-level.nix':
target # 
target #        … while evaluating the option `system.systemBuilderArgs':
target # 
target #        … while evaluating definitions from `/nix/var/nix/profiles/per-user/root/channels/nixos/nixos/modules/system/activation/activatable-system.nix':
target # 
target #        … while evaluating the option `system.activationScripts.etc.text':
target # 
target #        … while evaluating definitions from `/nix/var/nix/profiles/per-user/root/channels/nixos/nixos/modules/system/etc/etc-activation.nix':
target # 
target #        … while evaluating definitions from `/nix/var/nix/profiles/per-user/root/channels/nixos/nixos/modules/system/etc/etc.nix':
target # 
target #        … while evaluating the option `environment.etc.dbus-1.source':
target # 
target #        (stack trace truncated; use '--show-trace' to show the full, detailed trace)
target # 
target #        error: path '/nix/var/nix/profiles/per-user/root/channels' is a symlink
target: output: 
Test "Check whether nixos-rebuild works" failed with error: "command `nixos-rebuild switch >&2` failed (exit code 1)"
cleanup
kill machine (pid 91)
qemu-system-x86_64: terminating on signal 15 from pid 8 (/nix/store/0l539chjmcq5kdd43j6dgdjky4sjl7hl-python3-3.12.8/bin/python3.12)
kill vlan (pid 9)
(finished: cleanup, in 0.03 seconds)
error: builder for '/nix/store/skx3rv9y9mk98r8a2r42lwr0hlcn5nks-vm-test-run-installer-simpleUefiSystemdBoot.drv' failed with exit code 1;

To test this I used

nix-build -A nixosTests.installer.simpleUefiSystemdBoot

with following (non-reproducible) diff (which includes changing the Nix version for the test, and reverts NixOS/nixpkgs#369459) on a recent Nixpkgs version:

diff
diff --git a/nixos/tests/installer.nix b/nixos/tests/installer.nix
index 6be3346d9850..4b4e12ccd73b 100644
--- a/nixos/tests/installer.nix
+++ b/nixos/tests/installer.nix
@@ -761,6 +761,7 @@ let
               ++ optionals clevisTest [ pkgs.klibc ]
               ++ optional systemdStage1 pkgs.chroot-realpath;
 
+            nix.package = (builtins.getFlake "github:DeterminateSystems/nix-src/store-fs").outputs.packages.x86_64-linux.default;
             nix.settings = {
               substituters = mkForce [ ];
               hashed-mirrors = null;
diff --git a/pkgs/by-name/ni/nixos-firewall-tool/package.nix b/pkgs/by-name/ni/nixos-firewall-tool/package.nix
index b928487c5277..a5493d495876 100644
--- a/pkgs/by-name/ni/nixos-firewall-tool/package.nix
+++ b/pkgs/by-name/ni/nixos-firewall-tool/package.nix
@@ -6,10 +6,13 @@
   shellcheck-minimal,
 }:
 
-stdenvNoCC.mkDerivation {
+stdenvNoCC.mkDerivation rec {
   name = "nixos-firewall-tool";
 
-  src = builtins.filterSource (name: _: !(lib.hasSuffix ".nix" name)) ./.;
+  src = lib.fileset.toSource {
+    root = ./.;
+    fileset = lib.fileset.fileFilter (file: !file.hasExt "nix") ./.;
+  };
 
   strictDeps = true;
   buildInputs = [ bash ];

@edolstra
Copy link
Member Author

The tests do expose a problem with rootFS and storeFS not being equal, e.g.

error: lib.fileset.intersect: Expected file sets to have the same filesystem root, but first argument has root "/" while second argument has root "/".

and

error: lib.path.hasStorePathPrefix: Argument has a filesystem root (/) that's not /, which is currently not supported.

I suppose we can add a hack to make the root of storeFS equal to the root of rootFS...

@roberth
Copy link
Member

roberth commented Feb 19, 2025

An alternate solution is to mount the logical store as an overlay-mounted onto rootFS.
That solves the equality problem (the combined overlay accessor is the only evaluator-facing fs) and it might also solve problems that aren't solved by the string context-based eager decision.
We don't have overlay-mounting logic yet, but we'd only need overlayability for one directory, and it can be shallow, if that helps. Or something that simply only works for store mounts.

@edolstra
Copy link
Member Author

An alternate solution is to mount the logical store as an overlay-mounted onto rootFS.

That wouldn't work for the case where we're accessing files in both the physical and logical Nix store. For instance, the test case in #11503 is evaluating a default.nix in the real /nix/store, while writing/accessing stuff from the chroot store.

@roberth
Copy link
Member

roberth commented Feb 19, 2025

It would have to be an overlay mount, not a normal mount. That way you get a store path from either of the underlying file systems. I don't think we need to care about which one, as long as it can find a path that exists.

I suppose it might resolve a store path to the wrong fs, if it exists in both, which might be weird if it's an input addressed path with an impurity, or when it picks the one on a worse file system (e.g. case insensitive), but besides those corner cases, it seems like it'd be pretty good at picking a path that exists.

evaluating a file from the physical /nix/store while
using a chroot store. */
auto realStoreDir = dirOf(store->toRealPath(StorePath::dummy));
if (store->storeDir != realStoreDir) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll want this always

Suggested change
if (store->storeDir != realStoreDir) {

If we're not confident, I'd want a flag to make it unconditional so the team and any other volunteers can dogfood it.

Copy link
Member Author

@edolstra edolstra Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test suite succeeds with this enabled unconditionally, but for performance I don't think we should always enable it. For instance, every maybeLstat() call will now result in two maybeLstat() calls to the underlying accessors if the file doesn't exist. Though since PosixSourceAccessor has some caching it's probably not too bad.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read the code yet, but "double lstat" sounds like the wrong solution to me. We need to stop assuming the store is accessible within rootFS at all if we want both good performance and hygiene.

Comment on lines 260 to 266
auto storeFS = makeMountedSourceAccessor(
{
{CanonPath::root, makeEmptySourceAccessor()},
{CanonPath(store->storeDir), makeFSSourceAccessor(realStoreDir)}
});
accessor = makeUnionSourceAccessor({accessor, storeFS});
}
Copy link
Member

@Ericson2314 Ericson2314 Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Store::getFSAccessor for this? Also I want Store::getFSAccessor to be rooted int the store dir. I don't like all these "union in place" things that both this and Store::getFSAccessor do. I want to do CanonPath(storePath.to_string()) and have that just work with storeFS (note that the canon path will be e.g. w4l4xvw461ywc4ia3accj5i3hh50n4r8-nix-2.24.1 in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @puffnfresh who I have been talking to about this same stuff

@edolstra
Copy link
Member Author

edolstra commented Feb 19, 2025

@infinisil I think that error is coming from Nix 2.24.12, since that's what the installed NixOS system in the VM has.

It's no longer needed now that all store paths inside the evaluator
are logical rather than real paths.
@github-actions github-actions bot added the new-cli Relating to the "nix" command label Feb 19, 2025
@edolstra edolstra changed the title Add a storeFS accessor for paths resulting from IFD Use a union source accessor to put chroot stores in the logical location Feb 19, 2025
Note that in pure mode, we don't need to use the union FS even when
using a chroot store, since the user shouldn't have access to the
physical /nix/store.
@Ericson2314
Copy link
Member

Nix Meeting #214 notes:

  • Long intricate discussion.

  • Conclusion: @edolstra will try avoid the UnionSourceAccessor in pure mode


Some thoughts from me, paraphrasing the long discussion:

I like this path a lot. Here is my thinking:

Background: I like designs which are more openfd / capability-flavored, and I don't like global ambient authority with union/overlay and bind/shadowing mounts --- I think the mounting tricks are a poor substitute for just not having the ambient authority at all.

I think that whatever we do for this bug fix would be tantamount to an instant-stable experimental feature. So I was concerned to have in all code-paths a complex mounting-based fix. This would be more design debt in the name of back-compat getting us further way from a nice capability-based design.

At the same time, @edolstra convinced me that not doing the union mechanism was unworkable breakage in the general case. The issue is I had forgotten about coerceToPath --- in other words, I had forgotten that there are, thanks to coercions, many more ways to create path values that path literals (and, someday builtins.fetchTree). This means that, like it or not, there are tons of ways to get in-store and in-root-filesystem paths today, and those path values need to stay comperable

However, in the pure-eval case, we don't do arbitrary file access. So while we still have the situation of many path values within the store, we only have that situation --- not arbitrary root-FS paths. That means that --- for somewhat opposite reasons than I initially realized --- we can without breakage avoid a union FS in the pure-eval case.

This make me quite happy that we can achieve all goals:

  1. Impure eval doesn't break. Full stop.

  2. Impure eval works better with --store, and to the extent I don't like the solution, it's impure eval's fault, not --store's fault.

  3. Pure eval doesn't get sullied by any of this

  4. Pure eval is, in implementation, in fact better than before. (We could have previously outright denied it root fs access, but we hadn't done so. Now we have.) Not only is this not more tech debt in the name of back-compat (as I feared), it less,

  5. While pure eval is not all the way to a full capability-based workflow (e.g. carefully crafted absolute path literals in conjunction with tricks that exploit the fact that store path white-listing is an effect are possible), it is taking a significant step in that direction.

So in conclusion, horray! I was quite nervous about all this, and now I feel quite the opposite :).

Comment on lines -2488 to -2496
path = {state.rootFS, CanonPath(state.toRealPath(rewriteStrings(path.path.abs(), rewrites), context))};

try {
auto [storePath, subPath] = state.store->toStorePath(path.path.abs());
// FIXME: we should scanForReferences on the path before adding it
refs = state.store->queryPathInfo(storePath)->references;
path = {state.rootFS, CanonPath(state.store->toRealPath(storePath) + subPath)};
} catch (Error &) { // FIXME: should be InvalidPathError
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that refs is unused? If so, let's also delete the variable above? If not... what's going on?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a regression in....ea38605?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this appears to be dead code. I think the idea was to propagate the references, but that "feature" was never really well thought out. Maybe we should warn/fail if the path has references...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can investigate what happened here later. IIRC there was some regressions with this too.

Copy link
Member

@Ericson2314 Ericson2314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good modulo the one question

(all but approval :))

@Ericson2314 Ericson2314 merged commit 782c63f into NixOS:master Feb 20, 2025
12 checks passed
@edolstra edolstra deleted the store-fs branch February 20, 2025 00:54
@ElvishJerricco
Copy link
Contributor

@infinisil btw I tested this with the nixpkgs installer test using these changes: NixOS/nixpkgs@master...ElvishJerricco:nixpkgs:push-uysonulrlluq

I reverted the nixos-firewall-tools filesets thing and changed the revision of nix 2.26 to this merge revision, but I had to also add the nix package's buildInputs to the installer's extraDependencies. Due to the nix_2_26 attr itself only being a buildEnv, it has preferLocalBuild = true;, which means the installation target needs to build the buildEnv itself, including building / substituting all its buildInputs. Thanks to the checkInputs in the overrideAttrs on the buildEnv in the nix-everything expression, that includes some test derivations that have no output files and therefore are not linked in the buildEnv output and therefore not in the installer. Since the empty outputs are considered buildInputs to the nix_2_26 derivation, these test derivations need to be built on the target, which is not possible because the inputs necessary to build them are not in the installer.

That's why you got that error. Adding nix_2_26.buildInputs to extraDependencies in order to have the empty test outputs in the installer fixes it. And indeed this PR has resolved the issue with filesets breaking installer tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fetching Networking with the outside (non-Nix) world, input locking new-cli Relating to the "nix" command with-tests Issues related to testing. PRs with tests have some priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

builtins.path and builtins.filterSource use wrong filter paths with --store argument
5 participants