Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path map command lines lazily during expansion #25368

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Feb 24, 2025

This removes a peak memory inefficiency of path mapping compared to regular execution.

@fmeum fmeum force-pushed the lazy-path-mapping branch from ae62071 to f905c9a Compare February 24, 2025 19:46
@fmeum fmeum force-pushed the lazy-path-mapping branch from 3722f1f to 10d042e Compare February 24, 2025 21:58
@fmeum fmeum requested a review from justinhorvitz February 24, 2025 22:00
@fmeum fmeum marked this pull request as ready for review February 24, 2025 22:00
@fmeum fmeum requested a review from a team as a code owner February 24, 2025 22:00
@fmeum fmeum requested review from aranguyen and removed request for a team February 24, 2025 22:00
@github-actions github-actions bot added team-Performance Issues for Performance teams team-Configurability platforms, toolchains, cquery, select(), config transitions awaiting-review PR is awaiting review from an assigned reviewer labels Feb 24, 2025
@fmeum fmeum requested a review from gregestren February 25, 2025 18:47
@justinhorvitz
Copy link
Contributor

@gregestren @aranguyen are there still plans to enable path mapping in google? Trying to decide how closely I should look at this PR.

@gregestren
Copy link
Contributor

@gregestren @aranguyen are there still plans to enable path mapping in google? Trying to decide how closely I should look at this PR.

A measurement team just added it as a candidate to estimate value. If that plays out with positive results, then yes. See b/399115992.

@justinhorvitz
Copy link
Contributor

@fmeum what is the bazel motivation for this change? It's a lot of complexity for a feature that we're not sure whether we are going to enable for blaze. I'm leaning toward deferring the review until after we make a go/no-go decision.

@fmeum
Copy link
Collaborator Author

fmeum commented Mar 4, 2025

@justinhorvitz While path mapping hasn't been enabled inside Google yet, it's grown pretty popular (for an experimental feature) in Bazel, especially since I added support for C++ actions. #22658 has a number of user stories.

The concrete motivation for this PR is that I had a large scale user reach out to me about path mapping. They encountered OOMs with path mapping enabled that weren't showing up without it. As far as I can tell, this PR should resolve the last remaining memory inefficiency of path mapping, which should help them adopt it.

If you encounter any issues in this part of the code in the future, I'm always open to being assigned to them and preparing fixes that you would then only need to review.

Let me know if you need further information on the state of path mapping in Bazel.

@gregestren
Copy link
Contributor

I agree this feature has already proven value for Bazel and is getting more popular. I can bear more weight for that part of the review but I don't think we should discourage Bazel adoption.

@justinhorvitz
Copy link
Contributor

Thanks for the context. I'll look for time to review this soon.

Copy link
Contributor

@justinhorvitz justinhorvitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending what I have before I leave for the day. Didn't quite get through all of StarlarkCustomCommandLine yet.

/**
* Returns the length of the mapped path minus the length of the unmapped path.
*
* <p>Implementations should provide a more efficient implementation that avoids allocations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything that uses this default impl? Maybe just test implementations?

*
* <p>Implementations should provide a more efficient implementation that avoids allocations.
*/
default int computeExecPathLengthDiff(Artifact.DerivedArtifact artifact) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import DerivedArtifact instead of qualifying it. A qualifier is not needed if the inner type is unambiguous.

stringValues.set(addIndex++, val);
Object /* String | DerivedArtifact */ val = values.get(i);
// If the path mapper is a no-op, an artifact behaves just like its (trivially mapped)
// exec path string.If the path mapper is not a no-op, mapped paths are always distinct
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space before "If"

// exec path string.If the path mapper is not a no-op, mapped paths are always distinct
// from unmapped paths. We can thus uniquify based on the mapped exec path string in each
// case.
String valToDeduplicateBy =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why deduplicate based on mapped strings instead of unmapped exec paths? Are you just trying to preserve existing behavior, or is it ever expected that two unique artifacts map to the same path (or even weirder, an artifact maps to a path that matches a raw string argument)?

@@ -1423,8 +1432,13 @@ void addPreprocessedArg(PreprocessedArg arg) {
numArgs += arg.numArgs();
}

void addString(String arg) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you keep this method? Some callers still have a String reference, for which it helps performance to avoid the type check, and enhances readability that it's definitely a string.

String stringValue =
StarlarkCustomCommandLine.expandToCommandLine(object, pathMapper, mainRepoMapping);
builder.addPreprocessedArg(new PreprocessedSingleFormattedArg(formatStr, stringValue));
switch (StarlarkCustomCommandLine.expandToCommandLine(object, mainRepoMapping)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary qualifier to the enclosing class.

@@ -105,6 +105,16 @@ default String getMappedExecPathString(ActionInput artifact) {
return map(artifact.getExecPath()).getPathString();
}

/**
* Returns the length of the mapped path minus the length of the unmapped path.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got confused - this is actually supposed to return a negative number when the mapped path is shorter (what we expect). I think it makes more sense to deal with positive numbers. What do you think about making this return |unmapped| - |mapped| instead? Either way, please make it super clear in the documentation.

@@ -145,7 +145,9 @@ static Optional<PathMapper> tryCreate(AbstractAction action, boolean isStarlarkA

@Override
public String getMappedExecPathString(ActionInput artifact) {
if (isSupportedInputType(artifact) && isOutputPath(artifact, outputRoot)) {
if (artifact instanceof DerivedArtifact
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this easier to read with the helper method. Perhaps keep an isSupported(artifact) that does both the type check and calls isOutputPath.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review PR is awaiting review from an assigned reviewer team-Configurability platforms, toolchains, cquery, select(), config transitions team-Performance Issues for Performance teams
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants