Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support target based fix rule without file partitioning #21935

Open
chris-smith-zocdoc opened this issue Feb 8, 2025 · 5 comments
Open

Support target based fix rule without file partitioning #21935

chris-smith-zocdoc opened this issue Feb 8, 2025 · 5 comments

Comments

@chris-smith-zocdoc
Copy link
Contributor

Is your feature request related to a problem? Please describe.
I don't believe its currently possible to write a fix rule (or at least I can't figure it out) that operates on the complete set of files in a target.

FixTargetsRequest exists, but is subject to file partitioning because of this line

yield from cls.partitioner_type.default_rules(cls, by_file=True)

For my plugin's use case, we cannot partition the target by file, the fix rule needs to operate on all files in the target.

Describe the solution you'd like
FixTargetsRequest should by able to opt out of file based partitioning.

Describe alternatives you've considered
We've currently have a work around to operate on all files in the fix rule but only return the subset that was requested for by the request. This means the rule is running more than once (in multiples of batch_size) which is inefficient. So an additional workaround of setting a higher batch_size is also required, which may or may not be desirable

Currently if you try and write a fix request with a custom _get_rules which flips by_file to False cls.partitioner_type.default_rules(cls, by_file=False), an exception is currently thrown by the fix goal.

Traceback (most recent call last):
  File "pants/core/goals/fix.py", line 381, in fix
    return await _do_fix(
  File "pants/core/goals/fix.py", line 305, in _do_fix
    partitions_by_request_type = await _get_partitions_by_request_type(
  File "pants/core/goals/lint.py", line 389, in _get_partitions_by_request_type
    all_partitions = await MultiGet(
  File "pants/engine/internals/selectors.py", line 376, in MultiGet
    return await _MultiGet(tuple(__arg0))
  File "pants/core/goals/lint.py", line 390, in <genexpr>
    partition_request_get(request_type) for request_type in filtered_core_request_types
  File "pants/core/goals/lint.py", line 383, in partition_request_get
    assert partition_request_type in file_partitioners
AssertionError

Additional context

Somewhat related to this pylint issue #21686

Conversation where I was figuring out how to implement FixTargetsRequest https://pantsbuild.slack.com/archives/C01CQHVDMMW/p1718312447982689

@benjyw
Copy link
Contributor

benjyw commented Feb 8, 2025

The "workaround", such as it is, is to not use FixTargetsRequest at all and just write your plugin using lower level abstractions... Not very convenient, but possible. Probably allowing some control over batching (including no batching at all) makes sense.

But there are some important design questions to tease out here. For example, how does this interact with change tracking? If you use --changed-since does the fixer run on just the changed files, or must it run on all the files in the repo? Is the problem here that the fixer is inefficient if run multiple times, or that it's actually incorrect for it not to see all the possible files in a single pass?

Could you describe the nature of this fixer in more detail?

@chris-smith-zocdoc
Copy link
Contributor Author

The "workaround", such as it is, is to not use FixTargetsRequest at all and just write your plugin using lower level abstractions

Is it possible for me to attach the rule to the fix and lint goals using this approach? I'd still like my fixer to be run during pants lint :: or pants fix ::

If you use --changed-since does the fixer run on just the changed files, or must it run on all the files in the repo?

It would run on the target that has the file as a source. Dotnet is a compiled language and the structure of our plugin is that each target maps to a dotnet project, the output of which is an assembly (which can be referenced by other projects/targets)

Is the problem here that the fixer is inefficient if run multiple times, or that it's actually incorrect for it not to see all the possible files in a single pass?

Its incorrect to not see all the files that are part of the target. Our plugin works around this by ignoring what pants partitions and locates the full set.

Could you describe the nature of this fixer in more detail?

The tool in question is dotnet format

I'm not an expert in this, but I can give my rough understanding from the few analyzers I've written. The roslyn analyzers are compiler plugins that run during the compilation process (so full source for the project must be in the sandbox), and the fixers mutate the AST to produce an updated source file. When we're running the fixers with pants, we're running in "batch mode", which is "fix everything that is broken". So the single dotnet format invocation can update many .cs source files.

@benjyw
Copy link
Contributor

benjyw commented Feb 9, 2025

Gotcha. So the fine-grained per-file-oriented idioms of standard Pants aren't a good fit here.

I suspect that you'll need to use lower-level plugin concepts, because the FixTargetRequests (which you can think of us just a convenience wrapper around those lower-level concepts) makes use of the assumption that it can look at individual files.

What is the target type you're, er, targeting? Is it a standard target type or a custom one in your plugin? Note that something like java_sources is a target generator that generates a bunch of underlying java_source targets, one per file. In that case information about the original target is lost. But in your case that information is critical, so I assume this is a custom target type?

And yes, you should be able to plug a more raw plugin implementation into the standard fix goal.

@chris-smith-zocdoc
Copy link
Contributor Author

Gotcha. So the fine-grained per-file-oriented idioms of standard Pants aren't a good fit here.

Yeah our plugin isn't advanced enough yet to do automatic dependency inference at the file level, and certain language features (runtime reflection) make it significantly more challenging to implement correctly.

The approach our plugin uses is to wrap invocations to MSBuild (dotnet cli) which is itself a build system and package manager, similar to what https://github.com/tgolsson/pants-cargo-porcelain does with Cargo and Rust

What is the target type you're, er, targeting? Is it a standard target type or a custom one in your plugin?

This is a custom target type dotnet_project in our plugin to support dotnet. The target has a sources field which contains all the sources under the current directory, ie **/*.cs. We can have many of these targets in a git repository, but they don't share source files, the trees cannot overlap.

And yes, you should be able to plug a more raw plugin implementation into the standard fix goal.

Thanks, I'll look more into this.

@benjyw
Copy link
Contributor

benjyw commented Feb 9, 2025

Ah, so this should be pretty possible. Maybe start by writing it as a plugin that provides a new goal myfmt or something, and doesn't use the existing formatter machinery, just to see that it all works as expected. Then we can figure out how to wire it in to fmt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants