Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow --python-setup-requirement-constraints to work with inline requirements #10723

Closed
Eric-Arellano opened this issue Sep 2, 2020 · 15 comments · Fixed by #11724
Closed

Allow --python-setup-requirement-constraints to work with inline requirements #10723

Eric-Arellano opened this issue Sep 2, 2020 · 15 comments · Fixed by #11724

Comments

@Eric-Arellano
Copy link
Contributor

Eric-Arellano commented Sep 2, 2020

Problem

We need to be able to have constraints files work with constraints declared inline, e.g. through strings in a BUILD file, like this:

_python_constraints(
  constraints=[
     'foo',
     'bar==1.2',
  ],
)

Why? This blocks macros for Pipenv and Poetry working with constraints files #10655. Those need to be able to generate a target "in-memory" for the constraints, rather than writing to disk and saving that file to VCS.

For now, we will continue to support only one lockfile. You should probably only be able to do inline constraints, or do an actual file on disk. It doesn't make sense yet to be able to mix both.

Note that a single _python_constraints target is meant to represent the entire resolve, rather than one single constraint. This will be a key detail when we add multiple lockfiles: users choose the entire lockfile they want, rather than mixing pieces of different ones.

Approach

(Edit history: this originally considered always requiring a python_constraints for the [python-setup] option, rather than allowing for a file still. We allow either to maintain API flexibility because things will likely change soon for multiple lockfiles, and we don't want to break the API multiple times.)

  1. Define a new target type in backend/python/target_types.py through the Target API. See https://www.pantsbuild.org/v2.4/docs/target-api-new-targets.

    • The only fields should be COMMON_TARGET_FIELDS and a new field like Constraints(StringSequence). It doesn't make sense to have Dependencies, for example.
    • Let's name the alias _python_constraints for now, i.e. a private API, so that we can keep flexibility with changing things. For now, this is solely intended for macro consumption.
    • Register it in backend/python/register.py.
  2. Update python_setup.py for --requirement-constraints.

  • We will eventually want to update the help message and metavar, but let's not do that yet until we decide to graduate the target to be public.
  • Update the docstring for the property requirement_constraints to explain it could either be a file path or a target address.

TODO: finish updating step 3

  1. Update pex.py and pex_from_targets.py. This uses the Rules API, so check out https://www.pantsbuild.org/v2.4/docs/rules-api-concepts if you haven't yet.
    • I recommend starting with pex.py. pex_from_targets.py is a little tricky, and can be approached after pex.py through factoring up a new helper @rule.
    • While you're testing, you can use ./pants --isort-skip --black-skip fmt src/python/pants/util/strutil.py. Because you're changing pex.py, it will cause all the PEXes to be rebuilt for things like Docformatter. Docformatter builds extremely quickly, so this will be the fastest way to iterate. You can also add logger.info() calls to pex.py to debug things.
    • We need to detect if the user is using a file vs. a target address. This is tricky, and is the main motivation for the open question. For the sake of developing this, I recommend simply assuming that it's a target. We can work out this piece once we're ready, if we decide it's even relevant.
    • To resolve the target, use await Get(WrappedTarget, AddressInput, AddressInput.parse(python_setup.requirement_constraints). This will resolve the raw address string like 3rdparty/python:constraints into a proper Target. (WrappedTarget has a field target: Target)
    • Now that you have a WrappedTarget, you can use the Target API, e.g. wrapped_tgt.target[PythonConstraints].value to get the raw string sequence.
    • Pex/pip require reading the constraints from a file actually created. For Pants, we use a "Digest" to reference that file. Rather than constraint_file_digest = await Get(Digest, PathGlobs), we can use await Get(Digest, CreateDigest([FileContent("constraints.generated.txt", formatted_constraints_from_previous_step.encode())]). See https://www.pantsbuild.org/v2.0/docs/rules-api-file-system for what this all means.
    • Update the line argv.extend(["--constraints", python_setup.requirement_constraints]) to instead be argv.extend(["--constraints", "constraints.generated.txt"]) .
@benjyw
Copy link
Contributor

benjyw commented Sep 2, 2020

I think always having a target is better. It's no big hardship to require a target, and consistency is important. And no particular need to restrict the number of sources to 0 or 1, even,

@Eric-Arellano
Copy link
Contributor Author

And no particular need to restrict the number of sources to 0 or 1, even,

For now, we will continue to support only one lockfile. You should probably only be able to do inline constraints, or do an actual file on disk. It doesn't make sense yet to be able to mix both.

@benjyw
Copy link
Contributor

benjyw commented Sep 2, 2020

AFAICT, there is no need to restrict to a single lockfile now, just to a single set of constraints, now represented by this target. They can be spread over multiple lockfiles. No?

@Eric-Arellano
Copy link
Contributor Author

AFAICT, there is no need to restrict to a single lockfile now, just to a single set of constraints, now represented by this target. They can be spread over multiple lockfiles. No?

Yes, they can be. But why?

The one use case I was imagining was that you have your inlined constraints from Pipfile.lock, and you want to augment it with additional constraints. But that wouldn't be helped by loosening the restriction on # of source files - the macro would generate a new target, distinct from your hardcoded one, and you would have the same issue of --python-setup-requirement-constraints only allowing for a single target.

I can't imagine why you'd have two distinct constraints files if they are both meant to act as one. Two constraints files in general? Absolutely. But if they end up getting merged by Pants into one file, then I don't see the reason, and that merging could be confusing as it violates expectations.

@benjyw
Copy link
Contributor

benjyw commented Sep 2, 2020

I just don't see the need to impose artificial constraints. If the "unit" of resolution is a target, then let that target do whatever... Maybe people want to split the lockfile just to make it easy to edit or manage in some way, for example. pip supports multiple constraints files in a single run, so they are effectively a single resolve, just split across multiple files for whatever reason, so we probably should too.

In other words, let's lean in to "a resolve is represented by a target", and then there's no need to fuss about how many files and/or inline constraints that target encapsulates.

@Eric-Arellano
Copy link
Contributor Author

pip supports multiple constraints files in a single run, so they are effectively a single resolve, just split across multiple files for whatever reason, so we probably should too.

That's true. What we can do then is have this new implementation call await Get(HydratedSources, HydrateSourcesRequest(constraints_tgt[Sources]), which will get us each explicit file. Then, we can generate our constraints.generated.txt. Finally, we pass the option --constraints-file multiple times to Pex, once for each distinct file.

(One of my concerns was thinking that we had to manually merge the files ourselves. The fact that we can just pass it through to Pex makes me feel better.)

@benjyw
Copy link
Contributor

benjyw commented Sep 2, 2020

Yep! The true underlying concept is a "resolve", not a file. Can probably even support sources and inline constraints in the same target?

@benjyw
Copy link
Contributor

benjyw commented Sep 2, 2020

@jsirois could you weigh in briefly on the idea of a python_constraints target, since this interacts with the future work of having multiple resolves? Thx.

@jsirois
Copy link
Contributor

jsirois commented Sep 11, 2020

I haven't thought about the inline python_constraints yet, but stepping to the side, it seems this could all be done by standard targets / rules.

Why? This blocks macros for Pipenv and Poetry working with constraints files #10655. Those need to be able to generate a target "in-memory" for the constraints, rather than writing to disk and saving that file to VCS.

Since we only actually need a constraints.txt materialized to a chroot for Pex to see, couldn't we just directly materialize one there?

@Eric-Arellano
Copy link
Contributor Author

Since we only actually need a constraints.txt materialized to a chroot for Pex to see, couldn't we just directly materialize one there?

Yes, and this is the plan. But we need a mechanism for the very generic pex.py code to know that it should create a file, rather than reading from disk. For now, a python_constraints target is the best idea I could come up with that would allow the macro to communicate which constraints it needs to be used.

@jsirois
Copy link
Contributor

jsirois commented Sep 11, 2020

For now, a python_constraints target is the best idea I could come up with that would allow the macro to communicate which constraints it needs to be used.

My point is why does it need to be modeled as a macro? Is that just aping the precedent of requirements.txt parsing or is there some deeper reason we need to use a macro?

@Eric-Arellano
Copy link
Contributor Author

My point is why does it need to be modeled as a macro? Is that just aping the precedent of requirements.txt parsing or is there some deeper reason we need to use a macro?

For robust Pipenv/Poetry support, we have two requirements:

  • Convert the lockfile into a constraints.txt file understood by Pex / --python-setup-requirement-constraints.
  • Convert each top-level requirement defined in Pipfile / pyproject.toml into a distinct python_requirement_library, similar to the python_requiremrents() macro.

We might not need need a macro for the lockfile part, I'm not sure. I think we do need it for the second requirement, though, as we need to generate multiple targets dynamically.

@jsirois
Copy link
Contributor

jsirois commented Sep 11, 2020

I think we do need it for the second requirement, though, as we need to generate multiple targets dynamically.

I'm studiously trying to avoid implementing this but also nudge, so please forgive the pushing: Aren't all targets dynamically generated via rules? We parse BUILD files in rules now - that's no longer a special step like it was in v1. Couldn't a rule parse pipenv files to generate targets?

@stuhood
Copy link
Member

stuhood commented Sep 14, 2020

Aren't all targets dynamically generated via rules? We parse BUILD files in rules now - that's no longer a special step like it was in v1. Couldn't a rule parse pipenv files to generate targets?

This is roughly how we expect that a @rule and Target aware macro system would work (a @union that requires a @rule to provide one or more symbols from a BUILD file), but nothing like that is implemented yet.

@Eric-Arellano
Copy link
Contributor Author

Eric-Arellano commented Mar 17, 2021

Hm, I've been working on this and a bit stuck. @stuhood and I decided that we should continue to support [python-setup].requirement_constraints being a file glob, rather than requiring everyone to use _python_constraints. There is going to be too much churn with multiple lockfiles for us to make this deprecation, only to risk a possible change next month.

But it's tricky to have the option either be a file or a target address, as our code is currently factored, for example, to fail if a target cannot be resolved invalid.

I'm thinking, instead, add a new option --requirement-constraints-target, which is mutually exclusive with --requirement-constraints.

We can tighten it up once we figure out what multiple lockfiles looks like from a user perspective.

Eric-Arellano added a commit that referenced this issue Mar 18, 2021
…ry macro (#11724)

Closes #10723. This adds a new `_python_constraints` target, which allows defining the constraints as a list of strings in a BUILD file, rather than needing a constraints file on disk. This is useful for macros because they can generate this target.

As decided in that issue, we do not require using this new target - users can continue to use a file on disk via `[python-setup].requirement_constraints`.

It's not yet clear how this new target type will be used as we add support for multiple lockfiles, so the target is made private for now and its main intended use is macros like Pipenv and Poetry.

[ci skip-rust]
[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants