Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port to rattler-build #1796

Open
wants to merge 59 commits into
base: branch-25.04
Choose a base branch
from
Open

Conversation

gforsyth
Copy link
Contributor

@gforsyth gforsyth commented Jan 27, 2025

Summary:

recipe.yaml

build_*.sh

  • We use --no-build-id to allow sccache to look in a predictable place, see: https://rattler.build/latest/tips_and_tricks/#using-sccache-or-ccache-with-rattler-build
  • Depending on whether rapids-is-release-build, we include either rapidsai (release) or rapidsai-nightly (non-release) in the channel listing
  • Channels must be specified at the command-line
  • We remove the build_cache directory after building so it doesn't get packaged up with the other artifacts and uploaded to S3

xref: rapidsai/build-planning#47

Copy link

copy-pr-bot bot commented Jan 27, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

{% if cuda_major != "11" %}
- cuda-cudart-dev
{% endif %}
- {{ "cuda-cudart-dev" if cuda_major == "12" else "cuda-version" }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conda-recipe-manager doesn't like != as a comparison operator

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could do not and ==s?

Suggested change
- {{ "cuda-cudart-dev" if cuda_major == "12" else "cuda-version" }}
- {{ "cudatoolkit" if not (cuda_major == "12") else "cuda-version" }}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that works!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to write this in a way that only mentions CUDA 11, so that the condition can be trivially deleted when adding future major version support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was the idea. I just goofed on the syntax. Took another go below

#1796 (comment)

Comment on lines 5 to 13
version: ${{ env.get("RAPIDS_PACKAGE_VERSION") }}
cuda_version: ${{ (env.get('RAPIDS_CUDA_VERSION') | split('.'))[:2] | join(".") }}
cuda_major: ${{ (env.get('RAPIDS_CUDA_VERSION') | split('.'))[0] }}
date_string: ${{ env.get("RAPIDS_DATE_STRING") }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I grabbed these from rapidsai/cugraph#4551 as currently conda-recipe-manager doesn't have support for handling converting from the extended jinja2 syntax to the subset that rattler supports

@github-actions github-actions bot added the ci label Jan 29, 2025
@gforsyth gforsyth marked this pull request as ready for review January 29, 2025 19:48
@gforsyth gforsyth requested review from a team as code owners January 29, 2025 19:48
@gforsyth
Copy link
Contributor Author

well, if we remove fmt from host but leave it in run, then things get copied over correctly.

still missing all the nvtx3 files and even if I use the always_include_files directive for include/nvtx3/*, nothing shows up, so something is off

🐚 colordiff nightly-librmm rattler-librmm-nohost
15,31d14
< include/nvtx3/nvToolsExtCuda.h
< include/nvtx3/nvToolsExtCudaRt.h
< include/nvtx3/nvToolsExt.h
< include/nvtx3/nvToolsExtOpenCL.h
< include/nvtx3/nvToolsExtSync.h
< include/nvtx3/nvtx3.hpp
< include/nvtx3/nvtxDetail/nvtxImplCore.h
< include/nvtx3/nvtxDetail/nvtxImplCudaRt_v3.h
< include/nvtx3/nvtxDetail/nvtxImplCuda_v3.h
< include/nvtx3/nvtxDetail/nvtxImpl.h
< include/nvtx3/nvtxDetail/nvtxImplOpenCL_v3.h
< include/nvtx3/nvtxDetail/nvtxImplSync_v3.h
< include/nvtx3/nvtxDetail/nvtxInitDecls.h
< include/nvtx3/nvtxDetail/nvtxInitDefs.h
< include/nvtx3/nvtxDetail/nvtxInit.h
< include/nvtx3/nvtxDetail/nvtxLinkOnce.h
< include/nvtx3/nvtxDetail/nvtxTypes.h
1642,1644d1624
< lib/cmake/nvtx3/nvtx3-config.cmake
< lib/cmake/nvtx3/nvtx3-config-version.cmake
< lib/cmake/nvtx3/nvtx3-targets.cmake

@vyasr
Copy link
Contributor

vyasr commented Feb 10, 2025

well, if we remove fmt from host but leave it in run, then things get copied over correctly.

OK interesting that certainly implicates clobbering to some degree.

still missing all the nvtx3 files and even if I use the always_include_files directive for include/nvtx3/*, nothing shows up, so something is off

Maybe worth double-checking if there are any nvtx-related files in the build environment coming from other packages. Just to rule out clobbering as a possible root cause.

@gforsyth
Copy link
Contributor Author

Update: From looking at their docs, it does seem like there is some intentional compatibility baked in. It's not clear how much to expect without looking at the code, though.

It does seem to respect the CBC selectors:

with $RAPIDS_CUDA_VERSION=11.8

      "variant": {
        "c_compiler_version": "11",
        "c_stdlib": "sysroot",
        "c_stdlib_version": "2.28",
        "cmake_version": ">=3.26.4,!=3.30.0",
        "cuda_compiler": "nvcc",
        "cxx_compiler_version": "11",
        "librmm": "25.04.00 cuda11_250210_5d0a2446",
        "target_platform": "linux-64"
      },

with $RAPIDS_CUDA_VERSION=12.8

      "variant": {
        "c_compiler_version": "13",
        "c_stdlib": "sysroot",
        "c_stdlib_version": "2.28",
        "cmake_version": ">=3.26.4,!=3.30.0",
        "cuda_compiler": "cuda-nvcc",
        "cxx_compiler_version": "13",
        "librmm": "25.04.00 cuda12_250210_5d0a2446",
        "target_platform": "linux-64"
      },

@gforsyth
Copy link
Contributor Author

Ok, so I'm not sure what's up with my LOCAL build environment, but I just pulled down the latest .conda package and looked over the manifest and it's identical to the rmm nightly build.

I'll investigate whether my changes are even required for the fmt fix.

@gforsyth
Copy link
Contributor Author

Ok, so the changes removing fmt from the host environment ARE required to make sure those headers and shared objects get included.

the nvtx stuff seems to be a me problem, but it's working fine in CI

- "test -d \"${PREFIX}/include/rmm\""
about:
homepage: ${{ load_from_file("python/librmm/pyproject.toml").project.urls.Homepage }}
license: ${{ load_from_file("python/librmm/pyproject.toml").project.license.text | replace(" ", "-") }}
Copy link
Contributor

@bdice bdice Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably change python/librmm/pyproject.toml to use Apache-2.0 as an SPDX identifier. https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license

Like this:

- license = { text = "Apache 2.0" }
+ license = "Apache-2.0"

@vyasr Do you know why we chose to write it this way in #1529?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably fix this across the board -- I would suggest raising this with ops (in case they know of restrictions that I do not know). If ops is supportive, let's open a build-planning issue and audit this for all repos. Non-OSS repos may need a different solution, but Apache/BSD-3 repos should be fixable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this will look like:

Suggested change
license: ${{ load_from_file("python/librmm/pyproject.toml").project.license.text | replace(" ", "-") }}
license: ${{ load_from_file("python/librmm/pyproject.toml").project.license }}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the current choices were made in order to guarantee compatibility with wheeltamer. @raydouglass may remember the exact list of "allowed" licenses. Given that we no longer run wheeltamer before releases that is a moot point, but I don't know if there are any subsequent scans that we have reinstated where this could still be a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good -- that's exactly what I wondered. If we can adopt a normal SPDX license identifier in our pyproject.toml files, we absolutely should. I am okay with that being a follow-up to this PR, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, previously we used Apache 2.0 to comply with wheeltamer. Wheeltamer does now supports the correct SPDX Apache-2.0, so would be good to switch to the SPDX identifier across RAPIDS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK sweet then we can go ahead and synchronize the values in conda and pip so that we don't need any modification from one to the other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up: I opened rapidsai/build-planning#152 to propose switching to SPDX license expressions (like license = "Apache-2.0") but they're not supported by setuptools yet.

To simplify things, we should update our pyproject.toml files to license = { text = "Apache-2.0" } as we roll out rattler-build. @jakirkham proposed that here: rapidsai/build-planning#152 (comment)

We should avoid using replace(" ", "-") in the new recipes, just fix the source when needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed up a commit adding the - in the pyproject.toml files and removing the replace

conda/recipes/rmm/recipe.yaml Outdated Show resolved Hide resolved
conda/recipes/rmm/recipe.yaml Outdated Show resolved Hide resolved
conda/recipes/rmm/recipe.yaml Outdated Show resolved Hide resolved
- python:
imports:
- rmm
pip_check: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the details @vyasr. Let's copy-paste this into a build-planning issue or something that we can use to plan future work. I think we do have worthwhile action items here -- passing pip check would be a very nice-to-have validation of our packaging.

@bdice
Copy link
Contributor

bdice commented Feb 10, 2025

@gforsyth Conflicts will need to be resolved with #1808 -- we basically need to drop spdlog / fmt dependencies and add rapids-logger.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with a minor fix and one question about making all env vars required.

Follow-up work:

  • SPDX licenses in pyproject.toml files
  • Enabling pip check

conda/recipes/librmm/recipe.yaml Outdated Show resolved Hide resolved
conda/recipes/librmm/recipe.yaml Outdated Show resolved Hide resolved
conda/recipes/rmm/recipe.yaml Outdated Show resolved Hide resolved
ci/build_cpp.sh Outdated
Comment on lines 30 to 31
-c rapidsai \
-c rapidsai-nightly \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we actually need both of these channels or if we should tighten things up using rapids-is-release-build to select one or the other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We could probably use logic similar to https://github.com/rapidsai/gha-tools/blob/main/tools/rapids-configure-conda-channels to define an environment variable with channels in https://github.com/rapidsai/gha-tools/blob/main/tools/rapids-configure-rattler.

tl;dr We need to avoid using rapidsai-nightly or dask/label/dev when rapids-is-release-build returns true.

- if: cuda_major == "11"
then: cuda-cudart-dev
about:
homepage: ${{ load_from_file("python/librmm/pyproject.toml").project.urls.Homepage }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we load_from_file once into something in the context and then access that data? I don't know if rattler is smart enough to cache the file on its own.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem to work. In all of their documentation examples they repeatedly call load_from_file on the same file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird. I opened prefix-dev/rattler-build#1423. This seems like a bug in the Jinja parsing.

@gforsyth gforsyth changed the title port to rattler-build Port to rattler-build Feb 11, 2025
@gforsyth
Copy link
Contributor Author

@jakirkham @vyasr anything else you'd like to see here?

Comment on lines 88 to 91
about:
homepage: ${{ load_from_file("python/rmm/pyproject.toml").project.urls.Homepage }}
license: ${{ load_from_file("python/rmm/pyproject.toml").project.license.text | replace(" ", "-") }}
summary: ${{ load_from_file("python/rmm/pyproject.toml").project.description }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to assign ${{ load_from_file("python/rmm/pyproject.toml").project }} to a global variable and extract each value from it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
about:
homepage: ${{ load_from_file("python/rmm/pyproject.toml").project.urls.Homepage }}
license: ${{ load_from_file("python/rmm/pyproject.toml").project.license.text | replace(" ", "-") }}
summary: ${{ load_from_file("python/rmm/pyproject.toml").project.description }}
about:
homepage: ${{ project.urls.Homepage }}
license: ${{ project.license.text }}
summary: ${{ project.description }}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not currently, no. prefix-dev/rattler-build#1423

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Gil! 🙏

Am wondering if we can simplify the templating with a context variable

cuda_major: ${{ (env.get("RAPIDS_CUDA_VERSION") | split("."))[0] }}
date_string: ${{ env.get("RAPIDS_DATE_STRING") }}
py_version: ${{ env.get("RAPIDS_PY_VERSION") }}
head_rev: ${{ git.head_rev(".")[:8] }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this?

Suggested change
head_rev: ${{ git.head_rev(".")[:8] }}
head_rev: ${{ git.head_rev(".")[:8] }}
project: ${{ load_from_file("python/rmm/pyproject.toml").project }}

Comment on lines 88 to 91
about:
homepage: ${{ load_from_file("python/rmm/pyproject.toml").project.urls.Homepage }}
license: ${{ load_from_file("python/rmm/pyproject.toml").project.license.text | replace(" ", "-") }}
summary: ${{ load_from_file("python/rmm/pyproject.toml").project.description }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
about:
homepage: ${{ load_from_file("python/rmm/pyproject.toml").project.urls.Homepage }}
license: ${{ load_from_file("python/rmm/pyproject.toml").project.license.text | replace(" ", "-") }}
summary: ${{ load_from_file("python/rmm/pyproject.toml").project.description }}
about:
homepage: ${{ project.urls.Homepage }}
license: ${{ project.license.text }}
summary: ${{ project.description }}

@github-actions github-actions bot added the Python Related to RMM Python API label Feb 11, 2025
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last small set of questions, then this LGTM!

Comment on lines +22 to +26
if rapids-is-release-build; then
RAPIDS_CHANNEL=rapidsai
else
RAPIDS_CHANNEL=rapidsai-nightly
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor nit: I foresee this propagating everywhere. Should we turn this into a GHA tool? We could even make it return an array of usable channels so that we also have a centralized place for removing the nvidia channel once we no longer need it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should. I'll get that set up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +42 to +44
# remove build_cache directory
rm -rf "$RAPIDS_CONDA_BLD_OUTPUT_DIR"/build_cache

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version: ${{ version }}
build:
string: cuda${{ cuda_major }}_${{ date_string }}_${{ head_rev }}
script: install_librmm.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A final thought, should we just inline these scripts? They're pretty much all one-liners. I find that the current split makes most recipes harder to parse, not easier. Curious what other reviewers think, but I consider inlining <3 scripts preferable to having a separate file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good idea and would remove one more layer of indirection

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider inlining <3 scripts preferable

I ❤️ inline scripts too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be different for cudf or other repos — some of the scripts are multiple lines or have a TON of flags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we inline where it's a simple one-liner and if it's more complicated than that (or has a bunch of flags) we stick with the standalone install script.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider inlining <3 scripts preferable

I ❤️ inline scripts too.

Whoops lol I hope at least one of you guessed that I meant "<3 line scripts" 😂

Yeah I'm fine doing this on a case-by-case basis based on the rough 3-line heuristic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci conda improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Related to RMM Python API
Projects
Status: Review
Development

Successfully merging this pull request may close these issues.

5 participants