Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 25.02 submodule sync failed apply patch disable_shared_agg.patch #2911

Closed
pxLi opened this issue Feb 12, 2025 · 2 comments
Closed

[BUG] 25.02 submodule sync failed apply patch disable_shared_agg.patch #2911

pxLi opened this issue Feb 12, 2025 · 2 comments
Labels
? - Needs Triage bug Something isn't working

Comments

@pxLi
Copy link
Collaborator

pxLi commented Feb 12, 2025

Describe the bug
first seen at, spark-rapids-jni_submodule-sync-pre_release run:1013

with cudf commit 4b2ce98745187f1367cc8c461c6a6a18f42e2d1b


[2025-02-11T22:59:16.409Z] [INFO] --- maven-antrun-plugin:3.0.0:run (cudf patch) @ spark-rapids-jni ---

[2025-02-11T22:59:16.409Z] [INFO] Executing tasks

[2025-02-11T22:59:16.409Z] [INFO]      [exec] /home/jenkins/agent/workspace/jenkins-spark-rapids-jni_submodule-sync-pre_release-1013/thirdparty/cudf /home/jenkins/agent/workspace/jenkins-spark-rapids-jni_submodule-sync-pre_release-1013

[2025-02-11T22:59:16.409Z] [INFO]      [exec] patching with: /home/jenkins/agent/workspace/jenkins-spark-rapids-jni_submodule-sync-pre_release-1013/patches/disable_shared_agg.patch

[2025-02-11T22:59:16.409Z] [INFO]      [exec] Checking patch cpp/src/groupby/hash/compute_aggregations.cuh...

[2025-02-11T22:59:16.409Z] [INFO]      [exec] error: while searching for:

[2025-02-11T22:59:16.409Z] [INFO]      [exec]   auto const available_shmem_size = get_available_shared_memory_size(grid_size);

[2025-02-11T22:59:16.409Z] [INFO]      [exec]   auto const offsets_buffer_size  = compute_shmem_offsets_size(flattened_values.num_columns()) * 2;

[2025-02-11T22:59:16.409Z] [INFO]      [exec]   auto const data_buffer_size     = available_shmem_size - offsets_buffer_size;

[2025-02-11T22:59:16.409Z] [INFO]      [exec]   auto const is_shared_memory_compatible = std::all_of(

[2025-02-11T22:59:16.409Z] [INFO]      [exec]     requests.begin(), requests.end(), [&](cudf::groupby::aggregation_request const& request) {

[2025-02-11T22:59:16.409Z] [INFO]      [exec]       if (cudf::is_dictionary(request.values.type())) { return false; }

[2025-02-11T22:59:16.409Z] [INFO]      [exec]       // Ensure there is enough buffer space to store local aggregations up to the max cardinality

[2025-02-11T22:59:16.409Z] [INFO]      [exec]       // for shared memory aggregations

[2025-02-11T22:59:16.409Z] [INFO]      [exec]       auto const size = cudf::type_dispatcher<cudf::dispatch_storage_type>(request.values.type(),

[2025-02-11T22:59:16.409Z] [INFO]      [exec]                                                                            size_of_functor{});

[2025-02-11T22:59:16.409Z] [INFO]      [exec]       return static_cast<size_type>(data_buffer_size) >= (size * GROUPBY_CARDINALITY_THRESHOLD);

[2025-02-11T22:59:16.409Z] [INFO]      [exec]     });

[2025-02-11T22:59:16.409Z] [INFO]      [exec] 

[2025-02-11T22:59:16.409Z] [INFO]      [exec]   // Performs naive global memory aggregations when the workload is not compatible with shared

[2025-02-11T22:59:16.409Z] [INFO]      [exec]   // memory, such as when aggregating dictionary columns or when there is insufficient dynamic

[2025-02-11T22:59:16.409Z] [INFO]      [exec] 

[2025-02-11T22:59:16.409Z] [INFO]      [exec] error: patch failed: cpp/src/groupby/hash/compute_aggregations.cuh:69

[2025-02-11T22:59:16.409Z] [INFO]      [exec] error: cpp/src/groupby/hash/compute_aggregations.cuh: patch does not apply

Steps/Code to reproduce bug
Please provide a list of steps or a code sample to reproduce the issue.
Avoid posting private or sensitive data.

Expected behavior
A clear and concise description of what you expected to happen.

Environment details (please complete the following information)

  • Environment location: [Standalone, YARN, Kubernetes, Cloud(specify cloud provider)]
  • Spark configuration settings related to the issue

Additional context
Add any other context about the problem here.

@pxLi pxLi added ? - Needs Triage bug Something isn't working labels Feb 12, 2025
@pxLi
Copy link
Collaborator Author

pxLi commented Feb 12, 2025

context
#2909
#2910

pxLi pushed a commit that referenced this issue Feb 12, 2025
fix #2911

This fixes submodule sync for branch 25.02, reverting a patch for cudf
that is no longer relevant.

---------

Signed-off-by: Nghia Truong <[email protected]>
@pxLi
Copy link
Collaborator Author

pxLi commented Feb 12, 2025

closed with #2912

@pxLi pxLi closed this as completed Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant