Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat): calculate_qc_metrics with dask #3307

Merged
merged 21 commits into from
Nov 7, 2024
Merged

(feat): calculate_qc_metrics with dask #3307

merged 21 commits into from
Nov 7, 2024

Conversation

ilan-gold
Copy link
Contributor

  • Release notes not necessary because:

@ilan-gold ilan-gold changed the title Ig/dask qc (feat): calculate_qc_metrics with dask Oct 22, 2024
@@ -432,7 +456,7 @@ def top_segment_proportions_dense(
return values / sums[:, None]


@numba.njit(cache=True, parallel=True)
# @numba.njit(cache=True, parallel=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the current blocker. Without this commented, I just get a "quiet" failure. Asked on zulip: https://scverse.zulipchat.com/#narrow/channel/456188-Dask/topic/Numba.20in.20Dask/near/478248965

@ilan-gold ilan-gold marked this pull request as draft October 22, 2024 12:42
@ilan-gold ilan-gold added this to the 1.11.0 milestone Oct 22, 2024
Copy link

codecov bot commented Oct 22, 2024

Codecov Report

Attention: Patch coverage is 95.45455% with 2 lines in your changes missing coverage. Please review.

Project coverage is 77.23%. Comparing base (2f0afac) to head (65c74cf).
Report is 52 commits behind head on main.

Files with missing lines Patch % Lines
src/scanpy/preprocessing/_qc.py 94.28% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3307      +/-   ##
==========================================
+ Coverage   77.21%   77.23%   +0.02%     
==========================================
  Files         111      111              
  Lines       12597    12618      +21     
==========================================
+ Hits         9727     9746      +19     
- Misses       2870     2872       +2     
Files with missing lines Coverage Δ
src/scanpy/_utils/__init__.py 75.96% <100.00%> (+0.44%) ⬆️
src/scanpy/preprocessing/_qc.py 94.19% <94.28%> (-0.92%) ⬇️

@ilan-gold
Copy link
Contributor Author

Apparently the problem I saw here with numba within dask is a bit more widespread: #3317

@ilan-gold ilan-gold marked this pull request as ready for review October 25, 2024 08:52
@ilan-gold ilan-gold marked this pull request as draft October 25, 2024 08:53
@ilan-gold ilan-gold marked this pull request as ready for review October 25, 2024 10:42
Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

The diff for test_qc_metrics is a bit hard to follow. Am I right that you simply

  1. pulled out test_qc_metrics_no_log1p and test_qc_metrics_idempotent
  2. parametrized it with all array types
  3. added with maybe_dask_process_context():

Thanks for pulling the tests apart!

@ilan-gold
Copy link
Contributor Author

Am I right that you simply

You are right!

@flying-sheep
Copy link
Member

great! then apart from my suggestions and the crash problem, this looks ready.

Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beautiful. we should decide how to deal with the missing f-literals, otherwise LGTM

@ilan-gold ilan-gold mentioned this pull request Oct 31, 2024
19 tasks
@flying-sheep
Copy link
Member

flying-sheep commented Nov 5, 2024

OK, #3335 is ready.

We should manually check if it can get rid of maybe_dask_process_context

/edit: it’s not ready

@flying-sheep flying-sheep merged commit 2f15f79 into main Nov 7, 2024
14 checks passed
@flying-sheep flying-sheep deleted the ig/dask_qc branch November 7, 2024 11:44
kaushalprasadhial pushed a commit to sanchit-misra/scanpy that referenced this pull request Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

calculate_qc_metrics for dask
2 participants