-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] single join fully implemented in numba #1304
Conversation
🚀 Deployed on https://deploy-preview-1304--pyjanitor.netlify.app |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## dev #1304 +/- ##
==========================================
+ Coverage 92.85% 94.61% +1.76%
==========================================
Files 78 78
Lines 4142 4254 +112
==========================================
+ Hits 3846 4025 +179
+ Misses 296 229 -67 |
36f3eeb
to
c8a366b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-approving pending docstrings!
@settings(deadline=None, max_examples=10) | ||
@given(df=conditional_df(), right=conditional_right()) | ||
def test_single_condition_greater_than_floats_keep_last_numba(df, right): | ||
"""Test output for a single condition. "<".""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samukweku I took the liberty of asking ChatGPT for a better docstring on this test.
@pytest.mark.turtle
@settings(deadline=None, max_examples=10)
@given(df=conditional_df(), right=conditional_right())
def test_single_condition_greater_than_floats_keep_last_numba(df, right):
"""
Test the functionality of conditional_join with a single 'greater than' condition on floating-point data, while keeping the last match using Numba.
This test sorts and filters dataframes 'df' and 'right' by columns 'B' and 'Numeric' respectively, removing NaN values. It then performs a backward merge_asof operation on these sorted dataframes. The expected outcome is a dataframe where each row from 'df' is merged with the last row from 'right' where 'Numeric' is greater than 'B'.
The actual outcome is produced by the conditional_join method with a 'greater than' condition, left join type, sorted by appearance, keeping the last match, and utilizing Numba for performance optimization. The test asserts that the actual dataframe matches the expected dataframe, ensuring correct functionality of the conditional_join under these specific parameters.
"""
# Test implementation continues...
Is it accurate? If so, I might begin writing a template for testing! Also, if it is accurate, could you update the docstrings for these two tests please? (I will get them generated for the other one.)
@settings(deadline=None, max_examples=10) | ||
@given(df=conditional_df(), right=conditional_right()) | ||
def test_single_condition_greater_than_floats_keep_last(df, right): | ||
"""Test output for a single condition. "<".""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pytest.mark.turtle
@settings(deadline=None, max_examples=10)
@given(df=conditional_df(), right=conditional_right())
def test_single_condition_greater_than_floats_keep_last_numba(df, right):
"""
Test the functionality of conditional_join with a single 'greater than' condition on floating-point data, while keeping the last match using Numba.
This test sorts and filters dataframes 'df' and 'right' by columns 'B' and 'Numeric' respectively, removing NaN values. It then performs a backward merge_asof operation on these sorted dataframes. The expected outcome is a dataframe where each row from 'df' is merged with the last row from 'right' where 'Numeric' is greater than 'B'.
The actual outcome is produced by the conditional_join method with a 'greater than' condition, left join type, sorted by appearance, keeping the last match, without utilizing Numba for performance optimization. The test asserts that the actual dataframe matches the expected dataframe, ensuring correct functionality of the conditional_join under these specific parameters.
"""
# Test implementation continues...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels like a lot of docstring info for tests, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a good point, actually. I'm still a bit conflicted on whether to be verbose on test docstrings.
@samukweku are we good to merge? I think we should, please let me know. |
@ericmjl yes it is ok to merge |
Thank you very much, @samukweku! |
PR Description
Please describe the changes proposed in the pull request:
This PR resolves #1302 .
PR Checklist
Please ensure that you have done the following:
<your_username>
:dev
, but rather from<your_username>
:<feature-branch_name>
.AUTHORS.md
.CHANGELOG.md
under the latest version header (i.e. the one that is "on deck") describing the contribution.Automatic checks
There will be automatic checks run on the PR. These include:
Relevant Reviewers
Please tag maintainers to review.