Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-36989][TESTS][PYTHON] Add type hints data tests #34296

Closed
wants to merge 13 commits into from

Conversation

zero323
Copy link
Member

@zero323 zero323 commented Oct 15, 2021

What changes were proposed in this pull request?

This PR:

In case of failure, a message similar to the following one

starting mypy annotations test...
annotations passed mypy checks.

starting mypy data test...
annotations failed data checks:
============================= test session starts ==============================
platform linux -- Python 3.9.7, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
rootdir: /path/to/spark/python, configfile: pyproject.toml
plugins: mypy-plugins-1.9.2
collected 37 items

python/pyspark/ml/tests/typing/test_classification.yml ..                [  5%]
python/pyspark/ml/tests/typing/test_evaluation.yml .                     [  8%]
python/pyspark/ml/tests/typing/test_feature.yml .                        [ 10%]
python/pyspark/ml/tests/typing/test_param.yml .                          [ 13%]
python/pyspark/ml/tests/typing/test_readable.yml .                       [ 16%]
python/pyspark/ml/tests/typing/test_regression.yml ..                    [ 21%]
python/pyspark/sql/tests/typing/test_column.yml F                        [ 24%]
python/pyspark/sql/tests/typing/test_dataframe.yml .......               [ 43%]
python/pyspark/sql/tests/typing/test_functions.yml .                     [ 45%]
python/pyspark/sql/tests/typing/test_pandas_compatibility.yml ..         [ 51%]
python/pyspark/sql/tests/typing/test_readwriter.yml ..                   [ 56%]
python/pyspark/sql/tests/typing/test_session.yml .....                   [ 70%]
python/pyspark/sql/tests/typing/test_udf.yml .......                     [ 89%]
python/pyspark/tests/typing/test_context.yml .                           [ 91%]
python/pyspark/tests/typing/test_core.yml .                              [ 94%]
python/pyspark/tests/typing/test_rdd.yml .                               [ 97%]
python/pyspark/tests/typing/test_resultiterable.yml .                    [100%]

=================================== FAILURES ===================================
______________________________ colDateTimeCompare ______________________________
/path/to/spark/python/pyspark/sql/tests/typing/test_column.yml:39: 
E   pytest_mypy_plugins.utils.TypecheckAssertionError: Invalid output: 
E   Actual:
E     main:20: note: Revealed type is "pyspark.sql.column.Column" (diff)
E   Expected:
E     main:20: note: Revealed type is "datetime.date*" (diff)
E   Alignment of first line difference:
E     E: ...ote: Revealed type is "datetime.date*"
E     A: ...ote: Revealed type is "pyspark.sql.column.Column"
E                                  ^
=========================== short test summary info ============================
FAILED python/pyspark/sql/tests/typing/test_column.yml::colDateTimeCompare - 
======================== 1 failed, 36 passed in 56.13s =========================

will be displayed.

Why are the changes needed?

Currently, type annotations are tested primarily for integrity and, to lesser extent, against actual API. Testing against examples is work in progress (SPARK-36997). Data tests allow us to improve coverage and test negative cases (code, that should fail type checker validation).

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Running linter tests with additions proposed in this PR

@zero323
Copy link
Member Author

zero323 commented Oct 15, 2021

cc @HyukjinKwon @itholic @ueshin @xinrong-databricks FYI

@SparkQA
Copy link

SparkQA commented Oct 15, 2021

Test build #144306 has finished for PR 34296 at commit 736a91a.

  • This patch fails RAT tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 15, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48785/

@SparkQA
Copy link

SparkQA commented Oct 15, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48786/

@SparkQA
Copy link

SparkQA commented Oct 15, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48785/

@SparkQA
Copy link

SparkQA commented Oct 15, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48786/

@SparkQA
Copy link

SparkQA commented Oct 15, 2021

Test build #144307 has finished for PR 34296 at commit 6f55c43.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

thanks @zero323 for working on this!

@SparkQA
Copy link

SparkQA commented Oct 17, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48810/

@SparkQA
Copy link

SparkQA commented Oct 17, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48810/

@SparkQA
Copy link

SparkQA commented Oct 17, 2021

Test build #144331 has finished for PR 34296 at commit 149996e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class Foo(Params):
  • class Bar(Params):

@SparkQA
Copy link

SparkQA commented Oct 18, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48846/

@SparkQA
Copy link

SparkQA commented Oct 18, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48846/

@SparkQA
Copy link

SparkQA commented Oct 18, 2021

Test build #144371 has finished for PR 34296 at commit 62018ce.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class Foo(Params):
  • class Bar(Params):

@SparkQA
Copy link

SparkQA commented Oct 18, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48848/

@zero323
Copy link
Member Author

zero323 commented Oct 18, 2021

Quick update:

All test files from pyspark-stubs where migrated and dev scripts are prepared.

However, I hit typeddjango/pytest-mypy-plugins#83, which has serious impact on test performance, when numpy is installed (the whole suite requires > 20 minutes, compared to ~50 seconds when numpy is not present).

@SparkQA
Copy link

SparkQA commented Oct 18, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48848/

@SparkQA
Copy link

SparkQA commented Oct 18, 2021

Test build #144374 has finished for PR 34296 at commit 13ad5e3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 20, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48930/

@SparkQA
Copy link

SparkQA commented Oct 20, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48930/

@SparkQA
Copy link

SparkQA commented Oct 20, 2021

Test build #144457 has finished for PR 34296 at commit d3d3755.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

echo "starting mypy annotations test..."
PYTEST_REPORT=$( ($MYPY_BUILD \
--config-file python/mypy.ini \
--cache-dir /tmp/.mypy_cache/ \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we set --cache-dir here explicitly, we can reuse cached for the data tests.

Alternatively, we can use $PWD as the --mypy-testing-base in the data tests, but I'd prefer to avoid that, because temporary files are written there.

@SparkQA
Copy link

SparkQA commented Oct 23, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49028/

@SparkQA
Copy link

SparkQA commented Oct 23, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49028/

@SparkQA
Copy link

SparkQA commented Oct 23, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49029/

@SparkQA
Copy link

SparkQA commented Oct 23, 2021

Test build #144557 has finished for PR 34296 at commit b3e3f25.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 23, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49029/

@SparkQA
Copy link

SparkQA commented Oct 23, 2021

Test build #144558 has finished for PR 34296 at commit 1ae0494.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 23, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49031/

@SparkQA
Copy link

SparkQA commented Oct 23, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49031/

@zero323 zero323 marked this pull request as ready for review October 23, 2021 21:05
@SparkQA
Copy link

SparkQA commented Oct 23, 2021

Test build #144560 has finished for PR 34296 at commit e052edc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zero323 zero323 closed this in 6d3cfed Oct 26, 2021
@zero323
Copy link
Member Author

zero323 commented Oct 26, 2021

Merged to master.

@zero323 zero323 deleted the SPARK-36989 branch October 26, 2021 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants