Recent Dependabot messages suggested updating various Python modules. I elected to go further and update Python.
Using pyspark 3.3.2 on Python 3.11, the following code:
df = spark.read.json(rdd)
produced a long error that ended with
File "/usr/local/lib/python3.11/site-packages/pyspark/serializers.py", line 468, in dumps
raise pickle.PicklingError(msg)
_pickle.PicklingError: Could not serialize object: IndexError: tuple index out of range
When executing the example_code.py
script in Python 3.13 with pyspark 3.4.0, the following error occurs:
Traceback (most recent call last):
File "/workspaces/test311/example_code.py", line 1, in <module>
from pyspark import SparkConf
File "/usr/local/lib/python3.13/site-packages/pyspark/__init__.py", line 65, in <module>
from pyspark.broadcast import Broadcast
File "/usr/local/lib/python3.13/site-packages/pyspark/broadcast.py", line 38, in <module>
from typing.io import BinaryIO # type: ignore[import]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'typing.io'; 'typing' is not a package
typing.io
was removed from Python.