Skip to content

Python 3.14: PEP-784 compression.zstd #14129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 2, 2025

Conversation

Rogdham
Copy link
Contributor

@Rogdham Rogdham commented May 24, 2025

Second PR about PEP-784 in stdlib (first PR):

  • Add type hints for _zstd
  • Add type hints for compression.zstd

This comment has been minimized.

This comment has been minimized.

@Rogdham
Copy link
Contributor Author

Rogdham commented May 24, 2025

@emmatyping: if you want to give try this out, or simply comment on the use of Mapping[CompressionParameter, int] and Mapping[DecompressionParameter, int] as type hints for the options parameter. Any feedback would be appreciated!

@emmatyping
Copy link
Member

Note that I typed the options as Mapping[CompressionParameter, int] / Mapping[DecompressionParameter, int], while a Mapping[int, int] is technically accepted.

I think we should type that it accepts Mapping[int, int] as well. I consider that a supported usage.

This comment has been minimized.

This comment has been minimized.

@Rogdham
Copy link
Contributor Author

Rogdham commented May 29, 2025

Now that CPython 3.14.0 beta 2 is released, the CI is green and the PR ready for review!

@Rogdham Rogdham marked this pull request as ready for review May 29, 2025 07:41
Copy link
Collaborator

@srittau srittau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, a few remark below. Also I notice that many defaults are ..., although they have basic defaults in the implementation. In these cases, we include the actual defaults, ... is only used for complicated cases.

stdlib/_zstd.pyi Outdated
Comment on lines 44 to 46
CONTINUE: Final[_ZstdCompressorContinue] = 0
FLUSH_BLOCK: Final[_ZstdCompressorFlushBlock] = 1
FLUSH_FRAME: Final[_ZstdCompressorFlushFrame] = 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Final implies the Literal:

Suggested change
CONTINUE: Final[_ZstdCompressorContinue] = 0
FLUSH_BLOCK: Final[_ZstdCompressorFlushBlock] = 1
FLUSH_FRAME: Final[_ZstdCompressorFlushFrame] = 2
CONTINUE: Final = 0
FLUSH_BLOCK: Final = 1
FLUSH_FRAME: Final = 2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was to address a suggestion of @emmatyping of making sure the value is only defined once.

Do you want me to revert all occurrences of _ZstdCompressorContinue?

e.g. replace _ZstdCompressorContinue | _ZstdCompressorFlushBlock | _ZstdCompressorFlushFrame down below with Literal[0, 1, 2]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest is fine as is. This was just to point out the redundancy here.

stdlib/_zstd.pyi Outdated
def get_frame_info(frame_buffer: ReadableBuffer) -> tuple[int, int]: ...
def get_frame_size(frame_buffer: ReadableBuffer) -> int: ...
def get_param_bounds(parameter: int, is_compress: bool) -> tuple[int, int]: ...
def set_parameter_types(c_parameter_type: type[Any], d_parameter_type: type[Any]) -> None: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use Any here? I assume that only certain types are allowed? In that case, we need a comment explaining that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah these should be specifically type[CompressionParameter] and type[DecompressionParameter].

Suggested change
def set_parameter_types(c_parameter_type: type[Any], d_parameter_type: type[Any]) -> None: ...
def set_parameter_types(c_parameter_type: type[CompressionParameter], d_parameter_type: type[DecompressionParameter]) -> None: ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my understanding is correct, this functions take any types, and will set valid types to be used as keys (on top of int) for the options parameters.

In the implementation of zstd a single call is done explicitly with CompressionParameter and DecompressionParameter, but it could have been something else.

In any case, it is in a private _zstd module, so I don't think it matters a lot.

Happy to change this with Emma's suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While they can take any type, I would argue it is wrong to do so. I guess you could specify type[IntEnum] perhaps if you didn't want to specify the specific types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right, and typing with type[CompressionParameter]/type[DecompressionParameter] better shows the intent.

I will go with that.

Comment on lines 16 to 17
_PathOrFileBinary: TypeAlias = StrOrBytesPath | IO[bytes]
_PathOrFileText: TypeAlias = StrOrBytesPath | IO[str]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to avoid the unwieldy pseudo-protocol IO and its sub-classes, especially in argument annotations. In this case, appropriate protocols seem to be fairly to construct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took this from the lzma.pyi file so I thought it was ok. Tell me what you think.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lzma stubs probably predate the introduction of protocols and need to be updated as well at some point.

encoding: str | None = ...,
errors: str | None = ...,
newline: str | None = ...,
) -> TextIO: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. The implementation seems to return TextIOWrapper, so we should just return this concrete type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took this from the lzma.pyi file so I thought it was ok.

This one is straightforward to change though. Do you want me to change it in lzma.pyi as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing it in lzma as well would be appreciated.

encoding: str | None = ...,
errors: str | None = ...,
newline: str | None = ...,
) -> TextIO: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@Rogdham
Copy link
Contributor Author

Rogdham commented May 31, 2025

Also I notice that many defaults are ..., although they have basic defaults in the implementation.

I thought that was implementation details with no use for type checking, but I will change.


Likewise, is it needed to define the values for the constants?

For example:

# currently
ZSTD_CLEVEL_DEFAULT: Final[int]

# possible change
ZSTD_CLEVEL_DEFAULT: Final = 3

@srittau
Copy link
Collaborator

srittau commented May 31, 2025

Likewise, is it needed to define the values for the constants?

Generally we do that nowadays.

This comment has been minimized.

@Rogdham
Copy link
Contributor Author

Rogdham commented May 31, 2025

I think I have addressed all your points, feel free to review again, especially the protocols.

I am not sure what the issue with mypy_primer, feel free to provide insights if you have any!

@emmatyping
Copy link
Member

I am not sure what the issue with mypy_primer, feel free to provide insights if you have any!

Definitely looks unrelated. I searched through the codebase and there aren't any references to compression.zstd

@srittau srittau closed this Jun 2, 2025
@srittau srittau reopened this Jun 2, 2025
@srittau
Copy link
Collaborator

srittau commented Jun 2, 2025

The primer errors are a mypy hiccup. I've reopened the PR to trigger another run.

Copy link
Collaborator

@srittau srittau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's see what primer says, but I don't expect any relevant output.

Copy link
Contributor

github-actions bot commented Jun 2, 2025

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

@srittau srittau merged commit 798f332 into python:main Jun 2, 2025
127 checks passed
@Rogdham Rogdham deleted the pep-784_compression-zstd branch June 2, 2025 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants