fix: `finfo_object`, `iinfo_object`, `_array` to typing.Protocol #857

34j · 2024-11-24T07:39:16Z

Closes #856

rgommers · 2024-11-24T09:44:18Z

@34j thanks a lot for working on improving static typing. Did you see gh-589 though? This PR seems to reinvent parts of that PR. It's nontrivial to review a large PR and get static typing right, but I prefer to push gh-589 forward if there is energy for that. Would you be interested in checking if that addresses your issue and needs?

34j · 2024-11-24T09:56:37Z

@rgommers
Yes, I noticed it an hour ago. I think it is now almost upwardly superior to #589 thanks to @nstarman in that all classes are protocols, with some changes.
I thought stopping the use of Optional and Union should be done by other PRs, so this PR does not incorporate those changes.

lucascolley · 2024-11-24T21:03:04Z

@jorenham you may have opinions on the activity here

jorenham · 2024-11-24T21:59:58Z

@jorenham you may have opinions on the activity here

I certainly do. But before I share them, would you mind telling me what the exact purpose is of these "stubs" (which usually refer to .pyi files)?

lucascolley · 2024-11-24T22:12:37Z

what the exact purpose is of these "stubs" (which usually refer to .pyi files)?

Not sure about the exact purpose, but they aren't (just) type stubs - they're primarily for housing the (sort of typed) signatures and docstrings from which the docs pages for individual functions are generated. I suppose they are "stubs" in the sense of having no function implementations.

jorenham · 2024-11-24T22:57:31Z

Not sure about the exact purpose

Then I think that the first step should be to figure that out, and document it.
If you don't know what you are building, then my typing-related opinions will all be based on baseless assumptions and personal preferences, which probably won't be very helpful.
So the only rational opinion I can give you at this point, is that I think that you should at least be able to answer the following questions:

Will it be used as a verification suite for the static type annotations of array-api libraries?
Will there be a "baseline" package that these protocols should at least be compatible with, e.g. numpy?
Should it follow the official python typing specification, and if so, how will that be verified?
Will it be distributed as a (standalone- or bundled-) package, and if so, should these types also be usable at runtime?

lucascolley · 2024-11-24T23:02:11Z

If you don't know what you are building, then my typing-related opinions will all be based on baseless assumptions and personal preferences, which probably won't be very helpful.

Disclaimer: I'm sure the array API maintainers 'know what they are building' - I've contributed to the repo, but not majorly.

34j · 2024-11-25T01:32:28Z

Perhaps the maintainers are mostly interested in discussing the contents of array-api and this may not be so important.

However, I have noticed a number of issues while working on this implementation and the #858 implementation, and would like to offer some observations.

[no-protocol] Frequently abuses TypeVar, as there are so many functions that do not use TypeVar for both arguments and return values; TypeVar could be used correctly (only) if Generic Protocol is used, however according to Array namespace #685 (comment), this has been suspended because it was Protocol is "harder to understand".
The work that should be done depends on whether the people “difficult to understand” Protocols includes the contributors to this repository.
- If the contributors can understand the protocol, protocol-based implementation should be done in this repository (the content should be uploaded to PyPI) and then the ci could automatically generate function code and upload them to another “starter” repository.
- Otherwise, vise versa. (feat: automatically generate Protocol for array-api-namespace #858)
[no-separation] If this is the starting point for creating an array-api compatible library, it is bad that the mysterious TypeVars and _type.py with unknown Protocols are included and that no sub-modules other than linalg,fft are hidden (does not have _ prefix unlike array-api-strict). The repository should be split using automatic code generation, preferably as a “template” and stub package for developing array-api.
[no-ellipsis] Ellipsis should be added if there is no internal implementation of the function.
[no-pypi] Again, since array-api is not uploaded to PyPI, it cannot be used as a stub package or to check if a library is array-api compatible.
[pre-commit] pre-commit is not set up properly, making it hard to develop; pyupgrade (for Python3.9+ typing like list[str]) and ruff (or autoflake and isort) should be added.

https://github.com/asottile/pyupgrade?tab=readme-ov-file#pep-585-typing-rewrites

Thanks for taking a look anyway

betatim · 2024-11-25T08:58:42Z

Big 👍 to @jorenham suggestion of answering the question: "What problem does this solve (for users of the array API standard)?" For me "users of the standard" are both people creating array providing libraries like Numpy or array-api-strict, as well as people using such a library to build something else (e.g. scikit-learn).

asmeurer · 2024-11-25T19:37:00Z

Just to be clear, the primary function of these "stubs" is so that they can be automatically included in the standard via Sphinx autodoc. We used to just have everything in RST files, but using autodoc is nicer because it lets us write the signatures in pure Python. It also allows people to just copy-paste a signature when implementing a function (e.g., I do this all the time when adding something to array-api-strict or array-api-compat). The test suite also uses this package to automatically generate some tests, although that's something that can be refactored concurrently.

Reusing them for typing is probably fine, as long as that particular usage is maintained. This module isn't really ever "imported" or anything. This package could be refactored into something installable from PyPI (this was discussed previously at #472). It's not clear to me how that would or should work since it's not a runnable package. And there's also complexity there since there are multiple versions of the standard.

asmeurer · 2024-11-25T19:37:23Z

src/array_api_stubs/_draft/_types.py

@@ -147,3 +145,1229 @@ def dtypes(
        "max rank": Optional[int],
    },
 )
+
+
+class Array(Protocol[array, "dtype", "device", PyCapsule]):  # type: ignore


I don't see a good reason to move this from array_object.py.

asmeurer · 2024-11-25T19:49:11Z

Also, since this is primarily used for documentation, the other thing we need to make sure is that the type signatures always remain readable. That means that we should always spell out types exactly (e.g., it would not be preferred to split out common union types into variables since those would be opaque in the documentation), and ideally the types used in the signatures should always match the types spelled out in the docstring.

asmeurer · 2024-11-25T19:53:11Z

src/array_api_stubs/_draft/_types.py

@@ -90,7 +88,7 @@ def __len__(self, /) -> int:
        ...


-class Info(Protocol):
+class Info(Protocol[device]):


Can you explain what the [device] here means after Protocol? The Info namespace object itself does not depend on the device (__array_namespace_info__ does not take a device parameter).

jorenham · 2024-11-25T21:58:35Z

It also allows people to just copy-paste a signature when implementing a function

Then it's probably a good idea to conform to the official typing specification, and validate this using static type checkers like basedpyright (a stricter pyright fork) and basedmypy (a mypy fork that is less broken)

It's not clear to me how that would or should work since it's not a runnable package. And there's also complexity there since there are multiple versions of the standard.

In an ideal scenario, it could be used like this:

# roughly based on the `scipy.fft._helper` stubs in `scipy-stubs`:
# https://github.com/jorenham/scipy-stubs/blob/master/scipy-stubs/fft/_helper.pyi

from typing import Unpack, overload
from array_api.typing.v2023 as xpt

### utility type aliases

# generic dtype protocol (requires a non-trivial spec), with an
# (invariant) type parameter for its "kind"
type NumericDType = xpt.DType[int] | xpt.DType[float] | xpt.DType[complex]

# shape-types are integer tuples, which is what numpy currently
# uses to annotate the shape-type of `ndarray`
type Size = int  # positive integer
type AtLeast0D = tuple[Size, ...]
type AtLeast1D = tuple[Size, *AtLeast0D]  # rejects `()`

# runtime-checkable generic array protocol, so that instances 
# of e.g. `numpy.ndarray` are assignable to it
type NumericTensor[DeviceT: xpt.Device = xpt.Device] = xpt.Array[
    AtLeast1D,     # shape-type argument
    NumericDType,  # dtype-type argument
    DeviceT,       # device-type _parameter_ (optional)
]

### public signatures 

@overload
def fftshift[T: NumericTensor](x: T, axes: Axes | None = None) -> T: ...
# <four other overloads for numpy array-likes>


# currently impossible; requires higher-kinded-typing
# https://github.com/python/typing/issues/548
@overload
def fftfreq[
    SizeT: Size,
    ArrayT: xpt.Array,
    DTypeT: xpt.DType,
    DeviceT: xpt.DeviceT,
](
    n: SizeT, 
    d: float,
    *,
    # obtain the array- dtype-, and device-types by matching 
    # against the generic array-namespace Protocol, and binding
    # to its (hypothetically generic) type-parameters
    xp: xpt.Namespace[ArrayT, DTypeT, DeviceT],
    device: DeviceT,
) -> ArrayT[        # the array-type from `xp`, that is
    tuple[SizeT],   # 1-dimensional and of size `n`,
    DTypeT[float],  # with a real floating-point dtype, and
    DeviceT[None],  # allocated on the default device
]: ...
# <two other overloads for the default `xp=None` case>

caution: completely off-topic rant ahead

So you could probably already tell; but I've actually been thinking quite a lot about this 😅. I've been planning on building something like this in optype, so that I can add array-api support in scipy-stubs.

But unfortunately, my first attempt at building this has kinda failed, mostly because I didn't realize that the current spec doesn't include a way to distinguish the different kinds of dtypes from one another using static typing. So there's no way to type something like the DType[float] from the example.

Either way, I'll probably give it another go in a couple of months or so. And if this time it'll actually succeed, I'll make sure to open an issue or PR so that we can figure out the next steps.

TLDR-ish

So I guess I'm trying to say that typing the array-api is very difficult, and that I think it would help if the spec itself would be more aware of the static typing challenges. So in that sense, these "stubs" might actually be a very good way to go achieve this. For example by trying to use these "stubs" to annotate a simple toy example (like the one in my example), and in such a way that static type-checkers understand it.

_{I realize that this might come over as if I'm trying to make this into some school exercise or something. But the reason I'm suggesting this, is because there have been way too many times where I thought that I understood some python-typing concept after reading about it, followed by a realization that I completely misunderstood it (often after introducing several bugs along the way) 😅.}

asmeurer · 2024-11-25T22:15:49Z

We should consider whether it would more sense to put the installable typing stuff in array-api-strict (or even in a completely new separate package). That way it remains separate from the "stubs" in the standard, which, as I noted, exist primarily for documentation. It would imply some manual copying from the standard into whatever package, but that is already what we have been doing to create array-api-strict and array-api-compat (and even the test suite), and it has worked reasonably well.

The advantage of keeping them separate is we wouldn't have to worry about potential conflicts between typing correctness and readability of the stubs as standard documentation, or potential issues that might arise with Sphinx. It also keeps the stubs here very simple (just functions with basic docstrings), and keeps the code complexity implied by typing stuff elsewhere.

It would also mean that the (English) text of the standard remains the single source of truth about what is specified. If the Python typing stubs end up disagreeing with that somehow, either because of a bug or because Python typing doesn't support some feature, the standard would be the thing that is correct. I think this is important because if there are two sources of truth (the text and typing "implementation" of the standard), there should be one final one in the case of any ambiguity, and I strongly feel that the final source of truth should be English text, not some reference code.

Having it in a separate package would also allow people who are the experts wrt typing to be able to maintain that package more effectively, without necessarily having to have their changes go through the review process of this repo, which tends to be a little more stringent/slower.

jorenham · 2024-11-25T23:01:17Z

I totally agree with you on that @asmeurer. And having independent release cycles seems also like a good idea in this case 😛.

jorenham · 2024-11-25T23:04:02Z

Oh and while we're still off-topic:
I think I just figured out a way around the "untype-able dtype-type-type and device-type problem" (without having to change the spec, that is): jorenham/optype#25

betatim · 2024-11-26T13:19:22Z

It would also mean that the (English) text of the standard remains the single source of truth about what is specified.

👍 to this. We should not craft the spec in a certain way just because we can't write it down with Python's static type system.

jorenham · 2024-11-26T17:31:59Z

It would also mean that the (English) text of the standard remains the single source of truth about what is specified.

👍 to this. We should not craft the spec in a certain way just because we can't write it down with Python's static type system.

Not necessarily, no. But it could help to actively take it into consideration when making design decisions.

kgryte · 2025-06-23T07:23:25Z

@34j With your recent work and the ongoing efforts in array-api-typing, are you okay closing this PR out?

General consensus among the workgroup (which you are welcome to attend! :) ) is to push the typing work independently of this repository, and efforts have now been made along these lines.

fix: finfo_object, iinfo_object, _array to typing.Protocol

1810177

rgommers added the topic: Static Typing Static typing. label Nov 24, 2024

34j force-pushed the fix/dataclass-to-protocol branch 2 times, most recently from 7175de0 to ac20ed5 Compare November 24, 2024 10:13

docs: fix docs again

f3c9eb4

34j force-pushed the fix/dataclass-to-protocol branch from 788ba04 to f3c9eb4 Compare November 24, 2024 10:35

fix: fix typing

782b15a

34j mentioned this pull request Nov 24, 2024

feat: automatically generate Protocol for array-api-namespace #858

Closed

asmeurer reviewed Nov 25, 2024

View reviewed changes

34j marked this pull request as draft November 26, 2024 04:50

jorenham mentioned this pull request Feb 6, 2025

Why do we prefer protocols in type stubs? #899

Closed

jorenham mentioned this pull request Apr 7, 2025

TYP: compatibility with array API standard's typing numpy/numpy#28665

Closed

34j closed this Jun 23, 2025

fix: finfo_object, iinfo_object, _array to typing.Protocol #857

fix: finfo_object, iinfo_object, _array to typing.Protocol #857

Uh oh!

Conversation

34j commented Nov 24, 2024

Uh oh!

rgommers commented Nov 24, 2024

Uh oh!

34j commented Nov 24, 2024

Uh oh!

lucascolley commented Nov 24, 2024

Uh oh!

jorenham commented Nov 24, 2024

Uh oh!

lucascolley commented Nov 24, 2024

Uh oh!

jorenham commented Nov 24, 2024

Uh oh!

lucascolley commented Nov 24, 2024

Uh oh!

34j commented Nov 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

betatim commented Nov 25, 2024

Uh oh!

asmeurer commented Nov 25, 2024

Uh oh!

asmeurer Nov 25, 2024

Choose a reason for hiding this comment

Uh oh!

asmeurer commented Nov 25, 2024

Uh oh!

asmeurer Nov 25, 2024

Choose a reason for hiding this comment

Uh oh!

jorenham commented Nov 25, 2024

Uh oh!

asmeurer commented Nov 25, 2024

Uh oh!

jorenham commented Nov 25, 2024

Uh oh!

jorenham commented Nov 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

betatim commented Nov 26, 2024

Uh oh!

jorenham commented Nov 26, 2024

Uh oh!

kgryte commented Jun 23, 2025

Uh oh!

Uh oh!

fix: `finfo_object`, `iinfo_object`, `_array` to typing.Protocol #857

fix: `finfo_object`, `iinfo_object`, `_array` to typing.Protocol #857

34j commented Nov 25, 2024 •

edited

Loading

jorenham commented Nov 25, 2024 •

edited

Loading