-
-
Notifications
You must be signed in to change notification settings - Fork 31.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-127295: ctypes: Switch field accessors to fixed-width integers #127297
Conversation
Replace formattable by a switch. Generate some repetitive parts of handling individual C types.
(Yes, this is wrong as-is; hopefully due to a silly mistake on my part.) |
@ZeroIntensity @picnixz @serhiy-storchaka, would one of you be interested in looking at these changes? |
I'll look into it but let me first clean-up my backlog (so probably in a few hours or tomorrow) |
No rush :) |
I'll take a look later today :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first round of review.
Co-authored-by: Bénédikt Tran <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mostly LGTM. My main concern is that this adds some new thread safety issues with global variables, but knowing ctypes, it probably didn't work before anyway.
cc @skirpichev, you might be interested in looking at the integer-related parts of this PR.
static PyObject * | ||
g_set(void *ptr, PyObject *value, Py_ssize_t size) | ||
{ | ||
assert(NUM_BITS(size) || (size == sizeof(long double))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is repeated a lot; is it worth making it its own macro?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer this for now.
One advantage of writing this out is that it's much clearer if you land on this line in a debugger. With this one line, the trade-off isn't worth it.
(I'll be the first to admit that the macros this PR adds are a pain to debug, but, IMO they also remove enough duplication to be worth it.)
@serhiy-storchaka, should I wait for your review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did one final pass and I'm happy to say that this LGTM.
Thank you! If there are no objections, I plan to merge on Wednesday, and work on the next PR in this area. |
if (PyLong_Check(value) \ | ||
&& PyUnstable_Long_IsCompact((PyLongObject *)value)) \ | ||
{ \ | ||
val = (CTYPE)PyUnstable_Long_CompactValue( \ | ||
(PyLongObject *)value); \ | ||
} \ | ||
else { \ | ||
Py_ssize_t res = PyLong_AsNativeBytes( \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyLong_AsNativeBytes() already has a quick path for compact values. Did you check performance impact without first if statement?
Yes, without the if it's slightly but measurably slower. With the benchmarks in the OP: (Perhaps that would be worth the simplification, but that's for another PR & discussion.) |
Thank you for the reviews! |
…rs (pythonGH-127297) This should be a pure refactoring, without user-visible behaviour changes. Before this change, ctypes uses traditional native C types, usually identified by [`struct` format characters][struct-chars] when a short (and identifier-friendly) name is needed: - `signed char` (`b`) / `unsigned char` (`B`) - `short` (`h`) / `unsigned short` (`h`) - `int` (`i`) / `unsigned int` (`i`) - `long` (`l`) / `unsigned long` (`l`) - `long long` (`q`) / `unsigned long long` (`q`) These map to C99 fixed-width types, which this PR switches to: - - `int8_t`/`uint8_t` - `int16_t`/`uint16_t` - `int32_t`/`uint32_t` - `int64_t`/`uint64_t` The C standard doesn't guarantee that the “traditional” types must map to the fixints. But, [`ctypes` currently requires it][swapdefs], so the assumption won't break anything. By “map” I mean that the *size* of the types matches. The *alignment* requirements might not. This needs to be kept in mind but is not an issue in `ctypes` accessors, which [explicitly handle unaligned memory][memcpy] for the integer types. Note that there are 5 “traditional” C type sizes, but 4 fixed-width ones. Two of the former are functionally identical to one another; which ones they are is platform-specific (e.g. `int`==`long`==`int32_t`.) This means that one of the [current][current-impls-1] [implementations][current-impls-2] is redundant on any given platform. The fixint types are parametrized by the number of bytes/bits, and one bit for signedness. This makes it easier to autogenerate code for them or to write generic macros (though generic API like [`PyLong_AsNativeBytes`][PyLong_AsNativeBytes] is problematic for performance reasons -- especially compared to a `memcpy` with compile-time-constant size). When one has a *different* integer type, determining the corresponding fixint means a `sizeof` and signedness check. This is easier and more robust than the current implementations (see [`wchar_t`][sizeof-wchar_t] or [`_Bool`][sizeof-bool]). [swapdefs]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L420-L444 [struct-chars]: https://docs.python.org/3/library/struct.html#format-characters [current-impls-1]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L470-L653 [current-impls-2]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L703-L944 [memcpy]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L613 [PyLong_AsNativeBytes]: https://docs.python.org/3/c-api/long.html#c.PyLong_AsNativeBytes [sizeof-wchar_t]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L1547-L1555 [sizeof-bool]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L1562-L1572 Co-authored-by: Bénédikt Tran <[email protected]>
…rs (pythonGH-127297) This should be a pure refactoring, without user-visible behaviour changes. Before this change, ctypes uses traditional native C types, usually identified by [`struct` format characters][struct-chars] when a short (and identifier-friendly) name is needed: - `signed char` (`b`) / `unsigned char` (`B`) - `short` (`h`) / `unsigned short` (`h`) - `int` (`i`) / `unsigned int` (`i`) - `long` (`l`) / `unsigned long` (`l`) - `long long` (`q`) / `unsigned long long` (`q`) These map to C99 fixed-width types, which this PR switches to: - - `int8_t`/`uint8_t` - `int16_t`/`uint16_t` - `int32_t`/`uint32_t` - `int64_t`/`uint64_t` The C standard doesn't guarantee that the “traditional” types must map to the fixints. But, [`ctypes` currently requires it][swapdefs], so the assumption won't break anything. By “map” I mean that the *size* of the types matches. The *alignment* requirements might not. This needs to be kept in mind but is not an issue in `ctypes` accessors, which [explicitly handle unaligned memory][memcpy] for the integer types. Note that there are 5 “traditional” C type sizes, but 4 fixed-width ones. Two of the former are functionally identical to one another; which ones they are is platform-specific (e.g. `int`==`long`==`int32_t`.) This means that one of the [current][current-impls-1] [implementations][current-impls-2] is redundant on any given platform. The fixint types are parametrized by the number of bytes/bits, and one bit for signedness. This makes it easier to autogenerate code for them or to write generic macros (though generic API like [`PyLong_AsNativeBytes`][PyLong_AsNativeBytes] is problematic for performance reasons -- especially compared to a `memcpy` with compile-time-constant size). When one has a *different* integer type, determining the corresponding fixint means a `sizeof` and signedness check. This is easier and more robust than the current implementations (see [`wchar_t`][sizeof-wchar_t] or [`_Bool`][sizeof-bool]). [swapdefs]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L420-L444 [struct-chars]: https://docs.python.org/3/library/struct.html#format-characters [current-impls-1]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L470-L653 [current-impls-2]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L703-L944 [memcpy]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L613 [PyLong_AsNativeBytes]: https://docs.python.org/3/c-api/long.html#c.PyLong_AsNativeBytes [sizeof-wchar_t]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L1547-L1555 [sizeof-bool]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L1562-L1572 Co-authored-by: Bénédikt Tran <[email protected]>
See the issue.
This refactoring has a miniscule but consistent performance benefit (1.01x geometric mean; edit: 46 cases slower & 415 faster). See my bench script & results. Thanks @vstinner for
pyperf
and instructions for isolating CPU cores!For the repetitive parts, this uses a combination of macros and Argument Clinic for code generation (inline, so one can easily inspect the results). This is the most readable/maintainable of various approaches I tried.
(No, I'm not a fan of the giant macros, but it beats both code generated from f-strings that lack syntax highlighting, and external multiple-include files. Oh, and the
///////////
lines make it easy to see misplaced backslashes.)Several related changes are included:
asserts
in the wild, to see if I missed someone who's relying on the detail.)switch
in_ctypes_get_fielddesc
, rather than a linear search. (This requires that there are now several boring chunks with a line for each of the format codes,sbBcdCEFgfhHiIlLqQPzuUZXvO
, but Argument Clinic makes this bearable.)struct fielddesc
to move all the accessors together, making code generation a bit easierBSTR
in function names to its code charX
, to match the other accessors