Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-127295: ctypes: Switch field accessors to fixed-width integers #127297

Merged
merged 20 commits into from
Dec 20, 2024

Conversation

encukou
Copy link
Member

@encukou encukou commented Nov 26, 2024

See the issue.

This refactoring has a miniscule but consistent performance benefit (1.01x geometric mean; edit: 46 cases slower & 415 faster). See my bench script & results. Thanks @vstinner for pyperf and instructions for isolating CPU cores!

For the repetitive parts, this uses a combination of macros and Argument Clinic for code generation (inline, so one can easily inspect the results). This is the most readable/maintainable of various approaches I tried.
(No, I'm not a fan of the giant macros, but it beats both code generated from f-strings that lack syntax highlighting, and external multiple-include files. Oh, and the /////////// lines make it easy to see misplaced backslashes.)

Several related changes are included:

  • Pass a meaningful size argument, rather than zero, to non-bitfield accessors. (This would be great to have for size-generic functions, and it should be a private implementation detail. I want to do this now and get the asserts in the wild, to see if I missed someone who's relying on the detail.)
  • Use a switch in _ctypes_get_fielddesc, rather than a linear search. (This requires that there are now several boring chunks with a line for each of the format codes, sbBcdCEFgfhHiIlLqQPzuUZXvO, but Argument Clinic makes this bearable.)
  • Rearrange struct fielddesc to move all the accessors together, making code generation a bit easier
  • Motley consistency improvements, like changing BSTR in function names to its code char X, to match the other accessors

Replace formattable by a switch.

Generate some repetitive parts of handling individual C types.
@encukou
Copy link
Member Author

encukou commented Nov 26, 2024

(Yes, this is wrong as-is; hopefully due to a silly mistake on my part.)

@encukou encukou marked this pull request as ready for review November 27, 2024 08:42
@encukou encukou added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 27, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @encukou for commit fcae66b 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 27, 2024
@encukou
Copy link
Member Author

encukou commented Dec 6, 2024

@ZeroIntensity @picnixz @serhiy-storchaka, would one of you be interested in looking at these changes?

@serhiy-storchaka serhiy-storchaka self-requested a review December 6, 2024 10:40
@picnixz
Copy link
Member

picnixz commented Dec 6, 2024

I'll look into it but let me first clean-up my backlog (so probably in a few hours or tomorrow)

@picnixz picnixz self-requested a review December 6, 2024 10:40
@encukou
Copy link
Member Author

encukou commented Dec 6, 2024

No rush :)

@ZeroIntensity ZeroIntensity self-requested a review December 6, 2024 12:50
@ZeroIntensity
Copy link
Member

I'll take a look later today :)

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A first round of review.

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly LGTM. My main concern is that this adds some new thread safety issues with global variables, but knowing ctypes, it probably didn't work before anyway.

cc @skirpichev, you might be interested in looking at the integer-related parts of this PR.

static PyObject *
g_set(void *ptr, PyObject *value, Py_ssize_t size)
{
assert(NUM_BITS(size) || (size == sizeof(long double)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is repeated a lot; is it worth making it its own macro?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this for now.
One advantage of writing this out is that it's much clearer if you land on this line in a debugger. With this one line, the trade-off isn't worth it.

(I'll be the first to admit that the macros this PR adds are a pain to debug, but, IMO they also remove enough duplication to be worth it.)

@encukou
Copy link
Member Author

encukou commented Dec 13, 2024

@serhiy-storchaka, should I wait for your review?

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did one final pass and I'm happy to say that this LGTM.

@encukou
Copy link
Member Author

encukou commented Dec 16, 2024

Thank you!

If there are no objections, I plan to merge on Wednesday, and work on the next PR in this area.

Comment on lines +389 to +396
if (PyLong_Check(value) \
&& PyUnstable_Long_IsCompact((PyLongObject *)value)) \
{ \
val = (CTYPE)PyUnstable_Long_CompactValue( \
(PyLongObject *)value); \
} \
else { \
Py_ssize_t res = PyLong_AsNativeBytes( \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyLong_AsNativeBytes() already has a quick path for compact values. Did you check performance impact without first if statement?

@encukou
Copy link
Member Author

encukou commented Dec 18, 2024

Yes, without the if it's slightly but measurably slower. With the benchmarks in the OP: Geometric mean: 1.02x slower; 526 slower cases, 87 faster, 635 not significant.

(Perhaps that would be worth the simplification, but that's for another PR & discussion.)

@encukou encukou added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Dec 18, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @encukou for commit 5155293 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Dec 18, 2024
@encukou encukou merged commit 78ffba4 into python:main Dec 20, 2024
39 checks passed
@encukou encukou deleted the ctypes-fixint branch December 20, 2024 13:28
@encukou
Copy link
Member Author

encukou commented Dec 20, 2024

Thank you for the reviews!

srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull request Dec 23, 2024
…rs (pythonGH-127297)

This should be a pure refactoring, without user-visible behaviour changes.

Before this change, ctypes uses traditional native C types, usually identified 
by [`struct` format characters][struct-chars] when a short (and 
identifier-friendly) name is needed:
- `signed char` (`b`) / `unsigned char` (`B`)
- `short` (`h`) / `unsigned short` (`h`)
- `int` (`i`) / `unsigned int` (`i`)
- `long` (`l`) / `unsigned long` (`l`)
- `long long` (`q`) / `unsigned long long` (`q`)

These map to C99 fixed-width types, which this PR switches to: - 
- `int8_t`/`uint8_t`
- `int16_t`/`uint16_t`
- `int32_t`/`uint32_t`
- `int64_t`/`uint64_t`

The C standard doesn't guarantee that the “traditional” types must map to the 
fixints. But, [`ctypes` currently requires it][swapdefs], so the assumption won't 
break anything.

By “map” I mean that the *size* of the types matches. The *alignment* 
requirements might not. This needs to be kept in mind but is not an issue in 
`ctypes` accessors, which [explicitly handle unaligned memory][memcpy] for the 
integer types.

Note that there are 5 “traditional” C type sizes, but 4 fixed-width ones. Two of 
the former are functionally identical to one another; which ones they are is 
platform-specific (e.g. `int`==`long`==`int32_t`.) This means that one of the 
[current][current-impls-1] [implementations][current-impls-2] is redundant on 
any given platform.


The fixint types are parametrized by the number of bytes/bits, and one bit for 
signedness. This makes it easier to autogenerate code for them or to write 
generic macros (though generic API like 
[`PyLong_AsNativeBytes`][PyLong_AsNativeBytes] is problematic for performance 
reasons -- especially compared to a `memcpy` with compile-time-constant size).

When one has a *different* integer type, determining the corresponding fixint  
means a `sizeof` and signedness check. This is easier and more robust than the 
current implementations (see [`wchar_t`][sizeof-wchar_t] or 
[`_Bool`][sizeof-bool]).


[swapdefs]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L420-L444
[struct-chars]: https://docs.python.org/3/library/struct.html#format-characters
[current-impls-1]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L470-L653
[current-impls-2]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L703-L944
[memcpy]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L613
[PyLong_AsNativeBytes]: https://docs.python.org/3/c-api/long.html#c.PyLong_AsNativeBytes
[sizeof-wchar_t]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L1547-L1555
[sizeof-bool]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L1562-L1572


Co-authored-by: Bénédikt Tran <[email protected]>
srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull request Jan 8, 2025
…rs (pythonGH-127297)

This should be a pure refactoring, without user-visible behaviour changes.

Before this change, ctypes uses traditional native C types, usually identified 
by [`struct` format characters][struct-chars] when a short (and 
identifier-friendly) name is needed:
- `signed char` (`b`) / `unsigned char` (`B`)
- `short` (`h`) / `unsigned short` (`h`)
- `int` (`i`) / `unsigned int` (`i`)
- `long` (`l`) / `unsigned long` (`l`)
- `long long` (`q`) / `unsigned long long` (`q`)

These map to C99 fixed-width types, which this PR switches to: - 
- `int8_t`/`uint8_t`
- `int16_t`/`uint16_t`
- `int32_t`/`uint32_t`
- `int64_t`/`uint64_t`

The C standard doesn't guarantee that the “traditional” types must map to the 
fixints. But, [`ctypes` currently requires it][swapdefs], so the assumption won't 
break anything.

By “map” I mean that the *size* of the types matches. The *alignment* 
requirements might not. This needs to be kept in mind but is not an issue in 
`ctypes` accessors, which [explicitly handle unaligned memory][memcpy] for the 
integer types.

Note that there are 5 “traditional” C type sizes, but 4 fixed-width ones. Two of 
the former are functionally identical to one another; which ones they are is 
platform-specific (e.g. `int`==`long`==`int32_t`.) This means that one of the 
[current][current-impls-1] [implementations][current-impls-2] is redundant on 
any given platform.


The fixint types are parametrized by the number of bytes/bits, and one bit for 
signedness. This makes it easier to autogenerate code for them or to write 
generic macros (though generic API like 
[`PyLong_AsNativeBytes`][PyLong_AsNativeBytes] is problematic for performance 
reasons -- especially compared to a `memcpy` with compile-time-constant size).

When one has a *different* integer type, determining the corresponding fixint  
means a `sizeof` and signedness check. This is easier and more robust than the 
current implementations (see [`wchar_t`][sizeof-wchar_t] or 
[`_Bool`][sizeof-bool]).


[swapdefs]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L420-L444
[struct-chars]: https://docs.python.org/3/library/struct.html#format-characters
[current-impls-1]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L470-L653
[current-impls-2]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L703-L944
[memcpy]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L613
[PyLong_AsNativeBytes]: https://docs.python.org/3/c-api/long.html#c.PyLong_AsNativeBytes
[sizeof-wchar_t]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L1547-L1555
[sizeof-bool]: https://github.com/python/cpython/blob/v3.13.0/Modules/_ctypes/cfield.c#L1562-L1572


Co-authored-by: Bénédikt Tran <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants