Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IBufferProtocol to mmap #1866

Merged
merged 3 commits into from
Feb 1, 2025
Merged

Conversation

slozier
Copy link
Contributor

@slozier slozier commented Jan 6, 2025

Add IBufferProtocol to mmap. Throws a NotImplementedException if the length does not fit in a int.

For the case of bigger files I think we'll have to rework IPythonBuffer. My initial thoughts on changes would be:

  • Use nint in place of int where applicable (probably where CPython uses Py_ssize_t).
  • Change AsSpan/AsReadOnlySpan methods to have an nint start argument.
  • Fix up everything to use nint arithmetic instead of int.

Marking as draft since I need to do more testing (and maybe write some tests). @BCSharp I'm not sure if you looked at/thought about this before so if you have any feedback would appreciate it.

Related to #1408

@BCSharp
Copy link
Member

BCSharp commented Jan 8, 2025

Yes, I have been thinking about the buffer protocol for large blobs, though not in the context of mmap. I didn't know that mmap implements the buffer protocol, but of course, why not.

I was thinking of supporting Numpy's ND arrays, which can be really big too. Also on some .NET versions, CLI arrays can be up to 4GB. The latter is exotic and the former still a little bit away, but mmap will be a good testing ground for large buffers.

The idea is to make it still easy to support by various types that implement the buffer protocol, and relatively easy to use by consumers. I am wary of using too much of nint, since the interesting .NET API is predominantly int/Span based, and I would like to avoid too much of narrowing casting by the clients to be able to do useful things with the buffer. The casts would have to be checked, or guarded with ifs, which is ugly and error prone. In the end, however, it all depends on usage. Now that we have the usage of the protocol in a number of places, it will be easier to accommodate the most common usage patterns. For instance, I've noticed that almost every client requests BufferFlags.Simple.

I see the following three major consuming patterns that should be easy for consumers to implement:

  1. Some consumers have innate limitations that make them unable to consume buffers larger than 2GB no matter what. For instance, constructors or initializers of builtin types, like bytes, bytearray, etc. For them, the current interface form (int/span-based) is the most convenient. If given a buffer larger than it they can handle, OverflowException should be automatically thrown, just as currently an exception is thrown if the requested buffer type is not supported.

  2. Some consumers can handle buffers larger than 2GB but prefer to do it in span-size chunks because they are making various .NET API calls with the data. For instance scanning for bytes, regex, reading/writing/copying data, encoding/decoding, encrypting/decrypting, etc. The interface should make it easy and convenient for them to consume a given buffer.

  3. Some consumers should be fully capable of handling memory data of any size. For instance memoryview. The interface should allow those clients to easily access the whole blob randomly.

I was thinking along these lines:

  • Add an optional parameter start, and perhaps count too, to AsSpan/AsReadOnlySpan methods, like one of your ideas. This should be easy for exporters to implement, though not always convenient to consume.
  • OR: Add Apply/ReadOnlyApply that takes a lambda or a delegate and optional start/count, which will repeatedly invoke the lambda with the appropriate and successive span, until the end of the designated data range. This should be convenient for consumers in Group 2, but cumbersome for each exporter to implement.
  • OR: Go with the optional parameter to AsSpan/AsReadOnlySpan, and provide Apply as an extension method. This will have the best of both worlds.
  • Extend IBufferProtocol with GetLongBuffer (or GetNativeBuffer?) that returns IPythonLongBuffer, which looks just like IPythonBuffer but everything is nint-based (like another of your ideas). Also AsSpan/AsReadOnlySpan return a new ref struct type that is nint-based. This will be convenient for consumers in Group 3. Provide a helper method to easily implement GetLongBuffer for exporters that never export anything bigger of 2GB. I think we still cannot use default implementations in interfaces, can we?
  • OR: A modification to the point above: drop AsSpan/AsReadOnlySpan, which are just convenience methods around Pin. Since the consumers in Group 3 are few and far between, it may be just simpler to let them fiddle with unsafe pointers.

The file descriptor work seems to be finally coming to an end, so once you merge this PR I could play with these ideas in code to get some better insights.

@slozier slozier marked this pull request as ready for review January 31, 2025 13:28
@slozier slozier requested a review from BCSharp January 31, 2025 13:30
Copy link
Member

@BCSharp BCSharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I was reviewing this PR I noticed that there is one case that I failed to address in my PR #1891, so this comment is not about the changes you submit, but you may want to fix it together with the changes. It is about the series if ifs on lines 589-603 in TryAddRef.

There is the fourth case missing that should go after the three existing:

if (exclusive && ((oldState & StateBits.RefCount) > StateBits.RefCount)) {
    // mmap in non-exclusive use, temporarily no exclusive use allowed
    reason = StateBits.Exclusive;
    return false;
}

Comment on lines +656 to +670
private int InterlockedOrState(int value) {
#if NET5_0_OR_GREATER
return Interlocked.Or(ref _state, value);
#else
int current = _state;
while (true) {
int newValue = current | value;
int oldValue = Interlocked.CompareExchange(ref _state, newValue, current);
if (oldValue == current) {
return oldValue;
}
current = oldValue;
}
#endif
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that you factored it out. The while (true) form is more efficient than do {...} while because it accesses the volatile field only once per loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was not an intentional performance optimization. I just copy/pasted the .NET implementation. 😄

Comment on lines +640 to +642
if ((newState & StateBits.RefCount) == StateBits.RefCountOne) {
newState &= ~StateBits.Exporting;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interesting consequence of this implementation is that the Exporting flag may not be reset right after the last export has been released, but when there are sill some non-exclusive calls in progress. However, it will be reset as soon as all of them exit mmap. I am OK with that since the non-exclusive calls are supposed to be quick and transient, but there is a corner case when mmap is so intensely being used that this bit may not be reset for a while. As a result, trying resize in such state will result in BufferError rather than EAGAIN even when there are no extant exports. Reordering the tests in the MmapLocker constructor would "fix" that, but at the expense of further deviating from CPython's error handling. I think that a 100% fix would require maintaining a separate number of exports counter in the mmap object, which is probably not worth the effort and added complexity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I couldn't think of a way to do it without having a separate counter for exports which would probably result in a lot more complexity. I figured most calls are quick enough that it should drain down to one in most reasonable scenarios.

@slozier
Copy link
Contributor Author

slozier commented Jan 31, 2025

There is the fourth case missing that should go after the three existing:

if (exclusive && ((oldState & StateBits.RefCount) > StateBits.RefCount)) {
    // mmap in non-exclusive use, temporarily no exclusive use allowed
    reason = StateBits.Exclusive;
    return false;
}

Hmm, I thought StateBits.RefCount was negative. Should it be comparing to StateBits.RefCountOne?

@BCSharp
Copy link
Member

BCSharp commented Jan 31, 2025

Hmm, I thought StateBits.RefCount was negative. Should it be comparing to StateBits.RefCountOne?

Yes! Sorry...

@slozier
Copy link
Contributor Author

slozier commented Jan 31, 2025

Just a quick note in case I forget (probably won't but who knows). Noticed an issue with the resize on Linux (presumably also applicable to macOS) not respecting the offset. Hopefully I can do a PR later this evening...

@BCSharp
Copy link
Member

BCSharp commented Feb 1, 2025

Hopefully I can do a PR later this evening...

If you are at it with another PR, a suggestion: mark the Windows P/Invoke with SupportedOSPlatform("windows")?

@slozier slozier merged commit 144146d into IronLanguages:main Feb 1, 2025
8 checks passed
@slozier slozier deleted the mmap_buffer branch February 1, 2025 01:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants