-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IBufferProtocol to mmap #1866
Conversation
Yes, I have been thinking about the buffer protocol for large blobs, though not in the context of I was thinking of supporting Numpy's ND arrays, which can be really big too. Also on some .NET versions, CLI arrays can be up to 4GB. The latter is exotic and the former still a little bit away, but The idea is to make it still easy to support by various types that implement the buffer protocol, and relatively easy to use by consumers. I am wary of using too much of I see the following three major consuming patterns that should be easy for consumers to implement:
I was thinking along these lines:
The file descriptor work seems to be finally coming to an end, so once you merge this PR I could play with these ideas in code to get some better insights. |
0bb4dfe
to
a472936
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I was reviewing this PR I noticed that there is one case that I failed to address in my PR #1891, so this comment is not about the changes you submit, but you may want to fix it together with the changes. It is about the series if if
s on lines 589-603 in TryAddRef
.
There is the fourth case missing that should go after the three existing:
if (exclusive && ((oldState & StateBits.RefCount) > StateBits.RefCount)) {
// mmap in non-exclusive use, temporarily no exclusive use allowed
reason = StateBits.Exclusive;
return false;
}
private int InterlockedOrState(int value) { | ||
#if NET5_0_OR_GREATER | ||
return Interlocked.Or(ref _state, value); | ||
#else | ||
int current = _state; | ||
while (true) { | ||
int newValue = current | value; | ||
int oldValue = Interlocked.CompareExchange(ref _state, newValue, current); | ||
if (oldValue == current) { | ||
return oldValue; | ||
} | ||
current = oldValue; | ||
} | ||
#endif | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that you factored it out. The while (true)
form is more efficient than do {...} while
because it accesses the volatile field only once per loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was not an intentional performance optimization. I just copy/pasted the .NET implementation. 😄
if ((newState & StateBits.RefCount) == StateBits.RefCountOne) { | ||
newState &= ~StateBits.Exporting; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interesting consequence of this implementation is that the Exporting
flag may not be reset right after the last export has been released, but when there are sill some non-exclusive calls in progress. However, it will be reset as soon as all of them exit mmap. I am OK with that since the non-exclusive calls are supposed to be quick and transient, but there is a corner case when mmap is so intensely being used that this bit may not be reset for a while. As a result, trying resize
in such state will result in BufferError
rather than EAGAIN
even when there are no extant exports. Reordering the tests in the MmapLocker
constructor would "fix" that, but at the expense of further deviating from CPython's error handling. I think that a 100% fix would require maintaining a separate number of exports counter in the mmap object, which is probably not worth the effort and added complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. I couldn't think of a way to do it without having a separate counter for exports which would probably result in a lot more complexity. I figured most calls are quick enough that it should drain down to one in most reasonable scenarios.
Hmm, I thought |
Yes! Sorry... |
Just a quick note in case I forget (probably won't but who knows). Noticed an issue with the |
If you are at it with another PR, a suggestion: mark the Windows P/Invoke with |
Add
IBufferProtocol
tommap
. Throws aNotImplementedException
if the length does not fit in aint
.For the case of bigger files I think we'll have to rework
IPythonBuffer
. My initial thoughts on changes would be:nint
in place ofint
where applicable (probably where CPython usesPy_ssize_t
).AsSpan
/AsReadOnlySpan
methods to have annint start
argument.nint
arithmetic instead ofint
.Marking as draft since I need to do more testing (and maybe write some tests). @BCSharp I'm not sure if you looked at/thought about this before so if you have any feedback would appreciate it.
Related to #1408