-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic introduction to opcache and map_ptr #134
base: master
Are you sure you want to change the base?
Add basic introduction to opcache and map_ptr #134
Conversation
This looks basically fine to me, but I think from a didactic perspective, you should be focusing on INIT (non-shm) / NEW (shm) and GET/SET. The IMM variants are a performance optimization in case only the shm representation is used. I think it only gets used for mutable_data. If people use these macros, they generally want to reach for the non-IMM versions. FYI I also have some description of SHM and map pointers at https://www.npopov.com/2021/10/13/How-opcache-works.html. |
Yeah, the order is a bit off. I'll try to fix that.
Thanks! I think I actually never read that one. |
location of the exact element. | ||
|
||
The map ptr can also store real pointers. This can be useful when the data structure doesn't live in SHM and thus | ||
doesn't need to store the value in a separate map. The fact that pointers are aligned by their size is used to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is actually wrong. We're not just storing pointers to pointers but also pointers to other data structures. I think ZEND_MAP_PTR_DEF(bool*, foo)
would fail as its alignment is 1 byte and it thus could be stored in 0xABC1 without packing. So we're relying on the fact that the structures we're pointing themselves to have an alignment of >1. Maybe ZEND_MAP_PTR_DEF
could even assert that.
@nikic Can you confirm this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally yes, with the caveat that there are both type alignment and allocator alignment guarantees. E.g. all ZMM allocations are 4/8 byte aligned, and the system allocator has similar guarantees. You couldn't store a pointer into the middle of a string though, which might not be sufficiently aligned.
differentiate between offsets and real pointers. The offsets start at 1, 9, 17, etc (assuming a 64-bit pointer size). | ||
However, pointers (unless padding was removed) would not get aligned this way. Instead, they would be stored in | ||
addresses that end with 0x0 or 0x8 given that they are 8-byte aligned. The macro ``ZEND_MAP_PTR_IS_OFFSET`` will be used | ||
internally to check if the ``0b1`` byte is set. If it is, the value stored is an offset. Otherwise it's a direct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
byte -> bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! That makes sense. I guess we're also never using ZEND_MAP_PTR_DEF
to store an offset into a structure.
struct Foo {
bool a;
bool b;
};
struct Bar {
ZEND_MAP_PTR_DEF(bool*, b);
};
struct Foo *foo = emalloc(sizeof(struct Foo));
bool *b = &foo->b; // b would have the 0b1 bit set
struct Bar *bar = emalloc(sizeof(struct Bar));
ZEND_MAP_PTR_INIT(bar->b, b); // bar->b would be interpreted as an offset
I'll adjust the explanation accordingly.
EDIT: Oops, I meant to post that on the comment abovel
}; | ||
|
||
``ZEND_MAP_PTR_INIT`` assigns a value directly to the underlying offset field. This must only be done when the structure | ||
is not yet in SHM. In practice, this marco should usually only be used to initialize the map ptr to ``NULL``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
marco -> macro. I also wouldn't say this writes the offset field, it writes the pointer. It does not need to be NULL for the non-SHM use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to point to the fact that usually code should be agnostic on whether an offset or pointer is stored, so you should usually do ZEND_MAP_PTR_INIT(x->y, NULL)
and then ZEND_MAP_PTR_SET(x->y, ptr)
. Maybe I can clarify that ZEND_MAP_PTR_INIT
can be used when you know the structure doesn't live in SHM as a performance optimization.
|
||
* ``zend_persistent_script.dynamic_members`` stores information about the state of the cached script | ||
* ``zend_class_entry.inheritance_cache`` is extended if a new combination of parent class and interfaces is encountered | ||
* ``zend_op_array.handler`` can be replaced at runtime by the tracing JIT to start executing JITted code | ||
* ``zend_op.handler`` can be replaced at runtime by the tracing JIT to start executing JITted code | ||
* ``zend_string.gc.refcount`` for persistent strings can be replaced with map ptr offset to cache class lookups by name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* ``zend_string.gc.refcount`` for persistent strings can be replaced with map ptr offset to cache class lookups by name | |
* ``zend_string.gc.refcount`` for persistent strings can be replaced with a map pointer (map_ptr) offset to cache the lookup of classes by name |
Found this sentence somewhat confusing
``ZEND_MAP_PTR_NEW`` can be used to generate a new offset and store it in the given field. This must only be done when | ||
the structure is not yet in SHM. This step can be skipped if the data structure is not intended to be stored in SHM and | ||
thus doesn't need any pointer indirection. | ||
|
||
:: | ||
|
||
ZEND_MAP_PTR_NEW(ce->mutable_data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few caveats with ZEND_MAP_PTR_NEW
that I think are worth mentioning:
One is that it is only safe to use ZEND_MAP_PTR_NEW
during compilation of user scripts. This is because opcache needs to update ZCSG(map_ptr_last)
so that all processes are aware of the new offsets, which it does after compilations. It is also safe to use during the startup phase, but not in the MINIT() of dl()'ed modules.
An other one is that offsets should be allocated sparingly, and never on non persistent structures, as offsets are never freed or reused (and this eventually affects the memory usage of all processes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One is that it is only safe to use ZEND_MAP_PTR_NEW during compilation of user scripts.
I hinted at this with:
This must only be done when the structure is not yet in SHM.
I'll see if I can reword this to make it clearer. The fact that we're modifying ZCSG(map_ptr_last)
also means we need an exclusive write lock.
It took me a while to understand
map_ptr
, thus my motivation to document it. I'm not sure if my explanation is easy to follow, please let me know if there's anything I can improve.