-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FilterX copy-on-write for mutable objects #330
Conversation
427be8e
to
4a0223c
Compare
I've written a longer PR description. The diagram may be hard to understand due to the lack of context, but we can talk about it IRL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you! 🚀
Notes for the other reviewers:
I checked the unwrap_ro
usages, I kindly ask you to do the same to make sure the RO version is not mutated anywhere.
The merge-base changed after approval.
This looks awesome. I would like to recheck the downcast commit one more time, then approve and merge will come :) |
Calling I am considering freezing those objects or replacing their |
The merge-base changed after approval.
Approve. @MrAnno Could you kindly rebase this to main? GitHub does not let me merge this way. Thanks! |
Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Needed for introducing FilterXRef. Signed-off-by: László Várady <[email protected]>
Deep-copy assignment (the `.something = `, `["something"] = `, `something =`, `something +=` operators) is an essential feature of the FilterX language, making it easy to trace and understand how values change. However, copy (especially deep-copy) is expensive, so this PR aims to fix the performance bottleneck by introducing copy-on-write for mutable objects. To achieve CoW semantics without introducing endless changes in the `filterx` folder, `FilterXRef` has been introduced. References are currently not part of the FilterX language (hopefully, they never will be). `FilterXRef` is used only internally to reference the same `FilterXObject` from multiple locations (variables, other types of assignments). For this reason, `FilterXRef` pretends to be a `FilterXObject`, but it's not a real FilterX type. `FilterXRef` transparently propagates operations (virtual function calls) to the referenced value, but it also makes sure that the value is copied on the first write operation in case multiple references exist. `FilterXRef` is created only for mutable objects, and only when they are needed: around assignment-like operators; this and its `FilterXObject` behavior was the key to avoid polluting all code paths with references. Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
The function will be extended with FilterXRef-related functionality, which requires a separate unit to avoid circular dependency between FilterX objects and refs. Signed-off-by: László Várady <[email protected]>
FilterXRef can mimic the functionality of its referenced value, but only through virtual methods. filterx_object_is_type() is usually followed by an implicit type cast, which is impossible to implement transparently (without the unwrap() call), so this safety check has been added. This is far from ideal, but doing downcasts is the original and real problem here. Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
"lhs_object" is "self". Signed-off-by: László Várady <[email protected]>
`FilterXRef` is created only for mutable objects, and only when they are needed: around assignment-like operators; this and its `FilterXObject` behavior is the key to avoid polluting all code paths with references. Full CoW on subscript and attr operators (`.something`, `["something"]`) of JSON objects is currently not implemented. I've tried to make it work, but it requires a complete redesign of how we cache JSON subobjects (or alternatively, we need an internal representation for dicts and lists without using the json-c library). This will be implemented later. For now, such operators make unnecessary copies (CoW only partially works on them). Signed-off-by: László Várady <[email protected]>
FilterXRef can mimic the functionality of its referenced value, but only through virtual methods. Where (implicit) downcasts are used, it is impossible to be transparent about references, so these unwrap methods have to be used. This is far from ideal, but doing downcasts based on is_type() is the original and real problem here. In such cases, `filterx_ref_unwrap()` must be called to extract the original object. The extracted object must only be used locally (short-term, without referencing or storing it anywhere). Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Signed-off-by: László Várady <[email protected]>
Done, thanks. |
Depends on #346
Copy-on-write and
FilterXRef
Deep-copy assignment (the
.something =
,["something"] =
,something =
,something +=
operators) is an essential feature of the FilterX language, making it easy to trace and understand how values change.However, copy (especially deep-copy) is expensive, so this PR aims to fix the performance bottleneck by introducing copy-on-write for mutable objects.
To achieve CoW semantics without introducing endless changes in the
filterx
folder,FilterXRef
has been introduced.References are currently not part of the FilterX language (hopefully, they never will be).
FilterXRef
is used only internally to reference the sameFilterXObject
from multiple locations (variables, other types of assignments).For this reason,
FilterXRef
pretends to be aFilterXObject
, but it's not a real FilterX type.FilterXRef
transparently propagates operations (virtual function calls) to the referenced value, but it also makes sure that the value is copied on the first write operation in case multiple references exist.FilterXRef
is created only for mutable objects, and only when they are needed: around assignment-like operators;this and its
FilterXObject
behavior was the key to avoid polluting all code paths with references.Implementation concerns
Unfortunately,
FilterXRef
can only be transparent when virtual calls are used.We use a lot of
downcasting
andisinstanceof()
calls for logical and performance reasons.Where explicit downcasting is needed,
filterx_ref_unwrap()
must be called to extract the original object. The extracted object must only be used locally (short-term, without referencing or storing it anywhere).This is ugly and hard to maintain, but downcasting is the original problem here. A safety check has been added to
filterx_object_is_type()
in order to detect if we forget to callunwrap()
.(And since
FilterXRef
is only for mutable types, I could avoid polluting the codebase with those calls where immutable downcasts were used.)Everything sounds simple in theory, but debugging this with GDB is a nightmare if you try to catch a bug.
Current limitations
Full CoW on subscript and attr operators (
.something
,["something"]
) of JSON objects is currently not implemented.I've tried to make it work, but it requires a complete redesign of how we cache JSON subobjects (or alternatively, we need an internal representation for dicts and lists without using the json-c library). This will be implemented later.
For now, such operators make unnecessary copies (CoW only partially works on them).