-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[ntuple] fix schema evolution with streamer fields #18451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[ntuple] fix schema evolution with streamer fields #18451
Conversation
Register streamer infos of the extra type info when connecting streamer field to page source.
What do you mean by "fix de-s11n of streamer info records " ? |
diagRAII.requiredDiag(kError, "TBufferFile::ReadVersion", "Could not find the StreamerInfo with a checksum of", | ||
false /* matchFullMessage */); | ||
diagRAII.requiredDiag(kError, "TBufferFile::CheckByteCount", "object of class TemperatureKelvin read too few bytes", | ||
false /* matchFullMessage */); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand those error message. What is the case being tested here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I write a TemperatureCelsius
and then try to read it as TemperatureKelvin
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Humm ... why isn't the error message closer to "Can not convert (or read) a TemperatureCelsius
(in)to a TemperatureKelvin
" ? Without the corresponding checks the failure could be 'serious'. If I understood correctly, the user could have a class A
with version 2
which contains a long
then request the reading a class B
with also version 2
which contains a pointer the the I/O would succeed but not apply any check nor conversion and thus have a random number stored in the pointer. Did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a proper exception is part of the TODO
written above. I don't understand your example, how is this different from the case tested here that even has compatible class layouts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am probably mis-reading the actual test but ...
"TBufferFile::ReadVersion", "Could not find the StreamerInfo with a checksum of",
This message should appear only if there is something wrong with the file and the StreamerInfo record has not been properly stored. Seeing this indicates either a file corruption or a serious deficiency in ROOT (or maybe the user's code)
"TBufferFile::CheckByteCount", "object of class TemperatureKelvin read too few bytes",
This message should only appear if there is an unexpected impedance mismatch between the StreamerInfo and the data onfile. Seeing this indicates either a file corruption or a serious deficiency in ROOT (or maybe the user's code).
As far as my examples is concern, let me try to clarify it. With:
class A {
int value;
};
class B {
int value;
};
class C {
int value;
};
class D {
long value;
};
i.e. 3 distinct classes with the exact same layout one with a slight difference (And because they have a different name, 4 different CheckSums).
and with a dictionary for all 4 and the following rules:
#pragma read sourceClass="A" targetClass="C";
#pragma read sourceClass="A" targetClass="D" checksum="[correct_value_for_A]" source="int value" \
target="value" code="{ value = onfile.value*100; } "
you have the following consequences:
Reading From / Onfile | Reading Into / In Memory | |
---|---|---|
A | A | allowed |
A | B | NOT allowed, the user did not say those 2 types were equivalent |
A | C | allowed - need a trivial conversion StreamerInfo |
A | D | allowed (see note (1)) - need a non-trivial conversion StreamerInfo |
In the not allowed cases, we should get an error message akin to Can not read a "A" into a "B"
Note (1), since the user specified a checksum, if the file also contains instance of A with a different schema (eg. class A { long value; }
from a different release of the software), the conversion is not allowed. In this particular example, the user may or may not have wanted to multiply value
when reading those other As
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TODO(jblomer): this should fail with an exception when we connect the page source
humm ... Maybe the TODO
and I agree :) ... Assuming that is correct, I am strongly suggesting that it is actually 'urgent' to implement the TODO
as the current situation (if I understand correctly) will be very confusing to users (likely they will think that there is a file corruption (or bug in ROOT) rather than a bug in their code).
int fInt = 0; | ||
int fAdded = 137; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To test the indirect collection of StreamerInfo, there should be here a pointer to another class (possibly initialized with an instance of a derived class).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we already test that with
root/tree/ntuple/test/rfield_streamer.cxx
Line 126 in 031f6bd
TEST(RField, StreamerPoly) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically the test linked will succeed whether or not the StreamerInfo
are stored ... since they are still in memory from the already loaded dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e. like here to test the feature, one need 2 separate processes and 2 distinct schemas/versions.
Test Results 17 files 17 suites 3d 18h 24m 9s ⏱️ Results for commit 04b27d7. |
Two bugs are fixed in |
@@ -612,6 +612,7 @@ private: | |||
REntryRange fEntryRange; ///< Used by the cluster pool to prevent reading beyond the given range | |||
bool fHasStructure = false; ///< Set to true once `LoadStructure()` is called | |||
bool fIsAttached = false; ///< Set to true once `Attach()` is called | |||
bool fHasStreamerInfosRegistered = false; ///< Set to true when ReisterStreamerInfos() is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool fHasStreamerInfosRegistered = false; ///< Set to true when ReisterStreamerInfos() is called. | |
bool fHasStreamerInfosRegistered = false; ///< Set to true when RegisterStreamerInfos() is called. |
// Builds the streamer info records from the descriptor's extra type info section. This is necessary when | ||
// connecting streamer fields so that emulated classes can be read. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Builds the streamer info records from the descriptor's extra type info section. This is necessary when | |
// connecting streamer fields so that emulated classes can be read. | |
/// Builds the streamer info records from the descriptor's extra type info section. This is necessary when | |
/// connecting streamer fields so that emulated classes can be read. |
TEST(RNTupleEvolution, StreamerField) | ||
{ | ||
FileRaii fileGuard("test_ntuple_evolution_streamer_field.root"); | ||
fileGuard.PreserveFile(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be needed, probably a left-over from debugging
diagRAII.requiredDiag(kError, "TBufferFile::ReadVersion", "Could not find the StreamerInfo with a checksum of", | ||
false /* matchFullMessage */); | ||
diagRAII.requiredDiag(kError, "TBufferFile::CheckByteCount", "object of class TemperatureKelvin read too few bytes", | ||
false /* matchFullMessage */); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a proper exception is part of the TODO
written above. I don't understand your example, how is this different from the case tested here that even has compatible class layouts?
No description provided.