-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: EDM4HEPSchema and Newstyle FCCSchema #1245
base: master
Are you sure you want to change the base?
Conversation
tests/test_nanoevents_edm4hep.py
Outdated
assert field in delayed_fields | ||
|
||
|
||
def test_MC_daughters(delayed_events): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests should also check correctness!
assert field in delayed_fields | ||
|
||
|
||
def test_MC_daughters(delayed_events): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests should also check correctness!
# | ||
|
||
|
||
def test_KaonParent_to_PionDaughters_Loop(eager_events): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok, I guess correctness is checked here in the end.
Perhaps some more basic checks in the prior tests would still be useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, test_KaonParent_to_PionDaughters_Loop attempts to check the correctness of Parents and Daughters relations. Sure, I can add more basic tests
can you rebase all the commits into 1 so that that 12MB file isn't into the repo history at all? Thanks! |
9934f04
to
eeeeef5
Compare
Done |
@prayagyadav is there anything more you want to do on this PR? |
yeah, gotta sort out some features which I missed. Also, I have to add more comments ... |
OK - ping me when you are done! It's otherwise looking good so far. |
…adable with the EDM4HEPSchema_v00_10_05
Hi, @lgray. I am sorry for the radio silence.
|
…oot file (Don't merge yet: This needs the new feature:'infer typename in uproot.dask' to be merged for this to work)
for more information, see https://pre-commit.ci
…into edm4hep-schema
@prayagyadav you should be able to determine the C++ just from the file metadata right? If so if you dig through uproot a bit you can extract it without needing to resort to materializing an array. Otherwise let's wait for that PR to go through and an uproot release. Also @nsmith- had some interesting things to say about dealing with streamers when we were talking yesterday. May help this out. |
@lgray Using this script that I found, I can list out all the relevant metadata for a given EDM4HEP based file. I found out that some generations of FCC samples have the typename info in the Unfortunately, the older generations like the Spring2021 campaign do not have the typename info in metadata. Another issue is that the collection typenames are incorrect! Maybe it's just the problem with this script, but I am not sure. In the long term, it would be very beneficial to get access to the Collection IDs which are stored in the metadata. Is there a way to access the |
Regarding the format of this metadata: I would like to point out that they are technically not part of the guarantees we give for stability as we do in other places, see also the documentation, or a concrete example of how we want to change this AIDASoft/podio#711 If you want to have the ground truth of the generation you can get to the actual full definition of the datamodel as JSON encoded string from
Maybe this is because the script gives you the names of the classes in the user-layer (see documentation), while what actually lands in the files comes from the POD layer. So if you see |
@tmadlener - instead the output of the script seem shuffled? I guess |
Ah right, didn't catch that, sorry. From a quick look, I think this So Looking at |
@lgray @davidlange6 @gomber
Here is a clean draft for the EDM4HEPSchema (edm4hep1) and FCCSchema based on the same. I have not yet managed to add many comments and descriptions, but I plan to add them eventually.
Workings:
The EDM4HEPSchema reads the
edm4hep.yaml
file from the assets directory. I felt this was necessary to add maximum functionality to the schema. Reading the specifications of all the 'components' and 'datatypes' from the yaml file helps to identify the 'members' (example,energy
is a member ofedm4hep::ReconstructedParticle
datatype) and which members correspond to the various types of cross-branch relations in EDM4HEP: vector members, OneToOneRelations, OneToManyRelations and Links.The Schema fetches the comments in the
edm4hep.yaml
file and assigns them as docstrings to the relevant branches.The EDM4HEPSchema supports all these relations (With Links needing some manual boilerplate code from the user).
The version of the
edm4hep.yaml
file used is here. Please note that the way Links are represented in EDM4HEP has changed in the latest commit. @tmadlener can comment more on this. In any case, it seems necessary to find a way to track the changes from edm4hep.yaml, so that the COFFEA EDM4HEPSchema does not become obsolete after a few version changes.Link to example Notebooks:
Tests:
Other comments:
ExtraCode
sections mentioned inedm4hep.yaml
. They appear to be declarations for C++ methods specific to certain collections.