Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments #87

Open
rbeyer opened this issue Mar 23, 2021 · 2 comments
Open

Comments #87

rbeyer opened this issue Mar 23, 2021 · 2 comments

Comments

@rbeyer
Copy link
Member

rbeyer commented Mar 23, 2021

Is your feature request related to a problem? Please describe.
The pvl library quietly ignores comments when it reads PVL-text, and has no mechanism for programmatically adding comments to a Python structure that could be given to the dumpers (pvl.dump() and pvl/dumps()).

There was some discussion of this in #35, but that was before the architecture change to pvl 1.0.0 and there was never a PR.

Thoughts
The decoders currently have a "hook" for recognizing comments (so they can be skipped over), but this could be used to capture comments. The PVL Blue Book indicates that comments can appear ... anywhere. They can even be between things that you might not expect:

distance /* comments */ = /* can */ 1 /* be */ <AU> /* anywhere */

When I first understood this, it made my brain hurt. The above PVL-text would be parsed into a single key-vale pair (with units), but had four comments associated with it. To properly keep track of their location, you'd have to do a tremendous amount of book-keeping.

Fortunately, the ODL/PDS3 specification doesn't allow this sort of nonsense, and requires that comments be on a line by themselves, or must come at the end of a line, like so:

/* This is a comment 
that is all by itself */
distance = 1 <AU> /* This comment is at the end of a line */

Since there is almost no usage of straight PVL, and the only practical usage is ODL/PDS3 and variants thereof, I think that a comment system that robustly handled the ODL/PDS3 comment situation, but not the PVL situation would be fine. I think the right behavior would be to recognize the interspersed comments in the PVL example, but to parse them each into "separate comments" in the data structure that were associated with the parameter-value pairing in the right order but not try to preserve their location amongst the elements (it could be done, but I don't think there's a lot of practical use).

To realize this in code requires a new container object that would be returned by the loaders and accepted by the dumpers. We would either need to modify our pvl.collections.OrderedMultiDict or the experimental pvl.collections.PVLMultiDict (based on multidict.MultiDict) to be able to handle keeping track of comment objects and associating them with the right position in the dict-like. Users would need access to them, and they would need to behave correctly if you reordered things, or needed to build a new dict-like from scratch and add your own comments to it, etc.

This seems hard, and I'm loathe to try and do this from scratch. Fortunately, we may be able to collapse this to a previously-solved problem. The TOML language allows comments in similar relation to key-value pairs that ODL/PDS3 does. The toml Python package has the same behavior as pvl, it just skips over comments and doesn't try to parse them (hey, I said it was hard). However, the tomlkit Python package does, and has a container.Container object that we might just be able to use without alteration. It is a dict-like that should function similar to our OrderedMultiDict, but has "document" aspects like managing comments, and even allowing management of blank lines to help preserve readability, while still presenting dict-like functionality for the "data."

I don't have a strong need for this, so I'm unlikely to work this any time soon, but I just wanted to put this down as a seed for future development.

@michaelaye
Copy link
Member

michaelaye commented Mar 24, 2021 via email

@michaelaye
Copy link
Member

I use tomlkit now to keep the config file the same between updates.
How about storing comments into another dict under label['comments']['original_key'] ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants