-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request: Parsed structures note their locations in the input text #4
Comments
Hi, Because of incrementalism a node can't know it's start/end offset in the character stream -- if it could, an insertion would require to update this piece of data in all nodes to the right of the insertion. Assuming your lreaves are just strings, something like that should work.
(untested code) Does it help? |
Yeah, I was kinda thinking it could do that update on everything to the right when it needed to, but I think you are also aiming to keep a logarithmic runtime on the incremental parsing, so that would ruin it. I had not thought of the scheme you are suggesting, because I thought that the content nodes would possibly throw away information during the parse, especially with the - suffix. If the parse tree can always be used to recreate the input, then something like this would definitely suit me. I'll try it out and see if it works. |
Parsley keeps all characters. The - suffix only instructs the parser to not create a node and to inline its children in its parent node -- so only inner nodes are elided, but not their children which are reparented. |
I see. Sorry for the false issue, I only took a real look at parsley this morning, and it's different from what I'm used to in a number of ways, so I haven't digested it all yet. I do think it might be nice to have that offsets scheme you mentioned built in, or more readily available to users at least, so I'll leave the issue open if you want to talk about doc updates or code tweaks related to that, or you can go ahead and close it if not. |
Well, offsets are a special case of what I call "views" (aggregated properties on nodes/leaves) which I'm at last working on at the moment (because of sjacket) -- I helped @laurentpetit to introduce them in an ad hoc manner in paredit.clj but now they are going to be an existing facility of parsley. |
Oh, excellent. I was thinking it'd be a shame if something like offsets was something that people had to reinvent/discover on their own every time someone needed it, but it sounds like you're working on a more general facility. |
finding a node by its offset: https://github.com/cgrand/parsley/blob/master/test/net/cgrand/parsley/test.clj#L91 |
As far as I can tell, and I'm embarrassed to admit that I'm quite over my head in your amazing sci-fi incremental parser code, there's no way you can make parsley generate the parse tree so that it notes the location in the text of each production. For example, it would be nice to be able to have each node point to the location of the first and last char that got parsed into that production. Or even just the first character's location. I don't think it's possible to do this with make-node, as that information does not get passed into the arguments.
Is this something that could be done to the parser? I'm not sure if something about its incremental parsing would prevent this.
The text was updated successfully, but these errors were encountered: