-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some updates for DOM-first code selection #3640
Conversation
* Ensure resolution has happened when reconciling model with problems requested * Make getOrBuildAST taking focalPosition as input and pass focalPosition when processing long files (improve performance) * Support more cases in codeSelect
@@ -60,7 +60,7 @@ public IJavaElement[] codeSelect(int offset, int length) throws JavaModelExcepti | |||
if (offset + length > this.unit.getSource().length()) { | |||
throw new JavaModelException(new IndexOutOfBoundsException(offset + length), IJavaModelStatusConstants.INDEX_OUT_OF_BOUNDS); | |||
} | |||
org.eclipse.jdt.core.dom.CompilationUnit currentAST = this.unit.getOrBuildAST(this.owner); | |||
org.eclipse.jdt.core.dom.CompilationUnit currentAST = this.unit.getOrBuildAST(this.owner, this.unit.getBuffer().getLength() < 50000 ? -1 : offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the reason for not applying focal position always (ie. not just when the buffer is >= 50000) ? Is there something that's lost out on large files? Intuitively I'm just thinking that if we need to return a list of IJavaElement
s that are partly covered by a selection, then do we even need statement nodes completely outside the scope to make that decision ? Maybe if the selection goes across different scopes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current rationale (that we can discuss as we have a lot of choices here) is to enable some caching. If we always care about focalPosition, then all the doms we get are partial and resolve faster (that is per se not too bad), but in case of codeSelect, we may prefer to build the whole DOM once and reuse it as long as the file is not modified. The size of the buffer was used as a criterion to decide whether to enable caching and ignore focalPosition, or whether to prefer a partial but non-reusable DOM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Ok, it just hit me. This also caches the AST, and returning a cached AST is more efficient that computing a partial (setFocalPosition
) one, but if we always compute a partial, we never get to cache a proper one. So may as well limit ourselves to cases where we know the full parse is particularly bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So may as well limit ourselves to cases where we know the full parse is particularly bad.
Yes, that's the idea, but I'm not sure what is the best resolution of "particularly bad" concretely. I went for document size because it's easy and cheap to get, but I guess some better conditions could be identified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe @datho7561 has some ideas from his look into parsing about some "easy" checks other than document size that can indicate parsing will be expensive.
The only other thing I've looked at is the usage of setFocalPosition(-1)
in regular cases as a way of saying, "don't set the focal position and do a full AST parse". It still sets the CompilationUnitResolver.PARTIAL
bit, and useSearcher
to true
, but these are only meaningfully used for determining whether we set IGNORE_METHOD_BODIES
. In other words we could never set IGNORE_METHOD_BODIES
on a full AST parse. Almost identical to what setFocalPosition
would do to begin with except even the elements around the focal position would be ignored if in a method body. Are you fine with this ? Assuming I understood the logic, it is a subtle difference to be aware of. Update: I mean at the very least it's unexpected behaviour going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
particularly bad
lambdas and switch expressions (these require some fancy and costly hacks in the parser to work), and deeply nested expressions. Lambdas and switch expressions could be identified by searching for ->
in the file text. However, that check itself could take a while for a large file. I don't know how much time this would save in practice though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me. Everything seems isolated from basic ECJ operations.
What it does
Improve DOM-based codeSelection in several ways:
How to test
Ensure no regression.
Author checklist