Some updates for DOM-first code selection #3640

mickaelistria · 2025-01-30T15:13:58Z

What it does

Improve DOM-based codeSelection in several ways:

Ensure resolution has happened when reconciling model with problems requested
Make getOrBuildAST taking focalPosition as input and pass focalPosition when processing long files (improve performance)
Support more cases in codeSelect

How to test

Ensure no regression.

Author checklist

I have thoroughly tested my changes
The change is following the coding conventions
I have signed the Eclipse Contributor Agreement (ECA)

* Ensure resolution has happened when reconciling model with problems requested * Make getOrBuildAST taking focalPosition as input and pass focalPosition when processing long files (improve performance) * Support more cases in codeSelect

rgrunber · 2025-01-30T20:30:20Z

org.eclipse.jdt.core/codeassist/org/eclipse/jdt/internal/codeassist/DOMCodeSelector.java

@@ -60,7 +60,7 @@ public IJavaElement[] codeSelect(int offset, int length) throws JavaModelExcepti
 		if (offset + length > this.unit.getSource().length()) {
 			throw new JavaModelException(new IndexOutOfBoundsException(offset + length), IJavaModelStatusConstants.INDEX_OUT_OF_BOUNDS);
 		}
-		org.eclipse.jdt.core.dom.CompilationUnit currentAST = this.unit.getOrBuildAST(this.owner);
+		org.eclipse.jdt.core.dom.CompilationUnit currentAST = this.unit.getOrBuildAST(this.owner, this.unit.getBuffer().getLength() < 50000 ? -1 : offset);


What would be the reason for not applying focal position always (ie. not just when the buffer is >= 50000) ? Is there something that's lost out on large files? Intuitively I'm just thinking that if we need to return a list of IJavaElements that are partly covered by a selection, then do we even need statement nodes completely outside the scope to make that decision ? Maybe if the selection goes across different scopes ?

The current rationale (that we can discuss as we have a lot of choices here) is to enable some caching. If we always care about focalPosition, then all the doms we get are partial and resolve faster (that is per se not too bad), but in case of codeSelect, we may prefer to build the whole DOM once and reuse it as long as the file is not modified. The size of the buffer was used as a criterion to decide whether to enable caching and ignore focalPosition, or whether to prefer a partial but non-reusable DOM.

Right. Ok, it just hit me. This also caches the AST, and returning a cached AST is more efficient that computing a partial (setFocalPosition) one, but if we always compute a partial, we never get to cache a proper one. So may as well limit ourselves to cases where we know the full parse is particularly bad.

So may as well limit ourselves to cases where we know the full parse is particularly bad.

Yes, that's the idea, but I'm not sure what is the best resolution of "particularly bad" concretely. I went for document size because it's easy and cheap to get, but I guess some better conditions could be identified.

Maybe @datho7561 has some ideas from his look into parsing about some "easy" checks other than document size that can indicate parsing will be expensive.

The only other thing I've looked at is the usage of setFocalPosition(-1) in regular cases as a way of saying, "don't set the focal position and do a full AST parse". It still sets the CompilationUnitResolver.PARTIAL bit, and useSearcher to true, but these are only meaningfully used for determining whether we set IGNORE_METHOD_BODIES. In other words we could never set IGNORE_METHOD_BODIES on a full AST parse. Almost identical to what setFocalPosition would do to begin with except even the elements around the focal position would be ignored if in a method body. Are you fine with this ? Assuming I understood the logic, it is a subtle difference to be aware of. Update: I mean at the very least it's unexpected behaviour going forward.

particularly bad

lambdas and switch expressions (these require some fancy and costly hacks in the parser to work), and deeply nested expressions. Lambdas and switch expressions could be identified by searching for -> in the file text. However, that check itself could take a while for a large file. I don't know how much time this would save in practice though.

rgrunber

Looks fine to me. Everything seems isolated from basic ECJ operations.

Some updates for DOM-first code selection

c0194ce

* Ensure resolution has happened when reconciling model with problems requested * Make getOrBuildAST taking focalPosition as input and pass focalPosition when processing long files (improve performance) * Support more cases in codeSelect

mickaelistria mentioned this pull request Jan 30, 2025

DOM-based completion crashing in Eclipse IDE; must push latest getOrBuildAST upstream eclipse-jdtls/eclipse-jdt-core-incubator#1197

Open

rgrunber reviewed Jan 30, 2025

View reviewed changes

rgrunber added this to the 4.35 M3 milestone Feb 3, 2025

rgrunber approved these changes Feb 3, 2025

View reviewed changes

rgrunber assigned mickaelistria Feb 3, 2025

rgrunber merged commit 9827871 into eclipse-jdt:master Feb 3, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some updates for DOM-first code selection #3640

Some updates for DOM-first code selection #3640

mickaelistria commented Jan 30, 2025

rgrunber Jan 30, 2025 •

edited

Loading

mickaelistria Jan 31, 2025

rgrunber Jan 31, 2025

mickaelistria Jan 31, 2025

rgrunber Jan 31, 2025 •

edited

Loading

datho7561 Jan 31, 2025

rgrunber left a comment

Some updates for DOM-first code selection #3640

Some updates for DOM-first code selection #3640

Conversation

mickaelistria commented Jan 30, 2025

What it does

How to test

Author checklist

rgrunber Jan 30, 2025 • edited Loading

Choose a reason for hiding this comment

mickaelistria Jan 31, 2025

Choose a reason for hiding this comment

rgrunber Jan 31, 2025

Choose a reason for hiding this comment

mickaelistria Jan 31, 2025

Choose a reason for hiding this comment

rgrunber Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

datho7561 Jan 31, 2025

Choose a reason for hiding this comment

rgrunber left a comment

Choose a reason for hiding this comment

rgrunber Jan 30, 2025 •

edited

Loading

rgrunber Jan 31, 2025 •

edited

Loading