Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Selection syntax] Generalized autocomplete functionality #26273

Merged
merged 16 commits into from
Dec 10, 2024

Conversation

salazarm
Copy link
Contributor

@salazarm salazarm commented Dec 4, 2024

Summary & Motivation

This PR introduces generalized autocomplete functionality that can be re-used across grammars that follow the base structure of our asset selection syntax grammar.

As an input the autocomplete factory takes the names of attributes, their values, and the list of functions available in the grammar.

To implement auto-complete I opted to create a new grammar so that I can allow certain things to still be valid inputs.
Benefits of separate grammar:

  • IncompleteOrExpression, IncompleteAndExpression, IncompleteFunctionCall, etc. allow our syntax highlighting and autocomplete visitors to visit the full expression even if parts of it are malformed.
  • DownTraversal and UpTraversal are distinguished simplifying the visitor logic
  • Whitespace is differentiated between afterExpressionWhitespace and afterLogicalOperatorWhitespace, etc. greatly simplifying the visitors that provide suggestions in whitespace. (Instead of needing a visit function for every possible expression we instead have the whitespace tokens handle creating suggestions).
  • orToken, andToken, notToken simplifies visitor logic, rather having the various expressions that contain these tokens handle what to do when the cursor is over them, we can instead have the token itself handle what happens and reduce the amount of visitor logic.
  • Separate tokens for FunctionName AttributeName AttributeValue - also reduces visitor logic

Generally speaking, my strategy for the grammar was to make leaf nodes that are shared among the various expression in order to allow those leaf nodes to make suggestions and reduce the number of visit functions required in the visitor (avoid combinatorial explosion of cases).

The reason for doing this.cursorIndex - 1 in the nodeIncludesCursor function is that I want the suggestions to be determined by the token PRECEEDING the cursor, so I want the token immediately before the cursor to return "true" for nodeIncludesCursor.

I can probably clean up the grammar a bit, but quite fatigued for this PR. In a future PR I'll look to simplify the grammar even more but what I have now passes all the tests I've written.

How I Tested These Changes

Extensive jest testing and real testing.

Screenshot 2024-12-04 at 11 20 58 AM

Copy link

github-actions bot commented Dec 4, 2024

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-b2aoranul-elementl.vercel.app
https://salazarm-autocomplete.core-storybook.dagster-docs.io

Built with commit 224ed83.
This pull request is being automatically deployed with vercel-action

@salazarm salazarm force-pushed the salazarm/autocomplete branch from c188430 to 0ed15f1 Compare December 5, 2024 16:49
Comment on lines +507 to +552
return function (cm: CodeMirror.Editor, _options: CodeMirror.ShowHintOptions): any {
const line = cm.getLine(0);
const {parseTrees} = parseInput(line);

let start = 0;
const actualCursorIndex = cm.getCursor().ch;

let visitorWithAutoComplete;
if (!parseTrees.length) {
// Special case empty string to add unmatched value results
visitorWithAutoComplete = new SelectionAutoCompleteVisitor({
attributesMap,
functions,
nameBase,
line,
cursorIndex: actualCursorIndex,
});
start = actualCursorIndex;
visitorWithAutoComplete.addUnmatchedValueResults('');
} else {
for (const {tree, line} of parseTrees) {
const cursorIndex = actualCursorIndex - start;
const visitor = new SelectionAutoCompleteVisitor({
attributesMap,
functions,
nameBase,
line,
cursorIndex,
});
visitor.visit(tree);
const length = line.length;
if (cursorIndex <= length) {
visitorWithAutoComplete = visitor;
break;
}
start += length;
}
}
if (visitorWithAutoComplete) {
return {
list: visitorWithAutoComplete.list,
from: cm.posFromIndex(start + visitorWithAutoComplete.startReplacementIndex),
to: cm.posFromIndex(start + visitorWithAutoComplete.stopReplacementIndex),
};
}
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tests for this error scenario

Comment on lines 40 to 73
// Initialize ANTLR input stream, lexer, and parser
const inputStream = CharStreams.fromString(substring);
const lexer = new SelectionAutoCompleteLexer(inputStream);
const tokenStream = new CommonTokenStream(lexer);
const parser = new SelectionAutoCompleteParser(tokenStream);

// Attach custom error listener
const errorListener = new CustomErrorListener();
parser.removeErrorListeners();
parser.addErrorListener(errorListener);

// Set the error handler to bail on error to prevent infinite loops
parser.errorHandler = new BailErrorStrategy();

let tree: ParseTree | null = null;
try {
// Parse using the 'expr' rule instead of 'start' to allow partial parsing
tree = parser.expr();
parseTrees.push({tree, line: substring});

// Advance currentPosition to the end of the parsed input
const lastToken = tokenStream.get(tokenStream.index - 1);
currentPosition += lastToken.stopIndex + 1;
} catch (e) {
// Parsing error occurred
const currentErrors = errorListener.getErrors();

if (currentErrors.length > 0) {
const error = currentErrors[0]!;
const errorCharPos = error.column;
const errorIndex = currentPosition + errorCharPos;

// Parse up to the error
const validInput = input.substring(currentPosition, errorIndex);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure this is covered by tests


export const removeQuotesFromString = (value: string) => {
if (value.length > 1 && value[0] === '"' && value[value.length - 1] === '"') {
return value.slice(1, value.length - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value.slice(1, -1)?

Copy link
Collaborator

@bengotow bengotow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very cool 🙌 Nicely done! Cool that this can be attached to a CodeMirror parse tree visitor, I think I'd missed the implementation of that with the old AssetSelection one.

@salazarm salazarm merged commit c6928b7 into master Dec 10, 2024
1 of 2 checks passed
@salazarm salazarm deleted the salazarm/autocomplete branch December 10, 2024 20:15
pskinnerthyme pushed a commit to pskinnerthyme/dagster that referenced this pull request Dec 16, 2024
…#26273)

## Summary & Motivation

This PR introduces generalized autocomplete functionality that can be
re-used across grammars that follow the base structure of our asset
selection syntax grammar.

As an input the autocomplete factory takes the names of attributes,
their values, and the list of functions available in the grammar.

To implement auto-complete I opted to create a new grammar so that I can
allow certain things to still be valid inputs.
Benefits of separate grammar:

- `IncompleteOrExpression`, `IncompleteAndExpression`,
`IncompleteFunctionCall`, etc. allow our syntax highlighting and
autocomplete visitors to visit the full expression even if parts of it
are malformed.
- DownTraversal and UpTraversal are distinguished simplifying the
visitor logic
- Whitespace is differentiated between `afterExpressionWhitespace` and
`afterLogicalOperatorWhitespace`, etc. greatly simplifying the visitors
that provide suggestions in whitespace. (Instead of needing a visit
function for every possible expression we instead have the whitespace
tokens handle creating suggestions).
- `orToken`, `andToken`, `notToken` simplifies visitor logic, rather
having the various expressions that contain these tokens handle what to
do when the cursor is over them, we can instead have the token itself
handle what happens and reduce the amount of visitor logic.
- Separate tokens for `FunctionName` `AttributeName` `AttributeValue` -
also reduces visitor logic

Generally speaking, my strategy for the grammar was to make leaf nodes
that are shared among the various expression in order to allow those
leaf nodes to make suggestions and reduce the number of visit functions
required in the visitor (avoid combinatorial explosion of cases).

The reason for doing `this.cursorIndex - 1` in the `nodeIncludesCursor`
function is that I want the suggestions to be determined by the token
PRECEEDING the cursor, so I want the token immediately before the cursor
to return "true" for `nodeIncludesCursor`.

I can probably clean up the grammar a bit, but quite fatigued for this
PR. In a future PR I'll look to simplify the grammar even more but what
I have now passes all the tests I've written.

## How I Tested These Changes

Extensive jest testing and real testing.

<img width="937" alt="Screenshot 2024-12-04 at 11 20 58 AM"
src="https://github.com/user-attachments/assets/c29e702e-6048-40cb-ada6-04bb1a1a2a66">
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants