-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can a callback return/append different tokens? #433
Labels
question
Further information is requested
Comments
I ended up using a secondary enum instead of different token types. |
Hi @spearphishing, could you share your solution with us please :-) ? |
Instead of using an extra #[derive(Debug, PartialEq)]
pub enum MultiLineTokenType {
Valid,
Broken,
}
#[derive(Logos, Debug, PartialEq)]
#[logos(skip r"[ \t\n\f]+")]
pub enum Token {
// ...
#[regex(r"--\[[=]*\[", |lex| parse_multi_line_token(lex, "--["))]
LongBracketComment(MultiLineTokenType)
}
// this function is also used for parsing multi-line strings
// multi-line strings behave the same way as comments, just with a different prefix
fn parse_multi_line_token(lex: &mut Lexer<Token>, prefix: &str) -> MultiLineTokenType {
let slice = lex.slice();
if let Some(opening) = slice.strip_prefix(prefix) {
let mut equals_count = 0;
for ch in opening.chars() {
match ch {
'=' => equals_count += 1,
'[' => break,
_ => return MultiLineTokenType::Broken,
}
}
let closing_delimiter = format!("]{}]", "=".repeat(equals_count));
while lex.remainder().starts_with(&closing_delimiter) == false {
if lex.remainder().is_empty() {
return MultiLineTokenType::Broken;
}
lex.bump(1);
}
lex.bump(closing_delimiter.len());
return MultiLineTokenType::Valid;
}
MultiLineTokenType::Broken
} Hopefully this can help others in the situation I was in. ❤️ |
Thanks! Another example solution (though not exactly the same problem) is given in #432. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I am attempting to parse Lua type comments with the code below:
With this code, the below tokens are returned:
What I'm aiming for is to be able to return the
BrokenComment
token inside of theparse_long_bracket
callback somehow. Is this possible?Or maybe I'm doing something wrong here, as this is my first time doing any sort of lexical analysis.
The text was updated successfully, but these errors were encountered: