-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] No panic for illegal self-referential f-string annotations #14629
base: main
Are you sure you want to change the base?
Conversation
|
c352470
to
4aba390
Compare
I definitely agree with this, if it's possible. But I haven't thought at all about how we might redesign our infrastructure to make this possible! |
crates/red_knot_python_semantic/resources/mdtest/literal/literal.md
Outdated
Show resolved
Hide resolved
@@ -27,13 +27,17 @@ static EXPECTED_DIAGNOSTICS: &[&str] = &[ | |||
// We don't support `*` imports yet: | |||
"error[unresolved-import] /src/tomllib/_parser.py:7:29 Module `collections.abc` has no member `Iterable`", | |||
// We don't support terminal statements in control flow yet: | |||
"error[annotation-with-invalid-expression] /src/tomllib/_parser.py:57:71 Invalid expression in type expression", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several type annotations on this line, and it's hard to tell which annotation this is referring to without counting the columns. Could we improve the error message so it specifies which variable or annotation the diagnostic is complaining about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including arbitrary expressions is something that I'm always worried about because it can lead to very long messages, but including the name might be an option or David can come up with something that avoids this.
What should help here is when we have rich diagnostics that also show a code frame
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I wrote that error message, I was looking for something like a ast::Expr::human_readable_name()
method that would return "lambda expression" for a Expr::Lambda
etc., but didn't find anything.
I'm happy to improve this error message though once we agree that this is even the right approach to begin with.
What should help here is when we have rich diagnostics that also show a code frame
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should help here is when we have rich diagnostics that also show a code frame
That should help for sure, but it's also useful to have important information in the summary line for when a user is using --output-format=concise
. I found --output-format=concise
to be very useful when filing PRs to upgrade projects to Ruff 0.8, because there were many new violations on some codebases, and it was very hard to scroll through them all when using the default verbose output.
Yeah, I'm not sure about this either. I think there are two reasons why we want to infer types for all sub-expressions:
I think inferring |
Yeah, I agree. I'm having a hard time thinking of how we could enforce this invariant in the type system. (At least without major re-architecting.) I think I am reaching the conclusion that we should just have a default fallback to I think this seems like the right answer to me in particular due to invalid syntax, because a) we can't reasonably expect to cover the full space of possible invalid syntax with any testing strategy, and b) I think, in general,
I'm not sure it isn't the best option for the second, too. How symbols in type expressions are interpreted is contextual, and if we've departed from a valid context, I think it's reasonable to say that we really don't know what the type of sub-expressions should be. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be more inclined to switch to a default fallback of Unknown at lookup time (instead of panic), rather than adding all this infrastructure to walk a tree and store an explicit Unknown for everything in it.
But I don't feel strongly about it, and won't object if this PR is merged as-is.
But remember that we won't just be using red-knot from the linter for its type-inference capabilities, but also for its more general "symbol-aware" capabilities. For example, take PYI062. This rule tries to detect duplicate members in a literal slice. When it's ported to red-knot, it will need to work largely off the AST, as it does now, because in our red-knot representation of a |
Yes, I understood all of that from your first comment. This is the assumption I'm not sure I agree with:
I'm not sure it's reasonable to expect that rule to fire inside an arbitrary invalid context, because the meaning of these expressions is contextual. For example, the first "argument" to |
Yes, I think it should! It's just as much an error to have duplicate elements in a >>> from typing import Literal
>>> Literal[1, 1]
typing.Literal[1] So even if the |
Yes, I see that it would be nice to correctly infer the type of a reference to something like I wish we weren't constrained by Salsa query granularity, so we didn't have to do the extra work of type inference on expressions that are useless for type checking, just because a lint rule or an LSP hover might (but probably in many cases never will) ask for it. |
Right. Though the consolation here is that invalid annotations should generally be pretty rare in well-written code, so the "extra cost" of inferring sub-expressions in these invalid annotations should be pretty minimal, I'd wager. |
Summary
I went down another rabbit hole trying to fix all sorts of panics related to illegal type expressions. Some highlights from the linter corpus:
This fixes the last two remaining entries in
KNOWN_FAILURES
except for those related to salsa-cycles. But I really don't know if the approach taken here — to inferType::Unknown
for sub-expressions of illegal expressions — is the right one. And even if it is, I'm not sure if my implementation of it is fully correct (see for example the exception for expressions with an inner scope). I thought I'd still open a PR so we can discuss the general problem.Side remark: After having fixed a few of those panics over the last weeks, my impression is that our approach to upholding the invariant "we infer and store a type for each and every expression, even if illegal in this context" is extremely brittle. The only way we enforce it right now is through the corpus tests. But there is an exponentially large number of branches in type inference (or alternatively: an exponentially large number of weird edge cases like the above), which makes me think that we might need another way to enforce this constraint, e.g. at the (Rust) type-system level instead of the test level.
Test Plan