[red-knot] partially migrate to new `Diagnostic` types, delete old diagnostic code #17130

BurntSushi · 2025-04-01T18:59:24Z

This "surgically" replaces Red Knot's use of "old" diagnostic types
with the new Diagnostic type. It is surgical because it doesn't
change how Red Knot type checking actually creates diagnostics.
That is, the InferContext APIs to report diagnostics all remain
the same. What this PR does is change out the implementation of the
InferContext APIs to use the new Diagnostic types. This lays
the ground work for refactoring InferContext to expose the new
Diagnostic data model.

(I had originally planned for this refactor to include moving Red Knot
over to the new Diagnostic data model as well, but this got rather
big on its own. So I thought it would be better to split the work up.)

As discussed with @MichaReiser, this PR also removes the "fake" linear
typing for Diagnostic. That is, you'll no longer get a panic if you
drop a Diagnostic without printing it. It is thought that Salsa
specifically can make this akward to deal with, and there may be cases
in the future where diagnostics get dropped without us having an
opportunity to explicitly ignore them. In any case, while it is perhaps
still a good thing to have, it is not well motivated. Notably, Ruff
doesn't seem to have a problem of accidentally ignoring diagnostics. If
we do end up seeing this become a problem, we can think more carefully
about how to re-introduce the linear typing fakery.

Ref #16809

github-actions · 2025-04-01T19:01:20Z

`mypy_primer` results

No ecosystem changes detected ✅

github-actions · 2025-04-01T19:14:27Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

MichaReiser

Nice, thank you

MichaReiser · 2025-04-02T06:52:57Z

crates/red_knot_project/src/metadata/options.rs

-
-    fn severity(&self) -> Severity {
-        self.severity
+    pub(crate) fn to_diagnostic(&self) -> Diagnostic {


Should this be into_diagnostic to avoid cloning self or is it to_diagnostic because most salsa queries only return refs?

The Diagnostic API doesn't really support accepting ownership of things, so you have to clone anyway. For example:

/// Attach a message to this annotation. /// /// An annotation without a message will still have a presence in /// rendering. In particular, it will highlight the span association with /// this annotation in some way. /// /// When a message is attached to an annotation, then it will be associated /// with the highlighted span in some way during rendering. pub fn message<'a>(self, message: impl std::fmt::Display + 'a) -> Annotation { let message = Some(message.to_string().into_boxed_str()); Annotation { message, ..self } }

We may want to add alternative APIs to Diagnostic (and its constituent types) that accept ownership to avoid clones, but this isn't doing any more work AFAIK than the status quo:

ruff/crates/red_knot_python_semantic/src/types/context.rs

Lines 153 to 160 in a15404a

self.diagnostics.borrow_mut().push(TypeCheckDiagnostic {

file: self.file,

id,

message: message.to_string(),

range: ranged.range(),

severity,

secondary_messages,

});

I know we've been talking about an IntoDiagnostic trait, but it might actually make more sense for it to be a ToDiagnostic trait. Or, alternatively, an IntoDiagnostic trait if we decide to add ownership consuming construction APIs to Diagnostic. I think I'd prefer that decision to be lead by profiling/benchmarking. (I guess this could matter in cases where Red Knot reports an avalanche of diagnostics. But even then, I'm skeptical that these sorts of copies are going to be anything but marginal compared to whatever other work Red Knot is doing in cases like that. But I'm mostly just guessing.)

MichaReiser · 2025-04-02T06:56:14Z

crates/red_knot_python_semantic/src/types/context.rs

+        let config = ruff_db::diagnostic::DisplayDiagnosticConfig::default();
+        for d in &*self.diagnostics.borrow_mut() {
+            d.print(self.db.upcast(), &config, vec![]).unwrap();
+        }
+


I don't think this is necessary anymore, now that we removed the "drop bomb" from diagnostics.

Oops! I thought I caught everything. This was actually scaffolding code I was using to aide the refactor. Nice catch. Removed.

MichaReiser · 2025-04-02T07:04:39Z

crates/red_knot_python_semantic/src/types/diagnostic.rs

-    pub(crate) fn push(&mut self, diagnostic: TypeCheckDiagnostic) {
-        self.diagnostics.push(Arc::new(diagnostic));
+    pub(crate) fn push(&mut self, diagnostic: Diagnostic) {
+        self.diagnostics.push(diagnostic);
    }

    pub(super) fn extend(&mut self, other: &TypeCheckDiagnostics) {


I'm not sure what the solution for this is but it may still be desired to wrap the Diagnostics in self.diagnostics in an Arc because we now create multiple deep-clones of every diagnostic, which I suspect can be fairly expensive (diagnostics aren't lightweight structs).

The diagnostics in Red Knot are collected as part of type inference and type inference is done at three different levels:

Per scope (calls into expressions and definition inference)

Per definition (class, function, variable), calls into expression inference

Per expression (only some expressions)

and 2. both have to "copy over" the diagnostics from the inner inferences operations. That means, a diagnostic created when infering an expression may be copied twice: Once to aggregate the diagnostics at the definition level and once when aggregating the diagnostics at a scope level.

We later perform one last copy (clone) inside check_types

This is why the old implementation used an Arc to make those clones as cheap as possible. Alternatives are to use salsa::interned (but that comes with extra bookkeeping but it has the advantage that it's arena allocated) or salsa accumulated (which now avoids cloning internally and only requires cloning at the boundaries)

Unless we say that cloning Diagnostics isn't a performance concern relative to the cost of generating and printing them

Now that I'm through with my review: I see that you made Diagnostic use an Arc which addresses this concern. I liked that the "old" new diagnostic was salsa agnositc in the sense that it only used the Arc where necessary but I think this also makes sense.

It does have the downside that Ruff's diagnostics get Arced where this isn't technically necessary. It will be interesting to see how this impacts performance.

Yeah I like sticking with Arc for now. I do this sort of thing all the time in my Rust libraries. It generally works very well.

You are of course correct that using Arc for these sorts of things isn't free. But I try to look at what the marginal cost of the Arc is. You're only getting a new Arc when you create a diagnostic, and creating a diagnostic is already likely going to require allocs and formatting machinery and all that stuff.

I will be very happy to eat crow and optimize if we can find use cases for which getting rid of these allocs helps perf. But I'll note that, at some point, you do have to actually create strings somewhere in order to create a diagnostic, even if it had ownership accepting APIs. e.g., Diagnostic::new(blah, blah, format_args!("some message about {foo}")) has to do a to_string() on that std::fmt::Arguments somewhere. And unless we want a lifetime parameter infecting the Diagnostic types (which I would guess would be unworkable), it probably makes the most sense to do it at construction time.

I think even in alternative designs (e.g., a Diagnostic trait), there have to be clones and formatting happening somewhere, although precisely where might get shifted around a bit.

crates/red_knot_python_semantic/src/types/infer.rs

MichaReiser · 2025-04-02T07:12:26Z

crates/red_knot_test/src/lib.rs

 ) -> String {
    let display_config = DisplayDiagnosticConfig::default().color(false);

-    let mut snapshot = String::new();
+    let mut snapshot: Vec<u8> = Vec::new();


I believe you mentioned in one of your reviews that it was a very intentional decision to implement print over std::io::Write. I don't remember the details but I noticed that it results in many places where it's necessary to use Vec<u8> in combination with a String::from_utf8().unwrap which is somewhat unfortunate.

I don't really have a solution for this, it's just something I noticed (e.g. also in my tests).

Yeah... right... good point. The reason was related to the linear type fakery. Namely, using std::io::Write as a destination makes it "more likely" that printing is actually going to stdout or similar, and not just some string in memory that could get accidentally ignored.

I don't think I have any strong feelings about reverting this back to just a Display impl or some such (the internal renderer already works that way, so it's an overall small change). One note though is that the unwrap() calls you're referencing I believe only occur in tests.

I'm leaning towards offering a Display impl. But might do that in a follow-up PR.

One note though is that the unwrap() calls you're referencing I believe only occur in tests.

This was wrong! See: #17130 (comment)

MichaReiser · 2025-04-02T07:14:44Z

crates/red_knot_test/Cargo.toml

@@ -13,7 +13,7 @@ license.workspace = true
 [dependencies]
 red_knot_python_semantic = { workspace = true, features = ["serde"] }
 red_knot_vendored = { workspace = true }
-ruff_db = { workspace = true, features = ["testing"] }
+ruff_db = { workspace = true, features = ["os", "testing"] }


It's not clear to me what changed that the os feature is now required or is this a feature we already missed declaring as dependency before?

It was missed before. It came up when I was trying to run tests via -p red_knot_test I believe. Sorry, I meant to call this out in its own commit, but it slipped through.

crates/red_knot_wasm/src/lib.rs

MichaReiser · 2025-04-02T07:17:27Z

crates/red_knot_wasm/src/lib.rs

+            .borrow()
+            .print(workspace.db.upcast(), &config, &mut buf)
+            .unwrap();
+        String::from_utf8(buf).unwrap().into()


You could also change the return type to Result without breaking the JS Api if you want to avoid the unwrap here

Ah, and this is a non-test instance of the unwrap() that I previously claimed was only in tests. Nice.

OK, that convinces me to switch back to a Display impl, kinda like how the old diagnostics work.

But I'll do that in a follow-up PR if that's okay with you. (Which will fix this too.)

crates/ruff_db/src/diagnostic/mod.rs

MichaReiser · 2025-04-02T07:19:34Z

crates/ruff_db/src/diagnostic/mod.rs

+/// at time of writing, `ruff_db` depends on `ruff_python_parser` instead of
+/// the other way around. And since we want to do this conversion in a couple
+/// places, it makes sense to centralize it _somewhere_. So it's here for now.
+pub fn create_parse_diagnostic(file: File, err: &ruff_python_parser::ParseError) -> Diagnostic {


A use case for your IntoDiagnostic trait ;)

Hmmm. I don't think this on its own asks for generics. If it weren't for the dependency graph (which we briefly talked about fixing at the on-site), then this could "just" be a concrete method on ParseError and things would be fine.

I removed this to see how much code was depending internally on the `&[Arc<TypeCheckDiagnostic>]` representation. Thankfully, it was just one place. So I just removed the `Deref` impl in favor of adding an explicit `iter` method. In general, I think using `Deref` for things like this is _somewhat_ of an abuse. The tip-off is if there are `&self` or `&mut self` methods on the type, then it's probably not a good candidate for `Deref`.

I did this mostly because it wasn't buying us much, and I'm trying to simplify the public API of the types I'd like to refactor in order to make the refactor simpler. If we really want something like this, we can re-add it later.

…ages This is temporary to scaffold the refactor. The main idea is that we want to take the `InferContext` API, *as it is*, and migrate that to the new diagnostic data model *internally*. Then we can rip out the old stuff and iterate on the API.

The switch to `Arc` was done because Salsa sometimes requires cloning a `Diagnostic` (or something that contains a `Diagnostic`). And so it probably makes sense to make this cheap. Since `Diagnostic` exposes a mutable API, we adopt "clone on write" semantics. Although, it's more like, "clone on write when the `Arc` has more than one reference." In the common case of creating a `Diagnostic` and then immediately mutating it, no additional copies should be made over the status quo. We also drop the linear type fakery. Its interaction with Salsa is somewhat awkward, and it has been suggested that there will be points where diagnostics will be dropped unceremoniously without an opportunity to tag them as having been ignored. Moreover, this machinery was added out of "good sense" and isn't actually motivated by real world problems with accidentally ignoring diagnostics. So that makes it easier, I think, to just kick this out entirely instead of trying to find a way to make it work.

Now that we don't need to update the `printed` flag, this can just be an immutable borrow. (Arguably this should have been an immutable borrow even initially, but I didn't want to introduce interior mutability without a more compelling justification.)

This change just brings diagnostic rendering into parity with the status quo.

Previously, this was only available in the old renderer. To avoid regressions, we just copy it to the new renderer. We don't bother with DRY because the old renderer will be deleted very soon.

This replaces things like `TypeCheckDiagnostic` with the new Diagnostic` type. This is a "surgical" replacement where we retain the existing API of of diagnostic reporting such that _most_ of Red Knot doesn't need to be changed to support this update. But it will enable us to start using the new diagnostic renderer and to delete the old renderer. It also paves the path for exposing the new `Diagnostic` data model to the broader Red Knot codebase.

We do keep around `OldSecondaryDiagnosticMessage`, since that's part of the Red Knot `InferContext` API. But it's a rather simple type, and we'll be able to delete it entirely once `InferContext` exposes the new `Diagnostic` type directly. Since we aren't consuming `OldSecondaryDiagnosticMessage` any more, we can now accept a slice instead of a vec. (Thanks Clippy.)

I put this in its own commit in case all of the information removed here was controversial. But it *looks* stale to me. At the very least, `TypeCheckDiagnostic` no longer exists, so that would need to be fixed. And it doesn't really make sense to me (at this point) to make `Diagnostic` a Salsa struct, particularly since we are keen on using it in Ruff (at some point).

This just adds an extra blank line. I think these tests were written against the new renderer before it was used by Red Knot's `main` function. Once I did that, I saw that it was missing a blank line, and so I added it to match the status quo. But that means these snapshots have become stale. So this commit updates them.

It was already using this approach internally, so this is "just" a matter of rejiggering the public API of `Diagnostic`. We were previously writing directly to a `std::io::Write` since it was thought that this worked better with the linear typing fakery. Namely, it increased confidence that the diagnostic rendering was actually written somewhere useful, instead of just being converted to a string that could potentially get lost. For reasons discussed in #17130, the linear type fakery was removed. And so there is less of a reason to require a `std::io::Write` implementation for diagnostic rendering. Indeed, this would sometimes result in `unwrap()` calls when one wants to convert to a `String`.

BurntSushi requested review from carljm, AlexWaygood, sharkdp, dcreager and MichaReiser as code owners April 1, 2025 18:59

BurntSushi added red-knot Multi-file analysis & type inference diagnostics Related to reporting of diagnostics. labels Apr 1, 2025

MichaReiser approved these changes Apr 2, 2025

View reviewed changes

AlexWaygood removed their request for review April 2, 2025 13:13

BurntSushi added 10 commits April 2, 2025 09:21

red_knot_python_semantic: remove WithDiagnostic trait

0896e42

I did this mostly because it wasn't buying us much, and I'm trying to simplify the public API of the types I'd like to refactor in order to make the refactor simpler. If we really want something like this, we can re-add it later.

ruff_db: tweak line terminators emitted by diagnostic rendering

6a8cee6

This change just brings diagnostic rendering into parity with the status quo.

ruff_db: port concise diagnostic rendering to new renderer

c0376c2

Previously, this was only available in the old renderer. To avoid regressions, we just copy it to the new renderer. We don't bother with DRY because the old renderer will be deleted very soon.

BurntSushi force-pushed the ag/red-knot-new-diagnostics branch from d248dc4 to 3db884d Compare April 2, 2025 13:44

BurntSushi merged commit 28c7e72 into main Apr 2, 2025
23 checks passed

BurntSushi deleted the ag/red-knot-new-diagnostics branch April 2, 2025 14:10

BurntSushi mentioned this pull request Apr 2, 2025

ruff_db: switch diagnostic rendering over to std::fmt::Display #17154

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] partially migrate to new `Diagnostic` types, delete old diagnostic code #17130

[red-knot] partially migrate to new `Diagnostic` types, delete old diagnostic code #17130

BurntSushi commented Apr 1, 2025

github-actions bot commented Apr 1, 2025 •

edited

Loading

github-actions bot commented Apr 1, 2025 •

edited

Loading

MichaReiser left a comment

MichaReiser Apr 2, 2025

BurntSushi Apr 2, 2025 •

edited

Loading

MichaReiser Apr 2, 2025

BurntSushi Apr 2, 2025

MichaReiser Apr 2, 2025

MichaReiser Apr 2, 2025

BurntSushi Apr 2, 2025

MichaReiser Apr 2, 2025

BurntSushi Apr 2, 2025

BurntSushi Apr 2, 2025

BurntSushi Apr 2, 2025

BurntSushi Apr 2, 2025

MichaReiser Apr 2, 2025

BurntSushi Apr 2, 2025

MichaReiser Apr 2, 2025

BurntSushi Apr 2, 2025

MichaReiser Apr 2, 2025

MichaReiser Apr 2, 2025

BurntSushi Apr 2, 2025

	self.diagnostics.borrow_mut().push(TypeCheckDiagnostic {
	file: self.file,
	id,
	message: message.to_string(),
	range: ranged.range(),
	severity,
	secondary_messages,
	});

[red-knot] partially migrate to new Diagnostic types, delete old diagnostic code #17130

[red-knot] partially migrate to new Diagnostic types, delete old diagnostic code #17130

Conversation

BurntSushi commented Apr 1, 2025

github-actions bot commented Apr 1, 2025 • edited Loading

mypy_primer results

github-actions bot commented Apr 1, 2025 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

MichaReiser left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BurntSushi Apr 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

[red-knot] partially migrate to new `Diagnostic` types, delete old diagnostic code #17130

[red-knot] partially migrate to new `Diagnostic` types, delete old diagnostic code #17130

github-actions bot commented Apr 1, 2025 •

edited

Loading

`mypy_primer` results

github-actions bot commented Apr 1, 2025 •

edited

Loading

`ruff-ecosystem` results

BurntSushi Apr 2, 2025 •

edited

Loading