Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Invalid rule provided as rule RUF102 with --fix #17138

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

maxmynter
Copy link
Contributor

Closes #17084

Summary

This PR adds a new rule (RUF102) to detect and fix invalid rule codes in noqa comments.
Invalid rule codes in noqa directives serve no purpose and may indicate outdated code suppressions.

This extends the previous behaviour originating from crates/ruff_linter/src/noqa.rs which would only emit a warnigs.
With this rule a --fix is available.

The rule:

  1. Analyzes all noqa directives to identify invalid rule codes
  2. Provides autofix functionality to:
    • Remove the entire comment if all codes are invalid
    • Remove only the invalid codes when mixed with valid codes
  3. Preserves original comment formatting and whitespace where possible

Example cases:

  • # noqa: XYZ111 → Remove entire comment (keep empty line)
  • # noqa: XYZ222, XYZ333 → Remove entire comment (keep empty line)
  • # noqa: F401, INVALID123 → Keep only valid codes (# noqa: F401)

Test Plan

  • Added tests in crates/ruff_linter/resources/test/fixtures/ruff/RUF102.py covering different example cases.

Notes

  • This does not handle cases where parsing fails. E.g. # noqa: NON_EXISTENT, ANOTHER_INVALID causes a LexicalError and the diagnostic is not propagated and we cannot handle the diagnostic. I am also unsure what proper fix handling would be and making the user aware we don't understand the codes is probably the best bet.
  • The rule is added to the Preview rule group as it's a new addition

Questions

  • Should we remove the warnings, now that we have a rule?
  • Is the current fix behavior appropriate for all cases, particularly the handling of whitespace and line deletions?
  • I'm new to the codebase; let me know if there are rule utilities which could have used but didn't.

@ntBre ntBre added the rule Implementing or modifying a lint rule label Apr 2, 2025
Copy link
Contributor

github-actions bot commented Apr 2, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+2 -0 violations, +0 -0 fixes in 1 projects; 54 projects unchanged)

apache/airflow (+2 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL

+ providers/edge/src/airflow/providers/edge/worker_api/auth.py:34:78: RUF102 [*] Invalid rule code in `# noqa`: TCH001
+ providers/edge/src/airflow/providers/edge/worker_api/datamodels.py:28:72: RUF102 [*] Invalid rule code in `# noqa`: TCH001

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
RUF102 2 2 0 0 0

@maxmynter
Copy link
Contributor Author

maxmynter commented Apr 2, 2025

Sorry for the CI issues, I thought that clippy runs with checking the pre-commit hooks.

I thought I generated the docs with cargo dev generate-all and they are in the comment in the rule file. Am I supposed to do that differently?
Edit: I missed a # and had the wrong header level in the docstring. It should work now.

@MichaReiser MichaReiser added the preview Related to preview mode features label Apr 2, 2025
};

let mut invalid_code_refs = vec![];
let mut valid_codes = vec![];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to always collect all codes only to be able to generate the fix isn't ideal for performance (this requires a fair amount of writes).

I'd rewrite this to only track if all codes are invalid and then re-iterate over the noqa codes when generating the fix and skip over the known invalid codes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adressed in 52abe14

);

let original_text = locator.slice(line.range());
if let Some(comment_start) = original_text.find('#') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work with # test # noqa: invalid? Could we use line.directive.range (assuming directive is the Codes variant) here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the directive suggestion. It's much cleaner (adressed in f7014db).

Regarding multiple comments on a line it removes the invalid parts

  • # test # noqa: INVALID111# test
  • # test # noqa: INVALID111, VALID111# test # noqa: VALID111

I've added tests for this in 5cc8305.

I guess this is the desired behaviour. Having stale comments isn't nice, but auto removing ones that should stay is worse. What do you think.

Note: I'm writing it with uppercase ASCII plus a number because otherwise it's not parsed as a rule code; in this case the comment remains untouched.


for code in directive.iter() {
let code_str = code.as_str();
if Rule::from_code(code.as_str()).is_err() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test that this rule respects https://docs.astral.sh/ruff/settings/#lint_external

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't. I added a fix (3f50765) and test (e734d3d).

Comment on lines 145 to 146
let prefix = &original_text[..prefix_end];
let codes_part = &original_text[prefix_end..];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think there's a split_at(prefix_end) method that gives you both the before and after parts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adressed in 2be49a0

format!(
"{}{}{}",
prefix,
&codes_part[..whitespace_len],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will panic if there is any non-ASCII whitespace because you pass the character count instead of the length in bytes. You'll have to use find or similar to find the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adressed in e3f4e07

@maxmynter maxmynter requested a review from MichaReiser April 2, 2025 21:03
@@ -314,6 +315,20 @@ mod tests {
Ok(())
}

#[test]
fn ruf102() -> Result<()> {
Copy link
Member

@MichaReiser MichaReiser Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIt:

Suggested change
fn ruf102() -> Result<()> {
fn invalid_rule_code_external_rules() -> Result<()> {

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for incorporating external. This mostly looks good. The main thing we've to figure out is how to handle noqa comments where more than one code is invalid. Maybe @dylwil3 has an opinion on this?

/// ```python
/// import os # noqa: E402
/// ```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

/// ```python
/// import os # noqa: E402
/// ```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to link to the external setting similar to unused-noqa: https://docs.astral.sh/ruff/rules/unused-noqa/

Comment on lines +64 to +65
if external.iter().any(|ext| code_str.starts_with(ext))
|| Rule::from_code(code.as_str()).is_ok()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if external.iter().any(|ext| code_str.starts_with(ext))
|| Rule::from_code(code.as_str()).is_ok()
if Rule::from_code(code.as_str()).is_ok() || external.iter().any(|ext| code_str.starts_with(ext))

This might be faster because it checks the common case first.

Comment on lines +64 to +65
if external.iter().any(|ext| code_str.starts_with(ext))
|| Rule::from_code(code.as_str()).is_ok()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if external.iter().any(|ext| code_str.starts_with(ext))
|| Rule::from_code(code.as_str()).is_ok()
if external.iter().any(|ext| code_str.starts_with(ext))
|| Rule::from_code(code_str)).is_ok()

Comment on lines +84 to +85
if external.iter().any(|ext| code_str.starts_with(ext))
|| Rule::from_code(code_str).is_ok()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe consider moving this to a small helper function and use it here and when checking above.

Comment on lines +99 to +102
let invalid_codes = directive
.iter()
.map(crate::noqa::Code::as_str)
.collect::<Vec<_>>()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we reuse the collected invalid_code_refs and pass them to this function instead of collecting them again?

}
}

fn handle_all_codes_invalid(diagnostics: &mut Vec<Diagnostic>, directive: &Codes<'_>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: To reduce the number of arguments:

Suggested change
fn handle_all_codes_invalid(diagnostics: &mut Vec<Diagnostic>, directive: &Codes<'_>) {
fn create_all_codes_invalid_diagnostic(directive: &Codes<'_>) -> Diagnostic {

Comment on lines +126 to +136
for invalid_code in invalid_codes {
let mut diagnostic = Diagnostic::new(
InvalidRuleCode {
rule_code: invalid_code.as_str().to_string(),
},
invalid_code.range(),
);

diagnostic.set_fix(fix.clone());
diagnostics.push(diagnostic);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And another case for multispan diagnostics...

I think I'd be surprised that applying the fix for a specific code replaces all invalid codes and not just the one where I positioned my cursor on.

That's why I think we should either:

  • Only create one diagnostic that fixes all invalid codes at once (multispan diagnostics would be great)
  • Create a diagnostic for each invalid code, the fix only removes that one code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Related to preview mode features rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement "Invalid rule code provided to # noqa" as rule
3 participants