Replace `match_def_path` and friends with `is_item` #7647

Jarcho · 2021-09-08T20:56:51Z

This is a pair of new util functions is_item and is_any_item. These will take anything that can be turned into a DefId (currently DefId, Res, (QPath,HirId), Expr, Pat, and Ty) and anything that can refer to a specific item (currently a def path, diagnostic item and lang item). This replaces the handful of functions we currently have to do certain combinations of these.

The following functions are then removed:

is_qpath_def_path
is_expr_path_def_path
is_expr_diagnostic_item
match_any_def_paths
match_any_diagnostic_items
match_def_path
is_type_diagnostic_item
is_type_lang_item
match_type

The internal lint match_type_on_diagnostic_item has been changed to work on the new function. It will also check consts and statics from external crates and check for lang items.

changelog: none

rust-highfive · 2021-09-08T20:56:54Z

r? @giraffate

(rust-highfive has picked a reviewer for you, use r? to override)

flip1995 · 2021-09-09T09:53:47Z

This is a really big refactor and bitrotty. So I don't want to put all of the burden of reviewing this on @giraffate. So first @rust-lang/clippy what are your thoughts on this? And then, if you have time, please pick a part of the PR and help reviewing it.

Jarcho · 2021-09-09T13:23:21Z

I'll split this up into two PRs. One with just the switchover to is_item and another with all the refactoring around it. Should make it easier to review.

* `is_qpath_def_path` * `is_expr_path_def_path` * `is_expr_diagnostic_item` * `match_any_def_paths` * `match_any_diagnostic_items` * `match_def_path` * `is_type_diagnostic_item` * `is_type_lang_item` * `match_type`

* Check for `is_item` instead * Check consts and statics from external crates * Check for lang items * Check for inherent functions which have the same name as a field

Jarcho · 2021-09-09T16:47:10Z

Pulled out the refactorings. Only thing left is the switch to is_item and is_any_item which were all automated minus the manual fixups to the imports. Should be much easier to review now.

camsteffen · 2021-09-10T16:13:11Z

We definitely need to simplify our approach to item lookups. I've also toyed with creating utils like this. I'm a little unsure that this is the right direction though. Lately I've been thinking that we should simply move towards diagnostic items.

In particular, I would really like to have a diagnostic item "reverse lookup" added to rustc:

match cx.tcx.get_diagnostic_name(def_id)? {
    sym::vec_type => ..

If we do go with this approach, I would prefer that is_any_item returns bool instead of Option<usize>. To me returning an index feels un-Rusty, like C-style for loops.

Jarcho · 2021-09-10T16:41:56Z

That would definitely work better when checking for multiple diagnostic items.

There's still a need to simplify getting the DefId in the first place. Things like this happen all the time:

if_chain! {
    if let ExprKind::Call(fn_expr, args) = e.kind;
    if let ExprKind::Path(ref p) = fn_expr.kind;
    lf let Some(def_id) = cx.qpath_res(p, fn_expr.hir_id).opt_def_id();
    if cx.tcx.is_diagnostic_item(_, def_id);
    then {
    	// do stuff with args
    }
}

Lang items can also use a simplified check as well. cx.tcx.lang_items().item().map_or(false, |lang_id| lang_id == def_id) is a little wordy. Paths will also still need to be handled for external crates (e.g. itertools and regex) as well.

If we do go with this approach, I would prefer that is_any_item returns bool instead of Option<usize>. To me returning an index feels un-Rusty, like C-style for loops.

The name could be better (it's just wrong). There are quite a few cases where knowing which one matched is necessary though.

Edit:
Just a note about reversing the diagnostic item lookup (before I forget about this). They're also used to lookup the DefId for a specific trait (Display, Ord, DoubleEndedIterator and RangeBounds are used in clippy). Looks like this is always done to check if a type implements a specific trait. This could still be done if diagnostic item lookup is flipped, but it would require looking up the diagnostic name of every trait the type implements.

bors · 2021-09-11T08:43:54Z

☔ The latest upstream changes (presumably #7663) made this pull request unmergeable. Please resolve the merge conflicts.

camsteffen · 2021-09-12T06:43:45Z

Here is a util idea I have

fn is_path_to_item(cx, impl MaybePath, impl ItemRef);

impl MaybePath is Expr | Pat | QPath. This has a smaller surface area than is_item, but it's more expressive. I would like preserving the "path to" concept in code.

This could be complimented with is_item_id(cx, impl ItemRef, DefId).

There are quite a few cases where knowing which one matched is necessary though.

You could split into multiple invocations of is_item/is_any_item. Using indices is harder to read and more error prone IMO.

Jarcho · 2021-09-12T12:35:42Z

That still leaves Res and Ty out, but they could be fit in pretty easily.

My issue with splitting the functions is justifying why they're two separate functions. As in from the perspective of the person calling the function, what problem are you solving by having two functions as opposed to one. I can kind of do that for expressions and patterns just by being clear that they only work on paths rather than, for example, method calls (it would make the function useless as it can't disambiguate between a path to a method vs. a call to a method, but I could see someone possibly using it to check a method call node without really thinking). At this point the cost is the same either way. Either you remember that paths use a different function (for is_path_to_item), or the function only works on paths for expressions and patterns (for is_item).

If we do go with splitting the functions, please don't have different argument orders. Either one works, but differing orders are just pointless friction.

camsteffen · 2021-09-12T23:55:34Z

That still leaves Res and Ty out, but they could be fit in pretty easily.

True. I think Res is mostly an intermediate step for the other cases so it would be three widely used functions. Potentially an "any" variant for each one, but I'd also be fine with just using Iterator::any.

I'm torn on is_item. I do like the simplicity on one hand. But on the other hand it seems too ambiguously defined. It doesn't support ExprKind::MethodCall, but it could, and it's not obvious to me where that line should be drawn.

If we do go with splitting the functions, please don't have different argument orders. Either one works, but differing orders are just pointless friction.

Agreed. I would apply the "yoda condition" rule which would put ItemRef after the thing being checked.

Jarcho · 2021-09-15T03:30:59Z

It doesn't support ExprKind::MethodCall, but it could, and it's not obvious to me where that line should be drawn.

There is a line here. Checking for a ufcs call to Option::unwrap would look like this

if let ExprKind::Call(fn, _) = e.kind {
    is_item(cx, fn, &OPTION_UNWRAP)
} else {
    false
}

If we allow method calls this could match either Option::unwrap(foo) or foo.unwrap()() which are two very different things. So there are a small number of cases where it doesn't give a useful result. I would say knowing this would make the line clear, but the reason isn't obvious.

I'd be in favour of a good abstraction around method and function calls. There is match_function_call, but I would like something that matches the arguments before checking the DefId.

camsteffen · 2021-09-18T02:27:27Z

You make a compelling case for excluding ExprKind::MethodCall. It's unfortunate that the reason doesn't stem from the meaning of is_item or MaybeDefId.

I think is_item would work fine in practice. My preference still would be to have more tightly defined abstractions.

Jarcho · 2021-09-18T03:05:16Z

I can't really think of a single name that gets across subtleties like that.

Just some other things to consider:

Pat can work on PatKind::TupleStruct and PatKind::Struct as well. There's no ambiguity here as they both contain a path directly.
Expr can work with ExprKind::Struct for the same reason as above.
Various other types like AdtDef and VariantDef could also be fit in here. Not really a big deal as they have a def_id field.

camsteffen · 2021-10-05T17:08:01Z

Discussed this in the Clippy meeting today. We agreed on the following.

We'd like to move towards just using diagnostic items.
is_item feels like too much "magic" or over-abstraction
There may be opportunity to introduce new utils in this area, but before doing that, we should migrate more paths to diagnostic items and adopt get_diagnostic_name which is now landing soon.

Even if we don't adopt is_item, this PR is a valuable exploration of what is possible with Clippy utils. 👍

Jarcho · 2021-10-06T23:51:03Z

I'll pull out the fixes for the internal lint into a different PR since those will be needed anyways. Should I open an issue summarizing things and possible steps forward?

We'd like to move towards just using diagnostic items.

I don't think this is possible for external crates (currently we use regex and itertools). Even if we could get external crates tagged with diagnostic items old versions still wouldn't have them.

There may be opportunity to introduce new utils in this area, but before doing that, we should migrate more paths to diagnostic items and adopt get_diagnostic_name which is now landing soon.

Is that replacing diagnostic_items or in addition to it?

camsteffen · 2021-10-07T20:24:43Z

Should I open an issue summarizing things and possible steps forward?

Go for it! For reference we have #5393 for migrating to diagnostic items and I just opened #7784 for get_diagnostic_name.

We'd like to move towards just using diagnostic items.

I don't think this is possible for external crates (currently we use regex and itertools). Even if we could get external crates tagged with diagnostic items old versions still wouldn't have them.

True, there will be some exceptions. We can just implement those cases more "naively" instead of having utils. And/or we can separate path utils into a separate module to declutter.

Is that replacing diagnostic_items or in addition to it?

In addition - for cases where multiple queries can be replaced with one query.

…ogiq Fix and improve `match_type_on_diagnostic_item` This extracts the fix for the lint out of #7647. There's still a couple of other functions to check, but at least this will get lint working again. The two added util functions (`is_diagnostic_item` and `is_lang_item`) are needed to handle `DefId` for unit and tuple struct/variant constructors. The `rustc_diagnostic_item` and `lang` attributes are attached to the struct/variant `DefId`, but most of the time they are used through their constructors which have a different `DefId`. The two utility functions will check if the `DefId` is for a constructor and switch to the associated struct/variant `DefId`. There does seem to be a bug on rustc's side where constructor `DefId`s from external crates seem to be returning `DefKind::Variant` instead of `DefKind::Ctor()`. There's a workaround put in right. changelog: None

rust-highfive assigned giraffate Sep 8, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Sep 8, 2021

Jarcho force-pushed the is_item branch 2 times, most recently from 3fc960d to d4cc838 Compare September 8, 2021 21:07

Add is_item and is_any_item util functions

2cea009

Jarcho force-pushed the is_item branch from d4cc838 to 1384c14 Compare September 9, 2021 14:35

Remove functions replaced by is_item and is_any_item

f3bfd20

* `is_qpath_def_path` * `is_expr_path_def_path` * `is_expr_diagnostic_item` * `match_any_def_paths` * `match_any_diagnostic_items` * `match_def_path` * `is_type_diagnostic_item` * `is_type_lang_item` * `match_type`

Jarcho force-pushed the is_item branch 4 times, most recently from 2f5b2ed to 6b64159 Compare September 9, 2021 16:01

Fix internal lint checking match_type uses

e122bf4

* Check for `is_item` instead * Check consts and statics from external crates * Check for lang items * Check for inherent functions which have the same name as a field

Jarcho force-pushed the is_item branch from 6b64159 to e122bf4 Compare September 9, 2021 16:31

Jarcho mentioned this pull request Nov 12, 2021

Fix and improve match_type_on_diagnostic_item #7962

Merged

Jarcho closed this Jul 19, 2022

Replace match_def_path and friends with is_item #7647

Replace match_def_path and friends with is_item #7647

Uh oh!

Conversation

Jarcho commented Sep 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rust-highfive commented Sep 8, 2021

Uh oh!

flip1995 commented Sep 9, 2021

Uh oh!

Jarcho commented Sep 9, 2021

Uh oh!

Jarcho commented Sep 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

camsteffen commented Sep 10, 2021

Uh oh!

Jarcho commented Sep 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bors commented Sep 11, 2021

Uh oh!

camsteffen commented Sep 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jarcho commented Sep 12, 2021

Uh oh!

camsteffen commented Sep 12, 2021

Uh oh!

Jarcho commented Sep 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

camsteffen commented Sep 18, 2021

Uh oh!

Jarcho commented Sep 18, 2021

Uh oh!

camsteffen commented Oct 5, 2021

Uh oh!

Jarcho commented Oct 6, 2021

Uh oh!

camsteffen commented Oct 7, 2021

Uh oh!

Uh oh!

Replace `match_def_path` and friends with `is_item` #7647

Replace `match_def_path` and friends with `is_item` #7647

Jarcho commented Sep 8, 2021 •

edited

Loading

Jarcho commented Sep 9, 2021 •

edited

Loading

Jarcho commented Sep 10, 2021 •

edited

Loading

camsteffen commented Sep 12, 2021 •

edited

Loading

Jarcho commented Sep 15, 2021 •

edited

Loading