Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Declarative macro_rules! derive macros #3698

Open
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

joshtriplett
Copy link
Member

@joshtriplett joshtriplett commented Sep 21, 2024

Many crates support deriving their traits with derive(Trait). Today, this
requires defining proc macros, in a separate crate, typically with several
additional dependencies adding substantial compilation time, and typically
guarded by a feature that users need to remember to enable.

However, many common cases of derives don't require any more power than an
ordinary macro_rules! macro. Supporting these common cases would allow many
crates to avoid defining proc macros, reduce dependencies and compilation time,
and provide these macros unconditionally without requiring the user to enable a
feature.

I've reviewed several existing proc-macro-based derives in the ecosystem, and
it appears that many would be able to use this feature to avoid needing proc
macros at all.

Rendered

@joshtriplett joshtriplett added the T-lang Relevant to the language team, which will review and decide on the RFC. label Sep 21, 2024
@joshtriplett
Copy link
Member Author

Nominated as a follow-up to recent lang discussions about this.

@joshtriplett joshtriplett added the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Sep 22, 2024
@liigo
Copy link
Contributor

liigo commented Sep 23, 2024

Since its call-side syntax is different from normal macro_rules! (i.e. macroname!()), the definition-side syntax should also variants e.g. macro_rules_derive! (v.s. #[proc_macro_derive(TraitName)]).

The same to #3697 where we could name it macro_rules_attribute! (v.s. #[proc_macro_attribute]).

@coolreader18
Copy link

coolreader18 commented Sep 26, 2024

One thing that would be really nice (i.e. greatly increase usability of decl macros) (though would not at all decrease the usefulness of this feature) is proper, full-featured parsing of meta fragments of decl macros. Right now, if you want to parse arguments to a decl macro in a MetaList form, you have to either enforce that the named arguments come in a specific order, or use a tt-muncher (and even then, it's not clear at first thought how you'd do it!). Obviously this is something that would introduce a whole new axis of complexity to decl macro declarations and evaluation, but being able to specify something like $[$(foo = $foo:expr)? $(bar($bar:expr))*],* and accept 0 or 1 foos and 0 or more bars in any order would be a game changer, and I can't see using attr or derive decl macros being very ergonomic for anything but the simplest cases without something like that. This RFC lets you define helper attributes for decl derive macros, but without any easy way of extracting them from out of the set of other attributes on the item, I don't see that being very useful for most people. An accessible and intuitive macro definition format tt-munchers are not.

(Shoulda brought this up at the hopes and dreams for the language session at Unconf 🙃)

@joshtriplett
Copy link
Member Author

@coolreader18 I would love to see many parsing improvements for macros, including something to address this kind of parsing difficulty. I don't think that's specific to derive or attribute macros, though it certainly makes them more useful.

@matthieu-m
Copy link

To define helper attributes, put an attributes key in the macro_derive attribute, with a comma-separated list of identifiers for helper attributes: #[macro_derive(attributes(helper))]. The derive macro can process the #[helper] attribute, along with any arguments to it, as part of the item the derive macro was applied to.

In order to avoid name collisions in helper attributes between different derives, I think it would be worth it to take a page out of serde here and require namespacing of those attributes.

For example, simple serde code:

#[derive(Debug, Deserialize)]
struct Message {
    #[serde(rename = "type")]
    type_: String,
    #[serde(default = "Message::default_payload")]
    payload: String,
}

Some names (like default) are likely to be particularly sought after, and mixing dependencies which clash on those helper names will be nightmarish.

Convention for the namespace could dictate that it be either a parent module/crate name, or the name of the trait, snake-cased.

@jplatte
Copy link
Contributor

jplatte commented Sep 26, 2024

I don't think that's a good idea. It's already the case that other libraries than serde look at serdes attributes, sometimes even in a context where the input type does not use serdes derives at the same time (but a proc-macro copies the serde attributes to some types it generates). As one example, the derive macro schemars::JsonSchema currently "registers" serde, schemars and validate, likely for good reason.

@ssokolow
Copy link

ssokolow commented Sep 26, 2024

I don't think that's a good idea. It's already the case that other libraries than serde look at serdes attributes, sometimes even in a context where the input type does not use serdes derives at the same time (but a proc-macro copies the serde attributes to some types it generates). As one example, the derive macro schemars::JsonSchema currently "registers" serde, schemars and validate, likely for good reason.

Perhaps some kind of pub annotation then? Yes, that would have made achieving the current state of Serde attribute cross-compatibility difficult, but it has been the Rust way to default to pushing back against Hyrum's law.

@epage
Copy link
Contributor

epage commented Sep 26, 2024

There has also been talk of "common" attributes, like skip.

For clap, namespacing by crate or derive name was insufficient and it now processes 4 different namespaces.

If we did encourage something by default, i think it should be derive name so there is a clear relationship.

On a simlar note of constraining users, imo derives should only produce a trait impl for the derive and considered proposing that be enforced but figured that deviating for what proc-macros provide would also be a downside. Also, I've seen with clap how it can be useful to include mostly-internal trait impls with the requested one.

@kpreid
Copy link
Contributor

kpreid commented Sep 26, 2024

On a simlar note of constraining users, imo derives should only produce a trait impl for the derive

Note that imposing such restriction would make it difficult to write derives that implement traits like IntoIterator, where each implementation typically introduces a new type that must be nameable by the impl, until associated type position impl Trait is available.

I was recently exploring this space while working on exhaust 0.2, which needs to generate two types and several impls along with the nominally derived impl, and my current strategy for being nice to the macro’s users is that the derive macro’s generated items are all inside a const _: () = { ... } block; this ensures that the macro’s expansion has no effect on the caller’s namespace, even though more items than just a trait impl are generated. One could imagine inserting that block wrapper automatically as part of the semantics of derive macros (whether declarative or procedural), but that should probably be its own design discussion separate from introducing a new kind of macro implementation.

@matthieu-m
Copy link

I don't think that's a good idea.

For clap, namespacing by crate or derive name was insufficient and it now processes 4 different namespaces.

I would note that I specifically mentioned "as a convention".

If anything, the fact that schemars also looks at serde namespace is a validation of the need for namespacing to me. Otherwise if there's a #[default = ...] it wouldn't even know whether that's for serialization/deserialization or some unrelated purpose.

I think enforcing namespacing and having a simple convention for simple usecases -- so folks can just follow along -- would work well. Perhaps a clippy lint which has to be explicitly #[expect] for the handful of usecases requiring more schemas.

There has also been talk of "common" attributes, like skip.

Common attributes are fine: a standardized meaning should not cause "mishaps".

@weiznich
Copy link
Contributor

weiznich commented Oct 4, 2024

The RFC aims to define a chunk of useful functionality that will be useful for some crates, without the expectation that every possible proc macro will immediately be a good idea to convert.

I'm honestly not sure if this will be really that useful for anyone without a lot of additional work. One advantage as the maintainer of an old crate like diesel is that I can look back in the past and see how things were back then in the "beginning". In this case: Diesel is older than stable proc-macro support, so I remember the ways how this was implemented back then before we had access to stable proc-macros. You can find the code here. Funnily the solution we used back then is very similar to what this RFC proposes, although it did not have the syntax sugar to be able to just write #[derive(MyMacro)]. There was the custom_derives crate back then, which essentially enabled the syntax proposed by this RFC, with the minor cavecat that you would need to put the struct definition into a macro call.
Niko pointed out that it really hard to parse all possible syntax variants of rust structs/types/enums in a declarative macro today. I fully agree with that. See the old code from simpler days for some estimation how complex that would be today with even more syntax variants. So having support from the compiler here seems the be necessary from my point of view. This becomes especially important if you think about what happens if the rust syntax is extended. The current approach would likely lead to quite a few such crates fail to accept the new syntax, as their macro parser cannot handle these new cases. That's not an issue with the syn based approach or if the compiler does the parsing and only invokes some sort of special macro syntax.

As another counter point diesel has a table! with relatively simple syntax that is fully under our control as crate authors. Even for that we decided to implement that parser as proc-macro, as the macro was quite complex and the error reporting was quite bad if something went wrong. (It essentially boiled down to: You macro is wrong, checkout the documentation for the correct syntax).

Now my main points are:

  • If this RFC is only for experimentation: It's already possible to do that on stable rust, you can just use the same approach as the old custom_derives approach
  • There is existing old code that essentially used this approach already. There are a lot of reasons why crates like diesel moved away from this approach. You likely find a few more examples for this by checking which other crates also used custom_derives

@joshtriplett
Copy link
Member Author

joshtriplett commented Oct 21, 2024

I've looked at the linked code, and I can understand that the full generality of the macro infrastructure that Diesel had is not something you want to have to maintain. From what I've seen in the ecosystem, many proc macro derives will not require that degree of complexity, and some will be able to switch over earlier than Diesel will want to. I've spelled out in the RFC the potential issue of pressure on crate maintainers, and included a stabilization-blocking requirement that we provide guidance for users and crate maintainers to attempt to avert that. There's a limit to how much we can do there, but we should do what we can.

Beyond that, there's a long history of potential macro improvements stalling out because scope creep or because "wait until macros 2.0", rather than taking place incrementally. There is no one-way door here, so we can ship features incrementally without blocking on having everything users may want.

@weiznich
Copy link
Contributor

Beyond that, there's a long history of potential macro improvements stalling out because scope creep or because "wait until macros 2.0", rather than taking place incrementally. There is no one-way door here, so we can ship features incrementally without blocking on having everything users may want.

I'm not sure I agree with that. Yes not shipping any features because they are not 100% perfect is bad, sure. But on the other hand: Shipping features and promising to "fix" more complex cases later on is also bad, because quite often that fixing did not happen anytime soon. (Examples: Proc-macro diagnostics, various async limitations (traits, drop, …)) That leaves crate maintainers in a situation where there might be a considerable pressure to use a new feature but it is not possible to use it in an good.

I personally would wish that the RFC either clears up that this feature is supposed to be used in special situations (and declares which ones) or that it includes a more general approach to handle at least something rather simple like #[derive(Default)] (without #[default]) or #[derive(Clone)]. We hopefully all agree that these derives are not complex in terms of code parsing is done there. I personally do not see a way to express without at least a considerable amount of parsing code expressed as macro rules. That code wouldn't be a problem if it would be maintained as part of the compiler, but it becomes a problem as soon as each crate includes it's own likely incomplete and broken version.

@joshtriplett
Copy link
Member Author

Before stabilizing this feature, we should receive feedback from crate
maintainers, and potentially make further improvements to macro_rules to make
it easier to use for their use cases. This feature will provide motivation to
evaluate many new use cases that previously weren't written using
macro_rules, and we should consider quality-of-life improvements to better
support those use cases.

@ijackson
Copy link

ISTM that this feature is very difficult to use because of (1) macro_rules's severe parsing limitations, as others have said (2) even if those were improved, the results would not be very ergonomic, since you'd still need some kind of matcher for the derive input.

These problems could be addressed by taking an approach like derive-deftly's: don't have the user write a matcher for the struct definition, and instead give them pre-canned bindings for the pieces of the input. In principle something like derive-deftly could be done in-language.

I think rust-lang's development effort could probably be better spent by trying to improve macro_rules's limitations (some of this has already been done, but there's more needed; the ambiguity rule is particularly troublesome and also nontrivial to get rid of). Until that's done, there's little value in providing merely a mildly improved invocation syntax.

IOW: In this area, the hard problems need to be tackled before we start adding sugar.

@kennytm
Copy link
Member

kennytm commented Oct 23, 2024

@ijackson I believe that derive-deftly approach is being addressed by #3714.

@joshtriplett
Copy link
Member Author

We had a @rust-lang/lang design meeting today on the set of macro RFCs. I've updated the RFC to incorporate all the feedback from that design meeting.

Per the feedback in that meeting, I'm starting an FCP to start letting people register consensus for this RFC.

@rfcbot merge

@rfcbot
Copy link
Collaborator

rfcbot commented Nov 21, 2024

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.