Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load prebuilt macros #1812

Closed
wants to merge 33 commits into from
Closed

Load prebuilt macros #1812

wants to merge 33 commits into from

Conversation

DelevoXDG
Copy link
Contributor

Closes #1768

@@ -130,6 +131,8 @@ pub trait CompilationUnitAttributes {
fn components(&self) -> &[CompilationUnitComponent];
fn digest(&self) -> String;

fn is_prebuilt(&self) -> bool;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's the right place to store this information. In this PR you should load all available pre-builts, and we will add a whitelist later on.

Copy link
Contributor Author

@DelevoXDG DelevoXDG Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand 😅

This PR does load all available prebuilts, but we can't load any binaries if none are present for the target triple, even if they're included in the whitelist.

This method merely indicated whether a proc macro has prebuilts for the current platform

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trait CompilationUnitAttributes is shared between CompilationUnit, CairoCompilationUnit and ProcMacroCompilationUnit. Shouldn't this be defined only for the last one? I don't think it's used on the former two anyway.

}

#[tracing::instrument(skip_all, level = "debug")]
fn process<F>(
packages: Vec<PackageId>,
opts: CompileOpts,
ws: &Workspace<'_>,
skip_prebuilt_proc_macros: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Contributor Author

@DelevoXDG DelevoXDG Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to run compile_unit with this flag:

  1. True by default - prebuilts should be skipped initially
  2. If loading prebuilts fails, we want to be able to call it without skipping prebuilts

Since process takes mut operation: F including compile_unit and later runs it, process<F> had to be updated as well.

Since we currently always run process with skip_prebuilt_proc_macros = true, I've considered not adding as flag, and just having:

    operation(compilation_units, ws, true)?;

But, that just seemed less readable to me.

name = self.id.name,
version = self.id.version,
target = target_triple,
suffix = DLL_SUFFIX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use libloading::library_filename instead, to be consistent with shared_lib_path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've considered this as well. Unfortunately, it will lead to naming conflicts, as for example library with a name libsome_macro.so will be produced both for aarch64-unknown-linux-gnu and x86_64-unknown-linux-gnu.

Therefore I used format as described in the issue:

Within this directory, binaries for all platforms will follow the following naming pattern: ${PACKAGE_NAME}_v${PACKAGE_VERSION}_${TARGET}.${so|dylib|dll}

On the second look though

some_macro_v0.1.0_aarch64-apple-darwin.dylib
some_macro_v0.1.0_aarch64-unknown-linux-gnu.so
some_macro_v0.1.0_x86_64-apple-darwin.dylib
some_macro_v0.1.0_x86_64-unknown-linux-gnu.so

we may consider dropping ${PACKAGE_NAME}_v${PACKAGE_VERSION}_ to slightly simplify these.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just meant replace DLL_SUFFIX with call to libloading::library_filename which adds the extension for you

Copy link
Contributor Author

@DelevoXDG DelevoXDG Dec 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libloading::library_filename

It also adds DLL_PREFIX, which can be either "" or lib, which seems more confusing than helpful, especially since users need to name the files manually.

What do you think about dropping the ${PACKAGE_NAME}_v${PACKAGE_VERSION}_ prefix though? It would simplify things and prevent naming issues caused by mismatched Cargo and Scarb names or versions. The downside is that users might accidentally reuse a prebuilt binary after a version update, but this seems like a minor issue to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also adds DLL_PREFIX, which can be either "" or lib, which seems more confusing than helpful, especially since users need to name the files manually.

This is what Cargo compiles this packages into. Forcing macro authors to remove the prefix from compiled file is imho more confusing, that sticking to the conventional naming.

naming issues caused by mismatched Cargo and Scarb names or versions

What do you mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

I’m referring to the reason behind this warning.

When naming build artifacts, Cargo uses the package name specified in the Cargo manifest, which:

  1. May differ from the name specified in the Scarb manifest: Support packaging cairo-plugins #1605 (comment)
  2. Must be resolved manually via cargo_metadata: Support packaging cairo-plugins #1605 (comment)

The problem with 1. is that users will have to name packages manually, leading to confusion about whether the package name from the Cargo or Scarb manifest should be used - mistakes wills surely be made.
The problem with 2. is that using proc macros without Cargo is one of the motivations for introducing loading prebuilts. Relying on cargo_metadata would make that impossible.

This is what Cargo compiles this packages into. Forcing macro authors to remove the prefix from compiled file is imho more confusing, that sticking to the conventional naming.

While it’s true that Cargo compiles libraries with a prefix, the resulting libraries must be renamed anyway to avoid conflicts:

libsome_macro.dylib  
libsome_macro.so  
libsome_macro.dylib  
libsome_macro.so  
some_macro.dll  

Since renaming is inevitable, I argue that the simplest and most consistent naming convention - requiring minimal amount of documentation reading from the user - is:

aarch64-apple-darwin.dylib  
aarch64-unknown-linux-gnu.so  
x86_64-apple-darwin.dylib  
x86_64-unknown-linux-gnu.so  
x86_64-pc-windows-msvc.so  

This approach seems preferable to me alternatives like:

libsome_macro_v0.1.0_aarch64-apple-darwin.dylib  
libsome_macro_v0.1.0_aarch64-unknown-linux-gnu.so  
libsome_macro_v0.1.0_x86_64-apple-darwin.dylib  
libsome_macro_v0.1.0_x86_64-unknown-linux-gnu.so  
some_macro_v0.1.0_x86_64-pc-windows-msvc.so  

or:

libaarch64-apple-darwin.dylib  
libaarch64-unknown-linux-gnu.so  
libx86_64-apple-darwin.dylib  
libx86_64-unknown-linux-gnu.so  
x86_64-pc-windows-msvc.dll  

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with 1. is that users will have to name packages manually, leading to confusion about whether the package name from the Cargo or Scarb manifest should be used - mistakes wills surely be made.

True! Sounds like we should turn this warning into a hard error soon, to warn macro authors early on about the mismatch 😄

While it’s true that Cargo compiles libraries with a prefix, the resulting libraries must be renamed anyway to avoid conflicts:

Makes sense. So let's drop the prefix indeed. I would still keep the name + version, just to make sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So let's drop the prefix indeed.

👍🏻

I would still keep the name + version, just to make sure.

The current version does so, but I feel strongly that we should drop the name and version as well. They're redundant since this info is already available in the manifest, tarball name, and elsewhere, which potentially just makes things just a bit harder fot the macro developers. The only potential benefit I see is reducing the likelihood of outdated binaries being included. Unless you have any other reasons to keep it, I don't think this downside is really that important and justifies the trade-off.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just makes it less likely to ship wrong version if some build script fails, so it's an advantage for me. I don't think adding this is really a problem.

scarb/tests/build_cairo_plugin.rs Outdated Show resolved Hide resolved
@@ -1156,6 +1156,13 @@ impl ProcMacroHost {
Ok(())
}

/// Registers a prebuilt procedural macro by loading the shared library from the specified path.
pub fn register_prebuilt(&mut self, package: Package, config: &Config) -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this needs to be a separate function. Can't ProcMacroHost::register decide which logic should be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume what you mean is ProcMacroHost::register should be making the decision on loading mechanism implicitly based on the package being in the allow-list and prebuilt binary being present.

When loading procmacro package that is present in the allowlist (#1769), Scarb will look for the /target/scarb/cairo-plugin directory and will attempt to load a matching binary if present. In any case this loading process fails, Scarb will always fall back to source-code compilation. This will happen silently without any warnings.

However, we want to be able to attempt loading from prebuilts, and if that resulted in failure for any reason, explicitly compile and load the binaries instead.

scarb/src/compiler/plugin/proc_macro/ffi.rs Outdated Show resolved Hide resolved
ws.config()
.ui()
.print(Status::new("Compiling", &plugin_unit.name()));
compile_unit(plugin_unit, ws, false)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of corrupted prebuilt file, this would cause the macro compilation to be run on each compilation unit that depends on the plugin. For instance in workspace of 5 packages, each with lib and starknet-contract targets, compilation of snforge plugin that failed to load as pre-built would be called 10 times (and print statuses to ui).
Maybe instead we can return a typed error from compile_unit_inner function, e.g. RetryWithPrerequisite ( Vec::<_>) that would list additional compilation units that need to be run before this one?
Then we can change compile_units from for unit in units to a FILO stack of CUs to compile, that in case of RetryWithPrerequisite would add the failed CU back to FILO with the returned prerequisites and try again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead we can return a typed error from compile_unit_inner function, e.g. RetryWithPrerequisite ( Vec::<_>) that would list additional compilation units that need to be run before this one?

compile_unit_inner does not load proc macros though.

I think the simplier approach would be to update load_plugins to just skip compilation if shared_lib_path already exists for the plugin. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, you are right.
But I still don't like calling compile_unit in load_plugins or any other place that is not compile_units 😛 It would make the execution flow a bit confusing, both from the code perspective and from the user one. For instance, we could end up printing "Compiling A", then print "Compiling B", then compile B and only then actually start compiling A. And what if we decided we need to compile some comp. units in parallel in future?

skip compilation if shared_lib_path already exists for the plugin

This is not necessarily true. For instance, when you depend on a local proc macro by path, it will always have the same target dir location in cache (name-path/version/), but it will require recompilation with Cargo on any changes made to it.

I have another idea:

Maybe we should add an optional field with type Option<Arc<ProcMacroInstance>> and name like prebuilt or smth to ProcMacroCompilationUnit and CompilationUnitCairoPlugin? We could then load prebuilts in generate_compilation_units. If loading a prebuilt fails, we just set this field to None. The compile_unit could compile if the field is None, and just do nothing otherwise. Since it is an Arc, we would only load it once and then pass everywhere where needed. The load_plugins function can just use this field if present directly as it just needs the Arc anyway. This field would not be exposed in metadata, and we could add skip_loading_prebuilts to generate_compilation_units for the metadata command.

WDYT about this?

@DelevoXDG DelevoXDG requested a review from maciektr December 6, 2024 12:40
@DelevoXDG DelevoXDG marked this pull request as ready for review December 8, 2024 22:51
@DelevoXDG DelevoXDG requested a review from Draggu as a code owner December 8, 2024 22:51
@DelevoXDG DelevoXDG requested a review from a team as a code owner December 8, 2024 22:51
@DelevoXDG DelevoXDG force-pushed the zdobnikau/load-prebuilt-macros branch from 981d3df to e1873fd Compare December 11, 2024 10:10
Copy link
Contributor

@maciektr maciektr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great effort! 💪

Comment on lines +133 to +135
cargo test -p scarb --test build_cairo_plugin compile_with_prebuilt_plugins -- --exact
cargo test -p scarb --test build_cairo_plugin compile_with_invalid_prebuilt_plugins -- --exact
cargo test -p scarb --test proc_macro_server load_prebuilt_proc_macros -- --exact
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the same prefix for all this tests, and use the prefix to filter here, so we only run Cargo once.

name = self.id.name,
version = self.id.version,
target = target_triple,
suffix = DLL_SUFFIX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just makes it less likely to ship wrong version if some build script fails, so it's an advantage for me. I don't think adding this is really a problem.

Comment on lines 177 to +178
ignore_cairo_version: bool,
load_prebuilts: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a struct, e.g. CompilationUnitsOpts to name those? Passing raw boolean flag can be confusing at times.

@@ -130,6 +131,8 @@ pub trait CompilationUnitAttributes {
fn components(&self) -> &[CompilationUnitComponent];
fn digest(&self) -> String;

fn is_prebuilt(&self) -> bool;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trait CompilationUnitAttributes is shared between CompilationUnit, CairoCompilationUnit and ProcMacroCompilationUnit. Shouldn't this be defined only for the last one? I don't think it's used on the former two anyway.

Comment on lines +257 to +262
fn is_prebuilt(&self) -> bool {
self.components
.first()
.map(|c| c.package.is_prebuilt())
.unwrap_or(false)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be just self.prebuilt.is_some()?

prebuilt_path
.exists()
.then_some(prebuilt_path)
.ok_or_else(|| anyhow!("prebuilt library not found"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the path is expected to not be defined in some cases, this should be an Option instead of Result. Let's handle the error where path missing is an actual error.

Comment on lines +95 to +97
fn is_prebuilt(&self) -> bool {
self.prebuilt_lib_path().is_ok()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I find this function more confusing than helpful, as all it does is check if path exist - it still doesn't mean that macro can be actually loaded, or that the compilation will not be run, but the name does not reflect that. Maybe we should just inline prebuilt_path().is_some() instead? Sounds short enough to me. 🤔

@@ -619,9 +626,13 @@ impl<'a> PackageSolutionCollector<'a> {
// We can safely unwrap as all packages with `PackageClass::CairoPlugin` must define plugin target.
let target = package.target(&TargetKind::CAIRO_PLUGIN).unwrap();
let props: CairoPluginProps = target.props()?;
let prebuilt = ProcMacroInstance::try_load_prebuilt(package.clone())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, if loading the same dynamic library into the memory multiple times is legal (especially on windows). For safety, I would do it in two steps

  • First generate all CairoCompilationUnits
  • Then iter through all cairo_plugins and create ProceduralMacroCompilationUnits for them (loading the plugins)
  • And then, reiterate through CairoCompilationUnits, cloning the Arc with loaded macro into each maching CairoCompilationUnitComponent.

This way we do not need to load any macro twice.
Also, it might be worth the effort, to replicate this situation in tests - i.e. instead of a single package, have two packages relying on the same plugin in a single workspace.

// No need to fetch for buildin plugins.
if !props.builtin {
// No need to fetch for builtin or prebuilt plugins.
if !props.builtin && !package.is_prebuilt() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment saying, it will not run fetch for plugins with invalid prebuilts, even though they will need to compile with Cargo before being used.

@maciektr
Copy link
Contributor

Replaced with #1856

@maciektr maciektr closed this Dec 20, 2024
@maciektr maciektr mentioned this pull request Dec 20, 2024
github-merge-queue bot pushed a commit that referenced this pull request Dec 21, 2024
Originally submitted as
#1812

---------

Co-authored-by: Maksim Zdobnikau <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement loading of prebuilt macros
2 participants