Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rpath #2

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Add rpath #2

wants to merge 14 commits into from

Conversation

johnwparent
Copy link
Contributor

@johnwparent johnwparent commented Aug 1, 2024

Add RPath support to Windows compiler wrapper

In Summary

The Windows dynamic libraries are broken into two parts, an import library (.lib) and the actual dynamic library (.dll). The import library is the component utilized when linking a program or library against the dynamic lib, as it provides the symbol stubs, and tells the linker where to look for them at runtime via a name in the COFF format of the import library section associated with the symbol being resolved. Typically this name is resolved to the qualified paths to a dll at runtime via a complex series of filesystem lookup heuristics. However, if the name happens to be the absolute path to the respective DLL, then the linker behaves like it has an RPath and just uses that absolute path to resolve the DLL.

Normally MSVC's linker will not create import libraries like this, but if the import library is parsed and modified, this approach can be leveraged to replicate RPaths on Windows.

This PR adds this functionality to the MSVC compiler wrapper.

This PR also adds compiler wrapper side handling for relocation respective to the RPath considerations that need to be made for successful relocation of binaries.

Copy link
Member

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly speaking

  • More docs
  • I made a number of style requests (we don't enforce style on c code here)
  • It looks like relocation is handled as a consideration, but I missed where cl performs the necessary work to update .lib when not in patch mode

Makefile Outdated Show resolved Hide resolved
@echo off
REM Usage: dll2lib [32|64] some-file.dll
REM
REM Generates some-file.lib from some-file.dll, making an intermediate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO def_gen.bat could be named more clearly, like create_import_lib_from_dll.bat (and I think this comment should specifically highlight that this is an import lib.

Ideally, this could take an abspath, and could also specify a destination (where the .lib should be put).

// Returns a single item list
StrList split(std::string s, std::string delim);

// Parse command line opts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update this comment to explain what it means if this function returns true

src/winrpath.cxx Outdated Show resolved Hide resolved
src/winrpath.h Outdated Show resolved Hide resolved

std::string LibRename::compute_def_line() {
return "/EXPORTS " + this->name + ".dll";
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) this feels a bit low level to embed in its own function.

src/winrpath.cxx Outdated Show resolved Hide resolved
if (!this->replace)
this->new_lib = name + "abs-name.lib";
else
this->new_lib = this->lib;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This member only appears to be set in this function, and only read in this function: IMO that means it would be clearer if it were not a member.

src/utils.cxx Outdated Show resolved Hide resolved
LibRename patcher(patch_args[0], patch_args[1], true);
patcher.computeDefFile();
patcher.executeLibRename();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly speaking, my understanding is that this updates the cl wrapper so that if the word "patch" appears as an argument, it is capable of performing relocation. I don't see the modification of non-relocation toolchain executions though. Also, that seems like a loose check: I think that token could appear in the invocation in other ways.

Copy link
Member

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description needs update

  1. E.g. we'll need to re-write externals into their own simulated install prefix
  2. Overall I think it would be helpful to differentiate w/ linking/rpathing on Linux
    • e.g. for X->Y, RPATHing is about generating a lib for X that points to Y, wherever Y is
    • In our case, this approach is about generating the necessary objects in Y such that when X links to it, the objects for X refer to Y
    • and for example the compiler doesn't have options like -rpath that would make it easy to generate a dll for X with "rpaths" directly
  3. Document the overall concept of wrapper interception:
    • We put a "cl" first in PATH, so that invocations of CL get our wrappers
    • And we manage all of that with a single multiplexing binary (much like how spack cc works)
    • in light of [2], I'm trying to remember the purpose of intercepting compiler calls
      • Presumably this is to help a build-system that is not as capable of finding dependencies as CMake (point being, the wrapper interception on Windows isn't for the the rpathing part, like it is on non-Windows; or maybe I got mixed up and we do intercept link calls, but is there much difference between intercepting and doing this after the build completes?)

cd tmp/test
# create symlinks to test the different toolchains
copy ../..cl.exe cl.exe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ../../cl.exe instead?

if( !CreatePipe(&this->ChildStdOut_Rd, &this->ChildStdOut_Wd, &saAttr, 0) )
throw SpackException("Could not create Child Pipe");
if ( !SetHandleInformation(ChildStdOut_Rd, HANDLE_FLAG_INHERIT, 0) )
throw SpackException("Child pipe handle inappropriately inhereited");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) typo: inherited

}

bool ExecuteCommand::pipeChildToStdout()
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More docstrings would be helpful, e.g. in this case:

/*
 * Execute the command and then after it finishes collect all of
 * its output.
 */

Note that a single compilation can generate a lot of output, so if the pipe has limited space, this might cause a hang.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w.r.t docstring: will do

w.r.t hang: will look into async implementation

Tool = std::unique_ptr<IntelFortranInvocation>(new IntelFortranInvocation(command, cli));
}
else {
// If it's not c/c++ or fortran we're linking
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comma: // If it's not c/c++ or fortran, then we're linking

That being said, this would be more robust IMO if it explicitly checked for lang::link: then add a final else that raises an exception (this might occur if we add a lang:: but don't handle it here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good plan.

std::unique_ptr<ToolChainInvocation> ToolChainFactory::ParseToolChain(char const* const * argv) {
std::string command(*argv);
char const* const* cli(++argv);
stripPathandExe(command);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) can you use camel case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem!

mklink /S link.exe cl.exe
mklink /S ifx.exe cl.exe
mklink /S relocate.exe cl.exe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this also has to create a link for ifort

And this would need to be able to intercept clang-cl as well (e.g. to support spack/spack#47338)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, ifort has been deprecated for a year, and either already has been or is going to soon be discontinued, and the recommendation is to use OneAPI (ifx). I don't think we've ever officially supported ifort on Windows (even the MSVC-Fortran compiler expects ifx not ifort as the compiler), so I wasn't sure we'd want to support it here.

W.r.t. Clang-cl, I'll reach out to coordinate with James, but support for clang-cl may be more involved than just adding a symlink. Depending on how this and #47338 go, if possible I'd like to get this in first and then make an edit as a part of #47338. If that lands first I'll make changes in this PR.

#include "cl.h"

void ClInvocation::loadToolchainDependentSpackVars(SpackEnvState &spackenv) {
this->spackCommand = spackenv.SpackCC.empty() ? "cl.exe" : spackenv.SpackCC;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This env var should not be empty: this wrapper should only be invoked after Spack tells the wrapper where to find the "real" C compiler.

(If you agree, I get the impression these *Invocation classes are not needed)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The invocation classes are needed primarily to support the linker and any future extensions we want to make to the specific behavior of a given compiler/linker. Further the subclasses allow us a bit more flexibility when selecting the correct toolchain.

The linker *Invocation class sets up the entire RPath hack and actually performs the steps required to produce the path'd binaries.

std::string Spack;
std::string SpackInstDir;
std::string SpackCC;
std::string SpackCXX;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment should be added that this isn't used right now, but would be useful if we add support for a compiler with distinct c/c++ binaries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will do.

move calc.dll tmp/calc.dll
link $(LFLAGS) /test/main.obj calc.lib /out:tester.exe
move tester.exe tmp/test/tester.exe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a CI check that runs this? Does it run now?

I'm assuming not given

  • issue w/setup_test
  • how is the link command above able to know that calc.dll is in tmp in order to generate the abspath?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tester.exe runs if setup locally (roughly following these instructions), this makefile test does not work as of yet.

@johnwparent
Copy link
Contributor Author

johnwparent commented Nov 25, 2024

@scheibelp W.r.t.:

E.g. we'll need to re-write externals into their own simulated install prefix

This is less of a factor as far as this PR is concerned (outside of the general problem of rpath'ing for transitively depended on DLLs), and it mostly an issue for the Spack side PR.

W.r.t. that, do we want to go down the rabbit hole of transient DLLs? For example, external curl -> external openssl -> various system libraries. We don't need to worry about the system libs because those will always "just work" from Spack's compilation environments, but detecting openssl's DLL may not if curl doesn't vendor/have a copy. We could attempt to locate the dependent DLLs, but there's no guarantee we've found the correct version/etc... We could also just inform the user (if a basic attempt to detect the dependent dll has failed) that they'll need to explicitly add that external to Spack. Another issue that arises is managing multiple dependent dlls that are detected by Spack. We'd almost need the concretizer to select one...

Edit: hit comment too soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants