-
Notifications
You must be signed in to change notification settings - Fork 2k
raw_exec windows: add support for setting the task user #25641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for your submission! We require that all contributors sign our Contributor License Agreement ("CLA") before we can accept the contribution. Read and sign the agreement Learn more about why HashiCorp requires a CLA and what the CLA includes Have you signed the CLA already but the status is still pending? Recheck it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tomqwpl!
I haven't reviewed very closely yet, but just a couple of high-level notes...
First, are you able/willing to sign the CLA? We can't accept a non-trivial PR like this without it, but I want to save everyone some time if that's going to be an issue.
Second, what kind of testing can be reasonably done with this on GitHub Actions? We sort of let the original PR slide on expanding test coverage and hit this silly java
-driver-on-Windows bug #25648 so I'd really like to have some way of making sure this is going to be a maintainable change.
Right now, it would conflict with the #25496 changes. With that code you would use. e.g "NT AUTHORITY\LocalSystem" as the username. That doesn't work here. I could easily detect explicit use of "NT AUTHORITY\something" and revert to the original code. Discussion would be needed with @x-dvr on that and what use cases need to work there.
Unfortunately we've already shipped #25496, so we can't break the behavior that allows those names. Are those names improper in some way, or do they just not work with the behavior demonstrated here.
I'd like to include some logging, but so far I can see any logging actually come out in the nomad log output. Not sure why.
Right. The executor doesn't log to the client agent, it logs to the executor log which you can find at $datadir/alloc/$alloc_id/$task_name/executor.out
(you can find this path in the agent logs that start with "client.driver_mgr.raw_exec.executor: starting plugin". That's designed to expose the logs of the task executor to the job author, as opposed to only exposing them in the agent logs which many job authors won't have access to.
The CLA I'm trying to clear now. I'm not expecting it to be a problem, but can't say for sure yet. I'd forgotten I would need to do that, I should have done it before I created the PR. It'll probably be a week before I have any clarity on that.
I deliberately didn't include handling of NT_AUTHORITY so as to have a discussion on the best way to maintain compatibility there (hence my comment). The code was also meant to show what I'd have done had the original not existed and then leave how to deal with the merge as a separate discussion. "NT AUTHORITY" is a weird thing. LSALogonUser doesn't accept it, but "LogonUser" does. It's kind of a placeholder, though I don't know why ".\LocalSystem" wouldn't have been sufficient (it doesn't work with this code though, I tried it). The only way I know you could deal with it is just to recognise it manually. I didn't do that as it felt fragile. I'm not actually sure what the range of users the original code would accept is, whether recognising "NT AUTHORITY" is enough. I was hoping perhaps the original PR author could help out there.
Can you point at where the testing for this on Linux would be? I've looked around and can't find anything obvious. It would be useful to know what level would be appropriate to do this at, assuming it can be done. With specific reference to that issue though, I guess this is where someone would need to explain the potential impact. The name of the PR relates to "raw-exec", as did the original, but clearly the code change actually has a wider impact than that, so clearly tests would need to cover more than just "raw-exec". |
Cool, thanks. I know it's a pain, so I appreciate it.
Ok, understood. I suspect we're going to want to support the other
There are test files in https://github.com/hashicorp/nomad/tree/main/drivers/shared/executor but I guess I was thinking more in terms of what we might need to do in the test environment setup in test-windows.yml. Because the reality is that Unix users are really simple and easy for us to test the behaviors, whereas Windows users are not (as evidenced by all the code you've just contributed!). So all this new code is making syscalls and the only way we'd have to test that at all right now is running jobs in our E2E suite on Windows. |
Description
This expands the scope of what was done in #25496 to make it more generally applicable.
Note that this is currently a draft. I deliberately haven't done anything about docs or changelog, the purpose
is to show what's possible and to discuss inclusion.
Right now, it would conflict with the #25496 changes. With that code you would use. e.g "NT AUTHORITY\LocalSystem" as the username. That doesn't work here. I could easily detect explicit use of "NT AUTHORITY\something" and revert to the original code. Discussion would be needed with @x-dvr on that and what use cases need to work there.
The use here is to either use "domain\username" or "username@domain". The latter is the more modern form of Windows username. It is also possible just to use "username" for a local user, or ".\username".
I'd like to include some logging, but so far I can see any logging actually come out in the nomad log output. Not sure why.
Testing & Reproduction steps
In order to test this it's necessary to run Nomad as the LocalSystem account. Easiest method of doing that for development purposes is to use "sysinternals" psexec : https://learn.microsoft.com/en-us/sysinternals/downloads/psexec. With that, run "psexec -s cmd.exe". That gets you a command prompt running as LocalSystem from where you can run nomad. Alternatively just run "psexec -s bin\nomad agent -dev".
Because this has to be run as local system, not sure how you would run automated tests on it. Equally there's not much you can do to unit test the individual parts I don't think.
I've separated the code that does the specific "s4u" stuff into a separate package, but happy to do whatever is thought best there. I didn't want it cluttering up the executor_windows.go file.
Links
Contributor Checklist
changelog entry using the
make cl
command.ensure regressions will be caught.
and job configuration, please update the Nomad website documentation to reflect this. Refer to
the website README for docs guidelines. Please also consider whether the
change requires notes within the upgrade guide.
Reviewer Checklist
backporting document.
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
within the public repository.