-
-
Notifications
You must be signed in to change notification settings - Fork 941
safe mode to disable executing any external programs except git #2029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the preview - I see now how this can work and like that it seems to be minimally invasive overall.
I assume that the testing will primarily be done by hand for ease of use but hope that some sort of 'practical' test-case could be contributed as well.
I would like to write tests, I looked around through the test suite, but it still escapes me how to structure these tests. Since these options affect how |
It really isn't easy to test it at all, and probably impossible to test it exhaustively. So I am fine admitting defeat on this one, particularly because the tests I could imagine would be so specific and spotty that they barely have any value. |
if we can only run |
However, that may not be the whole story. Although I don't think "safe mode" should prohibit the use of subprocesses in Furthermore, regarding the use of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed a couple of things related to unusual but plausible command
arguments.
FYI this is my very simple test suite: #!/usr/bin/python3
import os
import git
import tempfile
for url in [
'[email protected]:fdroid/ci-test-app.git',
'ssh://gitlab.com/fdroid/ci-test-app.git',
'git://gitlab.com/fdroid/ci-test-app.git',
]:
print('====================', url)
d = tempfile.mkdtemp(prefix='foo_py_')
repo = git.Repo.clone_from(url, d, safe=True)
# clone from fake URL should prompt for password
d = tempfile.mkdtemp(prefix='foo_py_')
for url in [
'https://github.com/asdfasdfasdf/adsfasdfasdf.git',
'https://gitlab.com/asdfasdfasdf/adsfasdfasdf.git',
'https://codeberg.org/asdfasdfasdf/adsfasdfasdf.git',
]:
try:
repo = git.Repo.clone_from(url, d, safe=True)
except git.exc.GitCommandError as e:
print('repo', e)
url = 'https://gitlab.com/fdroid/ci-test-app.git'
d = tempfile.mkdtemp(prefix='foo_py_')
print(d)
repo = git.Repo.init(d, safe=True)
origin = repo.create_remote("origin", url)
origin.fetch()
d = tempfile.mkdtemp(prefix='foo_py_')
print(d)
repo = git.Repo.clone_from(url, d, safe=True)
print(type(repo))
print('SUCCESS') |
I think I should leave it to @EliahKagan to review this PR, due to its nature of being very relevant to the security of the project. To me it still is the question if there is a good case for supporting this - I'd think the answer should be "yes" if f-droid benefits, but I fear that it lures people into a false sense of safety. Maybe |
F-Droid has been using this approach for years, and will switch to this code once it is merged. As for the name, I'm open to suggestions. I used the word "safe" following the example of parser libs I've seen, where it means features are disabled in the interest of security. Like ruamel.yaml uses "safe" mode to mean a YAML parser that does not parse anything but lists, dicts, and scalars (e.g. no objects). From what see, I think this clearly does make it safer to use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to review this in full sometime soon! This mini-review is not that. Instead, this just points out a couple of things related directly to my previous comments. One is fairly minor and pertains to the clarity and technical accuracy of exception messages. The other is more significant: the current attempt to refuse to run the command in a shell does not always work.
@@ -1210,6 +1253,51 @@ def execute( | |||
if self.GIT_PYTHON_TRACE and (self.GIT_PYTHON_TRACE != "full" or as_process): | |||
_logger.info(" ".join(redacted_command)) | |||
|
|||
if self._safe: | |||
if isinstance(command, str) or command[0] != self.GIT_PYTHON_GIT_EXECUTABLE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To give clearer error messages, I recommend moving this check of the command
type and value below the if shell
check, and also splitting this command
check into two checks with two separate messages.
The reason I recommend doing this after the if shell
check is that, if a shell would (be attempted to) be used, then the condition here could come out True
even for git
commands. Technically the message is correct in that case, in that the shell is separate from the git
executable, but I don't think users would understand what was going on.
The reasons I recommend splitting this into separate if isinstance(command, str)
and if command[0] != self.GIT_PYTHON_GIT_EXECUTABLE
checks, with two separate exception messages, are:
- On Windows, passing a string like
"git version"
is unambiguously an attempt to rungit
with the argumentversion
. So in that case the message would be incorrect (and very confusing, if someone is using GitPython in a Windows-only application and writingcommand
arguments as strings). - It's possible for the message to be incorrect even outside Windows, because someone could pass
"git"
by itself (or whatever other valueGit.GIT_PYTHON_GIT_EXECUTABLE
has been set to), either by accident or intentionally in order to see the summary of subcommands it outputs. - On any system, even if the message is correct, I think it's useful to distinguish the problem of passing a string rather than a sequence from the problem of passing a sequence that attempts to execute something that isn't a Git command.
if shell: | ||
raise UnsafeExecutionError( | ||
redacted_command, | ||
"Command cannot be executed in a shell when in safe mode.", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This blocks using a shell when the reason the shell would be used is that a shell=True
argument was passed explicitly. But it fails to block using a shell when the reason is instead that Git.USE_SHELL
has been set to True
. See the first bullet point in #2029 (comment).
We can observe our own direct subprocess invocations (provided that they are done from Python and in the usual manner). Setting it up:
import sys
import git
def popen_hook(event, args):
if event == "subprocess.Popen":
print(event, args)
sys.addaudithook(popen_hook)
g = git.Git(".", safe=True)
Due to the Git
instance being created with safe=True
, this rightly declines to execute anything:
g.execute(["git", "version"], shell=True) # Blocked as of 07a3889.
But this executes a shell, even though it should also decline for the same reason, in the same way, and preferably through the same logic automatically:
git.Git.USE_SHELL = True
g.execute(["git", "version"]) # NOT blocked as of 07a3889.
(This example is written in such a way that it demonstrates the problem on any platform, not just Windows, though exactly what unintended git …
command is run in the shell differs across systems.)
The cause of the problem is that, at this point in the code, a None
value of shell
has not yet been replaced by the value of the USE_SHELL
class attribute. The if shell
logic should appear after this if shell is None
check, instead of before:
Lines 1343 to 1349 in 07a3889
if shell is None: | |
# Get the value of USE_SHELL with no deprecation warning. Do this without | |
# warnings.catch_warnings, to avoid a race condition with application code | |
# configuring warnings. The value could be looked up in type(self).__dict__ | |
# or Git.__dict__, but those can break under some circumstances. This works | |
# the same as self.USE_SHELL in more situations; see Git.__getattribute__. | |
shell = super().__getattribute__("USE_SHELL") |
Although one option would be to move the new if self._safe
code lower, that code for checking if shell is None
(and resetting shell
to the value of the USE_SHELL
class attribute) could probably instead just be moved higher.
As described in #2020, here is the core implementation of "safe mode". The core idea is to set up operations so that external programs are not executed by
git
. This has been a major source of vulnerabilities.This means that network connections are limited to HTTPS. As much as possible, this will rewrite remote URLs to HTTPS. This is necessary so that submodules work even when they do not use HTTPS URLs, as long as they are public, HTTPS-accessible repos.
This is a draft to confirm the approach. Then I will follow up and polish everything for merging.closes #2020