-
Notifications
You must be signed in to change notification settings - Fork 24
INTPYTHON-527 Add Queryable Encryption support #329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Wrong commit message for 65bd15a and I don't want to force push yet. It should have said:
I'm aware that
|
It's not working as you think it is. As I said elsewhere, Does this fix the "command not supported for auto encryption: buildinfo" error? If so, it's perhaps because I'd suggest to use my patch is as a starting point for maintaining two connections. |
I don't disagree, but it feels a lot like
Yes it works by design, not a side effect. I'm
I'd make a few passes at it but did not get anywhere, I'll try again though. |
Your "stumble" theory of how it's working isn't correct. |
Copy that, thanks! I've removed
Still working on an unencrypted connection, but perhaps the only time we need it is for the version check. |
@ShaneHarvey @Jibola @timgraham FYI here is the
And here is the error again with some additional debug:
And the full traceback:
Test settings:
This is happening in the |
- Expand generic router functionality - Specify encrypted db_name & kms_provider in model - Get kms_providers and key_vault_namespace from auto_encryption_opts
How about setting kms_provider and db_name per model? |
@@ -27,7 +27,7 @@ def test_encrypted_fields_map(self): | |||
{ | |||
"path": "ssn", | |||
"bsonType": "string", | |||
"queries": [{"contention": 1, "queryType": "equality"}], | |||
"queries": {"contention": 1, "queryType": "equality"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI it's a list in the tutorial. Also for fields that support multiple query types I think it has to be a list of dictionaries, but maybe for a single query type this is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's confusing! The documentation says, "You can configure an encrypted field for either equality or range queries, but not both. ", and I've seen errors like "BSON field 'create.encryptedFields.fields.queries' is the wrong type 'array'" and "pymongo.errors.EncryptedCollectionError: Exactly one query type should be specified per field".
Unless there's some case I'm missing, arguably the server/tutorial shouldn't cause ambiguity by using a list, but probably it can't change behavior due to backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jordan-smith721 @Jibola Should we use a list
here for future compatibility and as shown in the tutorial or dict
? I know that I've tested with more than a single query type per field and gotten an appropriate error and so I'm wondering if we should anticipate fields with multiple query types in the future or if we can always assume a single query type per field, as is the case for the fields and query types we are currently supporting.
I feel this design is driven too much by your desire to be able to provide a generic database router. Sure, we can provide an example router in the documentation for some use case, but a router configurable by other settings is just creating layers of indirection that ultimately result in needless complexity. I imagine the majority of projects will have their encrypted models in the same database. To that end, something like this is sufficient:
Having to declare a Similarly for |
I do have that desire … but it's driven by a desire to simplify project setup and that may or may not include a generic database router.
What is "encrypted-alias" here? The literal name of the database or something that refers to the literal name of the database?
In the short term, I'll just create a base class with
Same with KMS provider. The base class gets |
Yes, Conceptually, adding model attributes for routing decisions just isn't clean, but feel free to defer my suggestion until you want to tackle it. Arguably, We have some monkeypatching. Example:
as_mql() , kms_provider() ) that aren't going to conflict with any future Django functionality isn't so bad. I'd like to try to clean up the other examples by adding hooks in future version of Django.
|
I would love to see this happen but how do we support it with content types requiring an unencrypted connection? If we were targeting a single encrypted database and connection, then I would agree we don't need to provide any database routers for the user. Since that is currently not the case, I don't think we can tell users that a minimum requirement for using this feature is to cut/paste Python code from the docs into a file then import and use it in their settings. That said, your point about not wanting to tie a bunch of configuration together is well taken. |
You can use Django without the contrib apps. |
Apparently with any KMS provider but local, credentials must accompany the provider on a journey to `create_encrypted_collection`.
|
||
.. admonition:: Migrations support is limited | ||
|
||
:djadmin:`makemigrations` does not detect changes to encrypted fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wording was copied from changes to embedded fields, but the behavior here is different. makemigrations will detect changes to encrypted fields but trying to run those migrations will fail with server errors since changes to encrypted fields aren't supported by MongoDB.
def kms_provider(self): # noqa: ARG001 | ||
return getattr(settings, "KMS_PROVIDER", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You misunderstood what I meant. The idea is that the kms_provider()
would be a router method with a signature similar to db_for_read()
(it takes a model). The user could implement the kms_provider
method on their router if they to specify a kms_provider
, either the same one for all models, or model by model.
All that said, kms_provider
is an optional parameter of create_encrypted_collection()
and there is no discussion of it in the design doc, so please first confirm we need it. Possibly it's only for explicit encryption? If it is needed, perhaps it will be helpful if I first sketch out the design before you dive into the implementation.
The documenation for ClientEncryption says, "Explicit client-side field level encryption." so I'm confused. I guess it's also needed for auto-encryption?
*, key_vault_namespace, crypt_shared_lib_path=None, kms_providers=None, schema_map=None | ||
): | ||
""" | ||
Returns an `AutoEncryptionOpts` instance for use with Queryable Encryption. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The correct verb style (per some PEP) is "Return".
# `supports_transactions` already checks if the server is a | ||
# replica set or sharded cluster. | ||
is_not_single = self.supports_transactions | ||
return is_enterprise and is_not_single and self.is_mongodb_7_0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return is_enterprise and is_not_single and self.is_mongodb_7_0 | |
# TODO: check if the server is Atlas | |
return is_enterprise and is_not_single and self.is_mongodb_7_0 |
ConnectionRouter.kms_credentials = kms_credentials | ||
ConnectionRouter.kms_provider = kms_provider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is a list of database functions and ConnectionRouter
isn't one of them. If we do need this, put it in routers.py
.
app_config, conn.alias, include_auto_created=False | ||
): | ||
if getattr(model, "encrypted", False): | ||
encrypted_fields = self.get_encrypted_fields(model, conn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can still initialize with conn.schema_editor() as editor:
outside the loops and change this line to editor._get_encrypted_fields_map(model)
.
|
||
self.stdout.write(json.dumps(schema_map, indent=2)) | ||
|
||
def generate_encrypted_fields_schema_map(self, conn): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 10 instances of conn =
in Django, but it's far outnumbered by 86 instances of connection =
so I'd prefer not to use the abbreviated name. I supppose I find it easier and more pleasurable not to have to read and interpret abbreviation.
(see previous attempts in #318, #319 and #323 for additional context)
With this PR I am able to get Django to create an encrypted collection when the schema code is running
create_model
on anEncryptedModel
containing anEncryptedCharField
e.g. seedb.enxcol_.encryption__person.ecoc
belowQuestions
_nodb_cursor
functionality in this PR or do something ininit_connection_state
as @timgraham suggests, or do something else?command not supported for auto encryption: buildinfo
which happens when Django attempts to get the server version via encrypted connection, thus necessitating the need to manage both encrypted and unencrypted connections. Are most commands supported for auto encryption or not?EncryptedModel
support forEmbeddedModel
look like? What are the specific use cases for integration ofEncryptedModel
andEmbeddedModel
? Should we be able to mixinEncryptedModel
andEmbeddedModel
then include that model in anEmbeddedModelField
?Todo
django_mongodb_backend.encryption
Helpers
Included helpers are also used by the test runner e.g.