-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request/Discussion] Splitting tenant schemas into pieces to run in multiple processes. #550
Comments
Sounds interesting. The main issue coming to mind is how to handle the split in the event some other schemas are added while you're migrating all parts or migrating one part and not the other. |
@AGASS007 LOL, don't do that in the tenant migrations. Your two options on avoiding that are:
But my point is, don't create new schemas/delete schemas while you're running migration in parts. Do you have a particular use case for this? Right now, when we create a new tenant is a pretty big deal, and another team handles that, the dev team is responsible of the integrity as a whole, and not individual schemas. Edit: This is what i have working so far: https://github.com/bernardopires/django-tenant-schemas/compare/master...kingbuzzman:migration-in-parts?expand=1 - still need tests and clean up the text abit. |
In our setup, new tenants are created through an API, which can be triggered anytime. I'm fine with not addressing the potential change of number of tenants while migrating (i.e no built-in redundancy/extra margin in the chunks mechanism), but I think it should be made explicit. |
@AGASS007 thinking about your problem a little more, wouldn't this get solved by using transactions? if you wrap your create new tenant function, no one will see it until it gets committed. Which means, you will have to be REALLY unlucky to catch this issue while these guys are running in parts and you created a new migration. Ideally, these migrate commands would run within milliseconds of each other. My proof of concept: (inside the test app) from customers.models import Client
from django.db.transaction import atomic
with atomic():
Client.objects.create(schema_name='e', domain_url='e') In the meantime im running my migrate_schemas on a different session, looks like it works. No false positives, if i remove the atomic, i [can potentially] get issues depending on how many schemas you have / how many parts you're splitting it into. |
@AGASS007 Ive create the PR, any criticism would be appreciated. |
Hi @kingbuzzman . at Txerpa we've taken a diferent aproach to the problem, @marija-milicevic has developed a new executor based on celery, a truly distributed schema migration, much better than our previous parallel executor. We have more than 2k schemas, and big migrations are a nightmare for us, thats why we had put so much effort on this problem. |
@xgilest i dont know how i feel about delegating this task to celery. On the one hand, sure its a job queue... let it do it in the background... But this doesn't fix our releases -- the company i work for is very much: turn the app servers off after 9pm, throw a pretty 503, do the migrations as fast as possible, and when thats done, turn the app servers back on. If we were a bit more in the mindset of 0 downtime, your solution would probably work for us. But even so i'd still have some reservations:
Im not saying any of these are deal breakers.. Its just a different paradigm that scares a lot of large companies like mine. How are you dealing with all this right now? ps. your tests are failing. |
@kingbuzzman I do understand your reservations, how are we dealing with them:
ps. Tests are failing in dts, our changes doesn't break anythink that wasn't allready brake, but we've been unable to fix
Wich is very weird because it's not failing in other tests with python 3.5 any idea on how to fix will be apreciated |
@xgilest My one issue with the change by @marija-milicevic is the requirement of |
@andreburto - @xgilest actually has a very elegant solution to this. # of workers can be tuned PER machine-type and they'll just pull the next tenant as soon as they're ready. This means that this solution works the best if your tenants are not equal size (some take longer to migrate than others) or the worker machines are not the same (meaning that it's easier in a queue setup to tune workers to a machine and just let them go). As far as WHICH queue to choose - I bet that it would be fairly simple to apply a strategy pattern here and write providers for any queue type. RabbitMQ, RQ, SQS, etc |
Bonus points to sort the tenants by size (descending) so that the big tenants get picked up first. This is crude, but should reduce the overall time that the migrations take. |
The need has arisen to distribute our schema migrations, we have 150+ schemas and while parallel is a great add-on, and greatly speeds up our process, i'd like to propose a way to break migrations down further.
What i'm proposing is something like:
All the schemas would be retrieved, sorted by their pk, and split into parts, then the "part" you want would be the one that runs, using a good ol' python splice.
Naturally this wouldn't run on the same machine, my automated deployment would figure out how many servers i have to deploy the code, and divide the work once the public schema has been migrated.
What do you guys think?
The text was updated successfully, but these errors were encountered: