Skip to content
This repository has been archived by the owner on Jan 8, 2021. It is now read-only.

Cleanup of old user accounts - Issue#218 #230

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions piplmesh/account/tasks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from django.conf import settings
from django.utils import timezone

from celery import task

from piplmesh.account import models as account_models
from piplmesh.api import models as api_models

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base is not used?

@task.task
def clean_inactive_lazy_users():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. But this we will have to REALLY improve. Now you read ALL users from database into Python just to know which users not to process. ;-)

This is why databases support queries. So that you can limit what is transferred between database and Python to only what you are interested in at the end.

So please create a MongoEngine query which will return only users which have not content and which are was more than the timeout inactive. Then run over them is_anonymous and delete them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been trying for few days now to find a way to build such query, but I had no luck. I literally checked all pages on google about this topic but I'm still stuck at the query. Any help would be much appreciated. I assume I have to use something like this:
api_models.Post.objects(comments__author=user) - returns me the posts, which have comments written by an user. But this is just one part of the query and I don't know how to combine posts, comments and users together in a query.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this you can get all posts by the user at the same time as all posts with comments by the user: http://mongoengine-odm.readthedocs.org/en/latest/guide/querying.html#advanced-queries

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(But it is not necessary that it helps you here much.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But probably you will have to do a server-side query. Have you tried asking the question on StackOverflow?

users_with_content = []
for post in api_models.Post.objects:
users_with_content.append(post.author)
for comment in post.comments:
users_with_content.append(comment.author)
users_with_content = list(set(users_with_content))
for user in account_models.User.objects:
if not user.is_authenticated() and (timezone.now() - user.connection_last_unsubscribe).days >= settings.LAZY_USER_EXPIRATION and user not in users_with_content:
user.delete()
10 changes: 9 additions & 1 deletion piplmesh/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,9 @@
),
}

CHECK_ONLINE_USERS_INTERVAL = 10
CHECK_ONLINE_USERS_INTERVAL = 10 # seconds
CLEAN_INACTIVE_USERS_INTERVAL = 1 # days
LAZY_USER_EXPIRATION = 30 # days

CELERY_RESULT_BACKEND = 'mongodb'
CELERY_MONGODB_BACKEND_SETTINGS = {
Expand All @@ -221,6 +223,7 @@
BROKER_VHOST = 'celery'

CELERY_IMPORTS = (
'piplmesh.account.tasks',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Order alphabetically.

'piplmesh.frontend.tasks',
)

Expand All @@ -230,6 +233,11 @@
'schedule': datetime.timedelta(seconds=CHECK_ONLINE_USERS_INTERVAL),
'args': (),
},
'clean_inactive_lazy_users': {
'task': 'piplmesh.account.tasks.clean_inactive_lazy_users',
'schedule': datetime.timedelta(days=CLEAN_INACTIVE_USERS_INTERVAL),
'args': (),
},
}

# A sample logging configuration. The only tangible logging
Expand Down