-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSIDB-3579: Implement alert versioning to reduce database locks #856
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks mostly good to me, I just requested a change about the collector because I strongly believe it will take more than 1 hour to this collector do finish, but overall, good job!
eb3756c
to
985bb5e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of remarks, but otherwise LGTM! Really great job. I like the query creativity.
|
||
logger.info(f"Found {filtered_queryset.count()} stale alerts") | ||
|
||
deleted_alerts_count = filtered_queryset.delete()[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking about atomicity, do we need to somehow enforce it here? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I'm not sure about it, alerts are not modified once they go stale, so nothing should conflict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This job should be the only piece of code performing write operations on the stale alerts so there should be no danger of any conflict. I do not see a reason for handling the atomicity. And as the delete
is performed on the queryset
I would expect that it is atomic anyway 😄
3d7faf0
to
25f3bf4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is great! Most of it LGTM. I have some complains I think should be addressed regarding the location of the module and that it uses the collector framework.
class OSIDBScheduler(AppConfig): | ||
"""OSIDB scheduler service""" | ||
|
||
name = "collectors.osidb_scheduler" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that collectors
is the right place. The collectors are meant to bring some data to OSIDB from external sources which does not happen here. I assume that this module should run rather internal tasks like housekeeping here or some checks but not really a data collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, that is probably my fault as I pointed @MrMarble this way. I am aware that we don't yet have anything in OSIDB being scheduled to do work periodically without downloading data from anywhere so my initial thought was to just reuse the collectors framework.
logger = get_task_logger(__name__) | ||
|
||
|
||
class StaleAlertCollector(Collector): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The collector framework is too have for this periodic job. It was designed for the data collection so the model has attributes like collected data status, up until when the data is complete, etc. while nothing like this is necessary here. You just perform full cleanup every time this job runs and that is it. No information to store needed, no status etc. Just a simple periodic Celery task should be enough or not? We do not have any framework for simple internal tasks like this (to eg. report status on API if needed etc.) but also we currently do not have a need for one 😄 so I would think that Celery should be enough.
|
||
logger.info(f"Found {filtered_queryset.count()} stale alerts") | ||
|
||
deleted_alerts_count = filtered_queryset.delete()[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This job should be the only piece of code performing write operations on the stale alerts so there should be no danger of any conflict. I do not see a reason for handling the atomicity. And as the delete
is performed on the queryset
I would expect that it is atomic anyway 😄
@@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |||
## Unreleased | |||
### Changed | |||
- Use keywords from ps-constants in CVEorg collector (OSIDB-3694) | |||
- Added last_validated_dt field to AlertMixin and created_dt to alerts; | |||
optimized alert handling by matching valid alerts via timestamps | |||
and scheduling stale alert removal to reduce database locks. (OSIDB-3579) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick (feel free to ignore): The CHANGELOG should be primarily for the user (the developers have Git history) and I am not sure they could have an idea what those attribute names stand for 😄 So describing the reasons or effects of the change is more important - or sometimes the CHANGELOG record could be IMO fully omitted if we do some purely internal changes which do not affect the user or in some totally indirect way.
last_validated_dt
to theAlertMixin
andcreated_dt
to theAlert
Model.alerts.all.delete()
method call to prevent database locksAlertSerializer
to filter out alerts with acreated_dt
older than thelast_validated_dt
of the object they are relatedStaleAlertCollector
collector to clean up stale alerts every hourCloses OSIDB-3579