Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSIDB-3579: Implement alert versioning to reduce database locks #856

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

MrMarble
Copy link
Member

@MrMarble MrMarble commented Dec 10, 2024

  • Added new field last_validated_dt to the AlertMixin and created_dt to the Alert Model.
  • Remove the alerts.all.delete() method call to prevent database locks
  • Modify the AlertSerializer to filter out alerts with a created_dt older than the last_validated_dt of the object they are related
  • Created a StaleAlertCollector collector to clean up stale alerts every hour

Closes OSIDB-3579

@MrMarble MrMarble self-assigned this Dec 10, 2024
@MrMarble MrMarble marked this pull request as ready for review December 10, 2024 15:21
@MrMarble MrMarble requested a review from a team December 10, 2024 15:21
Copy link
Contributor

@costaconrado costaconrado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks mostly good to me, I just requested a change about the collector because I strongly believe it will take more than 1 hour to this collector do finish, but overall, good job!

@MrMarble MrMarble force-pushed the OSIDB-3579 branch 4 times, most recently from eb3756c to 985bb5e Compare December 11, 2024 12:29
Copy link
Contributor

@costaconrado costaconrado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@MrMarble MrMarble changed the title OSIDB-3579: Implment alert versioning to reduce database locks OSIDB-3579: Implement alert versioning to reduce database locks Dec 11, 2024
Copy link
Contributor

@JakubFrejlach JakubFrejlach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of remarks, but otherwise LGTM! Really great job. I like the query creativity.

collectors/osidb_scheduler/collectors.py Show resolved Hide resolved

logger.info(f"Found {filtered_queryset.count()} stale alerts")

deleted_alerts_count = filtered_queryset.delete()[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about atomicity, do we need to somehow enforce it here? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I'm not sure about it, alerts are not modified once they go stale, so nothing should conflict

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This job should be the only piece of code performing write operations on the stale alerts so there should be no danger of any conflict. I do not see a reason for handling the atomicity. And as the delete is performed on the queryset I would expect that it is atomic anyway 😄

collectors/osidb_scheduler/tests/test_collectors.py Outdated Show resolved Hide resolved
collectors/osidb_scheduler/tests/test_collectors.py Outdated Show resolved Hide resolved
collectors/osidb_scheduler/tests/test_collectors.py Outdated Show resolved Hide resolved
@MrMarble MrMarble force-pushed the OSIDB-3579 branch 2 times, most recently from 3d7faf0 to 25f3bf4 Compare December 11, 2024 14:40
Copy link
Contributor

@osoukup osoukup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is great! Most of it LGTM. I have some complains I think should be addressed regarding the location of the module and that it uses the collector framework.

class OSIDBScheduler(AppConfig):
"""OSIDB scheduler service"""

name = "collectors.osidb_scheduler"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think that collectors is the right place. The collectors are meant to bring some data to OSIDB from external sources which does not happen here. I assume that this module should run rather internal tasks like housekeeping here or some checks but not really a data collection.

Copy link
Contributor

@JakubFrejlach JakubFrejlach Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that is probably my fault as I pointed @MrMarble this way. I am aware that we don't yet have anything in OSIDB being scheduled to do work periodically without downloading data from anywhere so my initial thought was to just reuse the collectors framework.

logger = get_task_logger(__name__)


class StaleAlertCollector(Collector):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collector framework is too have for this periodic job. It was designed for the data collection so the model has attributes like collected data status, up until when the data is complete, etc. while nothing like this is necessary here. You just perform full cleanup every time this job runs and that is it. No information to store needed, no status etc. Just a simple periodic Celery task should be enough or not? We do not have any framework for simple internal tasks like this (to eg. report status on API if needed etc.) but also we currently do not have a need for one 😄 so I would think that Celery should be enough.


logger.info(f"Found {filtered_queryset.count()} stale alerts")

deleted_alerts_count = filtered_queryset.delete()[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This job should be the only piece of code performing write operations on the stale alerts so there should be no danger of any conflict. I do not see a reason for handling the atomicity. And as the delete is performed on the queryset I would expect that it is atomic anyway 😄

@@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## Unreleased
### Changed
- Use keywords from ps-constants in CVEorg collector (OSIDB-3694)
- Added last_validated_dt field to AlertMixin and created_dt to alerts;
optimized alert handling by matching valid alerts via timestamps
and scheduling stale alert removal to reduce database locks. (OSIDB-3579)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (feel free to ignore): The CHANGELOG should be primarily for the user (the developers have Git history) and I am not sure they could have an idea what those attribute names stand for 😄 So describing the reasons or effects of the change is more important - or sometimes the CHANGELOG record could be IMO fully omitted if we do some purely internal changes which do not affect the user or in some totally indirect way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants