Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monitoring PR handling failures. #42

Merged
merged 16 commits into from
Dec 13, 2024
Merged

monitoring PR handling failures. #42

merged 16 commits into from
Dec 13, 2024

Conversation

Oded-B
Copy link
Collaborator

@Oded-B Oded-B commented Nov 29, 2024

This PR adds a few metrics to Telefonistka process

Description

Name:      "open_prs",
Help:      "The total number of open PRs",

Note: Just a metric to track activity in repo.

Name:      "open_promotion_prs",
Help:      "The total number of open PRs with promotion label",

Note: Could be used to check if we need to enforce merging/closing stale promotion PRs

Name:      "open_prs_with_pending_telefonistka_checks",
Help:      "The total number of open PRs with pending Telefonistka checks(excluding PRs with very recent commits)",

Note: those represent cases the Telefonsitka totaly failed, not even marking the checks as fialed

Name:      "commit_status_updates_total",
Help:      "The total number of commit status updates, and their status (success/pending/failure)",

We can use this track failure telefonistika is aware of

Type of Change

  • Bug Fix
  • New Feature
  • Breaking Change
  • Refactor
  • Documentation
  • Other (please describe)

Checklist

  • I have read the contributing guidelines
  • Existing issues have been referenced (where applicable)
  • I have verified this change is not present in other open pull requests
  • Functionality is documented
  • All code style checks pass
  • New code contribution is covered by automated tests
  • All new and existing tests pass

We both explicit failures with pr_handle_failures_total
And cases where telefonistka commit status check  is left in "pending"
state (because Telefonistka exploded while handling that event )
This way we can also monitor the successes, so we can get a ratio or
just the activity rate of the repo and such
Express const timeframes in time (not int)
@Oded-B Oded-B marked this pull request as ready for review December 4, 2024 12:39
@Oded-B Oded-B marked this pull request as draft December 4, 2024 12:45
improve error handeling
Add context timeout
@Oded-B Oded-B marked this pull request as ready for review December 4, 2024 13:54
@Oded-B Oded-B marked this pull request as draft December 5, 2024 15:47
@Oded-B
Copy link
Collaborator Author

Oded-B commented Dec 5, 2024

time.After(60time.Second)
context.WithTimeout(context.Background(), 60
time.Second)

@Oded-B
Copy link
Collaborator Author

Oded-B commented Dec 9, 2024

Sample from Lab:

# HELP telefonistka_github_commit_status_updates_total The total number of commit status updates, and their status (success/pending/failure)
# TYPE telefonistka_github_commit_status_updates_total counter
telefonistka_github_commit_status_updates_total{repo_slug="commercetools/k8s-manifests-poc",status="error"} 1
telefonistka_github_commit_status_updates_total{repo_slug="commercetools/k8s-manifests-poc",status="pending"} 1
# HELP telefonistka_github_open_promotion_prs The total number of open PRs with promotion label
# TYPE telefonistka_github_open_promotion_prs gauge
telefonistka_github_open_promotion_prs{repo_slug="commercetools/cf-cd-poc-code"} 0
telefonistka_github_open_promotion_prs{repo_slug="commercetools/k8s-manifests-poc"} 10
# HELP telefonistka_github_open_prs The total number of open PRs
# TYPE telefonistka_github_open_prs gauge
telefonistka_github_open_prs{repo_slug="commercetools/cf-cd-poc-code"} 0
telefonistka_github_open_prs{repo_slug="commercetools/k8s-manifests-poc"} 21
# HELP telefonistka_github_open_prs_with_pending_telefonistka_checks The total number of open PRs with pending Telefonistka checks(excluding PRs with very recent commits)
# TYPE telefonistka_github_open_prs_with_pending_telefonistka_checks gauge
telefonistka_github_open_prs_with_pending_telefonistka_checks{repo_slug="commercetools/cf-cd-poc-code"} 0
telefonistka_github_open_prs_with_pending_telefonistka_checks{repo_slug="commercetools/k8s-manifests-poc"} 0

@Oded-B Oded-B marked this pull request as ready for review December 9, 2024 13:20
yzdann
yzdann previously approved these changes Dec 11, 2024
docs/observability.md Show resolved Hide resolved
docs/observability.md Show resolved Hide resolved
internal/pkg/githubapi/pr_metrics.go Outdated Show resolved Hide resolved
Comment on lines +37 to +39
if err != nil {
log.Errorf("error getting PRs for %s/%s: %v", ghOwner, repo.GetName(), err)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not checking the error first, and then calling the prom.InstrumentGhCall(resp)?
Another question if the output for prom.InstrumentGhCall(resp) is not important why do bother to assign it something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The InstrumentGhCall function returns labels just for testing.
During runtime these labels are just used internally in the function.
I chose to instrument before error handling to ensure we still instrument even if we change the error handling to break out of that code path in the future, be I can change the order if you feel it makes it less readable

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to instrument before error handling to ensure we still instrument even if we change the error handling to break out of that code path in the future, be I can change the order if you feel it makes it less readable

I would add that as a comment

internal/pkg/githubapi/pr_metrics.go Show resolved Hide resolved
internal/pkg/githubapi/pr_metrics.go Show resolved Hide resolved
internal/pkg/githubapi/pr_metrics_test.go Outdated Show resolved Hide resolved
Co-authored-by: Yazdan Mohammadi <[email protected]>
@Oded-B Oded-B requested a review from yzdann December 12, 2024 10:25
Copy link

@yzdann yzdann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Oded-B Oded-B merged commit 8439b61 into main Dec 13, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants