Ees 5687 add resource utilisation metric alerts #5464

duncan-at-hiveit · 2024-12-12T19:55:50Z

Explain the motivation for making this change. What existing problem does the pull request solve?

You might also want to think about the following:

How did you solve complex parts of the problem? Other contributors may benefit from knowing more about your solution.
Are there any potential tradeoffs that need mentioning, or edge cases that have not been covered?
Have you added a reusable piece of functionality? Consider including some code snippets to better illustrate its usage.

Related changes

Are there any changes that were closely related to this change but not directly part of the main work e.g. refactoring, fixes, etc?

Screenshots

Are there any screenshots that help illustrate the change made?

ntsim · 2024-12-13T13:05:45Z

infrastructure/templates/public-api/components/alerts/appGateways/responseTimeAlert.bicep

-module alerts '../dynamicMetricAlert.bicep' = [for name in resourceNames: {
-  name: '${name}ResponseTimeAlertModule'
+module alerts '../baseResponseTimeAlert.bicep' = {
+  name: '${resourceNames[0]}ResponseTimeAlertModule'


Given we're only using the first resourceNames item to construct the name, feels like the abstraction isn't quite right. The implication is that we intend this to only work with a single resource, so having a list of resourceNames seems a little confusing.

Should we change this so that we only work with a single resource by changing the resourceName to a string parameter?

We're doing a similar thing in a bunch of places throughout this PR so it'd be good to change this across the board if we make this change.

ntsim · 2024-12-13T15:34:17Z

infrastructure/templates/public-api/components/alerts/appServicePlans/cpuPercentageAlert.bicep

+@description('Tags with which to tag the resource in Azure.')
+param tagValues object
+
+module alerts '../baseCpuPercentageAlert.bicep' = {


Are we potentially over-abstracting by having loads of extra modules that wrap baseCpuPercentageAlert.bicep?

It's not clear what extra value we're getting from the additional wrapping and feels like we're just copy-pasting the same set of parameters and configuration in a bunch of different modules. It also looks we're generating a bunch of additional resource group deployments due to the extra wrapping, which is a little less clear to debug.

I'd potentially suggest we simplify things and remove the modules that are re-using the 'base' modules like baseCpuPercentageAlert.bicep. Instead, we can just use the base modules directly and avoid the extra abstraction layer.

Similar thing would apply across this PR.

ntsim · 2024-12-13T15:37:49Z

infrastructure/templates/public-api/components/alerts/baseCpuPercentageAlert.bicep

+param resourceNames string[]
+
+@description('Names of the resources that these alerts are being applied to.')
+param resourceType string


Could create and use a new CpuResourceType that is a union of the specific resources with CPUs (rather than every resource type). This would help with auto-completion and general type safety.

ntsim · 2024-12-13T15:38:52Z

infrastructure/templates/public-api/components/alerts/baseCpuPercentageAlert.bicep

+param resourceType string
+
+@description('Names of the resources that these alerts are being applied to.')
+param metricName string


As above - could create a CpuMetricName type that could assist in auto-completion and general type safety.

ntsim · 2024-12-13T15:40:05Z

infrastructure/templates/public-api/components/alerts/baseMemoryPercentageAlert.bicep

+@description('Names of the resources that these alerts are being applied to.')
+param resourceType string
+
+@description('Names of the resources that these alerts are being applied to.')
+param metricName string


Similar to baseCpuPercentageAlert.bicep, we could create some memory-specific types that can constrain these two parameters.

ntsim · 2024-12-13T15:40:17Z

infrastructure/templates/public-api/components/alerts/baseResponseTimeAlert.bicep

+@description('Names of the resources that these alerts are being applied to.')
+param resourceType string
+
+@description('Names of the resources that these alerts are being applied to.')
+param metricName string


Similar to baseCpuPercentageAlert.bicep, we could create some response time-specific types that can constrain these two parameters.

ntsim · 2024-12-13T15:42:38Z

infrastructure/templates/public-api/components/alerts/dynamicMetricAlert.bicep

+param sensitivity Sensitivity = 'Low'

-param minFailingPeriodsToAlert int = 1
+param minFailingPeriodsToAlert int = 3

-param numberOfEvaluationPeriods int = 1
+param numberOfEvaluationPeriods int = 3


Minor, but shouldn't these changes be in the EES-5686 PR given we introduce dynamicMetricAlert.bicep there?

Doesn't make a lot of sense to have a follow-up PR that immediately changes the previous one.

…ns (and the Data Processor Function App)

…rver

…t configuration inherited by different resource-specific alerts

duncan-at-hiveit marked this pull request as ready for review December 12, 2024 20:05

duncan-at-hiveit force-pushed the EES-5687-add-resource-utilisation-metric-alerts branch 2 times, most recently from 2cb2aa9 to 3eca3d8 Compare December 12, 2024 20:30

ntsim reviewed Dec 13, 2024

View reviewed changes

duncan-at-hiveit force-pushed the EES-5687-add-resource-utilisation-metric-alerts branch from 3eca3d8 to a3e1b44 Compare December 20, 2024 10:07

duncan-at-hiveit force-pushed the EES-5686-add-public-api-performance-metric-alerts branch 2 times, most recently from aaaccaf to 6cefbfc Compare December 20, 2024 17:06

duncan-at-hiveit added 6 commits December 20, 2024 17:11

EES-5687 - added CPU and memory percentage alerts for app service pla…

611cd0f

…ns (and the Data Processor Function App)

EES-5687 - addec CPU and memory percentage alerts for Container Apps

8f16262

EES-5687 - renamed some alerts files to be consistently named

0680b32

EES-5687 - added various additional alerts for PostgreSQL Flexible Se…

7940582

…rver

EES-5687 - reduced various common alert types to a set of "base" aler…

ff4f967

…t configuration inherited by different resource-specific alerts

EES-5687 - updated references to renamed folder

dd56fa7

duncan-at-hiveit force-pushed the EES-5687-add-resource-utilisation-metric-alerts branch from a3e1b44 to dd56fa7 Compare December 20, 2024 17:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ees 5687 add resource utilisation metric alerts #5464

Ees 5687 add resource utilisation metric alerts #5464

duncan-at-hiveit commented Dec 12, 2024

ntsim Dec 13, 2024

ntsim Dec 13, 2024

ntsim Dec 13, 2024

ntsim Dec 13, 2024

ntsim Dec 13, 2024

ntsim Dec 13, 2024

ntsim Dec 13, 2024

Ees 5687 add resource utilisation metric alerts #5464

Are you sure you want to change the base?

Ees 5687 add resource utilisation metric alerts #5464

Conversation

duncan-at-hiveit commented Dec 12, 2024

Related changes

Screenshots

ntsim Dec 13, 2024

Choose a reason for hiding this comment

ntsim Dec 13, 2024

Choose a reason for hiding this comment

ntsim Dec 13, 2024

Choose a reason for hiding this comment

ntsim Dec 13, 2024

Choose a reason for hiding this comment

ntsim Dec 13, 2024

Choose a reason for hiding this comment

ntsim Dec 13, 2024

Choose a reason for hiding this comment

ntsim Dec 13, 2024

Choose a reason for hiding this comment