Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ees 5687 add resource utilisation metric alerts #5464

Open
wants to merge 6 commits into
base: EES-5686-add-public-api-performance-metric-alerts
Choose a base branch
from
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ module apiContainerAppModule '../../components/containerApp.bicep' = {
alerts: deployAlerts ? {
restarts: true
responseTime: true
cpuPercentage: true
memoryPercentage: true
alertsGroupName: resourceNames.existingResources.alertsGroup
} : null
tagValues: tagValues
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,8 @@ module dataProcessorFunctionAppModule '../../components/functionApp.bicep' = {
storageFirewallRules: storageFirewallRules
alerts: deployAlerts ? {
functionAppHealth: true
cpuPercentage: true
memoryPercentage: true
storageAccountAvailability: true
storageLatency: true
fileServiceAvailability: true
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,11 @@ module postgreSqlServerModule '../../components/postgresqlDatabase.bicep' = {
availability: true
queryTime: true
transactionTime: true
clientConenctionsWaiting: true
cpuPercentage: true
diskBandwidth: true
diskIops: true
memoryPercentage: true
alertGroupName: resourceNames.existingResources.alertsGroup
} : null
tagValues: tagValues
Expand Down
Original file line number Diff line number Diff line change
@@ -1,32 +1,19 @@
import { Severity } from '../types.bicep'

@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('The alert severity.')
param severity Severity = 'Warning'

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts '../dynamicMetricAlert.bicep' = [for name in resourceNames: {
name: '${name}ResponseTimeAlertModule'
module alerts '../baseResponseTimeAlert.bicep' = {
name: '${resourceNames[0]}ResponseTimeAlertModule'
params: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we're only using the first resourceNames item to construct the name, feels like the abstraction isn't quite right. The implication is that we intend this to only work with a single resource, so having a list of resourceNames seems a little confusing.

Should we change this so that we only work with a single resource by changing the resourceName to a string parameter?

We're doing a similar thing in a bunch of places throughout this PR so it'd be good to change this across the board if we make this change.

alertName: '${name}-response-time'
resourceIds: [resourceId('Microsoft.Network/applicationGateways', name)]
resourceNames: resourceNames
resourceType: 'Microsoft.Network/applicationGateways'
query: {
metric: 'ApplicationGatewayTotalTime'
aggregation: 'Average'
operator: 'GreaterThan'
}
evaluationFrequency: 'PT1M'
windowSize: 'PT5M'
severity: severity
metricName: 'ApplicationGatewayTotalTime'
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts '../baseCpuPercentageAlert.bicep' = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we potentially over-abstracting by having loads of extra modules that wrap baseCpuPercentageAlert.bicep?

It's not clear what extra value we're getting from the additional wrapping and feels like we're just copy-pasting the same set of parameters and configuration in a bunch of different modules. It also looks we're generating a bunch of additional resource group deployments due to the extra wrapping, which is a little less clear to debug.

I'd potentially suggest we simplify things and remove the modules that are re-using the 'base' modules like baseCpuPercentageAlert.bicep. Instead, we can just use the base modules directly and avoid the extra abstraction layer.

Similar thing would apply across this PR.

name: '${resourceNames[0]}CpuPercentageAlertModule'
params: {
resourceNames: resourceNames
resourceType: 'Microsoft.Web/serverfarms'
metricName: 'CpuPercentage'
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts '../baseMemoryPercentageAlert.bicep' = {
name: '${resourceNames[0]}MemoryPercentageAlertModule'
params: {
resourceNames: resourceNames
resourceType: 'Microsoft.Web/serverfarms'
metricName: 'MemoryPercentage'
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import { Severity } from 'types.bicep'

@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('Names of the resources that these alerts are being applied to.')
param resourceType string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could create and use a new CpuResourceType that is a union of the specific resources with CPUs (rather than every resource type). This would help with auto-completion and general type safety.


@description('Names of the resources that these alerts are being applied to.')
param metricName string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above - could create a CpuMetricName type that could assist in auto-completion and general type safety.


@description('The alert severity.')
param severity Severity = 'Warning'

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts 'dynamicMetricAlert.bicep' = [for name in resourceNames: {
name: '${name}CpuPercentBaseAlertModule'
params: {
alertName: '${name}-cpu-percentage'
resourceIds: [resourceId(resourceType, name)]
resourceType: resourceType
query: {
metric: metricName
aggregation: 'Average'
operator: 'GreaterThan'
}
evaluationFrequency: 'PT5M'
windowSize: 'PT15M'
severity: severity
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}]
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import { Severity } from 'types.bicep'

@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('Names of the resources that these alerts are being applied to.')
param resourceType string

@description('Names of the resources that these alerts are being applied to.')
param metricName string
Comment on lines +6 to +10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to baseCpuPercentageAlert.bicep, we could create some memory-specific types that can constrain these two parameters.


@description('The alert severity.')
param severity Severity = 'Warning'

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts 'dynamicMetricAlert.bicep' = [for name in resourceNames: {
name: '${name}MemoryPercentBaseAlertModule'
params: {
alertName: '${name}-memory-percentage'
resourceIds: [resourceId(resourceType, name)]
resourceType: resourceType
query: {
metric: metricName
aggregation: 'Average'
operator: 'GreaterThan'
}
evaluationFrequency: 'PT5M'
windowSize: 'PT15M'
severity: severity
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}]
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import { Severity } from 'types.bicep'

@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('Names of the resources that these alerts are being applied to.')
param resourceType string

@description('Names of the resources that these alerts are being applied to.')
param metricName string
Comment on lines +6 to +10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to baseCpuPercentageAlert.bicep, we could create some response time-specific types that can constrain these two parameters.


@description('The alert severity.')
param severity Severity = 'Warning'

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts 'dynamicMetricAlert.bicep' = [for name in resourceNames: {
name: '${name}ResponseTimeBaseAlertModule'
params: {
alertName: '${name}-response-time'
resourceIds: [resourceId(resourceType, name)]
resourceType: resourceType
query: {
metric: metricName
aggregation: 'Average'
operator: 'GreaterThan'
}
evaluationFrequency: 'PT5M'
windowSize: 'PT15M'
severity: severity
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}]
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts '../baseCpuPercentageAlert.bicep' = {
name: '${resourceNames[0]}CpuPercentageAlertModule'
params: {
resourceNames: resourceNames
resourceType: 'Microsoft.App/containerApps'
metricName: 'CpuPercentage'
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts '../baseMemoryPercentageAlert.bicep' = {
name: '${resourceNames[0]}MemoryPercentageAlertModule'
params: {
resourceNames: resourceNames
resourceType: 'Microsoft.App/containerApps'
metricName: 'MemoryPercentage'
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}
Original file line number Diff line number Diff line change
@@ -1,32 +1,19 @@
import { Severity } from '../types.bicep'

@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('The alert severity.')
param severity Severity = 'Warning'

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts '../dynamicMetricAlert.bicep' = [for name in resourceNames: {
name: '${name}LatencyAlertModule'
module alerts '../baseResponseTimeAlert.bicep' = {
name: '${resourceNames[0]}ResponseTimeAlertModule'
params: {
alertName: '${name}-latency'
resourceIds: [resourceId('Microsoft.App/containerApps', name)]
resourceNames: resourceNames
resourceType: 'Microsoft.App/containerApps'
query: {
metric: 'ResponseTime'
aggregation: 'Average'
operator: 'GreaterThan'
}
evaluationFrequency: 'PT1M'
windowSize: 'PT5M'
severity: severity
metricName: 'ResponseTime'
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import { Severity } from '../types.bicep'

@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('The alert severity.')
param severity Severity = 'Warning'

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts '../dynamicMetricAlert.bicep' = [for name in resourceNames: {
name: '${name}ClientConnectionsWaitingAlertModule'
params: {
alertName: '${name}-query-time'
resourceIds: [resourceId('Microsoft.DBforPostgreSQL/flexibleServers', name)]
resourceType: 'Microsoft.DBforPostgreSQL/flexibleServers'
query: {
metric: 'client_connections_waiting'
aggregation: 'Maximum'
operator: 'GreaterThan'
}
evaluationFrequency: 'PT5M'
windowSize: 'PT15M'
severity: severity
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}]
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts '../baseCpuPercentageAlert.bicep' = {
name: '${resourceNames[0]}CpuPercentageAlertModule'
params: {
resourceNames: resourceNames
resourceType: 'Microsoft.DBforPostgreSQL/flexibleServers'
metricName: 'cpu_percent'
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import { Severity } from '../types.bicep'

@description('Names of the resources that these alerts are being applied to.')
param resourceNames string[]

@description('The alert severity.')
param severity Severity = 'Warning'

@description('Name of the Alerts Group used to send alert messages.')
param alertsGroupName string

@description('Tags with which to tag the resource in Azure.')
param tagValues object

module alerts '../dynamicMetricAlert.bicep' = [for name in resourceNames: {
name: '${name}DiskBandwidthAlertModule'
params: {
alertName: '${name}-disk-bandwidth'
resourceIds: [resourceId('Microsoft.DBforPostgreSQL/flexibleServers', name)]
resourceType: 'Microsoft.DBforPostgreSQL/flexibleServers'
query: {
metric: 'disk_bandwidth_consumed_percentage'
aggregation: 'Average'
operator: 'GreaterThan'
}
evaluationFrequency: 'PT5M'
windowSize: 'PT15M'
severity: severity
alertsGroupName: alertsGroupName
tagValues: tagValues
}
}]
Loading
Loading