Alert Notification Process Description¶
When creating an alert policy, Observable Insight supports configuring different notification sending intervals for alerts triggered at different levels under the same policy. However, because there are two parameters, group_interval
and repeat_interval
, in the native configuration of Alertmanager, the actual sending interval of alert notifications may deviate.
Parameter Configuration¶
In the Alertmanager configuration, set the following:
Parameter descriptions:
-
group_wait
: Sets the waiting time for sending alert notifications. When Alertmanager receives a group of alerts, if no more alerts are received within the time specified bygroup_wait
, Alertmanager will wait for a certain amount of time to collect more alerts with the same labels and content, and add all qualifying alerts to the same notification. -
group_interval
: Sets the waiting time before a group of alerts are merged into a single notification. If no more alerts from the same group are received during this time, Alertmanager sends a notification containing all received alerts. -
repeat_interval
: Sets the interval for resending alert notifications. After Alertmanager sends an alert notification to a receiver, if it continues to receive alerts with the same labels and content within the time specified byrepeat_interval
, Alertmanager will resend the alert notification.
When the group_wait
, group_interval
, and repeat_interval
parameters are set simultaneously, Alertmanager handles the alert notifications under the same group as follows:
-
When Alertmanager receives qualifying alerts, it waits for at least the time specified in the
group_wait
parameter to collect more alerts with the same labels and content, and adds all qualifying alerts to the same notification. -
If no more alerts are received during the
group_wait
time, Alertmanager sends all received alerts to the receiver after that time. If other qualifying alerts arrive during this period, Alertmanager continues to wait until all alerts are collected or until a timeout occurs. -
If more alerts with the same labels and content are received within the
group_interval
parameter, these new alerts are also merged into the previous notification and sent together. If there are still unsent alerts after thegroup_interval
time, Alertmanager starts a new timing cycle and waits for more alerts until thegroup_interval
time is reached again or new alerts are received. -
If Alertmanager keeps receiving alerts with the same labels and content within the time specified by
repeat_interval
, it will resend the alert notifications that have already been sent before. When resending alert notifications, Alertmanager does not wait forgroup_wait
orgroup_interval
, but sends notifications repeatedly according to the time interval specified byrepeat_interval
. -
If there are still unsent alerts after the
repeat_interval
time, Alertmanager starts a new timing cycle and continues to wait for new alerts with the same labels and content. This process continues until there are no new alerts or Alertmanager is stopped.
Example¶
In the following example, Alertmanager assigns all alerts with CPU usage above the threshold to a policy named "critical_alerts".
groups:
- name: critical_alerts
rules:
- alert: HighCPUUsage
expr: node_cpu_seconds_total{mode="idle"} < 50
for: 5m
labels:
severity: critical
annotations:
summary: "High CPU usage detected on instance {{ $labels.instance }}"
group_by: [rulename]
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
In this case:
-
When Alertmanager receives an alert, it waits for at least 30 seconds to collect more alerts with the same labels and content, and adds them to the same notification.
-
If more alerts with the same labels and content are received within 5 minutes, these new alerts are merged into the previous notification and sent together. If there are still unsent alerts after 15 minutes, Alertmanager starts a new timing cycle and waits for more alerts until 5 minutes are reached again or new alerts are received.
-
If Alertmanager keeps receiving alerts with the same labels and content within 1 hour, it will resend the alert notifications that have already been sent before.