Alert Grouping
Learn how to create an alert group.
Alert grouping reduces noise and alert fatigue by consolidating related alerts into a single notification, making it easier for incident responders to focus on critical issues. When alerts are grouped together, responders will only be paged for the initial alert.
This improves response efficiency, enhances prioritization, and simplifies communication, ultimately leading to faster incident resolution and better overall system reliability. There are three conditions that can define how alerts are grouped:
- Alert routes: Alerts are grouped based on their route. For example, alerts routed to the same team will be grouped together.
- Time Window (also known as time-based): Alerts are grouped on a rolling window of time. For example, all new alerts triggered within a 10 minute timeframe will be grouped together.
- Content Matching: Alerts are grouped based on the value of specific fields, including title, urgency, and payload. For example, all alerts with the same alert title will be grouped together.
A common use case for alert grouping is when the organization has multiple monitors for the service. They might have a monitor for error rate, a monitor for latency, a monitor for CPU, and then maybe a monitor for something on the database. With a lot of monitors, if something goes wrong with that particular service, it is going to trigger all related monitors to start sending off alerts - this is where alert grouping comes in. When using alert grouping you are able to group alerts accordingly so that the responder only gets paged from the first alert that comes in and not paged for each monitor that gets triggered.
To create a new alert group in the web app:
- Navigate to Alerts--> Grouping Tab and click + New Alert Group.
- Enter a Name (required) and a Description (optional).
The Alert Route condition ensures that alerts will be grouped based on the triggered alert's route.
Step 1: Select your route condition.
First, select the route that should be used to consider an alert for a group.
- All services, teams and escalation policies will consider alerts regardless of their target route.
- All services will consider alerts that are routed to any service.
- All teams will consider alerts that are routed to any team.
- All escalation policies will consider alerts that are routed to any escalation policy.
- Select routes will consider alerts routed to a specific service, team, or escalation policy. For example, only group alerts that are routed to a specific team.
- Select 'Select routes' in the first dropdown under 'Alert routes'.
- Select the target service, team, or escalation policy that you would like to group alerts by.
Step 2: Select your group's route logic.
Next, define how the alerts should be grouped based on the alert's route. For example, you can define a group that will group alerts regardless of their routed service or only group alerts together if they're routed to the same service.
- Groups should only contain alerts for the same route: This ensures that alerts will only be grouped if they're routing to the same service, team, or escalation policy defined in step 1. For example, alerts routing to Service A will be grouped together, and alerts routing to Service B will be grouped together.
- Groups can contain alerts for any selected route: This will group alerts regardless of the destination service, team, or escalation policy defined in step 1. For example, any alert routed to any team will be grouped together.
- The group's time window defines how long alerts should be grouped together before creating a new group for new incoming alerts.
- The time window is set on a rolling basis, and will start based on when the last alert was added to the group.
- For example, a 10 minute time window will result in a group continuing to accept new alerts until a 10 minute down-time where no new alerts have been added to the group.
- Content Matching allows for more granularity to define the conditions under which alerts get grouped together.
- Alert Title can be used to group alerts that come in with the same title.
- Alert Urgency can be used to group by different urgencies (high, medium, low)
- Payload can be used when you want to group alerts based off any specific field from your payload.
- Example: When you want to group alerts based off of a specific alert features in your payload, they payload may look something similar to $.alert.feature
The initial alert in a group is considered the group's leader. The leader is the alert that initially paged the responder. Any matching grouped alerts will become members of the leader's group.
When a subsequent alert is grouped with a leader, the leader will act as the source of truth for all grouped alerts.
- Any new alerts that match the group will be automatically grouped under the leader. They will not page the responder.
- Any status changes to the group's leader will also update all of the alert member's statuses.
- You can review any individual alert's group from the alert in the Rootly dashboard under the 'Alert Group' tab.