Circonus has the ability to create alerts for you based on defined rulesets. These rulesets allow you to set up thresholds that will trigger different severities of alerts. Once you’ve set up a ruleset, you can add a contact group to that ruleset, so you can be notified via email, SMS, Jabber, IRC, Pagerduty, or one or more of many other methods.
Alerts are triggered when metric values violate a rule configured in the system. To create an alert, you must create a ruleset. Start by adding a rule to a one your check's metrics:
Select a check from the "Checks" page, and click on the down arrow in the rightmost column to expand the selected check:
- Now click "View Check Details" from this expanded view to open the check view page:
- From the check view page, you can now select a metric from the list of metrics for that check and click the down arrow to expand details for that metric:
- Click "Set Rules" to open the "Rulesets" page with information for this metric.
- From the Rulesets page, you can add new rules. This is also the same interface you will use when editing an existing ruleset, and it can also be accessed under the "Monitor" section of the main menu.. Rules are created on individual metrics. You may configure more than one rule per metric, this is your ruleset. Rules are processed in order, and the first one to be found in violation triggers the alert and rule processing stops. Click the "Add Rule +" to open a menu:
There are fore items you can change on the "Add a Rule" dialog:
- The first select box is the rule to evaluate. Here we selected "is present and higher than" to compare our incoming value with the value we provide in the next box.
- The text input field lets you enter the value with which you want to compare the metric. For example, we could say we want to be alerted if our duration exceeds 100ms.
- Next we select what severity level this rule will trigger. Circonus has 5 severity levels, 1 through 5, with 1 being the most severe.
- The last field lets us add a wait time. Circonus will create the alert in the UI immediately when a rule is violated, but it will wait the specified number of minutes before notifying any contact groups about the problem. This allows time for the issue to resolve on its own.
- Once you have configured your new rule, click "Add".
Click the "Add Contact +" button to attach a contact group to be notified when an alert is triggered.
From the "Add a Contact Group" dialog, simply select the name of the contact group you want to notify, and then select which severity levels they will be alerted about. You can have any number of groups attached to a ruleset, being notified about any number of severities. If you do not want to attach a group, the alert will still be created for you to see in the UI.
- Finally, note the "Metric Notes" section of the "Rulesets" page. We provide some brief detail about why this rule might alert, and a link to an internal wiki on how to resolve it. These notes and links will show up on the alerts in the UI, and the links are sent along with notifications to contact groups.
The are several other ruleset options that you can implement from this page:
- Latest Value - The value we last saw flow into the system for this metric.
- Derivative - If you want your rules to operate on the first order derivative, you can set this value to "derive" or "counter". Derivatives can become negative, while counters can handle rollover.
Depends On - Dependencies let us create parent-child relationships with metrics. Establishing metric dependencies allows you to say "If the parent metric is in alert, don't tell me the child is also in alert." This prevents you from receiving redundant alerts. Circonus assumes a severity 1 alert to mean that the host is "down," so you can only create dependencies on metrics that have at least one severity 1 rule configured. Setting the parent is as simple as selecting a metric from the list. The system automatically saves this value when you change it. For example, here we have set our duration metric to depend on the HTTP status code because if we aren't serving a 200, then we don't care that our duration is too high.
- Metric Notes - You can create notes to give details on why this metric might trigger an alert. You might also provide a link to additional documentation detailing remediation actions. This is highly recommended.
You can watch a short video of the process here: Creating a Ruleset
You can also learn more about Analytics Alerts here.