Monitoring

AxonIQ Console provides a comprehensive monitoring solution for your Axon Framework applications. By defining conditions based on Framework Metrics, you can get notified when something goes wrong in your application. This way, you can take action before it becomes a problem.

Currently, monitoring is only available for Axon Framework applications. Monitoring for Axon Server instances is planned for a future release.

You can set up conditions for any metrics available in AxonIQ Console, which are collected by the Axon Framework Client for AxonIQ Console automatically. These conditions are checked once per minute by AxonIQ Console. If thresholds are exceeded, an alert is created.

Conditions

In the Monitoring tab, you can set up conditions for all instances of those resources at once. This way, you can for example set up a condition that triggers an alert when the ingest latency of any event processor in any application exceeds a certain threshold.

Screenshot of the Monitoring Conditions screen in AxonIQ Console

You can add a condition to any resource type by clicking the "Add new condition" button. This adds a new condition to the list that you can configure and then save. The formula has the following parts:

Field Description Possible values

Level

The level of the alerts, useful for filtering which integration receives which alerts

Incident, Critical, Major, Minor

Metric

The metric to check

Differs per resource, see available metrics.

Operator

The operator to use for the check

=, !=, >, <, >=, <=

Value

The value to compare the metric to

Any number

Percentile

In case the metric is a timer, select the percentile to check against. Generally the ninetieth percentile is recommended

Minimum, Median, ninetieth, ninetyfifth, Maximum

Duration

The amount of minutes until the alert is sent to the configured integrations. This helps prevent false positives.

Any number

The screen shows this in a readable format, so you can think of it as: "Create <level> when <metric> <operator> <value> for <duration> minutes", or "Create critical when segment claim percentage != 100% for 2 minutes". You can see this in the screen below.

Screenshot of the Monitoring Conditions screen in AxonIQ Console with a new condition being added

You can always adjust the conditions by clicking the "Edit" button next to the condition. This makes the entire row editable. You can change any field, except the level and metric. If you want to change the level or metric, you need to delete the condition and add a new one.

Specific instances

If you want to set up conditions for a specific instance of a resource, you can do so by navigating to the resource in the application and clicking "Configure" next to the Alerts header in the top right corner. This opens a dialog where you can add a new condition for that specific instance.

Screenshot of a specific resource with the Configure button in the top right corner

Setting up conditions for a specific instance works similar to setting up conditions for all instances. You can find a list of all available metrics and their defaults below. After adding a specific condition, it can be found in the resource itself, and in the Overrides section of the Monitoring tab. This way you can easily see which resources have specific conditions set up.

Override page of the monitoring section showing a handler override

In addition, the Conditions section of the Monitoring tab will show "x override(s)" when a resource has specific conditions set up.

Screenshot of the Monitoring Conditions screen in AxonIQ Console with an override shown

Alerts

When a condition is met, an alert is created. You can see all alerts in the Alerts section of the Monitoring tab. Each resource page also has an Alerts section where you can see all alerts for that specific resource. You can also see a badge in all tables where resources are listed with the number of alerts, like in the example below.

Screenshot of a row in a table with a badge showing the number of alerts

When you click on a row with alerts, you are taken to the resource page where you can see all alerts for that resource.

Integrations

AxonIQ Console can send alerts to various integrations. Currently, only Slack is supported. More integrations are planned for a future release.

Slack

There are three steps to set up Slack integration:

  1. Add our Slack app to your workspace

  2. Connect your Slack workspace to your AxonIQ Console workspace

  3. Set up the channels to send alerts to

Due to the dynamic nature of Slack, we cannot provide a step-by-step guide here. However, we can provide you with the information you need to set up the integration. You can find this information in the Integrations section of the Monitoring tab.

Screenshot of the Integrations section in the Monitoring tab in AxonIQ Console

The IDs and codes in the above image are unique to your workspace, and the codes in the image are not valid. You can find the correct codes in your workspace.

Available metrics

The following table contains all their available metrics and their defaults. The defaults have been found by our Solution Engineers to be a good start to set up monitoring. Some of these are automatically set up for you when you start using AxonIQ Console.

Resource Metric Default threshold Set up by default

Message Handler

Error Rate

> 1%

Yes, Critical

Message Handler

Latency (P90)

> 200 ms

Yes, Critical

Message Handler

Throughput

> 1000/minute

No

Aggregate

Error Rate

> 1%

Yes, Critical

Aggregate

Latency (P90)

> 200 ms

Yes, Critical

Aggregate

Lock Time (P90)

> 25 ms

Yes, Major

Aggregate

Load Time (P90)

> 100 ms

Yes, Major

Aggregate

Event Commit Time (P90)

> 300 ms

Yes, Major

Event Processor

Segment Claim Percentage

!= 100%

Yes

Event Processor

Ingest latency

> 100 ms

Yes, Major

Event Processor

Commit latency

> 300 ms

Yes, Major

Event Processor

DLQ Size

> 0

Yes, Critical

Application

Replica Count

< 1

Yes, Critical

Application

CPU Usage

> 80%

Yes, Major

Application

Host CPU Usage

> 80%

Yes, Major

Application

Heap Usage

> 80%

Yes, Major

Application

Thread Count

> 200

No

Application

Query Bus Usage

> 80%

Yes, Major

Application

Command Bus Usage

> 80%

Yes, Major