Event Processor Monitoring
Event processors should be kept an eye on when determining the health and status of your application. You can achieve this by checking the Event Tracker Status, or monitoring the event processors through metrics.
Event tracker status
Since Tracking Tokens "track" the progress of a given Streaming Event Processor, they provide a sensible monitoring hook in any Axon application. Such a hook proves its usefulness when we want to rebuild our view model and we want to check when the processor has caught up with all the events.
To that end the StreamingEventProcessor
exposes the processingStatus()
method.
It returns a map where the key is the segment identifier and the value is an "Event Tracker Status".
The Event Tracker Status exposes a couple of metrics:
-
The
Segment
it reflects the status of. -
A boolean through
isCaughtUp()
specifying whether it is caught up with the Event Stream. -
A boolean through
isReplaying()
specifying whether the given Segment is replaying. -
A boolean through
isMerging()
specifying whether the given Segment is merging. -
The
TrackingToken
of the given Segment. -
A boolean through
isErrorState()
specifying whether the Segment is in an error state. -
An optional
Throwable
if the Event Tracker reached an error state. -
An optional
Long
throughgetCurrentPosition
defining the current position of theTrackingToken
. -
An optional
Long
throughgetResetPosition
defining the position at reset of theTrackingToken
. This field will benull
in case theisReplaying()
returnsfalse
. It is possible to derive an estimated duration of replaying by comparing the current position with this field. -
An optional
Long
throughmergeCompletedPosition()
defining the position on theTrackingToken
when merging will be completed. This field will benull
in case theisMerging()
returnsfalse
. It is possible to derive an estimated duration of merging by comparing the current position with this field.
Only segments that are currently being actively processed or reached an error state during previous processing will be contained in the processingStatus()
.
For a complete overview, you should retrieve the status from each instance of your application.
Metrics
Besides querying the event processors for their status directly, the metric modules provides a way to monitor event processors as well.
The modules contain a MessageMonitor
that exposes metrics about the processed messages of each processor, including capacity, latency, processing time and counters.
The exposed metrics can be scraped by the tool of your choice (for example, Prometheus) and alerting can be put in place for several useful metrics. Examples of useful monitoring:
-
The latency becomes too high, indicating a long time between the moment an event was published and handled by the processor.
-
The capacity reaches high value (for example, 0.8 when using 1 thread, indicating it is busy 80% of the time). This indicates a performance problem, or that the segment should be split to parallelize processing.
-
The counter metrics can be used to calculate an average number of events processed per minute. If this drops or increases outside the normal operating parameters of your application, this warrants investigation.