Monitoring
This page describes the monitoring and observability features of Axon Server Proxy.
Health checks
Health endpoint
The health endpoint provides a full overview of the proxy’s status, including a component breakdown.
curl http://localhost:8080/health
Response (HTTP 200 when all components are up, HTTP 503 if any component is down):
{
"components": {
"proxyServer": {"status": "UP"},
"channelFactory": {"status": "UP"}
},
"status": "UP"
}
Metrics
The proxy exposes metrics through Micrometer, available in Prometheus format.
Prometheus metrics
Scrape metrics in Prometheus text format:
curl http://localhost:8080/prometheus
This returns metrics in Prometheus format:
# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge
jvm_memory_used_bytes{area="heap",id="G1 Eden Space",} 1.2345678E7
...
Key metrics to monitor
| Metric | Description | Type |
|---|---|---|
|
JVM memory usage |
Gauge |
|
Number of live threads |
Gauge |
|
System CPU usage |
Gauge |
|
Process CPU usage |
Gauge |
|
GC pause duration |
Timer |
|
Active client connections |
Gauge |
Connections endpoint
The proxy provides an endpoint to view active connections.
curl http://localhost:8080/connections
Response:
[
{
"connectionId": "550e8400-e29b-41d4-a716-446655440000",
"clientId": "my-application",
"componentName": "MyCommandHandler",
"context": "default",
"backend": "axonserver1:8124"
},
{
"connectionId": "661f9511-f30c-52e5-b827-557766551111",
"clientId": "another-app",
"componentName": "QueryProcessor",
"context": "default",
"backend": "axonserver2:8124"
}
]
This helps identify:
-
Which applications are connected
-
How many connections are active
-
Which backend node each connection is routed to
Logging
The proxy uses SLF4J with Logback. Log levels and output destinations are configured via a logback.xml file, not at runtime through a monitoring endpoint.
The default configuration writes colorized, timestamped output to the console.
To configure file output, place a logback.xml in the working directory or specify its path with -Dlogback.configurationFile. See Logging configuration for examples.
Prometheus integration
Prometheus configuration
Add the proxy as a scrape target in prometheus.yml:
scrape_configs:
- job_name: 'axon-proxy'
metrics_path: '/prometheus'
static_configs:
- targets: ['proxy.local:8080']
Grafana dashboard
Create a Grafana dashboard with key metrics:
JVM Panel:
-
Memory usage:
jvm_memory_used_bytes -
GC activity:
jvm_gc_pause_seconds_count -
Thread count:
jvm_threads_live_threads
Proxy Panel:
-
Active connections:
axon_proxy_connections
System Panel:
-
CPU usage:
process_cpu_usage,system_cpu_usage -
File descriptors:
process_files_open_files
Alerting
Prometheus alerts
Example alert rules:
groups:
- name: axon-proxy
interval: 30s
rules:
- alert: AxonProxyDown
expr: up{job="axon-proxy"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Axon Proxy is down"
description: "Axon Proxy {{ $labels.instance }} is unreachable"
- alert: AxonProxyHighMemory
expr: jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"} > 0.9
for: 5m
labels:
severity: warning
annotations:
summary: "High memory usage"
description: "Heap usage above 90% on {{ $labels.instance }}"
- alert: AxonProxyNoConnections
expr: axon_proxy_connections == 0
for: 5m
labels:
severity: info
annotations:
summary: "No active connections"
description: "No applications connected to {{ $labels.instance }}"
Performance monitoring
Key performance indicators
Monitor these KPIs:
-
Connection Count: Number of active connections (
axon_proxy_connections) -
Resource Usage: CPU, memory, file descriptors
-
JVM Health: Heap usage, GC pause frequency and duration (JAR only)
Load testing
Use tools like ghz for gRPC load testing:
ghz --insecure \
--proto axonserver.proto \
--call io.axoniq.axonserver.grpc.control.PlatformService/GetPlatformServer \
--duration 60s \
--connections 10 \
--concurrency 50 \
localhost:8124
Monitor proxy metrics during load tests to establish baselines.
Security monitoring
Monitor security-related events:
-
Failed TLS handshakes (check logs)
-
Certificate expiration (external monitoring)
-
Unusual connection patterns
-
Authentication failures (if using Axon Server access control)