Monitoring

This page describes the monitoring and observability features of Axon Server Proxy.

Health checks

Health endpoint

The health endpoint provides a full overview of the proxy’s status, including a component breakdown.

curl http://localhost:8080/health

Response (HTTP 200 when all components are up, HTTP 503 if any component is down):

{
  "components": {
    "proxyServer": {"status": "UP"},
    "channelFactory": {"status": "UP"}
  },
  "status": "UP"
}

Liveness probe

For Kubernetes and container orchestration:

curl http://localhost:8080/health/liveness

Returns:

200 OK with {"status":"UP"} whenever the process is responding
This endpoint is always 200 as long as the process is alive

Readiness probe

Check if the proxy is ready to accept traffic:

curl http://localhost:8080/health/readiness

Returns:

200 OK with {"status":"UP"} if ready
503 Service Unavailable if not ready

Metrics

The proxy exposes metrics through Micrometer, available in Prometheus format.

Prometheus metrics

Scrape metrics in Prometheus text format:

curl http://localhost:8080/prometheus

This returns metrics in Prometheus format:

# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge
jvm_memory_used_bytes{area="heap",id="G1 Eden Space",} 1.2345678E7
...

Proxy-specific metrics

The proxy exposes a custom metric for monitoring its operation:

Active connections

# Number of active client connections
axon_proxy_connections 5

Key metrics to monitor

Metric Description Type

Metric	Description	Type
`jvm.memory.used` / `jvm_memory_used_bytes`	JVM memory usage	Gauge
`jvm.threads.live` / `jvm_threads_live_threads`	Number of live threads	Gauge
`system.cpu.usage` / `system_cpu_usage`	System CPU usage	Gauge
`process.cpu.usage` / `process_cpu_usage`	Process CPU usage	Gauge
`jvm.gc.pause` / `jvm_gc_pause_seconds`	GC pause duration	Timer
`axon_proxy_connections`	Active client connections	Gauge

jvm.memory.used / jvm_memory_used_bytes

JVM memory usage

Gauge

jvm.threads.live / jvm_threads_live_threads

Number of live threads

Gauge

system.cpu.usage / system_cpu_usage

System CPU usage

Gauge

process.cpu.usage / process_cpu_usage

Process CPU usage

Gauge

jvm.gc.pause / jvm_gc_pause_seconds

GC pause duration

Timer

axon_proxy_connections

Active client connections

Gauge

Connections endpoint

The proxy provides an endpoint to view active connections.

curl http://localhost:8080/connections

Response:

[
  {
    "connectionId": "550e8400-e29b-41d4-a716-446655440000",
    "clientId": "my-application",
    "componentName": "MyCommandHandler",
    "context": "default",
    "backend": "axonserver1:8124"
  },
  {
    "connectionId": "661f9511-f30c-52e5-b827-557766551111",
    "clientId": "another-app",
    "componentName": "QueryProcessor",
    "context": "default",
    "backend": "axonserver2:8124"
  }
]

This helps identify:

Which applications are connected
How many connections are active
Which backend node each connection is routed to

Logging

The proxy uses SLF4J with Logback. Log levels and output destinations are configured via a logback.xml file, not at runtime through a monitoring endpoint.

The default configuration writes colorized, timestamped output to the console.

To configure file output, place a logback.xml in the working directory or specify its path with -Dlogback.configurationFile. See Logging configuration for examples.

Prometheus integration

Prometheus configuration

Add the proxy as a scrape target in prometheus.yml:

scrape_configs:
  - job_name: 'axon-proxy'
    metrics_path: '/prometheus'
    static_configs:
      - targets: ['proxy.local:8080']

Grafana dashboard

Create a Grafana dashboard with key metrics:

JVM Panel:

Memory usage: jvm_memory_used_bytes
GC activity: jvm_gc_pause_seconds_count
Thread count: jvm_threads_live_threads

Proxy Panel:

Active connections: axon_proxy_connections

System Panel:

CPU usage: process_cpu_usage, system_cpu_usage
File descriptors: process_files_open_files

Alerting

Prometheus alerts

Example alert rules:

groups:
  - name: axon-proxy
    interval: 30s
    rules:
      - alert: AxonProxyDown
        expr: up{job="axon-proxy"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Axon Proxy is down"
          description: "Axon Proxy {{ $labels.instance }} is unreachable"

      - alert: AxonProxyHighMemory
        expr: jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"} > 0.9
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High memory usage"
          description: "Heap usage above 90% on {{ $labels.instance }}"

      - alert: AxonProxyNoConnections
        expr: axon_proxy_connections == 0
        for: 5m
        labels:
          severity: info
        annotations:
          summary: "No active connections"
          description: "No applications connected to {{ $labels.instance }}"

Performance monitoring

Key performance indicators

Monitor these KPIs:

Connection Count: Number of active connections (axon_proxy_connections)
Resource Usage: CPU, memory, file descriptors
JVM Health: Heap usage, GC pause frequency and duration (JAR only)

Load testing

Use tools like ghz for gRPC load testing:

ghz --insecure \
    --proto axonserver.proto \
    --call io.axoniq.axonserver.grpc.control.PlatformService/GetPlatformServer \
    --duration 60s \
    --connections 10 \
    --concurrency 50 \
    localhost:8124

Monitor proxy metrics during load tests to establish baselines.

Security monitoring

Monitor security-related events:

Failed TLS handshakes (check logs)
Certificate expiration (external monitoring)
Unusual connection patterns
Authentication failures (if using Axon Server access control)

Next steps

Learn about troubleshooting
Set up alerting based on metrics
Create dashboards for visualization