Backups

Before proceeding with backing up Axon Server, ensure you have thoroughly read through this page and are familiar with the backup procedure. Practice the backup and restore procedures before disaster strikes. Axoniq does not acknowledge any liability if a backup/restore fails due to a failure to follow the backup procedure. Please ensure you have verified your backups by also testing the restore procedure.

This document describes how to backup and restore Axon Server. Axon Server ensures durability by replicating its data across cluster nodes.

To increase data durability, Axon Server nodes can be placed in different availability zones. Designated backup nodes enable faster disaster recovery and an up-to-date copy of your data.

Regardless of clustering and backup nodes, offline backups remain a crucial part of your setup. Steps to create reliable backups are outlined below.

Backup

To support the creation of consistent backups, Axon Server provides an HTTP API. The API documentation is accessible at http://[server]:8024/swagger-ui/index.html .

You need to back up the ConfigDB and the event store separately as described below. As the ConfigDB refers to data stored in the append-only event store, it is paramount that the ConfigDB is backed up first.

Config database

The ConfigDB contains important runtime configuration for your Axon Server deployment, like users or roles. A call to the POST endpoint http://[server]/v1/backup/createControlDbBackup creates a backup .zip file on disk and returns its path. After calling the endpoint, you have to copy or move that file to your backup location.

Event store

The event store contains your events, which make up the bulk of your backup size. Axon Server uses several indices, which are also included in this backup.

Backup procedures differ between DCB-Contexts and non-DCB contexts.

DCB Event Store

Backups for DCB contexts are currently only available in the push-variety, meaning Axon Server "pushes" data to a location. We advise to have Axon Server push the backup into a mounted backup volume.

It is currently not supported for an external tool to "pull" data out of a live Axon Server node. Of course you can pull data from the backup location after a backup has completed.

Storage considerations

In the current backup implementation, storage requirements for the backup can be estimated as follows: currentSpaceInUseByRocksDBColumnFamilies * 2 + currentSpaceInUseByEvents + space used by update trackers (label and snapshots)

Procedure for backup

  1. Call the /v2/backup endpoint with arguments

  curl -X 'POST' \
    'http://127.0.0.1:8024/v2/backup' \
    -H 'Content-Type: application/json' \
    -d '{
    "context": "billing",
    "location": "/mnt/backups"
  }'

This will provide you with an id you can use to query progress of that backup, for example 1a39b69c-0ca4-42cd-b0ed-22ee3727402a.

  1. Use the received ID to check the status of your backup.

curl -X 'GET' \
  'http://127.0.0.1:8024/v2/backup/1a39b69c-0ca4-42cd-b0ed-22ee3727402a' \
  -H 'accept: */*'

Once this endpoint returns "backupState": "FINISHED", your backup has completed.

Non-DCB Event Store

Axon Server exposes an HTTP endpoint /v1/backup/eventstore that can be used to gather files for backup. This endpoint returns a JSON object with a list of paths.

It also returns a number lastClosedSegment that can be used in subsequent backup calls as lastClosedSegmentBackedUp to only receive changed files. You are required to copy (with overwrite) all files returned by this endpoint, even if they exist in the target location.

The API documentation is accessible at http://[server]:8024/swagger-ui/index.html

Tips for reliable backups

To ensure integrity of your data, please adhere to the following guidelines:

  • Never modify files in the back up

  • Do not create a backup with the target path set to the live storage location

  • Use the correct backup endpoint for DCB and non-DCB contexts

  • If you restore a backup that is not the most recent one, all newer backups must be discarded

Restore

We distinguish between restoring without any preexisting data, for example after a disaster destroyed the previous cluster entirely. And restoring one or more contexts in a running cluster to a previous state.

Restore with no pre-existing data on disk

ConfigDB

If you backed up the ConfigDB, you can restore the ConfigDB so your restored cluster has the same configuration as when the backup was taken. For this, you need to make sure, that each node gets restored the ConfigDB backed up for that node. In particular, this requires that the hostname of the node after the restore is exactly the same as before. If that cannot be achieved, refer to the recovery procedure.

Event store

The event store can be restored to all nodes from one copy in the backup location.

  1. Copy the events folder from the backup to the location defined by axoniq.axonserver.event.storage

  2. Copy the snapshots folder from the backup to the location defined by axoniq.axonserver.snapshot.storage

  3. Ensure the license and properties file and (optionally) a cluster-template is present when starting Axon Server.

  4. Start Axon Server

  5. Create a replication group and the contexts that match the context named from your backup if not already covered by your cluster-template.

Restore in a running cluster

In this case we assume you want to restore one or more contexts into a running axonserver cluster.

This is for emergency use only, as a restore will undo everything within the system that has happened since the backup. It will not undo all side effects your system might have caused. If there were only some wrong events published, compensating actions or in worse case, event transformation might be less intrusive countermeasures.

As the cluster is still running, no restore of the ConfigDB is necessary.

Steps:

  1. Delete replication groups of affected contexts with preserving data

  2. Delete the context folder as specified by axoniq.axonserver.event.storage and axoniq.axonserver.snapshot.storage

  3. Copy the backup data (events and snapshots)

    1. Copy the events folder from the backup to the location defined by axoniq.axonserver.event.storage

    2. Copy the snapshots folder from the backup to the location defined by axoniq.axonserver.snapshot.storage

  4. Create new replication groups

  5. Create contexts with their previous names

For detailed examples and more sophisticated restore procedures, see advanced backups.