ArangoDB cluster

There are several ways to start an ArangoDB cluster. In this section we will focus on our recommended way to start ArangoDB: the ArangoDB Starter.

Datacenter to datacenter replication requires the rocksdb storage engine. The example setup described in this section will have rocksdb enabled. If you choose to deploy with a different strategy keep in mind to set the storage engine.

For other possibilities to deploy an ArangoDB cluster see Cluster Deployment.

The Starter simplifies things for the operator and will coordinate a distributed cluster startup across several machines and assign cluster roles automatically.

When started on several machines and enough machines have joined, the Starters will start Agents, Coordinators and DBservers on these machines.

When running the Starter will supervise its child tasks (namely Coordinators, DBservers and Agents) and restart them in case of failure.

To start the cluster using a systemd unit file use the following:

[Unit]
Description=Run the ArangoDB Starter
After=network.target

[Service]
Restart=on-failure
EnvironmentFile=/etc/arangodb.env
EnvironmentFile=/etc/arangodb.env.local
Environment=DATADIR=/var/lib/arangodb/cluster
ExecStartPre=/usr/bin/sh -c "mkdir -p ${DATADIR}"
ExecStart=/usr/bin/arangodb \
    --starter.address=${PRIVATEIP} \
    --starter.data-dir=${DATADIR} \
    --starter.join=${STARTERENDPOINTS} \
    --server.storage-engine=rocksdb \
    --auth.jwt-secret=${CLUSTERSECRETPATH}
TimeoutStopSec=60
LimitNOFILE=1048576

[Install]
WantedBy=multi-user.target

Note that we set rocksdb in the unit service file.

Cluster authentication

The communication between the cluster nodes use a token (JWT) to authenticate. This must be shared between cluster nodes.

Sharing secrets is obviously a very delicate topic. The above workflow assumes that the operator will put a secret in a file named ${CLUSTERSECRETPATH}.

We recommend to use a dedicated system for managing secrets like HashiCorp’s Vault.

Required ports

As soon as enough machines have joined, the Starter will begin starting Agents, Coordinators and DBservers.

Each of these tasks needs a port to communicate. Please make sure that the following ports are available on all machines:

  • 8529 for Coordinators
  • 8530 for DBservers
  • 8531 for Agents

The Starter itself will use port 8528.

Since the Agents are so critical to the availability of both the ArangoDB and the ArangoSync cluster, it is recommended to run Agents on dedicated machines. They run a real-time system for the elections and bad performance can negatively affect the availability of the whole cluster.

DBServers are also important and you do not want to lose them, but depending on your replication factor, the system can tolerate some loss and bad performance will slow things down but not stop things from working.

Coordinators can be deployed on other machines, since they do not hold persistent state. They might have some in-memory state about running transactions or queries, but losing a coordinator will not lose any persisted data. Furthermore, new coordinators can be added to a cluster without much effort.