Replication allows you to replicate data onto another machine. It forms the base of all disaster recovery and failover features ArangoDB offers.
ArangoDB offers synchronous and asynchronous replication.
Synchronous replication is used between the DB-Servers of an ArangoDB Cluster.
Asynchronous replication is used:
- between the Leader and the Follower of an ArangoDB Active Failover setup
- between multiple ArangoDB Data Centers (inside the same Data Center replication is synchronous)
Synchronous replication only works within an ArangoDB Cluster and is typically used for mission critical data which must be accessible at all times. Synchronous replication generally stores a copy of a shard’s data on another DB-Server and keeps it in sync. Essentially, when storing data after enabling synchronous replication, the Cluster waits for all replicas to write all the data before green-lighting the write operation to the client. This naturally increases the latency a bit, since one more network hop is needed for each write. However, it enables the cluster to immediately fail over to a replica whenever an outage is detected, without losing any committed data, and mostly without even signaling an error condition to the client.
Synchronous replication is organized such that every shard has a
r - 1 followers, where
r denotes the replication factor.
The replication factor is the total number of copies that are kept, that is, the
leader and follower count combined. For example, with a replication factor of
3, there is one leader and
3 - 1 = 2 followers. You can control the
number of followers using the
replicationFactor parameter whenever you
create a collection, by setting a
replicationFactor one higher than the
desired number of followers. You can also adjust the value later.
You cannot set a
replicationFactor higher than the number of available
DB-Servers by default. You can bypass the check when creating a collection by
enforceReplicationFactor option to
false. You cannot bypass it
when adjusting the replication factor later. Note that the replication factor
is not decreased but remains the same during a DB-Server node outage.
In addition to the replication factor, there is a writeConcern that
specifies the minimum number of in-sync followers required for write operations.
If you specify the
writeConcern parameter with a value greater than
collection’s leader shards are locked down for writing as soon as too few
followers are available.
When using asynchronous replication, Followers connect to a Leader and apply all the events from the Leader log in the same order locally. As a result, the Followers end up with the same state of data as the Leader.
Followers are only eventually consistent with the Leader.
Transactions are honored in replication, i.e. transactional write operations become visible on Followers atomically.
All write operations are logged to the Leader’s write-ahead log. Therefore, asynchronous replication in ArangoDB cannot be used for write-scaling. The main purposes of this type of replication are to provide read-scalability and hot standby servers for Active Failover deployments.
It is possible to connect multiple Follower to the same Leader. Followers should be used as read-only instances, and no user-initiated write operations should be carried out on them. Otherwise, data conflicts may occur that cannot be solved automatically, and this makes the replication stop.
In an asynchronous replication scenario, Followers pull changes from the Leader. Followers need to know to which Leader they should connect to, but a Leader is not aware of the Followers that replicate from it. When the network connection between the Leader and a Follower goes down, write operations on the Leader can continue normally. When the network is up again, Followers can reconnect to the Leader and transfer the remaining changes. This happens automatically, provided Followers are configured appropriately.
As described above, write operations are applied first in the Leader, and then applied in the Followers.
For example, let’s assume a write operation is executed in the Leader at point in time t0. To make a Follower apply the same operation, it must first fetch the write operation’s data from Leader’s write-ahead log, then parse it and apply it locally. This happens at some point in time after t0, let’s say t1.
The difference between t1 and t0 is called the replication lag, and it is unavoidable in asynchronous replication. The amount of replication lag depends on many factors, a few of which are:
- the network capacity between the Followers and the Leader
- the load of the Leader and the Followers
- the frequency in which Followers poll the Leader for updates
Between t0 and t1, the state of data on the Leader is newer than the state of data on the Followers. At point in time t1, the state of data on the Leader and Followers is consistent again (provided no new data modifications happened on the Leader in between). Thus, the replication leads to an eventually consistent state of data.
As the Leader servers are logging any write operation in the write-ahead-log anyway, replication doesn’t cause any extra overhead on the Leader. However, it causes some overhead for the Leader to serve incoming read requests of the Followers. However, returning the requested data is a trivial task for the Leader and should not result in a notable performance degradation in production.