Introduction

“Better to have, and not need, than to need, and not have.”
Franz Kafka

Franz Kafka’s talents wouldn’t have been wasted as DBA. Well, reasonable people might disagree.

With this article, we are shouting out a new enterprise feature for ArangoDB: consistent online single server or cluster-wide “hot backups.”

If you do not care for an abstract definition and would rather directly see a working example, simply scroll down to Section “A full cycle” below.

Snapshots of arbitrary sized complex raw datasets, be it file systems, databases, etc.: they are extremely useful for ultra-fast disaster recovery. With ArangoDB 3.5.1, we are introducing the hot backup feature for the RocksDB storage engine, which allows one to do exactly that with ArangoDB.

Using the new feature, one can take near-instantaneous snapshots of the entire operation, which includes a single server as well as cluster deployments. These hot backups can be made at very low cost in resources and with a single API call; so that creating and keeping many local hot sets is easily accomplished.

This is also the major difference to the current tool arangodump, which can produce a dump of a subset of the collections. However, this comes at a cost. First of all, arangodump has to create a complete physical copy of the data and has to go through the usual transport mechanisms to achieve this. Therefore it is much slower, takes a lot more space than a hot backup, and does not produce a consistent picture across collections. arangodump is still there and has its use cases, but the realm of near-instantaneous consistent snapshots will be taken care of by hot backups.

Intuitive labeling and an equally efficient restore mechanism to any past compatible snapshot make hot backup simple to use and easily integrated into automation of administration and maintenance of ArangoDB instances.

The usefulness of local snapshots is evident, but it would be somewhat limiting if DBAs would have to manage transfers to archiving or to other ArangoDB instances; say for switching hardware or riskless testing of new versions of database applications, etc. That’s why the hot backup comes with the full functionality of the vast feature set of the rclone synchronization program for data transfer. rclone supports every conceivable standard for secure file transfer in the industry. From locally or on-site mounted file systems to cloud protocols like S3, WebDAV, GCS, you name it.

arangodb 3.5.1 hot backup upload download

How is it done?

RocksDB offers the mechanism to create “checkpoints”, a mechanism we have come to rely on for building this feature. RocksDB checkpoints provide consistent read-only views over the entire state of the key-value store. The word view is actually quite accurate, as it sums up the idea that the checkpoints all work on the same datasets, just at a different time.

This has two desirable consequences. Firstly, our hot backups come at a near-zero cost in resources, CPU as well as RAM. Secondly, they can be obtained instantaneously. We have to acquire a global transaction lock, however, to make sure that we isolate a consistent state for the duration of the snapshot.

The next ingredients are file system hard links, which allow one to hold on to the same data file over multiple directories as if they were copies in reality, which are then released, however, when no longer needed. This is illustrated in the following image where a file name in the database directory and another file name in one of the backups, both point to the very same piece of data on disk (this is called an inode).

arangodb 3.5.1 hot backup inode

This saves a lot of disk space and the actual piece of data (inode) is only removed when both directory entries are deleted. Using this file system mechanism, one can then minimize the cost of the hot backups in storage space.

The entire process of creating a snapshot on a single ArangoDB instance then goes as follows: Stop all writes to the entire instance, flush the write-ahead log, create the snapshot in time, create a new directory with hard links to all database files, which are necessary for a consistent data layer with the proper labeling.

The restore procedure walks back the above process. Again, one needs to stop IO operations to the entire instance, shift to the directory on a former snapshot and restart the internal database process.

The cluster

Take now the whole procedure to the distributed ArangoDB deployment. Regardless of how you run your ArangoDB cluster, be it in a Kubernetes, using the ArangoDB Starter or in a manual fashion, you will have the whole shebang at your disposal.

Of course, the mere introduction of distribution adds complexity both to the taking of as well as restoring to snapshots, but, and this is a big but, the interaction remains transparent to the user. The API does not change a bit.

I would be lying, though, if I claimed that there is no perceptual difference at all, and this has to do with the global transaction locks to isolate the consistent state. In order to be able to make any consistency guarantees, the locks need now to engulf the entire cluster, which only increases the total time to obtain the lock.

The locking mechanism, however, would not be a good one, if it took a part of the cluster hostage for a long time while other servers are still engaged in other potentially long-running transactions.

So we iteratively try to obtain the lock with growing lock times while releasing the partial lock every time we timeout. If the global lock cannot be gotten, that is after a configurable overall timeout, which defaults at 2 minutes, the snapshot will eventually fail. However, one still can enforce a possibly inconsistent snapshot to be taken, as it might still be worlds away from total data disaster.

One would experience the repeated lock attempts in growing pauses in operation intertwined with normal periods in between.

Limitations

So these hot backups are truly great for ultra-fast disaster recovery, however, as always in life, they don’t come with a free lunch. We need to discuss limitations and caveats.

Hot backups or snapshots are exactly that. They are complete time warps. One cannot restore a single database, let alone a single collection, user, etc. All or nothing. This can also be considered a great thing by the way. One knows, what one gets. The state as it was properly working 20 minutes or 1 month ago or before and vice versa. But still, one needs to appreciate that that would necessitate creating backups of databases, collections, users, etc. independent of a snapshot.

As obvious as it may already be, I cannot stress enough how hot backups don’t replace proper backups, let alone better backups and vice versa.

Also, again this is the nature of a snapshot. One may only restore a snapshot to the identical topology. That means that a snapshot of a 3 node cluster can only be restored to a 3 node cluster, even if it is the same cluster that has changed a bit down the road.

Also, as the raw data files are associated with the raw operations layer above, it is obvious that one cannot use snapshots with file formats, which are not yet or no longer compatible with the running version of ArangoDB, that one wants to restore to. This specifically implies that snapshots created with an ArangoDB 3.5 version might not be usable for a later minor or major version (patch releases within 3.5 will be fine).

All of the above caveats need to be considered carefully when one integrates the hot backups in operation and administration processes.

A full cycle

In this section, we will demonstrate the new feature. Assume you have a cluster running at the endpoint tcp://localhost:8529. Then, to take a hot backup you can simply do

which will answer (after a few seconds):

As you can see, the total number of bytes the backup files take on disk, the number of data files as well as the number of DBServers at the time when the backup was taken, are reported. Furthermore, the backup has got its own unique identifier, which by default consists of a timestamp and a UUID, in this case

What will happen behind the scenes is that each of my three DBServers has its share of the hot backup in its disk, note that it shares the actual files with the normal database directory, so no new space on disk is needed at this stage. We can display the backup with the list command of arangobackup:

which yields:

As you can see, the above meta information about the backup is shown, too, we will talk about the potentiallyInconsistent further down.

However, this is not actually a good backup yet, since it resides on the same machine or even disk, and it even shares files with the database. So we need to upload the data to some safe place, this could, for example, be an S3 bucket in the cloud. Here is how you could achieve this:

You specify a path for the target repository (just a directory in an S3 bucket). Here, buck150969 is the name of my S3 bucket and backups is a directory name in there. You supply the credentials and the rest of the configuration in a JSON file. Here, we have used a file like this:

(credentials invalidated, of course!)

This is an asynchronous operation, since it can take a while to upload the files. Therefore, this command only initiates the transfer and the actual work will be done in the background:

However, it does tell you how to track the progress of the upload. You can use this command to get this information:

which gives you a nice overview like this:

Here, it was so fast that I only got the completed state. But even for larger backups, eventually, the upload is completed. To demonstrate that and the download operation, we first delete the local copy like this:

If you try this yourself, verify that the backup is indeed gone with the arangobackup list command:

Let’s now download the backup again. The command is very similar to the upload command:

Again, you only get information about how to query progress:

Doing this, you get:

After successful completion, the backup is there again:

And now we can of course have our big moment, the restore. If you try this at home, simply modify your database cluster by deleting or adding stuff and then do a restore:

and again after just a few seconds, you get something like this:

indicating that the restore is complete. If you now inspect the data in your database, you will see the state as it was when you took the hot backup. It remains to show you one more thing, which is crucial to achieving a consistent snapshot across your cluster. We need to get a global transaction lock, that is, we have to briefly stop all writing transactions – on all DBServers at the same time. That means that if a longish write operation is ongoing, we cannot do this short term. Let me demonstrate.

First, execute in arangosh a long running writing transaction:

This will take some 25 seconds since the SLEEP function used to fabricate a document sleeps one second. The write transaction for this query will also take this long.

If you try to create a hot backup during this time (I intentionally lower the timeout of how long to try):

you will get a disappointing result after 5 seconds (provided the above query is still running):

Within the upper limit of 5 seconds, it was not possible to halt all write transactions since there was one which was running for 25 seconds.

One way to solve this problem is to stop writes from the client side, or by switching the cluster into read-only mode. This will ensure that you can take a consistent hot backup across your cluster.

If you cannot interrupt your writes at all but you do not really care about the consistency, you can use this switch:

and it will – even if the above query is still running – create a hot backup, but warns you:

Note that this is also reflected in the backup list:

gives you:

This is where the above mentioned potentiallyInconsistent flag comes into play. If you have allowed inconsistent backups, then this flag tells you that indeed the backup could potentially be inconsistent across multiple DBServers.

Conclusion

We believe that hot backup is an important and very valuable new feature, so:

Life is still an optimization problem, but the problems have become much more benign 🙂