Using arangodump and arangorestore with a multi-database installation

Using arangodump and arangorestore with a multi-database installation

When you want to manage full backups, migrate your data between different instances, or make breaking changes to how your cluster data is structured, you will likely find yourself using our arangodump and arangorestore tools. Typically these tools dump or restore a single database at a time, and default to the _system database.

In order to dump or restore a different database, each tool accepts the --server.database option. So for instance, if you have a database foo, your can dump it as follows.

Now, what if you have several databases? In releases prior to ArangoDB 3.5, you would have to run the command once for each database, specifying the name of each database. You could script this, fetching the list of database names and dumping each database to its own directory; similarly you could scan for each dump directory and restore each individually. We decided this was too much work, so we added a simple flag to automate the behavior for you.

Both utilities now take the --all-databases option. In order to dump all databases serially and automatically, use a command like the following (adding your own options like --server.endpoint if necessary).

This will dump each database into its own subfolder in the dump directory, so that it can automatically be restored with the following command (again with any necessary additional options like --server.endpoint).

Limitations

This approach will not guarantee a consistent snapshot in time across databases. It is functionally equivalent to retrieving the list of databases yourself and executing the dump or restore command for each database in sequence.

Accordingly, this will not result in any speedup, as it operates on the list serially, not in parallel.

The --threads option may parallelize operations across collections within a single database at a time. This option is compatible with the --all-databases flag.

In the case of a cluster, we have a script, under the Fast Cluster Restore section of our arangorestore documentation, which can distribute the restoration of each collection within a database across multiple coordinators. Using the --all-databases flag will break this script.

Do you like ArangoDB?
icon-githubStar this project on GitHub.
close-link