At ArangoDB we’ve got many requests for running our database on Kubernetes. This makes complete sense since Kubernetes is a highly popular system for deploying, scaling and managing containerized applications.

Running any stateful application on Kubernetes is a bit more involved than running a stateless application, because of the storage requirements and potentially other requirements such as static network addresses. Running a database on Kubernetes combines all the challenges of running a stateful application, combined with a quest for optimal performance.

This article explains what is needed to run ArangoDB on Kubernetes and what we’re doing to make it a lot easier.

Throughout the article, we’ll concentrate on running ArangoDB in a cluster arrangement, because that is the most complex arrangement you can have.

Please note that the ArangoDB Kubernetes integration is in early pre-release stage and is not for production use. However, we’ve reached the point where it is interesting to test and “fun to play with”. We highly appreciate your feedback on this 0.0.1 release to get to a production-ready v1.0 faster. Please let us know via GitHub.

ArangoDB cluster introduction

Before we dive into Kubernetes, we’ll first give a short introduction into the structure of an ArangoDB cluster. Such a cluster contains multiple ArangoDB servers running in different roles.

There are typically 3 servers running in the agent role. Together they form the agency which provides a highly available distributed key-value store.

Next are the database servers. This is a set of 2 or more servers running in the dbserver role. They are responsible for storing data.

Finally, there are the coordinators. These are 2 or more servers running in the coordinator role. They are responsible for handling all client requests and coordinating work between the database servers.

In principle, any coordinator can handle work for any client request, but there are exceptions. When executing queries, the coordinator leaves some in-memory state when not all results can be sent back to the client in that request. The client must make additional requests to fetch the remaining data. These requests must be made to the coordinator on which the query was started.

A typical cluster contains 3 agents, 3 dbservers and 3 coordinators.

All of the servers have a unique identifier, should be reachable on a static network address (DNS name or IP) and all but the coordinators need persistent storage.

Running an ArangoDB cluster in Kubernetes

A first-order approach for running an ArangoDB cluster inside Kubernetes is to use Stateful sets (all with 1 replica) for all individual ArangoDB servers. The reason for using Stateful sets is that this provides enough stability so we can fulfil the persistent storage requirements for agents and dbservers.

For internal communication in the cluster, we use a headless Service that provides a well known DNS name for every server (Pod). This DNS name is needed for command line options such as --server.endpoint and
--agency.endpoint.

For storage, we have a lot of options. Kubernetes provides a very long list of Volume types, often tailored to a specific cloud provider (e.g. AWS of Google). Fortunately, Kubernetes also provides an abstraction in the form of a Persistent Volume Claim. Such a claim is bound to an actual volume when needed, but at deployment time, you do not have to know exactly which type of Volume you have available. All you care about is the size and perhaps some performance aspects of the Volume.

When we combine all this, we end up with an ArangoDB cluster in Kubernetes that is built from:

  • 6 Persistent Volume Claims (3 for agents, 3 for dbservers)
  • 9 Stateful sets (3 for agents, 3 for dbserver & 3 for coordinators)
  • 2 Services (1 for internal communication, selecting all servers and 1 for client communication, selecting only the coordinators)

Are you sure this is right?

The first order approach for running an ArangoDB cluster in Kubernetes yields one loud question. Are you serious?

There is a big list of resources that you have to create, and all of them require a lot of details to be specified. There is a long list of things that can go wrong and we’ve not even talked about authentication or encryption.

So what alternatives exist that would make this a bit easier and less error-prone.

The first alternative would be to use helm. Helm is a template-based approach for installing larger applications on Kubernetes. It would hide a lot of the complexity of all the resources into templates, leaving you with only a handful options such as the database version, whether to turn on authentication etc.

This approach is great for deployment and is flexible enough to deal with reasonable variations such as with/without authentication or with/without encryption.

Where this approach falls short is in the upgrade scenario. In a patch level version upgrade of ArangoDB (e.g. 3.3.3 -> 3.3.4) we will never make major changes that require specific upgrade procedures. However, when we upgrade the minor version number, this may happen. If this happens, a special (typically manual) procedure is needed to upgrade the database.

We know that we cannot rule out such changes in the future, so we want our Kubernetes approach to be able to handle that in a way that is as automatic as possible. In order to do so, we need more control than what is possible with a template based approach such as helm.

This leaves one more alternative: making use of Kubernetes Custom Resource Definitions. Such a definition adds a new type of resources to Kubernetes with a controller to handle them. This is a very good fit for our needs since it allows us to make a type of resource that is powerful, yet very easy to get started with and provide a controller that enables us to handle complex upgrading scenarios in the future.

Introducing the ArangoDB Operator

The approach we’ve chosen for our integration of ArangoDB in Kubernetes is that of an operator. An operator is a combination of one or more Custom Resource Definitions with their controllers.

You will find the code of the ArangoDB Operator here: https://github.com/arangodb/kube-arangodb

The first custom resource that our operator provides is an ArangoDeployment. Running a standard cluster (which took at least 17 resources before) now looks like this:

This resource can be deployed into your Kubernetes cluster using the standard kubectl tool, with a command such as:

The ArangoDB Operator will take this resource and create all the underlying resources for you. More important, it will also monitor these resources and take action when that is needed.

If you want to customize your cluster, you can. For example, to create a cluster with 3 agents (that is the default), 5 dbservers and 2 coordinators, use a resource like this:

Scaling your ArangoDB cluster is as easy as running kubectl edit arango/my-cluster and change the value of the appropriate count field.

Note that this even works with scaling down a cluster, since the operator will carefully make the cluster shuffle its data around to allow shutting down dbserver instances in a controlled fashion!

There are lots of other settings you can customize. E.g. you can choose to deploy a single server ArangoDB instance in Kubernetes using a resource like this:

It is possible to run multiple ArangoDB clusters in the same Kubernetes cluster, even in the same namespace.

What about storage?

The ArangoDB Operator creates Persistent Volume Claims for all servers that need persistent storage. Unfortunately, this is only half the story.

A database like ArangoDB is very dependent on fast local (SSD) storage to achieve optimal performance. Many cloud providers that offer Kubernetes clusters will provision Persistent Volumes that are actually network volumes. Even though their performance is not bad, the overall performance of your ArangoDB cluster will be severely impacted.

Even worse is the situation for those who want to run their Kubernetes cluster on their owner bare-metal machines. So far there is no good & easy solution for using the local (SSD) drives in these machines, other than manually provisioning Persistent Volume’s on them.

Scheduling servers onto machines is hard to get right because in an ArangoDB cluster agents (and dbservers) should no be scheduled on the same machine (for high availability) but at the same time volumes that they use are already bound to a specific machine.

Fortunately, Kubernetes 1.9 comes with an alpha feature called Volume Scheduling. This configures the Kubernetes scheduler such that it takes all affinity settings into account when scheduling Pods onto machines, this greatly reduced the risk of making the wrong scheduling decision. This is essential for local volumes such as Persistent Volumes provided by locally attached SSD drives.

To make it easier to use locally attached (SSD) drives (in a bare-metal setting or a when using cloud provider that provides locally attached SSD) the ArangoDB Operator also implements a second custom resource. This resource is called ArangoLocalStorage.

With an ArangoLocalStorage you can easily prepare your Kubernetes cluster to provide Persistent Volumes, on demand, on all (or specific) nodes as a sub-directory of a configured directory.

For example, the following resource results in the automatic, on demand, provisioning of Persistent Volumes on all nodes of the cluster under the /var/lib/arango-storage directory.

With the ArangoDB Operator, an ArangoLocalStorage resource and an ArangoDeployment resource, it is only a 5 minute job to setup a Kubernetes cluster on bare-metal machines and get your ArangoDB cluster up and running.

Current status

The ArangoDB Operator that implements the ArangoDeployment and ArangoLocalStorage resources and their controllers is currently under development.

Although, we’ve not set a release date yet, we’ve reached a point where you can “start to play with it”, provided you do not depend on it for production use and you understand that things can and will change or break.

If you’re interested in trying the ArangoDB Operator on your own Kubernetes cluster, have a look at our Github repo for more info. As this project is still in development phase, we’re very interested to get your input.

Follow our development on Github, send your feedback and chat to main devs team, Max and Ewout, in our Community Slack #kubernetes channel.