Additional Features of ArangoDB Enterprise
The ArangoDB 3 development phase is focused on solid scalability with all supported data models. Version 3.0 introduced a completely overhauled cluster architecture, our agency to ensure high availability in a cluster environment and no single point of failure. Together with our binary storage format VelocyPack we created a basis for upcoming innovations. The first big innovation is now implemented in ArangoDB 3.1 Enterprise Edition, and we call it SmartGraphs (download ArangoDB Enterprise Edition). The Enterprise Edition comes with 2 other useful features:
- SmartGraphs: Scale with graphs into a cluster and stay performant. This unique feature enables you to explore entirely new spheres in graph usage and provides nearly the same performance of graph traversals as a single instance setup
- Enhanced Encryption: Choose your level of SSL encryption
- Auditing: Keep a detailed log of everything that was written or read in ArangoDB
With ArangoDB 3.0 you could already scale graphs horizontally and achieve the same performance as other distributed graph solutions provide today. The new SmartGraphs feature is our answer to the problem of network hops which occur when sharding graphs to a cluster. Several of our internal tests show up to 40-120x the performance compared to a setup without SmartGraphs.
In projects for e.g. IoT, finance, communication, healthcare or genomics you will see really large graphs. Within those graph datasets, we recognized a natural distribution of a graph, forming highly interconnected communities and just rare edges between those communities. With the generally strong trend to graph data models, we will also see rapidly growing graph datasets which raise the question of how to handle billions or even trillions of connected data? Those amounts of data do not fit into one machine and sometimes they shouldn’t be kept in a single instance – for many reasons.
SmartGraphs - The problem to scale with graphs
The community edition of ArangoDB is already capable of handling large datasets on a single instance (vertical scalability) and even scale out to a cluster with all three data models (horizontal scalability). But when you shard a graph to a cluster you might recognize a performance not suitable for your use case. Why is this?
What you can see below in figure 1 is the situation without ArangoDB SmartGraphs. The graph is sharded to three instances (DB server 1, 2 and 3). In order to traverse through this graph, the traversal will have to hop between those three machines and network latency will occur during your traversal. In this example 6 network hops.
A network hop leads to network latency which is pretty expensive compared to in-memory computation. With several of those network hops, one might measure a performance which is not suitable for a given use case.
The new SmartGraphs feature of the ArangoDB Enterprise Edition solves the problem of networks hops by using the smartness of your application layer. A graph itself is kind of dumb. It doesn’t know anything about itself. But your application is smart. During the last year, we recognized a pattern that seems to be valid in many graph cases. In many datasets, there are highly interconnect communities but not so many connections between the communities. You can think about those communities as e.g. your customers, regions or any other logic by which you can shard your graph. This knowledge and some secret sauce are used to shard the graph in a very efficient way and reduce network hops to a minimum.
By using ArangoDB SmartGraphs for the case mentioned above you’d get the following result as shown in figure 2 below:
What happens now is that ArangoDB SmartGraph uses the smartness of your application layer to shard your graph data by a certain logic which might be a customer ID, regions or any other logic that fits to you main queries.
With this smartness of your application layer you can shard the highly connected communities within your graph to specified instances.
Figure 3 shows the sharding situation with ArangoDB SmartGraph. The example traversal will now only have one network hop and the highly connected communities within your graph are each on one instance. First, internal tests show a 40-120x performance gain for traversals.
This new feature will enable Enterprise Edition users to work on completely new use cases or optimize current graph-based applications. If you now raise your eyebrow, let us show you how you can setup such a high-performance cluster with two clicks.
Auditing is an important tool for both compliance and forensic analysis of data breaches. ArangoDB audit records provide an irrefutable record of actions taken whether they are generated by a database, directory, or operating system. The information stored in ArangoDB audit log include all important database actions:
- Creation or deletion of databases
- Creation or deletion of collections
- Creation and deletion of indices
- Read access on documents
- Altering of queries
Let us show you the power of ArangoDB Enterprise Edition and how we can contribute to your project with our +20 years of database experience. Request a demo or an introduction call via the form.