home shape

ArangoDB 3.0 Alpha Release: Getting Closer to the Future

There is this German saying “If it takes long enough, it will be all right in the end.” However, since just “all right” isn’t our quality standard this first alpha of 3.0 took us a bit longer to finish up than planned. We´d like to invite you to give this fully tested alpha a serious spin, test the new functionalities and share your thoughts and feedback with us on Slack in our “feedback30” channel

Within this short release note you´ll find 1) a quick overview of the most important changes; 2) an instruction on how to get the new version and 3) how to get your (test) data from your 2.x version into the 3.0 alpha which has our new binary storage format VelocyPack implemented.

For those who haven´t read about VelocyPack yet: We successfully did a kind of open heart surgery on the ArangoDB core by implementing our very own binary storage format VelocyPack. VelocyPack is even more compact than MessagePack and is very promising to further improve ArangoDBs query response time and memory usage. Before VelocyPack we used two formats (ShapeJSON, tri_json_t) internally which led to a lot of duplicate code. We tested existing formats but didn´t find one that met our needs. Now that we only have one storage format we can simplify and, hopefully, speedup our development cycle significantly.

We think that our alpha version is definitely worth testing. 3.0 alpha consists of 173.000 lines of code: not to mention the 290.000 lines of testcode… all our tests are green and even our performance tests are running smoothly (benchmark results are looking good, we´ll post a new performance benchmark blogpost with the 3.0 release).

What’s new in 3.0 alpha:

Under the hood

  • We did an open heart surgery on the ArangoDB core and implemented our own binary storage format VelocyPack (VPack) for storing documents, query results and temporarily computed values.
  • _from and _to attributes of edges are now updatable and usable in indexes
  • ArangoDB 3.0 now has reduced memory allocation for queries and a significant speedup in query response times
  • Unified API for CRUD Operations

Cluster Management

  • Master/Master setup: Clients can talk to any node. No dedicated master.
  • Shared nothing architecture. No single point of failure.
  • Synchronous Replication: We have implemented synchronous replication with automatic failover. The number of replicas for each shard can be configured for each collection separately. This improves reliability in the distributed setup considerably.
  • Self Organizing Cluster State Management: We developed our own cluster management agency which organizes itself. Cluster setup and maintenance got much easier with 3.0
  • Preparations for Jepsen-Test: We got rid of ETCD which turned out to be too slow for our goals and implemented RAFT with transactions and callbacks instead. With these improvements we can take the challenge of the Jepsen-Test later this year

AQL

AQL now uses VelocyPack internally for storing intermediate values. For many value types it can now get away without extra memory allocations and less internal conversions. Values can be passed into internal AQL functions without copying them. This leads to reduced query execution times for queries that use C++-based AQL functions.

Foxx Updates

  • Legacy Mode for 2.8 services
  • Dedicated Foxx Documentation with how-tos and examples
  • No more global variables and magical comments
  • Repository and Model have been removed: Instead of repositories use ArangoDB collections directly
  • Controllers have been replaced with nestable routers
  • Routers can be named and reversed: No more memorizing URLs
  • Simpler express-like middleware

Complete overhaul and simplification of graph functions

We have learned a lot over the past two years. Now, we’re seizing an opportunity with this new major release to simplify ArangoDB. Our philosophy is that every minor release is downward compatible. That’s why we utilize major releases for in depth changes if needed.

During all the 2.x releases you have seen that we had many updates and improvements on our graph capabilities. In this period we wanted to stay as backwards compatible as possible. Therefore, the amount of graph functions drastically increased and using them got more and more complicated. We now streamlined the graph functions and created a more flexible and faster API, which is feature-rich, while easy to learn. The graph functions are now implemented in C++ and allow better index utilization.

With the release of 3.0 we want to take the chance to unify and simplify the overall graph API in AQL. We took the radical decision to integrate all graph features natively into the language and get rid of all the functions that we were using until now.This nativeness of the features enables us to optimize all graph features for instance by using more efficient indexes, improving the overall query execution ordering, early filtering and so on. Furthermore, all graph features are now more flexible (you can execute ANY AQL based filter, not only fixed examples).

In summary, what are your gains:

  • Better performance
  • Less context switches
  • More automatic optimization
  • Use better suited indices
  • In-detail explain output of graph features
  • More flexible filters
  • Easier to use
  • No guessing which specialized function performs best in your case
  • No confusion between GRAPH_-prefixed and edge collection functions

We also simplified a lot of minor details. For example, we removed the double negation in the startup parameters. E. g., instead of --server.disable-authentication false you now write --server.authentication true. Much easier to read, isn’t it? We will document als these improvements and changes in the next days. The docker container is ready-to-run. All these changes have been incorporated there.

We will also make some further simplifications to the user manager and cluster start. Stay tuned.

How to get ArangoDB 3.0 alpha

Please note that this alpha is NOT for production use! It’s suitable for testing only!

If you want to try out the new 3.0 alpha, you can start a docker container. This will not disturb any existing ArangoDB 2.x installation. Just use

docker run -it -e ARANGO_NO_AUTH=1 -p 9000:8529 arangodb/arangodb-preview

You can now point your browser to port 9000 and play with ArangoDB 3. The login is “root” with empty password. If you are using docker machine or something similar, you need to use the IP of the DOCKER_HOST.

If want to use the ArangoDB shell, you can use a second instance of the docker container.

docker run -it arangodb/arangodb-preview /usr/bin/arangosh --server.endpoint tcp://192.168.99.100:9000

The 192.168.99.100 should be the IP address of the DOCKER_HOST.

Getting Test data into v3.0 alpha

Because the storage engine and the datafile format have changed considerably, you have to dump and restore your data of your current ArangoDB version.

Assume, you have a running 2.8. You can export your data using

arangodump --output-directory example

This will create a dump in the directory “example”. In order to import the data into ArangoDB 3 use:

tar cf - example | docker run -i arangodb/arangodb-preview /bin/bash -c '/bin/tar xf - ; /usr/bin/arangorestore --server.endpoint tcp://192.168.99.100:9000 example'

Again replace the 192.168.99.100 with the IP address of the DOCKER_HOST.

Now you’re good to go! We’re excited about your feedback and first impressions to help us put the final touch on 3.0!

If you have any feedback we would be happy to hear your thoughts on Slack in our “feedback30” channel

Julie Ferrario

Leave a Comment





Get the latest tutorials, blog posts and news: