Running ArangoDB Cluster on Debian and the Merii Hummingbird A80 Optimus – Part 1

For running ArangoDB in clusters doing performance tests we wanted to have a non virtualized set of descent hardware with fast ethernet connection, enough RAM (since thats what Arango needs) and multicore CPU. Since you need a bunch of them, cheap ARM devel boards come to mind. The original Raspberry PI (we have those) is out of the game due to V8 is not supporting it anymore. The now available PI 2 doesn’t cut it, since its ethernet NIC is connected via USB (as on the original PI). The Odroid series only have one of both: Fast ethernet or enough RAM. The Cubieboard 4 wasn’t available yet, but its Allwinner A80 SOC seemed a good choice. Then we met the Merii Optimus board, which seems to be almost the same as the PCDuino (now renamed to Arches) with the A80. While we got a bunch of them for a descent price over at Pollin, the upstream support wasn’t that good.

However, with some help of the SunXi-Linux Project we started flashing OS images to replace the preloaded Android image with the Merii Linux image. Since the userland of the Merii image is pretty sparse, we wanted something more useable. There is already a how-to on running Ubuntu which requires running a Windows host. We prefer a Linux host and want to run a Debian. Since the new Pi2 is also able to run regular Debian with ArmV7, we pick the root fs from sjoerd.

String Comparison Performance


We’ve been using Callgrind with its powerful frontend KCachegrind for quiet some time to analyse where the hot spots can be found inside of ArangoDB. One thing always accounting for a huge chunk of the resource usage was string comparison. Yes, string comparison isn’t as cheap as one may think, but its been even a bit more than one would expect. And since much of the business of a database is string comparison, its used a lot.

ArangoDB and V8 use the ICU Library for these purposes (with no alternatives on the market) – so basically we heavily rely on the performance of the ICU library. However, one line in the ICU change-log – ‘Performance: string comparisons significantly faster’ – made us listen up.

So it was a crystal clear objective to take advantage of these performance improvements. As we use the ICU bundled with V8, we had to make sure it would work smooth for it first ;-). After enrolling the upgrade, we wanted to know whether everything was working fine with valgrind etc, and get some figures how much the actual improvement is.
Comparison: Lockless programming with atomics in C++ 11 vs. mutex and RW-locks


ArangoDB is multithreaded and able to use several CPU-cores at once. Because of that access to common data structures to these threads have to be protected from concurrent access. ArangoDB currently uses mutexes, spinlocks and RW-locks for that. With the ongoing development of the MVCC the number of situations where protected access is needed grows significantly. If locking is done too often the scalability is effectively limited to one core. So this test was done to estimate the costs, and evaluate other solutions – so called lockless programming with atomics.

