An Introduction to Geo Indexes and their performance characteristics: Part II

00GeneralTags: ,

Geo Index Implementation

This section will cover the MMFiles based geo-index. The algorithm is optimized for in-memory accesses and optimal CPU cache utilization. The main goal for our geo queries is to reject as many distant possible result points as fast as possible.

One limitation of an approach purely using geostrings is, when one is trying to perform a query to find points near a target (see blog post Part I). Sometimes points close together on the surface might end up with entirely different geostring prefixes and cannot be scanned without seeks. We implemented a type of Metric Tree to optimize for nearest neighbor queries.

To consistently achieve fast queries the Hilbert geostrings are combined with a binary search tree, the current implementation chooses an AVL tree structure. Read more

ArangoJS 6.0.0 released: Load Balancing, Automated Failover and completely written in TypeScript


Version 6.0.0 of the JavaScript driver arangojs is now available (Find it on GitHub).

This is a major release that introduces a small number of breaking changes so make sure to check out the arangojs changelog before upgrading. The most significant additions in this release are support for load balancing and automated failover as well as improved browser and TypeScript support.

In order to use the new load balancing and failover features you can pass an array of URLs instead of a single URL when creating a new arangojs instance. The driver will automatically cycle between the URLs when the active server becomes unreachable. To cycle between servers for every request you can additionally provide the loadBalancingStrategy option:

When using ArangoDB in a cluster it’s also possible to auto-populate the URL list using the new acquireHostList method:

When using ArangoDB in standalone mode, leader/follower failover is handled automatically for each request. Read more

ArangoDB Fulltext Index


The ArangoDB Fulltext index allows you to search for text in arbitrary strings. It is a great way to implement things like autocompletion, product searches or many other use-cases which need some form of fulltext search.
The Fulltext Index is suitable for you if your use-case can be broken down to

  • Full matches of words
  • Prefix matches of words
  • You do not need a “ranking” of the matching documents

Read more

An Introduction to Geo Indexes and their performance characteristics: Part I

01Architecture, GeneralTags: ,

Starting with the mass-market availability of smartphones and continuing with IoT devices, self-driving cars ever more data is generated with geo information attached to it. Analyzing this data in real-time requires the use of clever indexing data-structures. Geo data in ArangoDB consists of 2 or more dimensions representing (x, y) coordinates on the earth surface. Searching on a single number is essentially a solved problem, but effectively searching on multi-dimensional data can be more difficult as standard indexing techniques cannot be used.

There exist a variety of indexing techniques. In this blogpost Part I, I will introduce some of the necessary background knowledge required to understand the ArangoDB geo index data structure. First I will start by introducing quadtrees and then I will extend this concept to geohashes and space filling curves like the Hilbert curve. Next week, I will publish Part II including details about the ArangoDB geo index implementation and performance benchmarking.
Read more

ArangoDB 3.3 GA
DC2DC Replication, Encrypted backup, Server-Level Replication and more

00General, ReleasesTags:

Just in time for the holidays we have a nice present for you all – ArangoDB 3.3. This release focuses on replication, resilience, stability and thus on general readiness for your production small and very large use cases. There are improvements for the community as well as for the Enterprise Edition. We sincerely hope to have found the right balance between them.

In the Community Edition there are:

  • Easier server-level replication
  • A resilient active/passive mode for single server instances with automatic failover
  • RocksDB throttling for increased guaranteed write performance
  • Faster collection and shard creation in the cluster
  • Lots of bug fixes (most of them have been backported to 3.2)

In the Enterprise Edition there are:

  • Datacenter to datacenter replication for clusters
  • Encrypted backup and restore

Read more

Spring is coming! – ArangoDB meets Spring Data


This year we got a lot of requests from our customers to provide Spring Data support for ArangoDB. So we listened and teamed up with one of our bigger customers from the financial sector to develop a Spring Data implementation for ArangoDB. We have also made an extensive demo on how to use Spring Data ArangoDB with an example data set of Game of Thrones characters and locations. So, Spring is not only coming… it is already there!

Read more

Introducing the new ArangoDB Java driver with load balancing and advanced fallback

02GeneralTags: ,

The newest release 4.3.2 of the official ArangoDB Java driver comes with load balancing for cluster setups and advanced fallback mechanics.

Load balancing strategies

Round robin

There are two different strategies for load balancing that the Java driver provides. The first and most common strategy is the round robin way. Round robin does, what the name already assumes, a round robin load balancing where a list of known coordinators in the cluster is iterated through. Each database operation uses a different coordinator than the one before.

Most of the database operations can be handled by this simple logic. But for AQL queries we need something smarter. We have to stick to a specific coordinator when performing AQL queries where its result is not fully returned in a single response. In this case, ArangoDB creates state in form of a local cursor on the coordinator, the initial query was sent to. Every following request to get more batches of the result has to go to the same coordinator. Different to most simple standalone load balancer the driver is able to take care of that.
Read more

ArangoDB Named Best Free Graph Database by G2 Crowd Users

00GeneralTags: ,

ArangoDB named by G2 Crowd users as the most popular graph database used today.

ArangoDB has been identified as the highest rated graph database, based on its high levels of customer satisfaction and likeliness to recommend ratings from real G2 Crowd users.

ArangoDB received a near perfect 4.9 out of 5 star average for user satisfaction for its free platform across its 24 user reviews. ArangoDB users point to the database’s query language, availability and storage as the three most liked features of the product. Read more

Do you like ArangoDB?
icon-githubStar this project on GitHub.
Star ArangoDB on GitHub