The upcoming 2.8 version of ArangoDB will provide several improvements in the area of index usage and query optimization.
First of all, hash and skiplist indexes can now index individual array values. A dedicated post on this will follow shortly. Second, the query optimizer can make use multiple indexes per collection for queries with OR-combined filter conditions. This again is a subject for another post. Third, there have been some speed improvements due to changes in the general index handling code. This is what this post is about.
In order to assess the speedups in 2.8, I have run some already existing performance tests that I initially ran when comparing ArangoDB 2.5 with 2.6. The test cases and methodology are detailed in this earlier blog post.
For measuring the index-related performance improvements, I simply re-ran the index related tests in 2.7 and in 2.8 / devel. I did not bother re-running all tests from the original blog article because only some are index-related. In particular, I only ran these tests again:
- join-key: for each document in the collection, perform a join on the _key attribute on the collection itself (i.e.
FOR c1 IN @@c FOR c2 IN @@c FILTER c1._key == c2._key RETURN c1)
- join-id: ditto, but perform the join using the
- join-hash-number and join-hash-string: ditto, but join using a hash index on a numeric or string attribute
- join-skiplist-number and join-skiplist-string: ditto, but join using a skiplist index on a numeric or string attribute lookup-key, lookup-hash-number, lookup-hash-string, lookup-skiplist-number, lookup-skiplist-string: compile an IN-list of 10,000 lookup values and search these 10,000 documents in the collection using either the primary index (
_keyattribute), a hash index or a skiplist index. The latter two are tested on numeric and string attributes.
The test queries were run 5 times each on collections containing 10,000, 100,000 and 1,000,000 documents. More