Performance Archives - Page 3 of 6 - ArangoDB

Sign up for ArangoGraph Insights Platform

Before signing up, please accept our terms & conditions and privacy policy.

What to expect after you signup
You can try out ArangoDB Cloud FREE for 14 days. No credit card required and you are not obligated to keep using ArangoDB Cloud.

At the end of your free trial, enter your credit card details to continue using ArangoDB Cloud.

If you decide that ArangoDB Cloud is not (yet) for you, you can simply leave and come back later.

On Getting Unique Values (Performance Comparison)

00PerformanceTags:

While paging through the issues in the ArangoDB issue tracker I came across issue #987, titled “Trying to get distinct document attribute values from a large collection fails”.

The issue was opened around 10 months ago when ArangoDB 2.2 was around. We improved AQL performance somewhat since then, so I was eager to see how the query would perform in ArangoDB 2.6, especially when comparing it to 2.2.

For reproduction I quickly put together some example data to run the query on:

More info

IN-list Improvements

01PerformanceTags:

Another performance improvement could be accomplished in the latest devel-branch: The handling of large IN-lists. Those become much faster than in the previous releases. Large IN-lists are normally used when comparing attribute or index values against some big array of lookup values or keys provided by the application.

Read on how this improvement reduces query execution time.

String Comparison Performance

00PerformanceTags:

We’ve been using Callgrind with its powerful frontend KCachegrind for quiet some time to analyse where the hot spots can be found inside of ArangoDB. One thing always accounting for a huge chunk of the resource usage was string comparison. Yes, string comparison isn’t as cheap as one may think, but its been even a bit more than one would expect. And since much of the business of a database is string comparison, its used a lot.

ArangoDB and V8 use the ICU Library for these purposes (with no alternatives on the market) – so basically we heavily rely on the performance of the ICU library. However, one line in the ICU change-log – ‘Performance: string comparisons significantly faster’ – made us listen up.

So it was a crystal clear objective to take advantage of these performance improvements. As we use the ICU bundled with V8, we had to make sure it would work smooth for it first ;-). After enrolling the upgrade, we wanted to know whether everything was working fine with valgrind etc, and get some figures how much the actual improvement is.
More info

Return Value Optimization for AQL

00PerformanceTags:

While in search for further AQL query optimizations last week, we found that intermediate AQL query results were copied one time too often in some cases.

Precisely, the data that a query’s ReturnNode will return to the caller was copied into the ReturnNode’s own register. With ReturnNode’s never modifying their input data, this demanded for something that is called return-value optimization in compilers.

2.6 will now optimize away these copies in many cases, and my blog post Return Value Optimization for AQL shows performance benefits of 10-25% that can be expected due to the optimization.

COLLECTing With a Hash Table

07Performance, Query LanguageTags:

ArangoDB 2.6 will feature an alternative hash implementation of the AQL COLLECT operation. The new implementation can speed up some AQL queries that can not exploit indexes on the COLLECT group criteria.

This blog post provides a preview of the feature and shows some nice performance improvements. It also explains the COLLECT-related optimizer parts and how the optimizer will decide whether to use the new or the traditional implementation.

More info

Updating Documents with an Arangoimp Import

02Documentation, PerformanceTags:

Inspired by the feature request in Github issue #1298, we added update and replace support for ArangoDB’s import facilities.

This extends ArangoDB’s HTTP REST API for importing documents plus the arangoimp binary so they can not only insert new documents but also update existing ones.

Inserts and updates can also be mixed in a single import run. What exactly will happen is configurable by setting arangoimp’s new command-line option --on-duplicate.

By default, error will be reported if a document already exists. This behavior can be changed by setting --on-duplicate to a value of update, replace or ignore. Here is an example result of an import with duplicated keys:

So, if you want to aggregate data from several data files, you can try the new import command-line option --on-duplicate.

In a blog post, Jan provides a few usage examples.

More Efficient Data Exports with new Export API

03API, PerformanceTags: ,

ArangoDB 2.6 provides a specialized export API for exporting all documents from a collection and shipping them to a client application. It is rather limited but faster than the general-purpose AQL cursor API and can store its snapshots using less memory.

export_api

A side effect of the speedup is that the first results will arrive much earlier in the client application. This will help in reducing client connection timeouts in case clients are enforcing them on temporarily non-responding connections. More info

New Cursor API leads to significant performance improvements

00API, PerformanceTags: ,

This week we pushed some modifications for ArangoDB’s cursor API into the devel branch. The change will result in less copying of AQL query results between the AQL and the HTTP layers. As a positive side effect, this will reduce the amount of garbage collection the built-in V8 has to do.

These modifications should improve the cursor API performance significantly for many cases, while at the same time keeping its REST API stable. Client programs do not need to be adjusted to reap the benefits. In a blog post, Jan shows some first unscientific performance tests comparing the old cursor API with its new, improved implementation.
More info