Ready for some hot new stuff? 🙂
Today we can share Release Candidate 3 of ArangoDB 3.5 with you and have a world premier up our sleeves. Get RC3 of ArangoDB 3.5.
With this RC we want to highlight a brand new feature which we called SmartJoins (well, no… this name was actually not born in the marketing department).
SmartJoins allows users to shard multiple collections in a, well, smarter way. Join operations among huge distributed collections can run much more efficiently now and performance gets very close to a single instance. More on SmartJoins and a hands-on tutorial in a bit.
The other new feature we want to highlight is index hints. Index hints allow you to give the query optimizer a hint to use a certain index or even define which index should be used by the optimizer to speed up a given query.
Please note as always, Release Candidates are for testing purposes only and please back up your system before upgrading. In addition, if you have generated a view of type
arangosearch please delete the view before upgrading to RC3 and recreate once the update process is finished.
Less talk, more details…
SmartJoins – Efficient Joins for Sharded Collections
If you are already familiar with SmartGraphs and SatelliteCollections, you might know that team ArangoDB really likes the idea of having the same flexibility users know from non-sharded deployments also in cluster settings. We believe that SmartJoins are another huge step to complete this picture.
Let’s assume you have two huge collections and want or need to shard both of them. Running joins can get very expensive in this scenario as data needed for your query may not reside on the same machine. Network latency between DBservers during query processing, can get pretty expensive compared to an in-memory lookup on the same machine.
In the schema below, you can see distributed collections sharded with the SmartJoin approach. All data needed for your query resides on the same machine and you can get close to the performance of a single instance, even if you sharded your data.
Collections Sharded with SmartJoin Approach
We are a bit nervous but also excited to hear about real world experiences with this new feature. If you’d like to give it a spin and learn how to apply it to your use case, please check out the SmartJoin tutorial. Would be great to get your feedback and ideas where to apply SmartJoins. Let us know via firstname.lastname@example.org.
Index Hints & Named Indexes
Index hints and named indexes both aim to simplify index usage in ArangoDB.
If more than one index was available for a query, the optimizer decides automatically which index to be used without the possibility of influencing it manually. Despite making an educated decision based on cost estimations, the optimizer might not make the optimal decision for your scenario.
With index hints, you can now influence the usage of indexes in two ways. First, you can provide a hint for the optimizer for an index you think could work better in a given query. Second, you can enforce the usage of a certain index and let a query fail if not (e.g. avoid full collection scans).
Named indexes allow a user to specify a name of their indexes. Prior to ArangoDB 3.5 an index just had a pretty long number and therefore hard to remember. With 3.5 you can now choose a name for any index you create manually.
You can dive a bit deeper into both features in the index hints tutorial.
Take the new release candidate for a test-drive and get RC3 here.
Hope these three new features are useful for you and your projects. If you have any feedback to the RC, please let us know via GitHub and specify the RC version you are using.
Happy testing and thanks a lot in advance for any feedback!
Known issues of RC3:
- For everybody who has a view of type arangosearch created and wants to upgrade ArangoDB from 3.4.x to RC3 of ArangoDB 3.5, we recommend to delete the view in 3.4.x, then upgrade to RC3 and create the view anew.
Other known issues can be found in the devel docs.