SatelliteGraphs feature in the upcoming ArangoDB 3.7

Sign up for ArangoGraph Insights Platform

Before signing up, please accept our terms & conditions and privacy policy.

What to expect after you signup
You can try out ArangoDB Cloud FREE for 14 days. No credit card required and you are not obligated to keep using ArangoDB Cloud.

At the end of your free trial, enter your credit card details to continue using ArangoDB Cloud.

If you decide that ArangoDB Cloud is not (yet) for you, you can simply leave and come back later.

SatelliteGraphs in ArangoDB 3.7

SatelliteGraphs in ArangoDB 3.7

Open In Colab

ArangoDB is a distributed Database allowing it to query large datasets distributed across multiple nodes. Great scale often comes at a price though, in this case network traffic and coordination.

When executing queries involving graph traversals, shortest path, or k-shortest paths computations in an ArangoDB cluster, data has to be exchanged between different servers. In particular graph traversals are usually executed on a Coordinator, because they need global information. This results in a lot of network traffic and slow query execution.

SatelliteGraphs are the natural extension of the concept of SatelliteCollections, improving join operations by replicating a small collection to all nodes, to graphs.

ArangoDB, being a Multi-Model database, is often used for use-cases where one has large amounts of data in collections sharded across multiple database nodes for scalability and performance.

Consider for example the massive amount of sensor data generated by IoT use-cases. The corresponding metadata describing the individual sensors (locations, type, accuracy, …) is stored in a graph allowing simple graph queries retrieving a particular subset of sensors. A simplified version of this use case is shown in the following jupyter notebook. You can see the output in this article or click the open in Colab button to get access to a temporary ArangoDB Oasis database and run it for yourself.

The first few code blocks contain some of the setup:

  1. Install and import necessary packages
  2. Setup a function that provides us with a temporary Oasis database
  3. Setup a simple cleanup function
In [0]:
!git clone -b oasisConnector --single-branch
!rsync -av ArangoNotebooks/ ./ --exclude=.git
!pip3 install pyarango
!pip3 install "python-arango>=5.0"
In [0]:
import json
import requests
import sys
import pprint
import oasis

from pyArango.connection import *
from pyArango.collection import Collection, Edges, Field
from pyArango.graph import Graph, EdgeDefinition
from pyArango.collection import BulkOperation as BulkOperation\
In [0]:
def cleanupCollections(db):

Now, connect to the temporary Oasis database and cleanup the collections.

In [18]:
pp = pprint.PrettyPrinter()

# Retrieve tmp credentials from ArangoDB Tutorial Service
login = oasis.getTempCredentials(tutorialName='satelliteGraphs37', tempURL='')

## Connect to the temp database
conn = oasis.connect(login)
db = conn[login["dbName"]] 

# Cleanup (just in case the example is rerun)
Requesting new temp credentials.
Temp database ready to use.
{'dbName': 'TUTu0n0wpjwuopfgxrnwiq3',
 'hostname': '',
 'password': 'TUT2wpdufx1ug5ea1wn47l4th',
 'port': 8529,
 'username': 'TUTp29lfmdha1hdeq2a4qw64h'}

For this example we will generate the IoT metadata documents and save them to the Sensordata collection.

In [19]:
# Define large (i.e., in reality shareded) collection]
collection = db.createCollection(name="Sensordata")
docs= []
for i in range(100):
    doc = collection.createDocument()
    doc["id"] = i
    doc["data"] = "Large amount of data"

# Returns number of inserted documents

Setting up a SatelliteGraph requires the same type of graph definition as before but we instead call the createSatelliteGraph function.

Now that the graph has been created, we can add our collection data to it.

In [20]:
class Location(Collection):
    _fields = {
        "Location": Field()
class Sensor(Collection):
    _fields = {
        "id": Field()
class SensorLocation(Edges):
    _fields = {
        "lifetime": Field()

class MySatelliteGraph(Graph) :
    _edgeDefinitions = [EdgeDefinition("SensorLocation", fromCollections=["Location"], toCollections=["Sensor"])]
    _orphanedCollections = []

theSatelliteGraph = db.createSatelliteGraph("MySatelliteGraph")
print("Our first SatellitGraph: " + str(theSatelliteGraph))

# Add data to  MySatelliteGraph
s1 = theSatelliteGraph.createVertex('Sensor', {"id": 1})
s2 = theSatelliteGraph.createVertex('Sensor', {"id": 2})
l1 = theSatelliteGraph.createVertex('Location', {"location": "CA"})
l2 = theSatelliteGraph.createVertex('Location', {"location": "WA"})'SensorLocation', l1, s1, {"lifetime": "eternal"})'SensorLocation', l2, s2, {"lifetime": "eternal"})
Our first SatellitGraph: ArangoGraph: MySatelliteGraph
ArangoEdge '_id: SensorLocation/14020089, _key: 14020089, _rev: _apjeYl6--_, _to: Sensor/18020027, _from: Location/16020130': <store: {'lifetime': 'eternal'}>

Without SatelliteGraphs this query would involve a lot of network traffic as the query would need to fetch all data and then execute the Graph traversal.

But as the graph based metadata is small, we can define it as a SatelliteGraph which is synchronously replicated to all DB-Servers that are part of a cluster. DB-Servers can then execute graph traversals, shortest path, and k-shortest paths computations locally. Having all collections defined in the graph stored locally greatly improves performance for such queries, while still maintaining the benefits of a distributed environment.

In [21]:
# Join between the SatelliteGraph and "sharded" collection
print("Joining SatelliteGraph and 'sharded' collection")
aql = """
FOR loc in Location
    FILTER loc.location == "CA"
    FOR sensor IN 1..1 OUTBOUND loc._id GRAPH "MySatelliteGraph"
      // Join with large collection
      For sensordata in Sensordata
        FILTER == 1 //==
        RETURN {
         "sensor" :,
         "data" :

queryResult = db.AQLQuery(aql, rawResults=True, batchSize=1)
document = queryResult[0]

# Next Steps
print("If you are running this notebook in Google Colab, use these credentials to access the ArangoDB Web UI at:")
print("Username: " + login["username"])
print("Password: " + login["password"])
Joining SatelliteGraph and 'sharded' collection
{'sensor': 1, 'data': 'Large amount of data'}

If you are running this notebook in Google Colab, use these credentials to access the ArangoDB Web UI at:
Username: TUTp29lfmdha1hdeq2a4qw64h
Password: TUT2wpdufx1ug5ea1wn47l4th

If you would like to dive deeper into this example, feel free to use the Explain feature from the ArangoDB Web UI.

If you have been running the Colab up to this point, simply use the credentials that were generated for you above.

Otherwise, if you have not run the notebook in Colab, click the Open in Colab button at the top of the page.

Please, keep in mind that this database is temporary and will be automatically deleted. If you would like to have a permanent deployment with ArangoDB Oasis, sign up for free!

If you would like to continue exploring ArangoDB and all of the new features of 3.7, you can download the beta here.