home shape

GeoJSON tutorial

scroll down line

Motivation

Starting with the mass-market availability of smartphones and continuing with IoT devices, self-driving cars ever more data is generated with geo information attached to it. Analyzing this data in real-time requires the use of clever indexing data-structures.

Up to version 3.3, ArangoDB supported a basic subset of geospatial functions, which were able to use an existent geo index:

  • NEAR(): Return documents which a are near to a given tuple coordinate pair
  • WITHIN(): Return documents which are located inside a specified circle.
  • WITHIN_RECTANGLE(): Return documents which are located inside a specified rectangle.

Calculating e.g. the distance between two coordinate tuples or checking whether a coordinate pair is located inside a polygon was possible, but those functions could not benefit by using the geo index optimizations. Those operations need to be as fast as possible to prevent them from being a show stopper.

Of course, speed is not everything, so we also want to provide a broader set of geo functionality by integrating full GeoJSON support including Polygons, Multi-Polygons and other geometry primitives.

With these functionalities, one can do more complex queries and build e.g. location-aware recommendation engines by combining the graph data model with geo-location aspects or use multiple data models. For instance, in the age of self-driving cars, one can find the nearest available maintenance team (geo query) with the right permission (graph model) to repair a given problem (sent automatically to the DB as e.g. a JSON document or key/value pair).

Full GeoJSON Support in ArangoDB 3.4

ArangoDB 3.4 includes the new GeoJSON functionalities based on Google’s S2 geospatial index which will replace the old geospatial index without dropping support of our old geospatial functions or index. We support indexing on a subset of the GeoJSON standard, as well as simple latitude-longitude pairs (Non-GeoJSON mode). This article will also introduce new geo functions and geo constructors. Also, the AQL Editor got an improvement to render GeoJSON based data directly with OpenStreetMap.

Improved AQL Editor

While boosting up ArangoDB’s backend and working with several geo use-cases during development, we
noticed that something was missing. Even if everything was working as expected and our tests were green, it felt not like this whole geo feature is complete. Restaurants, Districts, Streets etc. are all locations based on our earth’s surface. So you want to see them! That is the reason we also implemented a new layer inside the WebUI’s AQL Editor which is detecting if your queries are returning GeoJSON objects. And if that is the case, we’ll show them!

GeoJSON

GeoJSON is an open standard format for geographical features or data structures including properties and their spatial extends based on JSON. Back in the days it started as a community project by developers, but today it is fully specified and released by the IETF (RFC7946). GeoJSON uses a geographic coordinate reference system (WGS84) and units of decimal degrees for representing positions on earth with their longitude and latitude values.

ArangoDB currently supports Point, MultiPoint, LineString, MultiLineString, Polygon and MultiPolygon as GeoJSON types.
A GeoJSON representation of a Point:

{ "coordinates": [10.0, 20.0], "type": "Point" }

Where 10.0 stands for the longitude value and 20.0 for the latitude value.

GeoJSON Supported Index

To create an geospatial index, which supports GeoJSON, on a collection named restaurants, use this command:

restaurants.ensureIndex({
  type: "geo",
  fields: [ "location" ],
  geoJson:true
});

The newly created index is expecting a valid GeoJSON object in each document inside the location property. A valid document would look like this:

{
  "location": {
    "coordinates": [-73.97632519999999, 40.6748163 ],
    "type": "Point"
  },
  "name": "Crab Spot Restaurant"
}

Invalid documents will be ignored. You are also able to create indices via the Web UI.

Added AQL Features

This chapter will introduce the new GeoJSON constructors (helpers) and geospatial functions, which actually will gain optimization through the geo index.

Constructors

To keep things simple, ArangoDB offers some AQL constructor functions to create those types easily:

GEO_POINT(10.0, 20.0)

will return above’s GeoJSON Point representation. Whereas the GEO_POLYGON function

RETURN GEO_POLYGON([
  [6.591796875, 47.54687159892238],
  [14.5458984375, 47.635783590864854],
  [14.1943359375, 54.62297813269033],
  [6.50390625, 54.54657953840501],
  [6.591796875, 47.54687159892238]
])

will create the following GeoJSON output (representing a simplified shape around Germany):

{
  "coordinates": [
    [
      [6.591796875, 47.54687159892238],
      [14.5458984375, 47.635783590864854],
      [14.1943359375, 54.62297813269033],
      [6.50390625, 54.54657953840501],
      [6.591796875, 47.54687159892238]
    ]
  ],
  "type": "Polygon"
}

Available constructors are:

  • GEO_POINT()
  • GEO_LINESTRING()
  • GEO_MULTILINESTRING()
  • GEO_POLYGON()
  • GEO_MULTIPOLYGON()

To get all usage information, please refer to the geo function documentation.

Functions

With ArangoDB 3.4+ we’ve included new geo functions, which are much more powerful as in our previous releases. This chapter introduces each function with a small example snippet:

GEO_DISTANCE()

Return all restaurants with a maximum distance of 30km starting from the Statue of Liberty.

LET statueOfLiberty = GEO_POINT(-74.044500, 40.689306)
FOR restaurant IN restaurants
  FILTER GEO_DISTANCE(statueOfLiberty, restaurant.location) <= 30000
  RETURN restaurant.location

Return all restaurants with a distance between 25km and 30km starting from the Statue of Liberty.

LET statueOfLiberty = GEO_POINT(-74.044500, 40.689306)
FOR restaurant IN restaurants
  FILTER GEO_DISTANCE(statueOfLiberty, restaurant.location) <= 30000
  FILTER GEO_DISTANCE(statueOfLiberty, restaurant.location) >= 25000
  RETURN restaurant.location
RestaurantsBetweenPNG

Return 100 nearest restaurants sorted by their distance starting from the Statue of Liberty.

FOR restaurant IN restaurants
  LET statueOfLiberty = GEO_POINT(-74.044500, 40.689306)
  SORT GEO_DISTANCE(statueOfLiberty, restaurant.location) ASC
  LIMIT 100
  RETURN restaurant.location

GEO_CONTAINS()

First get the document representing the district “Chinatown”. Then iterate and filter trough all available restaurants and only show those, which are located inside “Chinatown”.

FOR n IN neighborhoods
  FILTER n.name == "Chinatown"
  LET chinatown = n
  FOR restaurant IN restaurants
    FILTER GEO_CONTAINS(chinatown.geometry, restaurant.location)
    RETURN restaurant.location
RestaurantsInNeighborhoodPNG

GEO_INTERSECTS()

Define some area in NYC which potentially covers more neighborhoods. Then return all neighborhoods which will intersect with the defined area (Polygon).

[-74.02587890625, 40.70536767492135],
  [-73.97335052490234, 40.71135347314246],
  [-73.90434265136719, 40.797957124643666],
  [-73.98193359375, 40.814328907637126],
  [-74.02587890625, 40.70536767492135]
])
FOR n IN neighborhoods
  FILTER GEO_INTERSECTS(someAreaInNYC, n.geometry)
  RETURN n.geometry

GEO_EQUALS()

This function checks whether GeoJSON objects are equal or not.

LET A = GEO_POINT(1.0, 2.0)
LET B = GEO_POINT(3.0, 4.0)
RETURN {
  "AA": GEO_EQUALS(A, A), // true
  "AB": GEO_EQUALS(A, B) // false
}

Try it out

This blog post uses a dataset of restaurants and neighborhoods in New York City. To run all examples, create two collections called neighborhoods and code>restaurants and import the dataset with arangoimport or directly via ArangoDB’s WebUI.
The dataset can be found here:

It contains two JSON files. The neighborhood dataset consists of GeoJSON specified Polygons representing districts in NYC (~200 entries). The restaurant dataset contains restaurants located in NYC which are stored as GeoJSON Points (~20k entries).
Import with arangoimport:

./arangoimport restaurants.json --collection restaurants 
./arangoimport neighborhoods.json --collection neighborhoods

After the successful import, create also indices like explained before in chapter GeoJSON Supported Index. Do not forget to also create the index for the neighborhoods collection as well:

db.restaurants.ensureIndex({ type: "geo", fields: [ "location" ], geoJson:true })
db.neighborhoods.ensureIndex({ type: "geo", fields: [ "geometry" ], geoJson:true })

Feel free to use the given dataset and queries to play around.

Updated Geo Foxx Service

If you would like to take a look into Foxx and a basic geo example, feel free to download and examine our updated Geo Foxx Service here: