Community Edition Features

The open-source version of ArangoDB is available under the permissive Apache 2.0 license and offers an extensive feature set including cluster support for free

The Community Edition features are outlined below. For additional information, see arangodb.com/community-server/.

General

  • Graph Database: Native support for storing and querying graphs comprised of vertices and edges. You can model complex domains because edges are documents without any restrictions in complexity.

  • Document Database: A modern document database system that allows you to model data intuitively and evolve the data model easily. Documents can be organized in collections, and collections in databases for multi-tenancy.

  • Data Format: JSON, internally stored in a binary format invented by ArangoDB called VelocyPack.

  • Schema-free: Flexible data modeling without having to define a schema upfront. Model your data as combination of key-value pairs, documents, or graphs - perfect for social relations. Optional document validation using JSON Schema (draft-4, without remote schema support).

  • Data Storage: RocksDB storage engine to persist data and indexes on disk, with a hot set in memory. It uses journaling (write-ahead logging) and can take advantage of modern storage hardware, like SSDs and large caches.

  • Computed Values: Persistent document attributes that are generated when documents are created or modified, using an AQL expression.

  • Multi-Platform: Available for Linux, macOS, and Windows, for the x86-64 architecture (with the SSE 4.2 and AVX instruction set extensions), as well as for 64-bit ARM chips on macOS (Apple silicon, like M1) and Linux (ARMv8+ with Neon SIMD support).

Scalability & High Availability

  • Hash-based sharding: Spread bigger datasets across multiple servers using consistent hashing on the default or custom shard keys.

  • Synchronous Replication: Data changes are propagated to other cluster nodes immediately as part of an operation, and only considered successful when the configured number of writes is reached. Synchronous replication works on a per-shard basis. For each collection, you can configure how many copies of each shard are kept in the cluster.

  • Active Failover: Run a single server with asynchronous replication to one or more passive single servers for automatic failover.

  • Automatic Failover Cluster: If a nodes goes down, another node takes over to avoid any downtime.

  • Load-Balancer Support: Round-robin load-balancer support for cloud environments.

  • High-performance Request Handling: Low-latency request handling using a boost-ASIO server infrastructure.

Querying

  • Declarative Query Language for All Data Models: Powerful query language (AQL) to retrieve and modify data. Graph traversals, full-text searches, geo-spatial queries, and aggregations can be composed in a single query. Support for sliding window queries to aggregate adjacent documents, value ranges and time intervals. Cluster-distributed aggregation queries.

  • Query Optimizer: Cost-based query optimizer that takes index selectivity estimates into account.

  • Query Profiling: Show detailed runtime information for AQL queries.

  • Upsert Operations: Support for insert-or-update (upsert), insert-or-replace (repsert), and insert-or-ignore requests, that result in one or the other operation depending on whether the target document exists already.

  • Graph Relations: Edges can connect vertex and even edge documents to express complex m:n relations with any depth, creating graphs and hyper-graphs.

  • Relational Joins: Joins similar to those in relational database systems can be leveraged to match up documents from different collections, allowing normalized data models.

  • Advanced Path-Finding with Multiple Algorithms: Graphs can be traversed with AQL to retrieve direct and indirect neighbor nodes using a fixed or variable depth. The traversal order can be depth-first, breadth-first, or in order of increasing edge weights (“Weighted Traversals”). Stop conditions for pruning paths are supported. Traversal algorithms to get a shortest path, all shortest paths, paths in order of increasing length (“k Shortest Paths”), and to enumerate all paths between two vertices (“k Paths”) are available, too.

  • Pregel: Iterative graph processing for single servers with pre-built algorithms like PageRank, Connected Components, and Label Propagation. Cluster support requires the Enterprise Edition.

  • ArangoSearch for Text Search and Ranking: A built-in search engine for full-text, complex data structures, and more. Exact value matching, range queries, prefix matching, case-insensitive and accent-insensitive search. Token, phrase, wildcard, and fuzzy search support for full-text. Result ranking using Okapi BM25 and TF-IDF. Geo-spatial search that can be combined with full-text search. Flexible data field pre-processing with custom queries and the ability to chain built-in and custom Analyzers. Language-agnostic tokenization of text.

  • GeoJSON Support: Geographic data encoded in the popular GeoJSON format can be stored and used for geo-spatial queries.

Transactions

  • AQL Queries: AQL queries are executed transactionally (with exceptions), either committing or rolling back data modifications automatically.

  • Stream Transactions: Transactions with individual begin and commit / abort commands that can span multiple AQL queries and API calls of supported APIs.

  • JavaScript Transactions: Single-request transactions written in JavaScript that leverage ArangoDB’s JavaScript API.

  • Multi-Document Transactions: Transactions are not limited to single documents, but can involve many documents of a collection.

  • Multi-Collection Transactions A single transaction can modify the documents of multiple collections. There is an automatic deadlock detection for single servers.

  • ACID Transactions: Using single servers, multi-document / multi-collection queries are guaranteed to be fully ACID (atomic, consistent, isolated, durable). Using cluster deployments, single-document operations are fully ACID, too. Multi-document queries in a cluster are not ACID, except for collections with a single shard. Multi-collection queries require the OneShard feature of the Enterprise Edition to be ACID.

Performance

  • Persistent Indexes: Indexes are stored on disk to enable fast server restarts. You can create secondary indexes over one or multiple fields, optionally with a uniqueness constraint. A “sparse” option to only index non-null values is also available. The elements of an array can be indexed individually.

  • Inverted indexes: An eventually consistent index type that can accelerate a broad range of queries from simple to complex, including full-text search.

  • Vertex-centric Indexes: Secondary indexes for more efficient graph traversals with filter conditions.

  • Time-to-Live (TTL) Indexes: Time-based removal of expired documents.

  • Geo-spatial Indexes: Accelerated geo-spatial queries for coordinates and GeoJSON objects, based on the S2 library. Support for composable, distance-based geo-queries (“geo cursors”).

  • Background Indexing: Indexes can be created in the background to not block queries in the meantime.

  • Extensive Query Optimization: Late document materialization to only fetch the relevant documents from SORT/LIMIT queries. Early pruning of non-matching documents in full collection scans. Inlining of certain subqueries to improve execution time.

Extensibility

Security

  • Authentication: Built-in user management with password- and token-based authentication.

  • Role-based Access Control: ArangoDB supports all basic security requirements. By using ArangoDB’s Foxx microservice framework users can achieve very high security standards fitting individual needs.

  • TLS Encryption: Internal and external communication over encrypted network connections with TLS (formerly SSL). TLS key and certificates rotation is supported.

Administration

  • Web-based User Interface: Graphical UI for your browser to work with ArangoDB. It allows you to view, create, and modify databases, collections, documents, graphs, etc. You can also run, explain, and profile AQL queries. Includes a graph viewer with WebGL support.

  • Cluster-friendly User Interface: View the status of your cluster and its individual nodes, and move and rebalance shards via the web interface.

  • Backup and Restore Tools: Multi-threaded dumping and restoring of collection settings and data in JSON format. Data masking capabilities for attributes containing sensitive data / PII when creating backups.

  • Import and Export Tools: CLI utilities to load and export data in multiple text-based formats. You can import from JSON, JSONL, CSV, and TSV files, and export to JSON, JSONL, CSV, TSV, XML, and XGMML files.

  • Metrics: Monitor the healthiness and performance of ArangoDB servers using the metrics exported in the Prometheus format.