home shape

ArangoDB 2.8 Beta 1: Test the Latest Features

The first beta release of ArangoDB 2.8 is available for download now, adding Array Indexes and Graph Traversals in AQL. Please try the new version, report bugs on Github and provide us your valuable feedback.

Check out the latest blog posts to get some more background about performance improvements and added features.

AQL in ArangoDB 2.8 helps you to traverse a graph

The query language AQL adds the keywords GRAPH, OUTBOUND, INBOUND and ANY for use in graph traversals, reserved AQL keyword ALL for future use.

Syntax for managed graphs:

FOR vertex[, edge[, path]] IN MIN [..MAX] OUTBOUND|INBOUND|ANY startVertex GRAPH graphName

Working on collection sets:

FOR vertex[, edge[, path]] IN MIN[..MAX] OUTBOUND|INBOUND|ANY startVertex edgeCollection1, .., edgeCollectionN

Using plain AQL in ArangoDB 2.8 you can create your christmas shopping list for your friends gifts, related to products they already own and up to 5 ideas ordered by price.

FOR friend IN OUTBOUND @me isFriendOf
LET toBuy = (
  FOR bought IN OUTBOUND friend hasBought

    FOR combinedProduct IN OUTBOUND bought combinedProducts
      SORT combinedProduct.price
      LIMIT 5
      RETURN combinedProduct
)
RETURN { friend, toBuy }

graph4

Using Dave as a bind parameter for @me, we get the following result for our shopping tour:

[
  {
    "friend": {
      "name": "Julia",
      "_id": "users/Julia",
      "_rev": "1868379126",
      "_key": "Julia"
    },
    "toBuy": [
      {
        "price": 12,
        "name": "SanDisk Extreme SDHC UHS-I/U3 16GB Memory Card",
        "_id": "products/SanDisk16",
        "_rev": "2012820470",
        "_key": "SanDisk16"
      },
      {
        "price": 21,
        "name": "Lightweight Tripod 60-Inch with Bag",
        "_id": "products/Tripod",
        "_rev": "2003514358",
        "_key": "Tripod"
      },
      {
        "price": 99,
        "name": "Apple Pencil",
        "_id": "products/ApplePencil",
        "_rev": "2019177462",
        "_key": "ApplePencil"
      },
      {
        "price": 169,
        "name": "Smart Keyboard",
        "_id": "products/SmartKeyboard",
        "_rev": "2020160502",
        "_key": "SmartKeyboard"
      }
    ]
  },
  {
    "friend": {
      "name": "Debby",
      "city": "Dallas",
      "_id": "users/Debby",
      "_rev": "1928803318",
      "_key": "Debby"
    },
    "toBuy": [
      {
        "price": 12,
        "name": "Lixada Bag for Self Balancing Scooter",
        "_id": "products/LixadaScooterBag",
        "_rev": "2018194422",
        "_key": "LixadaScooterBag"
      }
    ]
  }
]

Usage of these new keywords as collection names, variable names or attribute names in AQL queries will not be possible without quoting. For example, the following AQL query will still work as it uses a quoted collection name and a quoted attribute name:

FOR doc IN `OUTBOUND`
  RETURN doc.`any`

Please have a look in the documentation for further details.

Array Indexes

Hash indexes and skiplist indexes can now optionally be defined for array values so they index individual array members. To define an index for array values, the attribute name is extended with the expansion operator [*] in the index definition:

arangosh> db.colName.ensureHashIndex(“tags[*]”);

When given the following document

{ tags: [ “AQL”, “ArangoDB”, “Index” ] }

the index will now contain the individual values “AQL”, “ArangoDB” and “Index”.

Now the index can be used for finding all documents having “ArangoDB” somewhere in their tags array using the following AQL query:

FOR doc IN colName 
  FILTER “ArangoDB” IN doc.tags[*] 
RETURN doc

Changelog of this 2.8 beta 1:

  • added AQL function IS_DATESTRING(value)

    Returns true if value is a string that can be used in a date function. This includes partial dates such as 2015 or 2015-10 and strings containing invalid dates such as 2015-02-31. The function will return false for all non-string values, even if some of them may be usable in date functions.

  • added AQL keywords GRAPH, OUTBOUND, INBOUND and ANY for use in graph traversals, reserved AQL keyword ALL for future use

    Usage of these keywords as collection names, variable names or attribute names in AQL queries will not be possible without quoting. For example, the following AQL query will still work as it uses a quoted collection name and a quoted attribute name:

    FOR doc IN OUTBOUND RETURN doc.any

  • issue #1593: added AQL POW function for exponentation
  • added cluster execution site info in explain output for AQL queries
  • replication improvements:
    • added autoResync configuration parameter for continuous replication.

    When set to true, a replication slave will automatically trigger a full data re-synchronization with the master when the master cannot provide the log data the slave had asked for. Note that autoResync will only work when the option requireFromPresent is also set to true for the continuous replication, or when the continuous syncer is started and detects that no start tick is present.

    Automatic re-synchronization may transfer a lot of data from the master to the slave and may be expensive. It is therefore turned off by default. When turned off, the slave will never perform an automatic re-synchronization with the master.

    • added idleMinWaitTime and idleMaxWaitTime configuration parameters for continuous replication.

    These parameters can be used to control the minimum and maximum wait time the slave will (intentionally) idle and not poll for master log changes in case the master had sent the full logs already. The idleMaxWaitTime value will only be used when adapativePolling is set to true. When adaptivePolling is disable, only idleMinWaitTime will be used as a constant time span in which the slave will not poll the master for further changes. The default values are 0.5 seconds for idleMinWaitTime and 2.5 seconds for idleMaxWaitTime, which correspond to the hard-coded values used in previous versions of ArangoDB.

    • added initialSyncMaxWaitTime configuration parameter for initial and continuous replication

    This option controls the maximum wait time (in seconds) that the initial synchronization will wait for a response from the master when fetching initial collection data. If no response is received within this time period, the initial synchronization will give up and fail. This option is also relevant for continuous replication in case autoResync is set to true, as then the continuous replication may trigger a full data re-synchronization in case the master cannot the log data the slave had asked for.

    • HTTP requests sent from the slave to the master during initial synchronization will now be retried if they fail with connection problems.
    • the initial synchronization now logs its progress so it can be queried using the regular replication status check APIs.
    • added async attribute for sync and syncCollection operations called from the ArangoShell. Setthing this attribute to true will make the synchronization job on the server go into the background, so that the shell does not block. The status of the started asynchronous synchronization job can be queried from the ArangoShell like this:

      /* starts initial synchronization */ var replication = require(“org/arangodb/replication”); var id = replication.sync({ endpoint: “tcp://master.domain.org:8529”, username: “myuser”, password: “mypasswd”, async: true });

      /* now query the id of the returned async job and print the status */ print(replication.getSyncResult(id));

    The result of getSyncResult() will be false while the server-side job has not completed, and different to false if it has completed. When it has completed, all job result details will be returned by the call to getSyncResult().

  • fixed non-deterministic query results in some cluster queries
  • fixed issue #1589
  • return HTTP status code 410 (gone) instead of HTTP 408 (request timeout) for server-side operations that are canceled / killed. Sending 410 instead of 408 prevents clients from re-starting the same (canceled) operation. Google Chrome for example sends the HTTP request again in case it is responded with an HTTP 408, and this is exactly the opposite of the desired behavior when an operation is canceled / killed by the user.
  • web interface: queries in AQL editor now cancelable
  • web interface: dashboard – added replication information
  • web interface: AQL editor now supports bind parameters
  • added startup option --server.hide-product-header to make the server not send the HTTP response header "Server: ArangoDB" in its HTTP responses. By default, the option is turned off so the header is still sent as usual.
  • added new AQL function UNSET_RECURSIVE to recursively unset attritutes from objects/documents
  • switched command-line editor in ArangoShell and arangod to linenoise-ng
  • added automatic deadlock detection for transactions

    In case a deadlock is detected, a multi-collection operation may be rolled back automatically and fail with error 29 (deadlock detected). Client code for operations containing more than one collection should be aware of this potential error and handle it accordingly, either by giving up or retrying the transaction.

  • Added C++ implementations for the AQL arithmetic operations and the following AQL functions:
    • ABS
    • APPEND
    • COLLECTIONS
    • CURRENT_DATABASE
    • DOCUMENT
    • EDGES
    • FIRST
    • FIRST_DOCUMENT
    • FIRST_LIST
    • FLATTEN
    • FLOOR
    • FULLTEXT
    • LAST
    • MEDIAN
    • MERGE_RECURSIVE
    • MINUS
    • NEAR
    • NOT_NULL
    • NTH
    • PARSE_IDENTIFIER
    • PERCENTILE
    • POP
    • POSITION
    • PUSH
    • RAND
    • RANGE
    • REMOVE_NTH
    • REMOVE_VALUE
    • REMOVE_VALUES
    • ROUND
    • SHIFT
    • SQRT
    • STDDEV_POPULATION
    • STDDEV_SAMPLE
    • UNSHIFT
    • VARIANCE_POPULATION
    • VARIANCE_SAMPLE
    • WITHIN
    • ZIP
  • improved performance of skipping over many documents in an AQL query when no indexes and no filters are used, e.g.

    FOR doc IN collection LIMIT 1000000, 10 RETURN doc

  • Added array indexes

    Hash indexes and skiplist indexes can now optionally be defined for array values so they index individual array members.

    To define an index for array values, the attribute name is extended with the expansion operator [*] in the index definition:

    arangosh> db.colName.ensureHashIndex(“tags[*]”);

    When given the following document

    { tags: [ “AQL”, “ArangoDB”, “Index” ] }

    the index will now contain the individual values "AQL", "ArangoDB" and "Index".

    Now the index can be used for finding all documents having "ArangoDB" somewhere in their tags array using the following AQL query:

    FOR doc IN colName FILTER “ArangoDB” IN doc.tags[*] RETURN doc

  • rewrote AQL query optimizer rule use-index-range and renamed it to use-indexes. The name change affects rule names in the optimizer’s output.
  • rewrote AQL execution node IndexRangeNode and renamed it to IndexNode. The name change affects node names in the optimizer’s explain output.
  • added convenience function db._explain(query) for human-readable explanation of AQL queries
  • module resolution as used by require now behaves more like in node.js
  • the org/arangodb/request module now returns response bodies for error responses by default. The old behaviour of not returning bodies for error responses can be re-enabled by explicitly setting the option returnBodyOnError to false (#1437)
Ingo

Ingo Friepoertner

Ingo is dealing with all the good ideas from the ArangoDB community, customers and industry experts to improve the value provided by the company’s native multi-model approach. In former positions he worked as a product owner and tech consultant, building custom software solutions for large companies in various industries. Ingo holds a diploma in business informatics from FHDW University of Applied Sciences.

Leave a Comment





Get the latest tutorials, blog posts and news: