HTTP Interface for Administration and Monitoring

This is an introduction to ArangoDB’s HTTP interface for administration and monitoring of the server.

Logs

Read global logs from the server

returns the server logs

GET /_admin/log

Query Parameters

  • upto (optional): Returns all log entries up to log level upto. Note that upto must be:
  • fatal or 0
  • error or 1
  • warning or 2
  • info or 3
  • debug or 4 The default value is info.

    • level (optional): Returns all log entries of log level level. Note that the query parameters upto and level are mutually exclusive.

    • start (optional): Returns all log entries such that their log entry identifier (lid value) is greater or equal to start.

    • size (optional): Restricts the result to at most size log entries.

    • offset (optional): Starts to return log entries skipping the first offset log entries. offset and size can be used for pagination.

    • search (optional): Only return the log entries containing the text specified in search.

    • sort (optional): Sort the log entries either ascending (if sort is asc) or descending (if sort is desc) according to their lid values. Note that the lid imposes a chronological order. The default value is asc.

Returns fatal, error, warning or info log messages from the server’s global log. The result is a JSON object with the following attributes:

HTTP 200

  • lid: a list of log entry identifiers. Each log message is uniquely identified by its @LIT{lid} and the identifiers are in ascending order.

  • level: A list of the log levels for all log entries.

  • timestamp: a list of the timestamps as seconds since 1970-01-01 for all log entries.

  • text: a list of the texts of all log entries

  • topic: a list of the topics of all log entries

  • totalAmount: the total amount of log entries before pagination.

  • 400: is returned if invalid values are specified for upto or level.

  • 500: is returned if the server cannot generate the result due to an out-of-memory error.

Return the current server log level

returns the current log level settings

GET /_admin/log/level

Returns the server’s current log level settings. The result is a JSON object with the log topics being the object keys, and the log levels being the object values.

Return codes

  • 200: is returned if the request is valid

  • 500: is returned if the server cannot generate the result due to an out-of-memory error.

Modify and return the current server log level

modifies the current log level settings

PUT /_admin/log/level

A JSON object with these properties is required:

  • agency: One of the possible log levels.

  • agencycomm: One of the possible log levels.

  • authentication: One of the possible log levels.

  • authorization: One of the possible log levels.

  • cache: One of the possible log levels.

  • cluster: One of the possible log levels.

  • collector: One of the possible log levels.

  • communication: One of the possible log levels.

  • compactor: One of the possible log levels.

  • config: One of the possible log levels.

  • datafiles: One of the possible log levels.

  • development: One of the possible log levels.

  • engines: One of the possible log levels.

  • general: One of the possible log levels.

  • graphs: One of the possible log levels.

  • heartbeat: One of the possible log levels.

  • memory: One of the possible log levels.

  • mmap: One of the possible log levels.

  • performance: One of the possible log levels.

  • pregel: One of the possible log levels.

  • queries: One of the possible log levels.

  • replication: One of the possible log levels.

  • requests: One of the possible log levels.

  • rocksdb: One of the possible log levels.

  • ssl: One of the possible log levels.

  • startup: One of the possible log levels.

  • supervision: One of the possible log levels.

  • syscall: One of the possible log levels.

  • threads: One of the possible log levels.

  • trx: One of the possible log levels.

  • v8: One of the possible log levels.

  • views: One of the possible log levels.

  • ldap: One of the possible log levels.

  • audit-authentication: One of the possible log levels.

  • audit-authorization: One of the possible log levels.

  • audit-database: One of the possible log levels.

  • audit-collection: One of the possible log levels.

  • audit-view: One of the possible log levels.

  • audit-document: One of the possible log levels.

  • audit-service: One of the possible log levels.

Modifies and returns the server’s current log level settings. The request body must be a JSON object with the log topics being the object keys and the log levels being the object values.

The result is a JSON object with the adjusted log topics being the object keys, and the adjusted log levels being the object values.

It can set the log level of all facilities by only specifying the log level as string without json.

Possible log levels are:

  • FATAL - There will be no way out of this. ArangoDB will go down after this message.
  • ERROR - This is an error. you should investigate and fix it. It may harm your production.
  • WARNING - This may be serious application-wise, but we don’t know.
  • INFO - Something has happened, take notice, but no drama attached.
  • DEBUG - output debug messages
  • TRACE - trace - prepare your log to be flooded - don’t use in production.

Return codes

  • 200: is returned if the request is valid

  • 400: is returned when the request body contains invalid JSON.

  • 405: is returned when an invalid HTTP method is used.

  • 500: is returned if the server cannot generate the result due to an out-of-memory error.

Statistics

Read the statistics

return the statistics information

GET /_admin/statistics

Returns the statistics information. The returned object contains the statistics figures grouped together according to the description returned by _admin/statistics-description. For instance, to access a figure userTime from the group system, you first select the sub-object describing the group stored in system and in that sub-object the value for userTime is stored in the attribute of the same name.

In case of a distribution, the returned object contains the total count in count and the distribution list in counts. The sum (or total) of the individual values is returned in sum.

The transaction statistics show the local started, committed and aborted transactions as well as intermediate commits done for the server queried. The intermediate commit count will only take non zero values for the RocksDB storage engine. Coordinators do almost no local transactions themselves in their local databases, therefor cluster transactions (transactions started on a coordinator that require DBServers to finish before the transactions is committed cluster wide) are just added to their local statistics. This means that the statistics you would see for a single server is roughly what you can expect in a cluster setup using a single coordinator querying this coordinator. Just with the difference that cluster transactions have no notion of intermediate commits and will not increase the value.

HTTP 200 Statistics were returned successfully.

  • error: boolean flag to indicate whether an error occurred (false in this case)

  • code: the HTTP status code - 200 in this case

  • time: the current server timestamp

  • errorMessage: a descriptive error message

  • enabled: true if the server has the statistics module enabled. If not, don’t expect any values.

  • system: metrics gathered from the system about this process; may depend on the host OS

    • minorPageFaults: pagefaults

    • majorPageFaults: pagefaults

    • userTime: the user CPU time used by the server process

    • systemTime: the system CPU time used by the server process

    • numberOfThreads: the number of threads in the server

    • residentSize: RSS of process

    • residentSizePercent: RSS of process in %

    • virtualSize: VSS of the process

  • client: information about the connected clients and their resource usage

    • sum: summarized value of all counts

    • count: number of values summarized

    • counts: array containing the values

    • connectionTime: total connection times

    • totalTime: the system time

    • requestTime: the request times

    • queueTime: the time requests were queued waiting for processing

    • ioTime: IO Time

    • bytesSent: number of bytes sent to the clients

    • bytesReceived: number of bytes received from the clients

    • httpConnections: the number of open http connections

  • http: the numbers of requests by Verb

    • requestsTotal: total number of http requests

    • requestsAsync: total number of asynchronous http requests

    • requestsGet: No of requests using the GET-verb

    • requestsHead: No of requests using the HEAD-verb

    • requestsPost: No of requests using the POST-verb

    • requestsPut: No of requests using the PUT-verb

    • requestsPatch: No of requests using the PATCH-verb

    • requestsDelete: No of requests using the DELETE-verb

    • requestsOptions: No of requests using the OPTIONS-verb

    • requestsOther: No of requests using the none of the above identified verbs

  • server: statistics of the server

    • uptime: time the server is up and running

    • physicalMemory: available physical memory on the server

    • transactions: Statistics about transactions

    • started: the number of started transactions

    • committed: the number of committed transactions

    • aborted: the number of aborted transactions

    • intermediateCommits: the number of intermediate commits done

    • v8Context: Statistics about the V8 javascript contexts

    • available: the number of currently spawnen V8 contexts

    • busy: the number of currently active V8 contexts

    • dirty: the number of contexts that were previously used, and should now be garbage collected before being re-used

    • free: the number of V8 contexts that are free to use

    • max: the total number of V8 contexts we may spawn as configured by --javascript.v8-contexts

    • memory: a list of V8 memory / garbage collection watermarks; Refreshed on every garbage collection run; Preserves min/max memory used at that time for 10 seconds

    • contextId: ID of the context this set of memory statistics is from

    • tMax: the timestamp where the 10 seconds interval started

    • countOfTimes: how many times was the garbage collection run in these 10 seconds

    • heapMax: High watermark of all garbage collection runs in 10 seconds

    • heapMin: Low watermark of all garbage collection runs in these 10 seconds

    • threads: Statistics about the server worker threads (excluding V8 specific or jemalloc specific threads and system threads)

    • scheduler-threads: The number of spawned worker threads

    • in-progress: The number of currently busy worker threads

    • queued: The number of jobs queued up waiting for worker threads becomming available

Examples

shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_admin/statistics

HTTP/1.1 OK
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

Show response body

Statistics description

fetch descriptive info of statistics

GET /_admin/statistics-description

Returns a description of the statistics returned by /_admin/statistics. The returned objects contains an array of statistics groups in the attribute groups and an array of statistics figures in the attribute figures.

A statistics group is described by

  • group: The identifier of the group.
  • name: The name of the group.
  • description: A description of the group.

A statistics figure is described by

  • group: The identifier of the group to which this figure belongs.
  • identifier: The identifier of the figure. It is unique within the group.
  • name: The name of the figure.
  • description: A description of the figure.
  • type: Either current, accumulated, or distribution.
  • cuts: The distribution vector.
  • units: Units in which the figure is measured.

HTTP 200 Description was returned successfully.

  • groups: A statistics group

    • group: The identifier of the group.

    • name: The name of the group.

    • description: A description of the group.

  • figures: A statistics figure

    • group: The identifier of the group to which this figure belongs.

    • identifier: The identifier of the figure. It is unique within the group.

    • name: The name of the figure.

    • description: A description of the figure.

    • type: Either current, accumulated, or distribution.

    • cuts: The distribution vector.

    • units: Units in which the figure is measured.

  • code: the HTTP status code

  • error: the error, false in this case

Examples

shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_admin/statistics-description

HTTP/1.1 OK
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

Show response body

Cluster

Return whether or not a server is in read-only mode

Return the mode of this server (read-only or default)

GET /_admin/server/mode

Return mode information about a server. The json response will contain a field mode with the value readonly or default. In a read-only server all write operations will fail with an error code of 1004 (ERROR_READ_ONLY). Creating or dropping of databases and collections will also fail with error code 11 (ERROR_FORBIDDEN).

This is a public API so it does not require authentication.

Return codes

  • 200: This API will return HTTP 200 if everything is ok

Update whether or not a server is in read-only mode

Update the mode of this server (read-only or default)

PUT /_admin/server/mode

A JSON object with these properties is required:

  • mode: The mode of the server readonly or default.

Update mode information about a server. The json response will contain a field mode with the value readonly or default. In a read-only server all write operations will fail with an error code of 1004 (ERROR_READ_ONLY). Creating or dropping of databases and collections will also fail with error code 11 (ERROR_FORBIDDEN).

This API so it does require authentication and administrative server rights.

Return codes

  • 200: This API will return HTTP 200 if everything is ok

  • 401: if the request was not authenticated as a user with sufficient rights

Return id of a server in a cluster

Get to know the internal id of the server

GET /_admin/server/id

Returns the id of a server in a cluster. The request will fail if the server is not running in cluster mode.

Return codes

  • 200: Is returned when the server is running in cluster mode.

  • 500: Is returned when the server is not running in cluster mode.

Return the role of a server in a cluster

Return the role of a server in a cluster

GET /_admin/server/role

Returns the role of a server in a cluster. The role is returned in the role attribute of the result. Possible return values for role are:

  • SINGLE: the server is a standalone server without clustering
  • COORDINATOR: the server is a Coordinator in a cluster
  • PRIMARY: the server is a DBServer in a cluster
  • SECONDARY: this role is not used anymore
  • AGENT: the server is an Agency node in a cluster
  • UNDEFINED: in a cluster, UNDEFINED is returned if the server role cannot be determined.

HTTP 200 Is returned in all cases.

  • error: always false

  • code: the HTTP status code, always 200

  • errorNum: the server error number

  • role: one of [ SINGLE, COORDINATOR, PRIMARY, SECONDARY, AGENT, UNDEFINED]

Return whether or not a server is available

Return whether or not a server is available

GET /_admin/server/availability

Return availability information about a server.

This is a public API so it does not require authentication. It is meant to be used only in the context of server monitoring only.

Return codes

  • 200: This API will return HTTP 200 in case the server is up and running and usable for arbitrary operations, is not set to read-only mode and is currently not a follower in case of an active failover setup.

  • 503: HTTP 503 will be returned in case the server is during startup or during shutdown, is set to read-only mode or is currently a follower in an active failover setup.

Queries statistics of DBserver

allows to query the statistics of a DBserver in the cluster

GET /_admin/clusterStatistics

Query Parameters

  • DBserver (required): Queries the statistics of the given DBserver

Return codes

  • 200:

  • 400: ID of a DBserver

  • 403:

Queries the health of cluster for monitoring

Returns the health of the cluster as assessed by the supervision (agency)

GET /_admin/cluster/health

Queries the health of the cluster for monitoring purposes. The response is a JSON object, containing the standard code, error, errorNum, and errorMessage fields as appropriate. The endpoint-specific fields are as follows:

  • ClusterId: A UUID string identifying the cluster
  • Health: An object containing a descriptive sub-object for each node in the cluster.
    • <nodeID>: Each entry in Health will be keyed by the node ID and contain the following attributes:
      • Endpoint: A string representing the network endpoint of the server.
      • Role: The role the server plays. Possible values are "AGENT", "COORDINATOR", and "DBSERVER".
      • CanBeDeleted: Boolean representing whether the node can safely be removed from the cluster.
      • Version: Version String of ArangoDB used by that node.
      • Engine: Storage Engine used by that node.
      • Status: A string indicating the health of the node as assessed by the supervision (agency). This should be considered primary source of truth for coordinator and dbservers node health. If the node is responding normally to requests, it is "GOOD". If it has missed one heartbeat, it is "BAD". If it has been declared failed by the supervision, which occurs after missing heartbeats for about 15 seconds, it will be marked "FAILED".

      Additionally it will also have the following attributes for:

      Coordinators and DBServers

      • SyncStatus: The last sync status reported by the node. This value is primarily used to determine the value of Status. Possible values include "UNKNOWN", "UNDEFINED", "STARTUP", "STOPPING", "STOPPED", "SERVING", "SHUTDOWN".
      • LastAckedTime: ISO 8601 timestamp specifying the last heartbeat received.
      • ShortName: A string representing the shortname of the server, e.g. "Coordinator0001".
      • Timestamp: ISO 8601 timestamp specifying the last heartbeat received. (deprecated)
      • Host: An optional string, specifying the host machine if known.

      Coordinators only

      • AdvertisedEndpoint: A string representing the advertised endpoint, if set. (e.g. external IP address or load balancer, optional)

      Agents

      • Leader: ID of the agent this node regards as leader.
      • Leading: Whether this agent is the leader (true) or not (false).
      • LastAckedTime: Time since last acked in seconds.

Return codes

  • 200:

Other

Reloads the routing information

Reload the routing table.

POST /_admin/routing/reload

Reloads the routing information from the collection routing.

Return codes

  • 200: Routing information was reloaded successfully.