Replication Logger Commands

Previous versions of ArangoDB allowed starting, stopping and configuring the replication logger. These commands are superfluous in ArangoDB 2.2 as all data-modification operations are written to the server’s write-ahead log and are not handled by a separate logger anymore.

The only useful operations remaining since ArangoDB 2.2 are to query the current state of the logger and to fetch the latest changes written by the logger. The operations will return the state and data from the write-ahead log.

Return replication logger state

returns the state of the replication logger

GET /_api/replication/logger-state

Returns the current state of the server’s replication logger. The state will include information about whether the logger is running and about the last logged tick value. This tick value is important for incremental fetching of data.

The body of the response contains a JSON object with the following attributes:

  • state: the current logger state as a JSON object with the following sub-attributes:

    • running: whether or not the logger is running

    • lastLogTick: the tick value of the latest tick the logger has logged. This value can be used for incremental fetching of log data.

    • totalEvents: total number of events logged since the server was started. The value is not reset between multiple stops and re-starts of the logger.

    • time: the current date and time on the logger server

  • server: a JSON object with the following sub-attributes:

    • version: the logger server’s version

    • serverId: the logger server’s id

  • clients: returns the last fetch status by replication clients connected to the logger. Each client is returned as a JSON object with the following attributes:

    • serverId: server id of client

    • lastServedTick: last tick value served to this client via the logger-follow API

    • time: date and time when this client last called the logger-follow API

Return codes

  • 200: is returned if the logger state could be determined successfully.

  • 405: is returned when an invalid HTTP method is used.

  • 500: is returned if the logger state could not be determined.

Examples

Returns the state of the replication logger.

shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/replication/logger-state

HTTP/1.1 OK
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

Show response body

To query the latest changes logged by the replication logger, the HTTP interface also provides the logger-follow method.

This method should be used by replication clients to incrementally fetch updates from an ArangoDB database.

Returns log entries

Fetch log lines from the server

GET /_api/replication/logger-follow

This route should no longer be used. It is considered as deprecated from version 3.4.0 on.

Query Parameters

  • from (optional): Exclusive lower bound tick value for results.

  • to (optional): Inclusive upper bound tick value for results.

  • chunkSize (optional): Approximate maximum size of the returned result.

  • includeSystem (optional): Include system collections in the result. The default value is true.

Returns data from the server’s replication log. This method can be called by replication clients after an initial synchronization of data. The method will return all “recent” log entries from the logger server, and the clients can replay and apply these entries locally so they get to the same data state as the logger server.

Clients can call this method repeatedly to incrementally fetch all changes from the logger server. In this case, they should provide the from value so they will only get returned the log events since their last fetch.

When the from query parameter is not used, the logger server will return log entries starting at the beginning of its replication log. When the from parameter is used, the logger server will only return log entries which have higher tick values than the specified from value (note: the log entry with a tick value equal to from will be excluded). Use the from value when incrementally fetching log data.

The to query parameter can be used to optionally restrict the upper bound of the result to a certain tick value. If used, the result will contain only log events with tick values up to (including) to. In incremental fetching, there is no need to use the to parameter. It only makes sense in special situations, when only parts of the change log are required.

The chunkSize query parameter can be used to control the size of the result. It must be specified in bytes. The chunkSize value will only be honored approximately. Otherwise a too low chunkSize value could cause the server to not be able to put just one log entry into the result and return it. Therefore, the chunkSize value will only be consulted after a log entry has been written into the result. If the result size is then bigger than chunkSize, the server will respond with as many log entries as there are in the response already. If the result size is still smaller than chunkSize, the server will try to return more data if there’s more data left to return.

If chunkSize is not specified, some server-side default value will be used.

The Content-Type of the result is application/x-arango-dump. This is an easy-to-process format, with all log events going onto separate lines in the response body. Each log event itself is a JSON object, with at least the following attributes:

  • tick: the log event tick value

  • type: the log event type

Individual log events will also have additional attributes, depending on the event type. A few common attributes which are used for multiple events types are:

  • cid: id of the collection the event was for

  • tid: id of the transaction the event was contained in

  • key: document key

  • rev: document revision id

  • data: the original document data

A more detailed description of the individual replication event types and their data structures can be found in Operation Types.

The response will also contain the following HTTP headers:

  • x-arango-replication-active: whether or not the logger is active. Clients can use this flag as an indication for their polling frequency. If the logger is not active and there are no more replication events available, it might be sensible for a client to abort, or to go to sleep for a long time and try again later to check whether the logger has been activated.

  • x-arango-replication-lastincluded: the tick value of the last included value in the result. In incremental log fetching, this value can be used as the from value for the following request. Note that if the result is empty, the value will be 0. This value should not be used as from value by clients in the next request (otherwise the server would return the log events from the start of the log again).

  • x-arango-replication-lasttick: the last tick value the logger server has logged (not necessarily included in the result). By comparing the the last tick and last included tick values, clients have an approximate indication of how many events there are still left to fetch.

  • x-arango-replication-checkmore: whether or not there already exists more log data which the client could fetch immediately. If there is more log data available, the client could call logger-follow again with an adjusted from value to fetch remaining log entries until there are no more.

    If there isn’t any more log data to fetch, the client might decide to go to sleep for a while before calling the logger again.

Note: this method is not supported on a coordinator in a cluster.

Return codes

  • 200: is returned if the request was executed successfully, and there are log events available for the requested range. The response body will not be empty in this case.

  • 204: is returned if the request was executed successfully, but there are no log events available for the requested range. The response body will be empty in this case.

  • 400: is returned if either the from or to values are invalid.

  • 405: is returned when an invalid HTTP method is used.

  • 500: is returned if an error occurred while assembling the response.

  • 501: is returned when this operation is called on a coordinator in a cluster.

Examples

No log events available

shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/replication/logger-follow?from=105709

HTTP/1.1 No Content
content-type: application/x-arango-dump; charset=utf-8
x-arango-replication-active: true
x-arango-replication-checkmore: false
x-arango-replication-frompresent: true
x-arango-replication-lastincluded: 0
x-arango-replication-lastscanned: 105709
x-arango-replication-lasttick: 105709
x-content-type-options: nosniff

A few log events (One JSON document per line)

shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/replication/logger-follow?from=105709

HTTP/1.1 OK
content-type: application/x-arango-dump; charset=utf-8
x-arango-replication-active: true
x-arango-replication-checkmore: false
x-arango-replication-frompresent: true
x-arango-replication-lastincluded: 105730
x-arango-replication-lastscanned: 105730
x-arango-replication-lasttick: 105730
x-content-type-options: nosniff

Show response body

More events than would fit into the response

shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/replication/logger-follow?from=105688&chunkSize=400

HTTP/1.1 OK
content-type: application/x-arango-dump; charset=utf-8
x-arango-replication-active: true
x-arango-replication-checkmore: true
x-arango-replication-frompresent: true
x-arango-replication-lastincluded: 105692
x-arango-replication-lastscanned: 105692
x-arango-replication-lasttick: 105709
x-content-type-options: nosniff

Show response body

To check what range of changes is available (identified by tick values), the HTTP interface provides the methods logger-first-tick and logger-tick-ranges. Replication clients can use the methods to determine if certain data (identified by a tick date) is still available on the master.

Returns the first available tick value

Return the first available tick value from the server

GET /_api/replication/logger-first-tick

Returns the first available tick value that can be served from the server’s replication log. This method can be called by replication clients after to determine if certain data (identified by a tick value) is still available for replication.

The result is a JSON object containing the attribute firstTick. This attribute contains the minimum tick value available in the server’s replication log.

Note: this method is not supported on a coordinator in a cluster.

Return codes

  • 200: is returned if the request was executed successfully.

  • 405: is returned when an invalid HTTP method is used.

  • 500: is returned if an error occurred while assembling the response.

  • 501: is returned when this operation is called on a coordinator in a cluster.

Examples

Returning the first available tick

shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/replication/logger-first-tick

HTTP/1.1 OK
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

{ 
  "firstTick" : "5" 
}

Return the tick ranges available in the WAL logfiles

returns the tick value ranges available in the logfiles

GET /_api/replication/logger-tick-ranges

Returns the currently available ranges of tick values for all currently available WAL logfiles. The tick values can be used to determine if certain data (identified by tick value) are still available for replication.

The body of the response contains a JSON array. Each array member is an object that describes a single logfile. Each object has the following attributes:

  • datafile: name of the logfile

  • status: status of the datafile, in textual form (e.g. “sealed”, “open”)

  • tickMin: minimum tick value contained in logfile

  • tickMax: maximum tick value contained in logfile

Return codes

  • 200: is returned if the tick ranges could be determined successfully.

  • 405: is returned when an invalid HTTP method is used.

  • 500: is returned if the logger state could not be determined.

  • 501: is returned when this operation is called on a coordinator in a cluster.

Examples

Returns the available tick ranges.

shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/replication/logger-tick-ranges

HTTP/1.1 OK
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

[ 
  { 
    "datafile" : "/tmp/arangosh_gMkkDj/tmp-59-3475165828/data/journals/logfile-2.db", 
    "status" : "collected", 
    "tickMin" : "5", 
    "tickMax" : "103821" 
  }, 
  { 
    "datafile" : "/tmp/arangosh_gMkkDj/tmp-59-3475165828/data/journals/logfile-16.db", 
    "status" : "collected", 
    "tickMin" : "103829", 
    "tickMax" : "103968" 
  }, 
  { 
    "datafile" : "/tmp/arangosh_gMkkDj/tmp-59-3475165828/data/journals/logfile-43.db", 
    "status" : "collected", 
    "tickMin" : "103977", 
    "tickMax" : "105614" 
  }, 
  { 
    "datafile" : "/tmp/arangosh_gMkkDj/tmp-59-3475165828/data/journals/logfile-103824.db", 
    "status" : "collected", 
    "tickMin" : "105622", 
    "tickMax" : "105640" 
  }, 
  { 
    "datafile" : "/tmp/arangosh_gMkkDj/tmp-59-3475165828/data/journals/logfile-103971.db", 
    "status" : "open", 
    "tickMin" : "105645", 
    "tickMax" : "105730" 
  } 
]