An introduction to TTL indexes

An introduction to TTL indexes

ArangoDB 3.5 comes with an additional type of index, named “TTL index”. “TTL” means “time-to-live”, as this type of index can be used to automatically expire documents in a collection.

Example use cases for TTL indexes include expiring sessions after a configurable time of inactivity, and the automatic purging of “old” measures, statistics or logs.

How PrivacyPerfect built a multi-tenant GDPR SaaS application with ArangoDB.
View the webinar here

How it works

By default, documents stored in a collection will remain active until deleted explicitly by a removal operation.
This is the desired behavior for most scenarios. However there are some use cases in which it is more desirable to keep only the most recent documents around, and get rid of older documents in a comfortable way.

This is where TTL indexes come to the rescue. They allow to automatically get rid of “old” documents after a certain point or period of time.

A TTL index is a regular index on a collection, and it indexes a date/time point for each document. When this point of time is reached, a document is considered expired and will soon be removed by a background thread.

Expiring documents at a certain date/time

The easiest way to use it is to give every document and expiration date/time value, and use the index to remove the documents when that point of time has been reached.

Here’s a quick example for a collection named “sessions”. Let’s first create the collection and a TTL index for it:

We can now store documents in the “sessions” collection, and provide the “expireAt” attribute for each document we store in it:

Because we set the index’ “expireAfter” attribute to a value of “0”, it means that the documents will be considered expired exactly at the date/time given in their individual “expireAt” attribute. No further actions are required to remove the documents at that point of time. This will all be handled by a database background thread.

In this setup, it is necessary to update the “expireAt” attribute with a prolonged expiration date/time whenever a document is updated. Doing this will keep documents alive when they are updated, and expire them eventually when no further updates happen.

Expiring documents after a certain period of time

In the above usage example the client application is responsible for calculating a reasonable expiration date/time.
Calculation of the actual expiration date/time can alternatively be performed by the database based on any document’s date/time attribute value plus a fixed TTL value.

This is what the “expireAfter” attribute of the index is for. Its value is specified in seconds.
For example, to expire documents automatically one day (86400 seconds) after they have been modified last, use the following setup:

All the client application now needs to do is to fill the “lastModifiedAt” attribute of each document with the current date/time:

Still, whenever the client application updates a document it should also bump the date/time value in the document’s “lastModifiedAt” attribute to keep documents from being removed prematurely.

Format of date/time values

Date/time values that define the documents’ expiration values can be given as UTC date/time strings as seen in the above examples, or as numeric timestamp values.

The timestamp value should be a Unix timestamp and contain the number of seconds elapsed since January 1st 1970. For example, the timestamp equivalent of the date/time string “2019-06-22T10:42:54Z” would be 1561200174.

That means we could have inserted the last document as follows, and the outcome would have been identical:

In case a document’s TTL index attribute does neither contain a proper date/time string nor a numeric timestamp value, the document will not be stored in the TTL index and thus will not be removed automatically. This way certain documents can be kept from being removed should it be desired.

Fine-tuning removal operations

The frequency for invoking the background removal thread can be configured using the server’s --ttl.frequency startup option. This frequency is specified in milliseconds.

In order to avoid “random” load spikes by the background thread suddenly kicking in and removing a lot of documents at once, the number of to-be-removed documents per thread invocation can be capped.
The total maximum number of documents to be removed per thread invocation is controlled by the server startup option --ttl.max-total-removes. The maximum number of documents in a single collection at once can be controlled by the startup option --ttl.max-collection-removes.

Limitations

The actual removal of expired documents will not necessarily happen immediately when they have reached their expiration time.

Expired documents will eventually be removed by a background thread that is periodically going through all TTL indexes and removing the expired documents.

There is no guarantee when exactly the removal of expired documents will be carried out, so queries may still find and return documents that have already expired. These will eventually be removed when the background thread kicks in and has spare capacity to remove the expired documents. It is guaranteed however that only documents which are past their expiration time will actually be removed. So documents will never be removed prematurely.

Please note that there is one background thread per ArangoDB database server instance for performing the removal of expired documents of all collections in all databases. If the number of databases and collections with TTL indexes is high and there are many documents to remove, the background thread may at least temporarily lag behind with the removal operations. It should eventually catch up in case the number of to-be-removed documents per invocation is not higher than the background thread’s configured threshold values.

Please also note that TTL indexes are designed exactly for the purpose of removing expired documents from collections.

For higher efficiency, they will always store expiration date/time values as a numeric timestamps even if the attribute value is a date/time string in the document. Because of this transformation, TTL indexes will likely not be used for filtering and/or sorting in any of your AQL queries!

Do you like ArangoDB?
icon-githubStar this project on GitHub.
close-link