HTTP Interface for Analyzers

The REST API is accessible via the /_api/analyzer endpoint URL callable via HTTP requests.

Analyzer Operations

Create an analyzer with the suppiled definition

creates a new analyzer based on the provided definition

POST /_api/analyzer

A JSON object with these properties is required:

  • name: The analyzer name.

  • type: The analyzer type.

  • properties: The properties used to configure the specified type. Value may be a string, an object or null. The default value is null.

  • features: The set of features to set on the analyzer generated fields. The default value is an empty array.

Creates a new analyzer based on the provided configuration.

Return codes

  • 200: An analyzer with a matching name and definition already exists.

  • 201: A new analyzer definition was successfully created.

  • 400: One or more of the required parameters is missing or one or more of the parameters is not valid.

  • 403: The user does not have permission to create and analyzer with this configuration.

Examples

shell> curl -X POST --header 'accept: application/json' --data-binary @- --dump - http://localhost:8529/_api/analyzer <<EOF
{ 
  "name" : "testAnalyzer", 
  "type" : "identity" 
}
EOF

HTTP/1.1 Created
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

Show response body

Return the analyzer definition

returns an analyzer definition

GET /_api/analyzer/{analyzer-name}

Path Parameters

  • analyzer-name (required): The name of the analyzer to retrieve.

Retrieves the full definition for the specified analyzer name. The resulting object contains the following attributes:

  • name: the analyzer name
  • type: the analyzer type
  • properties: the properties used to configure the specified type
  • features: the set of features to set on the analyzer generated fields

Return codes

  • 200: The analyzer definition was retrieved successfully.

  • 404: Such an analyzer configuration does not exist.

Examples

Retrieve an analyzer definition:

shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/analyzer/testAnalyzer

HTTP/1.1 OK
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

Show response body

List all analyzers

returns a listing of available analyzer definitions

GET /_api/analyzer

Retrieves a an array of all analyzer definitions. The resulting array contains objects with the following attributes:

  • name: the analyzer name
  • type: the analyzer type
  • properties: the properties used to configure the specified type
  • features: the set of features to set on the analyzer generated fields

Return codes

  • 200: The analyzer definitions was retrieved successfully.

Examples

Retrieve all analyzer definitions:

shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/analyzer

HTTP/1.1 OK
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

Show response body

Remove an analyzer

removes an analyzer configuration

DELETE /_api/analyzer/{analyzer-name}

Path Parameters

  • analyzer-name (required): The name of the analyzer to remove.

Query Parameters

  • force (optional): The analyzer configuration should be removed even if it is in-use. The default value is false.

Removes an analyzer configuration identified by analyzer-name.

If the analyzer definition was successfully dropped, an object is returned with the following attributes:

  • error: false
  • name: The name of the removed analyzer

Return codes

  • 200: The analyzer configuration was removed successfully.

  • 400: The analyzer-name was not supplied or another request parameter was not valid.

  • 403: The user does not have permission to remove this analyzer configuration.

  • 404: Such an analyzer configuration does not exist.

  • 409: The specified analyzer configuration is still in use and force was omitted or false specified.

Examples

Removing without force:

shell> curl -X DELETE --header 'accept: application/json' --dump - http://localhost:8529/_api/analyzer/testAnalyzer

HTTP/1.1 OK
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

Show response body

Removing with force:

shell> curl -X POST --header 'accept: application/json' --data-binary @- --dump - http://localhost:8529/_api/collection <<EOF
{ 
  "name" : "testCollection" 
}
EOF

shell> curl -X POST --header 'accept: application/json' --data-binary @- --dump - http://localhost:8529/_api/view <<EOF
{ 
  "name" : "testView", 
  "type" : "arangosearch", 
  "links" : { 
    "testCollection" : { 
      "analyzers" : [ 
        "testAnalyzer" 
      ] 
    } 
  } 
}
EOF

shell> curl -X DELETE --header 'accept: application/json' --dump - http://localhost:8529/_api/analyzer/testAnalyzer?force=false

shell> curl -X DELETE --header 'accept: application/json' --dump - http://localhost:8529/_api/analyzer/testAnalyzer?force=true

HTTP/1.1 OK
content-type: application/json; charset=utf-8
x-content-type-options: nosniff

Show response body

Analyzer Types

The currently implemented Analyzer types are:

  • identity
  • delimiter
  • stem
  • norm
  • ngram
  • text

Identity

An analyzer applying the identity transformation, i.e. returning the input unmodified.

The value of the properties attribute is ignored.

Delimiter

An analyzer capable of breaking up delimited text into tokens as per RFC4180 (without starting new records on newlines).

The properties allowed for this analyzer are either:

  • a string encoded delimiter to use
  • an object with the attribute delimiter containing the string encoded delimiter to use

Stem

An analyzer capable of stemming the text, treated as a single token, for supported languages.

The properties allowed for this analyzer are an object with the following attributes:

  • locale : string (required) format: (language[_COUNTRY][.encoding][@variant])

Norm

An analyzer capable of normalizing the text, treated as a single token, i.e. case conversion and accent removal

The properties allowed for this analyzer are an object with the following attributes:

  • locale: string (required) format: (language[_COUNTRY][.encoding][@variant])
  • case: string enum (optional) one of: lower, none, upper, default: lower
  • accent: boolean (optional) preserve accent, default: true

N-gram

An analyzer capable of producing n-grams from a specified input in a range of [min;max] (inclusive). Can optionally preserve the original input.

The properties allowed for this analyzer are an object with the following attributes:

  • max: unsigned integer (required) maximum n-gram length
  • min: unsigned integer (required) minimum n-gram length
  • preserveOriginal: boolean (required) output the original value as well

Example

With min = 4 and max = 5, the analyzer will produce the following n-grams for the input foobar:

  • foob
  • ooba
  • obar
  • fooba
  • oobar

With preserveOriginal enabled, it will additionally include foobar itself.

Text

An analyzer capable of breaking up strings into individual words while also optionally filtering out stop-words, applying case conversion and extracting word stems.

The properties allowed for this analyzer are an object with the following attributes:

  • locale: string (required) format: (language[_COUNTRY][.encoding][@variant])
  • case: string enum (optional) one of: lower, none, upper, default: lower
  • stopwords: array of strings (optional) words to omit from result, default: load words from stopwordsPath
  • stopwordsPath: string(optional) path with the language sub-directory containing files with words to omit, default: if no stopwords provided then the value from the environment variable IRESEARCH_TEXT_STOPWORD_PATH or if undefined then the current working directory
  • accent: boolean (optional) preserve accent in returned words, default: false
  • stemming: boolean (optional) apply stemming on returned words, default: true