Back Office API (2.7.0)

Download OpenAPI specification:Download

Back Office

This API acts as a create/read/update/delete interface for anything related to documents.

Format

All the request bodies in this API are JSON encoded and their content-type header should be set to application/json.

Auth

The API requires to set the authorizationToken header when used with the provided token.

Document

As a document we consider a cohesive text, for example a complete news article. It consists just of a unique id, a text snippet or a file, optional properties and optional tags. The text snippet is ideally a short, meaningful representation of the larger document, reduced to just one paragraph. In place of the text snippet it is possible to upload a file. The system will extract the text in the file and use that as the content of the document.

Id

The document id is a unique identifier for a single document.

Snippet

A snippet is a reduced representation of a larger text, for example if the document would be a news article, then its text would be the news article in a pure textual format. In order for our system to work correctly, it is important that the snippet is just the size of one or two paragraphs, and that the snippet's text clearly summarises the larger text. For example, let's take a news article which talks about the effects of inflation. Our snippet could then be: "Inflation worries as prices keep rising. People having budgetary difficulties as a result. Government pressured to take action.". A bad snippet would for example be just taking the very first paragraph of a document, which could sometimes work, but could also lead to: "December 20th 2020. Article written by Jane Doe. On this bright and sunny day, people might forget about their worries sometimes."

Properties

Properties are optional data for documents, which are usually needed to properly show the document back to the user, when it returns as a personalized document. If for example, you'd wish to integrate a type of carousel view, listing a total of 10 personalized documents in a "for you"-section, then you might choose to display each document as an image and title, with a url to link the user to when pressed. For this, you would need three document properties: image, link and title.

Tags

Tags are optional data for documents, which are used to improve the scoring in document searches. Each document can have multiple tags. For example, tags can be categories which the documents can be assigned to.

Documents

Ingest documents

Upsert documents to the system, which creates a representation of the document that will be used to match it against the preferences of a user.

Important note: The maximum size for a request is 10Mb. This means that if you have big documents you would not be able to fill the request to the maximum batch size.

Important note: If a document id appears multiple times, only the last document with that id is retained.

Authorizations:
ApiKeyAuth
Request Body schema: application/json
required
Array of objects (IngestedDocument) [ 1 .. 100 ] items
Array ([ 1 .. 100 ] items)
id
required
string (Id) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@.][a-zA-Z0-9\-:@._]*$

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s, dots or underscores (except as the first character). The length constraints are in bytes, not characters.

snippet
string [ 1 .. 2048 ] characters ^[^\x00]+$

Text that will be used to match the document against the user interests. Enclosing whitespace will be trimmed. The length constraints are in bytes, not characters. If summarize is enabled, then the length applies to the summarized instead of the original snippet.

One between snippet and file is required, but they are mutally exclusive.

file
string <byte> [ 1 .. 10000000 ] characters

A base64 encoded file. It file must be in one of the supported format (pdf, doc, etc.). The text content will be automatically extracted and many snippets will be created for the given document id depeding on the length. The length constraints are in bytes, not characters. This option can only be used with split set to true and it does not work with summarization.

One between file and snippet is required, but they are mutally exclusive.

Important note: Uploading a file is not enabled by default, please write us if you needed it. If you try to use this when disabled a bed request error will be returned.

object (DocumentProperties)

Mostly arbitrary properties that can be attached to a document, up to 2.5KB in size. A key must be a valid DocumentPropertyId.

publication_date
string <date-time> (PublicationDate) [ 10 .. 40 ] characters
Deprecated

Deprecated. Document property dates can have any name.

additional property
null or boolean or number or string or Array of strings or string (DocumentProperty)
tags
Array of strings (DocumentTag) [ 0 .. 10 ] items [ items [ 1 .. 256 ] characters ^[^\x00]+$ ]

A tag of a document can be any non-empty, UTF-8-encoded string which doesn't contain a zero byte. Enclosing whitespace will be trimmed. The length constraints are in bytes, not characters.

is_candidate
boolean

Indicates if the document is considered for recommendations. Always overwrites any existing is_candidate value from a previous ingestion.

Setting both is_candidate and default_is_candidate is invalid. Setting neither will default to is_candidate = true.

default_is_candidate
boolean

Behaves like is_candidate but will not overwrite any existing is_candidate value already stored in the database for this document.

Setting both is_candidate and default_is_candidate is invalid. Setting neither will default to is_candidate = true.

summarize
boolean
Default: false

Summarize the document before further processing.

This is incompatible with split.

split
boolean
Default: false

Split the input document into multiple parts before further processing.

This is incompatible with summarize.

Responses

Request samples

Content type
application/json
{
  • "documents": [
    ]
}

Response samples

Content type
application/json
{ }

Delete documents

Delete all listed documents.

Authorizations:
ApiKeyAuth
Request Body schema: application/json
documents
required
Array of strings (Id) [ 1 .. 1000 ] items [ items [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@.][a-zA-Z0-9\-:@._]*$ ]

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s, dots or underscores (except as the first character). The length constraints are in bytes, not characters.

Responses

Request samples

Content type
application/json
{
  • "documents": [
    ]
}

Response samples

Content type
application/json
{
  • "request_id": "string",
  • "kind": "string",
  • "details": { }
}

Delete document

Delete the listed document.

Authorizations:
ApiKeyAuth
path Parameters
document_id
required
string (Id) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@.][a-zA-Z0-9\-:@._]*$
Example: id1

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s, dots or underscores (except as the first character). The length constraints are in bytes, not characters.

Responses

Response samples

Content type
application/json
{
  • "request_id": "string",
  • "kind": "string",
  • "details": { }
}

Document candidates

Get document candidates

Get the documents considered for recommendations.

Authorizations:
ApiKeyAuth

Responses

Response samples

Content type
application/json
{
  • "documents": [
    ]
}

Set document candidates

Set the documents considered for recommendations.

Authorizations:
ApiKeyAuth
Request Body schema: application/json
required
Array of objects (DocumentCandidate) >= 0 items
Array (>= 0 items)
id
required
string (Id) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@.][a-zA-Z0-9\-:@._]*$

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s, dots or underscores (except as the first character). The length constraints are in bytes, not characters.

Responses

Request samples

Content type
application/json
{
  • "documents": [
    ]
}

Response samples

Content type
application/json
{
  • "request_id": "string",
  • "kind": "string",
  • "details": { }
}

Document properties

Get document properties

Get all the properties of the document.

Authorizations:
ApiKeyAuth
path Parameters
document_id
required
string (Id) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@.][a-zA-Z0-9\-:@._]*$
Example: id1

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s, dots or underscores (except as the first character). The length constraints are in bytes, not characters.

Responses

Response samples

Content type
application/json
{
  • "properties": {
    }
}

Set document properties

Set or replace all the properties of the document.

Authorizations:
ApiKeyAuth
path Parameters
document_id
required
string (Id) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@.][a-zA-Z0-9\-:@._]*$
Example: id1

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s, dots or underscores (except as the first character). The length constraints are in bytes, not characters.

Request Body schema: application/json
required
object (DocumentProperties)

Mostly arbitrary properties that can be attached to a document, up to 2.5KB in size. A key must be a valid DocumentPropertyId.

publication_date
string <date-time> (PublicationDate) [ 10 .. 40 ] characters
Deprecated

Deprecated. Document property dates can have any name.

additional property
null or boolean or number or string or Array of strings or string (DocumentProperty)
One of
null

Responses

Request samples

Content type
application/json
{
  • "properties": {
    }
}

Response samples

Content type
application/json
{
  • "request_id": "string",
  • "kind": "string",
  • "details": { }
}

Delete document properties

Delete all the properties of the document.

Authorizations:
ApiKeyAuth
path Parameters
document_id
required
string (Id) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@.][a-zA-Z0-9\-:@._]*$
Example: id1

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s, dots or underscores (except as the first character). The length constraints are in bytes, not characters.

Responses

Response samples

Content type
application/json
{
  • "request_id": "string",
  • "kind": "string",
  • "details": { }
}

Document property

Get document property

Get the property of the document.

Authorizations:
ApiKeyAuth
path Parameters
document_id
required
string (Id) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@.][a-zA-Z0-9\-:@._]*$
Example: id1

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s, dots or underscores (except as the first character). The length constraints are in bytes, not characters.

property_id
required
string (IdNoDot) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@][a-zA-Z0-9\-:@_]*$
Example: id1

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s or underscores (except as the first character). The length constraints are in bytes, not characters.

Responses

Response samples

Content type
application/json
{
  • "property": "Any valid json value"
}

Set document property

Set or replace the property of the document.

Authorizations:
ApiKeyAuth
path Parameters
document_id
required
string (Id) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@.][a-zA-Z0-9\-:@._]*$
Example: id1

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s, dots or underscores (except as the first character). The length constraints are in bytes, not characters.

property_id
required
string (IdNoDot) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@][a-zA-Z0-9\-:@_]*$
Example: id1

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s or underscores (except as the first character). The length constraints are in bytes, not characters.

Request Body schema: application/json
required
null or boolean or number or string or Array of strings or string (DocumentProperty)
One of
null

Responses

Request samples

Content type
application/json
{
  • "property": { }
}

Response samples

Content type
application/json
{
  • "request_id": "string",
  • "kind": "string",
  • "details": { }
}

Delete document property

Delete the property of the document.

Authorizations:
ApiKeyAuth
path Parameters
document_id
required
string (Id) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@.][a-zA-Z0-9\-:@._]*$
Example: id1

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s, dots or underscores (except as the first character). The length constraints are in bytes, not characters.

property_id
required
string (IdNoDot) [ 1 .. 256 ] characters ^[a-zA-Z0-9\-:@][a-zA-Z0-9\-:@_]*$
Example: id1

An id can be any non-empty string that consist of arabic digits, latin letters, hyphens, colons, @s or underscores (except as the first character). The length constraints are in bytes, not characters.

Responses

Response samples

Content type
application/json
{
  • "request_id": "string",
  • "kind": "string",
  • "details": { }
}

Document property indexing

Get indexed properties

Get the schema of all indexed properties.

Authorizations:
ApiKeyAuth

Responses

Response samples

Content type
application/json
{
  • "properties": {
    }
}

Add indexed properties

Add additional indexed properties to the schema.

The schema can have at most 11 properties in total, including the automatically created publication_date property.

If you plan to create multiple indexed properties, it is strongly recommended to do so with one request.

For now it is not possible to modify or delete indexed properties through the API.

To use a property with query filters it is necessary to once add it to the list of indexed properties using this endpoint.

Newly ingested documents are checked to be compatible with the indexed property schema, i.e. if they have a property in the schema it's value must be compatible (same type, in case of date a string in rfc3339 date time format).

Due to technical limitation existing documents are not checked to be compatible with the new indexed properties added with this request. Incompatible documents will instead be treated as if they didn't had that property wrt. the filter/index. Besides that existing documents with matching properties are added to the index in a background job. Functionality to check the completion of that job is not yet implemented.

Authorizations:
ApiKeyAuth
Request Body schema: application/json
required
object (IndexedPropertiesSchema)

A mapping of document property ids to indexed property definitions.

Be aware that the keys of the object must be valid DocumentPropertyId.

additional property
object (IndexedPropertyDefinition)
type
required
string (IndexedPropertyType)
Enum: "boolean" "number" "keyword" "keyword[]" "date"

Responses

Request samples

Content type
application/json
{
  • "properties": {
    }
}

Response samples

Content type
application/json
{
  • "properties": {
    }
}