Documents

Translate documents

Translate documents text and/or title to the desired language.

POST

https://api.textreveal.com/api/2.0/documents/translate/batch

This route allow its users to translate all documents except the text of premium ones. For premium news, users still can translate the title.

See the Language Support page for more information

Request Body

documents*object[]
List of documents to translate
fields(string (enum))[]
Fields to translate
Default: ["title", "text"]Values: "title", "text"
languagestring
The language in which you want the documents to be translated. See Language Support page for more information.
Default: "english"Example: "italian"

Request

{
  "documents": [
    {
      "extracted": "2022-12-30T22:59:57.502Z",
      "id": "c34ac671a1b0b80078f9acd7e80217e28e8c554e14e1de707fb4370e52299add"
    },
    {
      "extracted": "2022-12-28T13:15:06.644Z",
      "id": "0d285bcb024438d022c91d75556c4159786e72b7f3b4b3a22562ca9d1dbabb4a"
    }
  ],
  "fields": [
    "title",
    "text"
  ],
  "language": "french"
}

Response - 200

An identifier of the analysis to retrieve results

errorobject | object
extracted*date-time
The document's extraction date.
Example: "2022-11-25T14:31:10.834Z"
id*string
The document identifier.
Example: "c34ac671a1b0b80078f9acd7e80217e28e8c554e14e1de707fb4370e52299add"
partialboolean
Whether the document has been partially translated or not.
Example: true
textstring
The document's text translated. Undefined if 'text' is not in the requested fields.
Example: "A translated text"
titlestring
The document's title translated. Undefined if 'title' is not in the requested fields.
Example: "A translated title"

Response

[
  {
    "extracted": "2022-12-30T22:59:57.502Z",
    "id": "c34ac671a1b0b80078f9acd7e80217e28e8c554e14e1de707fb4370e52299add",
    "text": "the document's text translated",
    "title": "the document's title translated"
  },
  {
    "extracted": "2022-12-28T13:15:06.644Z",
    "id": "0d285bcb024438d022c91d75556c4159786e72b7f3b4b3a22562ca9d1dbabb4a",
    "partial": true,
    "text": "the document's text translated",
    "title": "the document's title translated"
  }
]

Examples

With the `POST /analyze/download` endpoint

In this example, we'll use the POST /analyze/download endpoint to retrieve the top 500 documents of our analyze. And then we'll use the GET /documents endpoint to retrieve a summary for each of them.

example.py

import json
import requests
 
# Documentation on how to get a token: https://docs.textreveal.com/guide/authentication
token = "..."
host = "https://api.textreveal.com"
 
instance_id = 'INSTANCE_ID' # Replace with your instance id
headers = {
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {token}'
}
 
# Download the top 500 documents
payload = json.dumps({
    'instance': instance_id,
    'limit': 500, # The documents route is limited to 500 documents
    'sort': {
        'field': 'document_positive',
        'order': 'DESC'
    },
    'fields': ['title', 'extract_date'] # id is automatically added
})
 
endpoint = f'{host}/api/2.0/analyze/download'
response = requests.post(endpoint, headers=headers, data=payload)
lines = [json.loads(line) for line in response.text.splitlines()]
 
# Now we can use the id and extract_date to generate a summary
payload = json.dumps(
    {
        "fields": ["summary"],
        "documents": list(
            map(lambda x: {"id": x["id"], "extracted": x["extract_date"]}, lines)
        ),
    }
)
 
endpoint = f"{host}/api/2.0/documents"
response = requests.post(endpoint, headers=headers, data=payload)
documents_with_summaries = response.json()
 
# Now join the documents with their summaries
documents = {}
for line in lines:
    documents[line["id"]] = line
for document in documents_with_summaries:
    documents[document["id"]]["summary"] = document.get("summary")
 
# We now have a dictionary with the documents and their summaries
print(json.dumps(documents, indent=2))

Error handling

We raise error an error in the following cases:

The user’s company is out of quota:
- Each company can translate up to 10.000 documents. This quota is defined in the company license.
- Each document count as 1 translation.
- We return a 403 error with all information in headers in this case.
A document is not found or his language is not supported:
- This document is still counted in the quota.
- Other documents are not impacted.
An error occurred when translating a field:
- Other fields are not impacted.
- An error containing helpful information is added in the document response.

example response with errors

[
  {
    "id": "97563b96-eeb7-492f-b12c-fbfa03927d6d",
    "extracted": "2023-01-13T14:51:57.740Z",
    "error": {
      "message": "Document 97563b96-eeb7-492f-b12c-fbfa03927d6d not found on date 2023-01-13T14:51:57.740Z",
      "statusCode": 404
    }
  },
  {
    "id": "6ab99392-b8cb-4533-9c28-01718a2360fe",
    "extracted": "2023-06-27T13:49:14.740Z",
    "text": "the document's text translated",
    "error": {
      "field": {
        "title": {
          "message": "Failed to translate title",
          "statusCode": 500
        }
      }
    }
  }
]

Partial translation

When a document's text or title reaches a length of 10,000 bytes, only the first 10,000 bytes are translated.

This limit is lowered depending on the content of the document, to avoid sentence breaks.
If a sentence reaches the 10,000 bytes limit, it is ignored.
If there is no sentence with less than 10.000 bytes, an error is thrown.

When a partial translation is done, the returned document will contain "partial": true