Guide
Analyze

Download analysis response

Download analysis response

For large downloads, we recommend using the async download endpoint.

You can download the analysis result in either JSONL or CSV format.

Notes

  • The instances must be in completed state, you can check the status using the /analyze/status route.
  • Some fields (id, title, sentences, url, summary and thread) are only available if the total number of documents (after applying the limit) is below 2000.
  • When reusing the same payload, the api will return the cached result of the previous call.
  • When requesting for less than 2000 documents, the id field is always returned.
  • The response will be provided as a file in either JSONL or CSV format.
content-disposition: attachment; filename="INSTANCE_ID-SORT_FIELD-SORT_ORDER.jsonl"
content-type: application/jsonl

Understanding the limit parameter

Limit as a number

  • If you pass a number directly to the limit parameter (e.g., "limit": 100), you will receive up to this value of documents in total, sorted by the rules specified in the sort field (for example, descending order on a specific score).

Limit by object (e.g., entity)

  • You can also pass a dictionary with this structure:
"limit": {
  "by": "entity",
  "value": 3
}
  • In this scenario, the system returns the top 3 documents per entity, based on the specified sort.
  • Example: If your analysis contains 10 distinct entities, and you request 3 documents per entity, you could receive up to 30 documents in total. We refer to this final total (30 in this example) as the computed limit.

Access to id, title, sentences, url and thread fields

  • These fields are only available if the computed limit is below 2000.
  • Example: If you request 3 documents per entity ("limit": {"by": "entity", "value": 3}) but have 1000 entities, the computed limit is 3000, which exceeds 2000. Therefore, the id, text, and sentences fields will not be included.
  • If you need these fields for more than 2000 documents, you can make multiple calls (e.g., split queries, apply filters by date or entity) or adjust your request so that each call stays under the 2000-document threshold.

The id field is returned by default only when the computed limit is 2000 or less. If you do not specifically request for the id field, it won't be returned for requests with a computed limit greater than 2000.

Request Examples

Getting the top 2 positive documents per entity

POST /analyse/download?json_lines=true
{
  "instance": "INSTANCE_ID",
  "limit": {
    "by": "entity",
    "value": 2
  },
  "sort": {
    "field": "document_positive",
    "order": "DESC"
  },
  "fields": ["title", "entities", "mentions"]
}

Response

{"id":"73d80260aa6a12004bc9d98b705f7801140dc2174ce62eb245e8399fde0e9c61","title":"Rankie HDMI Cable, High-Speed HDTV Cable, Supports Ethernetm, 3D, 4K and Audio Return, 1.8 m, Black","entities":{"Q312":[{"apple tv":1}]},"mentions":{"Q312":{"apple tv":1}}}
{"id":"6ed6bf33bd3e7b0b8b7906e9cd85ba422f7c0465652dde873e31306b9448b616","title":"Syncwire HDMI Cable 2M HDMI Lead - Ultra High Speed 18Gbps HDMI 2.0 Cable 4K@60Hz Support Fire TV, Apple TV, Ethernet, Audio Return, Video UHD 2160p, HD 1080p, 3D, Xbox PlayStation PS3 PS4 PC -Black","entities":{"Q312":[{"apple tv":1}]},"mentions":{"Q312":{"apple tv":1}}}
{"id":"dd5f25b58ea1058564bc71af8002961c12fe23f650a9eb59ac6283de6d2264e1","title":"Using the hot exhuast flow from a boeing 777 to stay warm in this -28 degree Toronto weather ❄️","entities":{},"mentions":{"Q66":{"boeing":3}}}
{"id":"dfdca0216ec6a2840c7a4efb950a1912e3f45b12dddcfcdad238b5297bbb7591","title":"Space Launch Services Market 2019 Global Key Players, Size, Trends, Opportunities, Growth- Analysis to 2025","entities":{"Q66":[{"boeing":2}]},"mentions":{"Q66":{"boeing":2}}}

Retrieving the title of the 3 most positive documents in jsonl format

POST /analyse/download?json_lines=true
{
  "instance": "INSTANCE_ID",
  "limit": 3,
  "sort": {
    "field": "document_positive",
    "order": "DESC"
  },
  "fields": ["title"]
}

Response

{"id":"dfa4469c5d18ef68e7e25fcfe964d5295914f48fff7a53e0d77ab1ecbdc0c87f","title":"Pinned to Healthy Smoothie Recipes on Pinterest"}
{"id":"74fd5cb9908806bd2da3755526faaf1d3305764047999d758c1984d91c75bca1","title":"Tys Actual Birthday ~ May 11th Child, Guy and Gift"}
{"id":"23808b442fd6196e7d98b91397146dd725066da677a73147cc0d71f2bcc135c2","title":"One-Michelin-Star Chef Massimiliano Celeste Delighted Diners @ JW Marriott Phuket’s Cucina Italian Kitchen"}

Getting the text of a document

To prepare any potential analysis, the document text is splitted per sentence, meaning that the text line breaks are not shown in the download response. For instance:

premium_news text cannot be retrieved. Each sentence is replaced by this placeholder The download of licensed text is not allowed.

POST /analyse/download?json_lines=true
{
  "instance": "INSTANCE_ID",
  "limit": 3,
  "sort": {
    "field": "document_positive",
    "order": "DESC"
  },
  "fields": ["sentences"]
}

Response

{
  "sentences": [
    {
      "sentence_id": 0,
      "type": 0,
      "results": ...
      "text": "sentence1"
    },
    {
      ...
      "text": "sentence2"
    }
  ]
}

To get the full text with line breaks, you can use the POST /documents route (Documentation).

Generate a summary of each documents

Generating a summary takes time, around 3 minutes for 500 documents.

POST /analyse/download?json_lines=true
{
  "instance": "INSTANCE_ID",
  "limit": 1,
  "fields": ["summary"]
}

Response

The summary size can vary but our system will try to generate a 100 words summary for each document.

{
  "id": "59ccf66b67a2abdc86aa59d21e0ff3b5a10666680f065c48bcdd55225fdefaae",
  "summary": "Apple has issued a severe warning to iPhone users regarding a significant privacy issue, potentially affecting millions of users. The tech giant was alerted to this bug a week prior but took minimal action. This incident has exposed the privacy of many iPhone users and is considered one of the most serious problems in Apple's history."
}