Guide
Analyze

Data Dictionary

Data Dictionary

This page reports on input and output data available on TextReveal® analyze resource. It presents data columns, their descriptions and their coverage as well

POST /analyze/dataset

This route allows a client to launch a query to analyse data relevant to a list of entities

Parameters

Body parameters

nametypedescriptionscoperequired
entitieslist[dict]List of entities to be requestedglobalYes
entity_of_intereststringUnique id for the entity of interestentityYes
keywordslistList of keywords to search.
All keywords with a length strictly lower than 3 characters are filtered out except for Japanese, Chinese and Korean languages.
entityYes
conceptsdict[str, list[str]]List of concepts or risks that are to be analyzed. Each individual concept is defined by its own list of keywords.
Punctuation is not handled in the concept labels. Each concept label must be unique (case insensitive)
globalNo
concepts_filterdict[str, list[str]]Same as concepts but filters out documents that does not contain the concepts.
Note: You can either use concepts or concepts_filter
globalNo
sentiments_filterdict[str, dict[str, int]]Partial object containing a min/max values for each sentiments. The end analysis will contains documents that match these filters.
Allowed keys:
  • positive, negative, neutral, polarity
Note: This can be compared to a filter on:
  • document_{sentiment}.mean key for positive, negative and neutral
  • document_{sentiment} key for polarity
globalNo
sites_excludeslistList of website to exclude from search.
N.B: Use the base domain of the websites.
globalNo
min_matchintThe message must contain at least min_match keywords.
When used, each entity must have at least min_match keywords.
Example:
  • keywords: ["apple", "iphone", "macbook"]
  • min_match: 2
  • Behavior = TextReveal® API will only keep the document if and only if at least 2 elements from the keywords list appear in the document.
globalNo
min_repeatintThe message must contain at least min_repeat occurrence of a keyword.
Example:
  • keywords: ["apple", "iphone"]
  • min_repeat: 2
  • Behavior = TextReveal® API will only keep the document if and only if it contains at least 2 occurrences of either apple or iphone.
globalNo
start_datedateformat: (YYYY-MM-DD)globalYes
end_datedateformat: (YYYY-MM-DD)globalYes
site_typelistType of sites to search (field thread.site_type)
Available options are :
  • news
  • blogs
  • discussions
  • licensed_news
  • premium_news
Default value is : blogs, news and discussions
globalNo
languageslist[str]List of languages to search, see Language Support page for more information.
Note:
  • We do not recommend using multiple values.
Default value is : ['english']
globalNo
countrieslist[str]List of countries to search (field thread.country).
N.B: Use alpha-2 format.
globalNo
siteslistList of websites to search.
N.B: Use the base domain of the websites.
globalNo
co_mentionslistList of keywords to search with the keywords list. Works like a boolean AND.
Example:
  • keywords : ["TotalEnergy"]
  • co_mentions: ["gas", "oil price"]
  • Behavior = TextReveal® API will look for documents relevant to at least one of the co_mentions. For the above example, below are the different cases of relevancy:
    • TotalEnergy and gas
    • TotalEnergy and oil price
    • TotalEnergy and oil price and gas
N.B: Search of co_mentions is operated in full-text and is case insensitive.
globalNo
keywords_excludelistList of keywords to exclude from the search. Works like a boolean AND NOT.
Example:
  • keywords: ["apple", "iphone"]
  • keywords_exclude: ["Steve Jobs", "Tim Cook"]
  • Behavior: TextReveal® API will look for documents relevant to apple the company or Iphone but NOT containing either Steve Jobs or Tim Cook.
N.B: Search of keywords_exclude is operated in full-text and is case insensitive.
globalNo
qscorefloatQuality threshold to filter out unreadable data. The default value is 50. No filtering is applied if the quality-score worker is not provided.globalNo
neg_keywordslistList of keywords not used for search but for named entity resolution or annotation task.
Detailed explanation:
  • Using generic keywords or high cardinality keywords can bring huge volume of data to process or reduce quality of the data extracted.
  • neg_keywords parameter allows you to add such keywords so that you can use them to annotate sentence containing them within documents already containing less generic keywords.
  • Example:
    • keywords: ['Microsoft'] (Used for search in datalake)
    • neg_keywords: ['MSFT'] (Not used for search in datalake)
    • Behavior = Textreveal® api will look for documents containing only microsoft, then after, it will annotate every sentence mentionning MSFT or Microsoft
entityNo
workerslistWorkflow steps definitionglobalYes
contextstringContext description of the entityentityYes
similarity_thresholdfloatSimilarity score threshold for recognized or matched entities. Filters out documents containing entities with a similarity score lesser than the threshold.globalNo
search_inlist[string]Allows to define if the documents extraction has to be done by searching entity keywords in the title and/or in the text.
Example:
  • search_in: ["title", "text"]

Note:
  • This parameter is only applied on the keywords of entity, not on keywords_exclude, co_mentions, neg_keywords, min_repeat, min_match.
  • Not available with ner-linking worker.
globalNo

Response

nametypedescription
instance_idstringThe unique identifier of the analysis

POST /analyze/tql

This route allows a client to launch an analysis for data relevant to a list of entities using Textreveal query language (TQL).

The TextReveal Query Language (TQL) is a simple text-based query language for filtering data. It is composed with a field on which a value is applied: site_type: "news". Each filter can be combined to create a boolean expression with AND, OR and NOT operators.
Example: (text:"Apple TV" OR title:"Steve Jobs") AND NOT text:"apple tree"

Unlike the dataset route, the TQL route requests all types of sites. The news site type groups news, premium_news and licensed_news. Moreover, the workers are implicitly enabled if the associated parameter is used. For example, the quality score worker will be enabled if the qscore parameter is used.

Parameters

Body parameters

nametypedescriptionscoperequired
entitieslist[dict]List of entities to be requestedglobalYes
entity_of_intereststringUnique id for the entity of interestentityYes
contextstringContext description of the entity. The context is mandatory if you use the similarity_threshold parameter.entityNo
querystringA TQL query to define the entity of interest. The TQL query will be used for the data extraction.
Accepted fields are :
  • country
  • ner
  • site
  • site_type
  • text
  • title

Specific values for site_type:
  • news
  • blogs
  • discussions

Example: ((title:"1&1" AND text:"1&1 DRILLISCH") OR (title:"DRILLISCH" AND text:"1&1 DRILLISCH") AND (ner:"1&1 DRILLISCH"))
entityYes
annotate_keywordslistList of keywords not used for search but for named entity resolution or annotation task.entityYes
conceptsdict[str, list[str]]List of concepts or risks that are to be analyzed. Each individual concept is defined by its own list of keywords.
Punctuation is not handled in the concept labels. Each concept label must be unique (case insensitive)
globalNo
concepts_filterdict[str, list[str]]Same as concepts but filters out documents that does not contain the concepts.
Note: You can either use concepts or concepts_filter
globalNo
sentiments_filterdict[str, dict[str, int]]Partial object containing a min/max values for each sentiments. The end analysis will contains documents that match these filters.
Allowed keys:
  • positive, negative, neutral, polarity
Note: This can be compared to a filter on:
  • document_{sentiment}.mean key for positive, negative and neutral
  • document_{sentiment} key for polarity
globalNo
min_matchintThe message must contain at least min_match annotate keywords.
When used, each entity must have at least min_match keywords.
Example:
  • annotate_keywords: ["apple", "iphone", "macbook"]
  • min_match: 2
  • Behavior = TextReveal® API will only keep the document if and only if at least 2 elements from the keywords list appear in the document.
globalNo
min_repeatintThe message must contain at least min_repeat occurrence of an annotate keyword.
Example:
  • annotate_keywords: ["apple", "iphone"]
  • min_repeat: 2
  • Behavior = TextReveal® API will only keep the document if and only if it contains at least 2 occurrences of either apple or iphone.
globalNo
start_datedateformat: (YYYY-MM-DD)globalYes
end_datedateformat: (YYYY-MM-DD)globalYes
languagestringLanguage to search, see Language Support page for more information.
Default value is : english
globalNo
qscorefloatQuality threshold to filter out unreadable data. No filtering is applied if the qscore parameter is not provided.globalNo
similarity_thresholdfloatSimilarity score threshold for recognized or matched entities. Filters out documents containing entities with a similarity score lesser than the threshold.globalNo

Response

nametypedescription
instance_idstringThe unique identifier of the analysis

POST /analyze/download

This route allows a client to preview result of an analysis previously run

Parameters

Body parameters

nametypedescriptionrequired
instancestringId of the instance you want to retrieve the textual dataYes
limitnumber|dictWhen limit is a number (e.g., 100):
  • Specifies the total maximum number of documents to retrieve, sorted as defined in the sort parameter.
When limit is a dictionary (e.g., {"by": "entity", "value": 3}):
  • Indicates a per-object limit. For each distinct object (in the example, each "entity"), the system will return up to {value} documents. The actual total can therefore exceed {value}, depending on how many entities (or other objects) are present.
  • Example: If there are 10 distinct entities and limit:{"by": "entity", "value": 3}, you may receive up to 30 total documents (3 per entity).
Important: Certain fields (like id, title, sentences, url or thread) are only returned if the total number of documents (the “computed limit”) is ≤ 2000.
Yes
datestringFilter the documents on a given date. Use %Y-%m-%d format. The date must be included in the date range of the analysis.No
entitystringFilter the documents on a given entity. The entity must be an entity of interest of the analysis.No*
conceptstringFilter the documents on a given concept. The concept must be present in the analysis.No
sortdictSort the documents in ascending or descending order given a fieldno
sort>fieldstringThe field to sort the documents. Available fields are:
  • document_negative
  • document_neutral
  • document_positive
  • document_polarity
  • document_entity_polarity
  • document_entity_positive
  • document_entity_neutral
  • document_entity_negative
yes
sort>orderstringThe order of the sorting. Available values are
  • ASC
  • DESC
yes
fieldslist[str]Collect only the fields you need. By default, all fields except summary are returned.
id field is always returned.

Available keys for the fields parameter are:
  • concepts
  • document_entity_negative
  • document_entity_neutral
  • document_entity_polarity
  • document_entity_positive
  • document_negative
  • document_neutral
  • document_polarity
  • document_positive
  • entities
  • extract_date
  • id
  • language
  • mentions
  • qscore
  • sentences
  • thread
  • title
  • url
  • summary
No
  • If you use the sort parameter, the date parameter can become mandatory if your analysis has generated a certain amount of results (2 500 000 documents).
  • When using the sort parameter with a field that has aggregation functions (e.g, min, max, median, mean), we will use the mean value.
  • When using one of the entity match field (document_entity_polarity, document_entity_positive, document_entity_neutral, document_entity_negative) in the sort parameter, the entity parameter is mandatory.
  • premium_news text cannot be retrieved. Each sentence is replaced by this placeholder The download of licensed text is not allowed.
  • The summary field is an experimental feature, we recommend to use the /documents route with the document id, as seen in the example page here

Response

nametypeparentdescription
extract_datedatetimeCorresponds to the date of extraction of the article. (YYYY-MM-DD)
languagestringLanguage of the article
threaddictParent key for country, site, site_type and title
countrystringthread2 letter ISO country code
sitestringthreadSite of the article
site_typestringthreadSite type of the article
Available options are :
  • news
  • blogs
  • discussions
  • licensed_news
  • premium_news
titlestringthreadTitle of the thread mapped from sentences of type 2 (If no sentences, the title will be an empty string)
urlstringUrl of the article
idstringid of the article
titlestringTitle of the document mapped from sentences of type 1 (If no sentences, the title will be an empty string)
sentenceslist[dict]List of sentences with their match and indicators when available
textstringsentencesText of the sentence
entitieslist[str]sentences(Deprecated*) List containing the labels of the matched entities in the sentence
sentence_idintsentencesId of the sentence
typeintsentencesType of the sentence:
  • 0 - text
  • 1 - title
  • 2 - thread.title
matcheslist[dict]sentencesList of matched keywords or entities
resultsdictsentencesList of indicators:
  • sentiments (optional)
  • (Deprecated*) emotions (optional)
negativefloatresultsNegative sentiment probability
positivefloatresultsPositive sentiment probability
neutralfloatresultsNeutral sentiment probability
polarityfloatresultsThe aggregated of positive and negative sentiment scores at the sentence level.
  • Formula: (positive_sentiment - negative_sentiment) / (positive_sentiment + negative_sentiment).
  • Range: -1 to 1
polarity_expfloatresultsThe aggregated score, at the sentence level, of the difference between negative and positive sentiment scores processed into a sigmoid in order to smooth outliers.
  • Formula: 1 / (1+e^(-(positive_sentiment-negative_sentiment)))
  • Range: 0 to 1
document_entity_polaritydictEvaluate the sentiment level towards the entity of interest in all sentences mentioning the entity in a given document.
Formula:
  • Select all sentences where there is an entity match in the document
  • Average the positive and negative scores of the selected sentences
  • Compute the formula (avg_positive_sentiment - avg_negative_sentiment) / (avg_positive_sentiment + avg_negative_sentiment)
  • Range: -1 to 1
document_entity_positivedictEvaluate the level of positive sentiment towards an entity of interest in all sentences mentioning the entity in a given document.
Formula:
  • For each entity of interest matched in the document:
    • Select all the sentences where there is an entity match in the document
    • Aggregate the sentences' positive score with each aggregation function (min, max, mean and median)
  • Range: 0 to 1
document_entity_neutraldictEvaluate the level of neutral sentiment towards an entity of interest in all sentences mentioning the entity in a given document.
Formula:
  • For each entity of interest matched in the document:
    • Select all the sentences where there is an entity match in the document
    • Aggregate the sentences' neutral score with each aggregation function (min, max, mean and median)
  • Range: 0 to 1
document_entity_negativedictEvaluate the level of negative sentiment towards an entity of interest in all sentences mentioning the entity in a given document.
Formula:
  • For each entity of interest matched in the document:
    • Select all the sentences where there is an entity match in the document
    • Aggregate the sentences' negative score with each aggregation function (min, max, mean and median)
  • Range: 0 to 1
document_{sentiment}dictEvaluate the desired sentiment1 of a document
Formula:
  • Select all sentences in the document
  • Aggregate the sentences {sentiment} score with each aggregation function (min, max, mean and median)
document_polarityfloatEvaluate the sentiment level in all sentences in a given document.
Formula:
  • Select all sentences in the document
  • Average the positive and negative scores of the selected sentences
  • Compute the formula (avg_positive_sentiment - avg_negative_sentiment) / (avg_positive_sentiment + avg_negative_sentiment)
  • Range: -1 to 1
nb_sentencesintNumber of sentences composing the article
textstringmatchesmention of the keyword or entity in the sentence
entitydictmatchesIdentifier of the entity
countdictmatchesPrevalence of keywords for the matched concept.
similarityfloatmatchesCosine similarity score between the sentence and the context of the entity. Ranges between [0,1]
qscorefloatReadability score of the document. This score is calculated on some kpis such as the average length of sentences within the document or the ratio of non alphanumerical character within the document.
conceptsdictSum of occurrences of the keywords related to a given concept in each sentence.
Available with the concept worker
mentionsdictSum of occurrences of the keywords related to a given mention in each sentence.
Available with the raw-matcher worker
entitiesdictSum of occurrences of the keywords related to a given entity in each sentence.
Available with the ner-linking worker
summarystringSummary in english of the document's text

It is possible to have exceptionally empty summaries for some texts that the model is not able to handle.

  1. Available sentiment classes: positive , neutral , negative

*Deprecated: The field is deprecated and will be removed in future releases. Please consider updating your code as soon as possible.

POST /analyze/status

This route allows a client to get the status of a previously run analysis.

Parameters

Body parameters

nametypedescriptionrequired
instancestringThe identifier of an analysis. This identifier have to be used to get results.Yes

Response

nametypedescription
countnumber(Deprecated*) Total number of articles.
filterednumber(Deprecated*) Number of filtered texts.
globalSpeednumber(Deprecated*) Analysis global speed.
handlednumberNumber of documents in the analysis result set.
lastErrorMessagestring(Deprecated*) If analysis fails, the last error message that has been raised.
startedAtdateThe time when the analysis started.
statusstringThe current status of the analysis. One of :
  • pending
  • starting
  • running
  • failed
  • stopped
  • completed
updatedAtdateThe last time the analysis was updated.

Pending: Your analysis is queued. The limit for concurrent analyses is reached and your analysis will start as soon as another already-running analysis finishes. See the limitation page for more information.

Starting: Your analysis is starting. Necessary resources are being gathered in order to run it.

*Deprecated: The field is deprecated and will be removed in future releases. Please consider updating your code as soon as possible.

POST /analyze/{id}/timeseries

This route allows a client to run an HTTP request in order to start the computing of a Timeseries.

Parameters

Path parameters

nametypedescription
idstringThe analysis id. This id is the response of analyze/dataset route. The analysis must be completed.

Body parameters

nametypedescriptionrequired
operandslist[string]The operators that will be used for aggregation.
Must be a list composed of one or more of:
  • min lowest value observed for the class on the defined period
  • max highest value observed for the class on the defined period
  • median middle value observed for the class on the defined period
  • mean average value observed for the class on the defined period
No
output_formatstringThe output format of the final result.
Must be one of:
  • json
  • csv
Currently, when using json as the output_format, the concept names used will be returned in lowercase, while the csv format maintains the original case to the output_format
No
pivotslist[string]The pivots that will be used for aggregation (Additional to the date and entity).
Must be a list composed of one or more of:
  • extract_day
  • language
  • entity
  • site
  • site_type
  • country
No
time_granularitystringAggregation granularity period.
Must be one of:
  • day
  • hour
  • minute
No
volume_onlybooleanAggregation mode. Set to true to display only volumesNo

As you can notice, the output format is chosen when launching a timeseries and not when downloading it. This means you need to run a new timeseries in order to change the output format.

Response

nametypedescription
hashintegerThe hash of launched analysis

The table above only show the successful HTTP API Response (Status code = 200). You can expect multiple responses and status codes. Please see here for more information.

GET /analyze/{id}/timeseries/{hash}/status

This route allows a client to retrieve the timeseries status of a given instance using its hash.

Parameters

Path parameters

nametypedescription
idstringThe analysis id. The analysis must be completed.
(format: uuid)
hashstringThe timeseries hash.

Response

nametypedescription
statusstringThe status of the timeseries. One of :
  • running
  • failed
  • stopped
  • completed

GET /analyze/{id}/timeseries/{hash}/download

This route allows a client to download the timeseries results of a given instance using its hash.

Parameters

Path parameters

nametypedescription
idstringThe analysis id. The analysis must be completed.
(format: uuid)
hashstringThe timeseries hash.

Response

nametypedescription
{concept_label}_scorefloatThe percentage of documents containing at least one keyword related to the concept.
  • Formula: Volume of documents co-mentionning the company and the concept / Volume of documents mentionning the company
entitystringThe detected entity
extract_daystringExtract day of the article
Format: date YYYY-MM-dd
extract_hourintegerExtract hour of the article
extract_minuteintegerExtract minute of the article
languagestringThe language of the article
{operator}_{sentiment_class}float{operator}1 aggregation sentiment score4 based on the{sentiment_class}2 score4 of all the sentences of all the documents matching the entity of interest for the selected aggregation period
{operator}_{emotion_class}float(Deprecated*){operator}1 aggregation emotion score4 based on the{emotion_class}3 score4 of all the sentences of all the documents matching the entity of interest for the selected aggregation period
entity_{operator}_{sentiment_class}float{operator}1 aggregation sentiment score4 based on the{sentiment_class}2 score4 of the sentences matching the entity of interest for the selected aggregation period
entity_{operator}_{emotion_class}float(Deprecated*){operator}1 aggregation emotion score4 based on the{emotion_class}3 score4 of the sentences matching the entity of interest for the selected aggregation period
volume_documentintegerThe volume of documents where the entity of interest is matched for the aggregation period
volume_sentenceintegerThe volume of all sentences of all documents where the entity of interest is matched for the aggregation period
entity_volume_sentenceintegerThe volume of sentences where the entity of interest is matched for the aggregation period.
volume_document_{concept_label}integerThe volume of documents where the entity of interest AND the specified concept are matched for the aggregation period
volume_sentence_{concept_label}integerThe volume of all sentences of all documents where the entity of interest AND the specified concept are matched for the aggregation period
{concept_label}_sentiment_polarityintegerAverage sentiment polarity of documents that match both the specified concept and the entity
concepts_keywords_countdict[str, dict[str, int]]Represents the count of keywords matched per concepts in the document for the aggregation period.
More info on the timeseries indicators page
  1. Available operators: min, max, median, mean

min: lowest value observed for the class on the defined period

max: highest value observed for the class on the defined period

median: middle value observed for the class on the defined period

mean: average value observed for the class on the defined period

  1. Available sentiment classes: positive, neutral, negative
  1. Available emotion classes: anger, anticipation, fear, joy, sadness, surprise, trust
  1. Sentiment and emotion scores will be displayed using scientific notation, meaning that an exponent can appear at the end of the number

Spaces surrounding the concept label are removed in the result

GET /analyze/{id}

This route allows a client to retrieve the payload of an instance previously ran using its id.

Parameters

Path parameters

nametypedescription
idstringThe analysis id.
(format: uuid)

Response

nametypedescriptionscope
entitieslist[dict]List of entities to be requestedglobal
entity_of_intereststringUnique id for the entity of interestentity
keywordslistList of keywords to searchentity
sites_excludeslistList of website to exclude from searchglobal
min_matchintThe message must contain at least min_match keywords.
Example:
  • keywords: ["apple", "iphone", "macbook"]
  • min_match: 2
  • Behavior = TextReveal® API will only keep the document if and only if at least 2 elements from the keywords list appear in the document.
global
min_repeatintThe message must contain at least min_repeat occurrence of a keyword.
Example:
  • keywords: ["apple", "iphone"]
  • min_repeat: 2
  • Behavior = TextReveal® API will only keep the document if and only if it contains at least 2 occurrences of either apple or iphone.
global
start_datedateformat: (YYYY-MM-DD)global
end_datedateformat: (YYYY-MM-DD)global
site_typelistType of sites to search (field thread.site_type)
Available options are :
  • news
  • blogs
  • discussions
  • licensed_news
  • premium_news
global
languageslistList of languages to searchglobal
countrieslistList of countries to search (field thread.country)global
siteslistList of websites to searchglobal
co_mentionslistList of keywords to search with the keywords list. Works like a boolean AND.
Example:
  • keywords : ["TotalEnergy"]
  • co_mentions: ["gas", "oil price"]
  • Behavior = TextReveal® API will look for documents relevant to at least one of the co_mentions. For the above example, below are the different cases of relevancy:
    • TotalEnergy and gas
    • TotalEnergy and oil price
    • TotalEnergy and oil price and gas
N.B: Search of co_mentions is operated in full-text and is case insensitive.
global
keywords_excludelistList of keywords to exclude from the search. Works like a boolean AND NOT.
Example:
  • keywords: ["apple", "iphone"]
  • keywords_exclude: ["Steve Jobs", "Tim Cook"]
  • Behavior: TextReveal® API will look for documents relevant to apple the company or Iphone but NOT containing either Steve Jobs or Tim Cook.
N.B: Search of keywords_exclude is operated in full-text and is case insensitive.
global
qscorefloatReadability score of the document. This score is calculated on some kpis such as the average length of sentences within the document or the ratio of non alphanumerical character within the document.global
neg_keywordslistList of keywords not used for search but for named entity resolution or annotation task.
Detailed explanation:
  • Using generic keywords or high cardinality keywords can bring huge volume of data to process or reduce quality of the data extracted.
  • neg_keywords parameter allows you to add such keywords so that you can use them to annotate sentence containing them within documents already containing less generic keywords.
  • Example:
    • keywords: ['Microsoft'] (Used for search in datalake)
    • neg_keywords: ['MSFT'] (Not used for search in datalake)
    • Behavior = Textreveal® api will look for documents containing only microsoft, then after, it will annotate every sentence mentionning MSFT or Microsoft
entity
workerslistWorkflow steps definitionglobal
contextstringContext description of the entityentity
precomputebooleanWhether to query offline data:
  • True
  • False
global
similarity_thresholdfloatSimilarity score threshold for recognized or matched entities. Filters out documents containing entities with a similarity score lesser than the threshold.

POST /analyze/{id}/stop

This route allows a client to stop an instance previously ran using its id.

Parameters

Path parameters

nametypedescription
idstringThe analysis id.
(format: uuid)

POST /analyze/{id}/download

Prepare the download of your instance. The result will be available in the analyze/{id}/download/{hash} route once the process is completed.

Parameters

Body parameters

nametypedescriptionrequired
iduuidThe instance idyes
limitnumber | dictThe number of documents to download. Or or dictionary specifying a limit per resourceno
limit>bystringThe resource to limit. Possible values: entityyes
limit>valuenumberThe limit valueyes
fieldsList[string]Collect only the fields you need. By default, all fields except summary are returned.
Available keys for the fields parameter are:
  • concepts
  • document_entity_negative
  • document_entity_neutral
  • document_entity_polarity
  • document_entity_positive
  • document_negative
  • document_neutral
  • document_polarity
  • document_positive
  • entities
  • extract_date
  • id
  • language
  • mentions
  • qscore
  • sentences
  • thread
  • title
  • url
  • summary
no
datedaterangeExtract only the documents published between the two dates.no
date>startdateThe start date of the date range. The format is YYYY-MM-DD.no
date>enddateThe end date of the date range. The format is YYYY-MM-DD.no
conceptsList[string]Extract only the documents that contain the concepts. Each concept must be present in the analysis.no
entitiesList[string]Extract only the documents that contain the entities. Each entity must be present in the analysis.no
sortdictSort the documents in ascending or descending order given a fieldno
sort>fieldstringThe field to sort the documents. Available fields are:
  • document_negative
  • document_neutral
  • document_positive
  • document_polarity
  • document_entity_polarity
  • document_entity_positive
  • document_entity_neutral
  • document_entity_negative
yes
sort>orderstringThe order of the sorting. Available values are
  • ASC
  • DESC
yes

Path parameters

nametypedescription
idstringThe analysis id. The analysis must be completed.
(format: uuid)

Response

nametypedescription
hashintegerThe hash of the download

GET /analyze/{id}/download/{hash}/status

This route allows a client to retrieve the download status of a given instance using its hash.

Parameters

Path parameters

nametypedescription
idstringThe analysis id. The analysis must be completed.
(format: uuid)
hashstringThe download hash.

Response

nametypedescription
statusstringThe status of the download. One of :
  • running
  • failed
  • stopped
  • completed

GET /analyze/{id}/download/{hash}

This route allows a client to download the results of a given instance using its hash.

Parameters

Path parameters

nametypedescription
idstringThe analysis id. The analysis must be completed.
(format: uuid)
hashstringThe download hash.

Response

Array of urls that you can use to retrieve the result of the download. Example:

[
  "https://files.textreveal.com/download/company=e8c8d3ba-4ca0-45d1-b4ba-c1b1f2364a12/instance=fabd78aa-5241-4842-8108-fd52ef805cde/download=03d8c58a31/output-0.parquet.gz",
  "https://files.textreveal.com/download/company=e8c8d3ba-4ca0-45d1-b4ba-c1b1f2364a12/instance=fabd78aa-5241-4842-8108-fd52ef805cde/download=03d8c58a31/output-1.parquet.gz"
]

POST /analyze/timeserie deprecated

Deprecated: The route is deprecated and will be removed in future releases. Please consider updating your code as soon as possible.

Parameters

Body parameters

Response

nametypedescription
extract_daydateDay of extraction of the article. (YYYY-MM-DD)
Date : UTC+0
extract_datedatetimeDate of extraction of the article
extract_hourtimes
Available when time series are aggregated by hour
extract_minutetimes
Available when time series are aggregated by minute
countrystringCountry of the site, determined automatically by the site language, IP and TLD
entitystringEntity detected for the record
idstringidentifier of the document
site_typestringType of data source for document
  • news
  • blogs
  • discussions
  • licensed_news
  • premium_news
languagestringlanguage of the document
sitestringWebsite of the document
urlstringUrl of the document
volume_sentenceintNumber of sentences
volume_documentintNumber of the documents
mean_<indicator>floatMean score calculated for the record
max_<indicator>floatMax score calculated for the record
min_<indicator>floatMin score calculated for the record
median_<indicator>floatMedian score calculated for the record