Workers

NER Linking Worker

The NER Linking worker allows you to filter out all documents not mentioning relevant entities. This process uses the named entity recognition with the keywords provided as an input in the request payload to annotate and link entities present in the document.

The ner-linking worker is restricted to certain languages, see the Language Support page for more informations

Definition

The NER Linking worker is a two-stage process:

NER operates at the sentence level
Entity linking function using the keywords specified in the payload

First stage: Named Entities Recognition at a sentence-level

TextReveal® NER is hybrid. It uses a combination of rule-based and machine learning approaches.

Available classes for the NER are the following:

Label	Type
`ORG`	Organization
`LOC`	Location
`PERSON`	Person
`PRODUCT`	Product
`GPE`	Geopolitical Entity

The NER module annotates within the sentences the presumed mentions of organization, location, person, product or geopolitical entity.

This process only takes place once and happens before you run your analysis.

Examples

NER's output for the sentences Tim Cook eats an apple in front of Apple. Bill Gates is delivering a speech on Windows:

[
    {
      "text": "Tim Cook eats an apple in front of Apple",
      "entities": {
        "PERSON": [
          "Tim Cook"
        ],
        "ORG": [
          "Apple"
        ]
      }
    },
     {
      "text": "Bill Gates is delivering a speech on Windows",
      "entities": {
        "PERSON": [
          "Bill Gates"
        ],
        "PRODUCT": [
          "Windows"
        ]
      }
    }
]

Sentence annotated with matched entities

Example of a result from a NER System:

Example result from a NER system Colour-coded recognised entities

Second stage: Entity Linking Function

The Entity Linking Function’s job is to filter out documents not mentioning any entity of interest and keep only the ones which are relevant with the analysis you ran.

In order to do so, the Entity Linking Function uses the NER result of the previous stage and the keywords you've specified in the payload. The Entity Linking Function checks for each document’s sentence that has been annotated at the previous stage if it matches with one of the keywords of the entity of interest. If it does, the document is kept.

This function identifies and links sentences to an entity id, preserving the entity type. Identification is performed using lower case matching. By default, only the first occurrence of the same entity of interest is retrieved. When the sentence does not contain any entity of interest, an empty array is returned.

Example

Entity Linking output based on the previous result example

[
  {
    "entities": {
      "apple": [
        {
          "Apple": 1,
          "Tim Cook": 1
        }
      ],
      "microsoft": [
        {
          "Windows": 1,
          "Bill Gates": 1
        }
      ]
    },
    "extract_date": "2019-02-01 09:27:39.016",
    "id": "H36itWwBVJ4dixto1Mq89",
    "language": "english",
    "sentences": [
      {
        "entities": [
          "apple"
        ],
        "matches": [
          {
            "class": "entity",
            "count": null,
            "entity_id": "apple",
            "entity_type": "ORG",
            "label": "Apple"
          },
          {
            "class": "entity",
            "count": null,
            "entity_id": "apple",
            "entity_type": "PERSON",
            "label": "Tim Cook"
          }
        ],
        "sentence_id": 0,
        "text": "Tim Cook eats an apple in front of Apple",
        "type": 0
      },
      {
        "entities": [
          "microsoft"
        ],
        "matches": [
          {
            "class": "entity",
            "count": null,
            "entity_id": "microsoft",
            "entity_type": "PRODUCT",
            "label": "Windows"
          },
          {
            "class": "entity",
            "count": null,
            "entity_id": "microsoft",
            "entity_type": "PERSON",
            "label": "Bill Gates"
          }
        ],
        "sentence_id": 1,
        "text": "Bill Gates is delivering a speech on Windows",
        "type": 0
      }
    ]
  }
]

Covered languages

The ner-linking worker is restricted to certain languages, see the Language Support page for more informations