Document
Documents are web-pages, blog posts, and any content found on the internet or in online newspaper editions.
Every document is unique and given a datamarts_document_id that can be used to reference it.
At that level, we have the document's metadata (title, url, extract_date, text, ...), and what the sentiment is about it (positive, neutral, negative, polarity).
We want to have entity-specific information about documents: if a document is about a specific entity, we want to know it, what makes it relevant to the entity (entity_keywords).
Now we have an entity-document level view of the documents, and we want to add ESG and SDG information about the document.
During this process we create a new esg-entity-document level, and sdg-entity-document level, in which we give it a new id and the relevant information (score, categories, ...).
Licenses
Some documents are sourced from paid sources. You can still get their title and have information about your entities, categories, score, summary, etc. but you cannot access their text or get a translation of it.