language en

fel: A Fine-grained Entity Linking vocabulary

Release 04-10-2019

Revision:
0.0.1
Authors:
Henry Rosales-Méndez
Aidan Hogan
Barbara Poblete
Download serialization:
RDF/XML N-Triples TTL
License:
https://creativecommons.org/licenses/by/4.0/


Ontology Specification Draft

Abstract

Some decades have passed since the concept of "named entity" was used for the first time. Since then, new lines of research have emerged in this environment, such as works on the Entity Linking (EL) task, which links the (named) entity mentions in a text collection with their corresponding knowledge-base entries. However, within this task, there is often a lack of consensus in the literature on the definition of the concept of "entity".

This vocabulary aims to model fine-grained categories for EL in order to tackle this problem. Each category divides the universe of mentions into subclasses based on current entity definitions. The vocabulary is designed to extend the NIF format.

Table of contents

Introduction back to top

Entity Linking (EL) is a task in Information Extraction that links the entity mentions in a text collection with their corresponding knowledge-base (KB) entries. With EL, we can take advantage of a large amount of information available in publicly available KBs (e.g., Wikipedia, DBpedia, Wikidata) about real-world entities and their relationships to obtain semantic information that can be used to achieve a better understanding of text corpora. While the previous challenges for EL are well-known, another more fundamental issue is often overlooked by the community: the question of what is an “entity”? Though several definitions have emerged about what an entity should be [Grishman at al., 1996][Eckhardt at al., 2014][Uren at al., 2006][Perera at al., 2016], there is, as of yet, no clear consensus [Borrega at al., 2007][Ling at al., 2015]. This vocabulary proposes terms representing fine-grained categories of mentions and links in order to make different design choices for the EL task explicit.

Namespace declarations

All examples in this document are written in the Turtle RDF syntax. Throughout the document, the following namespaces are used:

Prefix Namespace Description
owl http://www.w3.org/2002/07/owl The OWL 2 Schema vocabulary (OWL 2)
xsd http://www.w3.org/2001/XMLSchema# XML Schema
rdfs http://www.w3.org/2000/01/rdf-schema# The RDF Schema vocabulary (RDFS)
dc http://purl.org/dc/terms/ DCMI Metadata Terms
vann http://purl.org/vocab/vann/ A vocabulary for annotating vocabulary descriptions
lexinfo http://www.lexinfo.net/ontology/2.0/lexinfo# Version 2.0 of LexInfo Ontology, based on Lemon
doap http://usefulinc.com/ns/doap# Description of a Project vocabulary
void http://rdfs.org/ns/void# Vocabulary of Interlinked Datasets
gold http://purl.org/linguistics/gold Genderal Ontology for Linguistic Description
skos http://www.w3.org/2004/02/skos/core# SKOS Simple Knowledge Organization System Namespace Document

Description back to top

This vocabulary is organized as a hierarchy, where the first level categories are: fel:BaseFormClass, fel:PartOfSpeechClass, fel:OverlapClass and fel:ReferenceClass. Each of them is a partition of the universe of mentions with the goal of categorizing different types of entities. Additionally, we link to related external terms (in gray).

Vocabulary back to top

This section provides details for each class and property defined by the FEL vocabulary..

Classes

fel:BaseFormClassc back to top or classes

IRI: https://w3id.org/vcb/fel#BaseFormClass

This class gathers definitions that mainly recognize Proper Nouns as entities (e.g., MUC-6 definition), with other more flexible definitions, such as those that allow pronouns, numbers, temporal expressions, and any mention with a related KB-entity. All mentions fill in this category, the separation is provided by its subclasses: fel:ProperForm, fel:NumericTemporalForm, fel:CommonForm, and fel:ProForm.

Is defined by
https://w3id.org/vcb/fel#
has sub-classes
fel:ProperForm c
fel:NumericTemporalForm c
fel:CommonForm c
fel:ProForm c

fel:ProperFormc back to top or classes

IRI: https://w3id.org/vcb/fel#ProperForm

This class gathers all mentions based on names (proper nouns), e.g., 'Michael Jackson', 'USA', 'King of the Pop', 'B. Obama', etc. Such mentions do not have to be nouns if they are based on proper nouns, as in the case of 'French, 'Orwellian', etc. Such mentions may use abbreviated or extended forms of names; we add a new level in the class hierarchy to separate them: Full, Extended, Short or Alias.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:BaseFormClass c
has sub-classes
fel:FullProperForm c
fel:ExtendedProperForm c
ShortProperForm c
fel:AliasProperForm c
skos:closeMatch
lexinfo:ProperNoun c

fel:FullProperFormc back to top or classes

IRI: https://w3id.org/vcb/fel#FullProperForm

This class gathers all proper-form mentions that (almost) exactly match with the label of the Knowledge-Base entity. For example, the mention 'Michael Jackson' targeting wiki:Michael_Jackson is considered Full. This class also includes mentions that are syntactically close to the knowlegebase entity, sharing the same morpheme(s), for instance 'German' pointing to wiki:Germany is also considered a FullProperForm.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:ProperForm c

fel:ShortProperFormc back to top or classes

IRI: https://w3id.org/vcb/fel#ShortProperForm

This class is concerned with all the proper-name mentions that are shorter than the label of the Knowledge-Base entity while still being based on the label. For instance, the mentions 'Jackson' or 'M. Jackson' targeting wiki:Michael_Jackson are considered ShortProperForm.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:ProperForm c

fel:ExtendedProperFormc back to top or classes

IRI: https://w3id.org/vcb/fel#ExtendedProperForm

This class gathers all proper-name mentions longer than the label of the Knowledge-Base entity but containing the label. For example, the mention 'Michael Joseph Jackson' targeting wiki:Michael_Jackson is considered an ExtendedProperForm.

Is defined by
https://w3id.org/vcb/fel#
is equivalent to
StreamEndpoint c
has super-classes
fel:ProperForm c

fel:AliasProperFormc back to top or classes

IRI: https://w3id.org/vcb/fel#AliasProperForm

This class is concerned with all the proper-noun mentions with a different morpheme than the primary label of the knowledge base entity to which if refers (though it may be a known alias). For instance, the mention 'King of Pop' targeting wiki:Michael_Jackson is considered an AliasProperForm.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:ProperForm c

fel:NumericTemporalFormc back to top or classes

IRI: https://w3id.org/vcb/fel#NumericTemporalForm

This class gathers all mentions based on numeric and temporal expressions, such as: '1', 'one', '12/23/2019', etc. (as were included in MUC-6).

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:BaseFormClass c
skos:narrower
lexinfo:NumeralPOS c
gold:Quantifier c

fel:CommonFormc back to top or classes

IRI: https://w3id.org/vcb/fel#CommonForm

This class gathers all the mentions with a corresponding entity in the knowledgebase, but that does not correspond to a Proper Form, Pro-Form or Numeric/Temporal Form. For instance, the mention 'belt' referring to wiki:Belt_(clothing) is considered CommonForm.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:BaseFormClass c

fel:ProFormc back to top or classes

IRI: https://w3id.org/vcb/fel#ProForm

This class gathers all mentions based on pronouns, pro-adjective, etc. For example, the mentions 'he', 'theirs', etc., are considered ProForm (assuming they link to a knowledgebase entity).

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:BaseFormClass c

fel:PartOfSpeechClassc back to top or classes

IRI: https://w3id.org/vcb/fel#PartOfSpeechClass

This meta-class gathers classes that divide annotations according to the part-of-speech of their mention.

Is defined by
https://w3id.org/vcb/fel#
has sub-classes
fel:NounPhrasePoS c
fel:VerbPoS c
fel:AdjectivePoS c
fel:AdverbPoS c

fel:NounPhrasePoSc back to top or classes

IRI: https://w3id.org/vcb/fel#NounPhrasePoS

This class gathers all the noun mentions.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:PartOfSpeechClass c
has sub-classes
fel:SingularNounPhrasePoS c
fel:PluralNounPhrasePoS c
skos:narrower
lexinfo:NounPOS c
gold:Noun c

fel:SingularNounPhrasePoSc back to top or classes

IRI: https://w3id.org/vcb/fel#SingularNounPhrasePoS

This class gathers all the singular noun mentions, including 'documentary', 'Germany', etc.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:NounPhrasePoS c

fel:PluralNounPhrasePoSc back to top or classes

IRI: https://w3id.org/vcb/fel#PluralNounPhrasePoS

This class gathers all the plural noun mentions. For instance, 'political parties' may refer to wiki:Political_party.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:NounPhrasePoS c

fel:VerbPoSc back to top or classes

IRI: https://w3id.org/vcb/fel#VerbPoS

This class gathers all the verb mentions. For instance the verb mention 'assassinated' may link to wiki:Assassination.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:PartOfSpeechClass c
skos:closeMatch
lexinfo:VerbPOS c

fel:AdjectivePoSc back to top or classes

IRI: https://w3id.org/vcb/fel#AdjectivePoS

This class gather all the adjective mentions. For example, there is a wikipedia page (wiki:Red) about the color 'red'.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:PartOfSpeechClass c
skos:closeMatch
lexinfo:AdjectivePOS c

fel:AdverbPoSc back to top or classes

IRI: https://w3id.org/vcb/fel#AdverbPoS

This class gathers all the Adverb mentions. For instance, 'comercially' could be associated to wiki:Commerce

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:PartOfSpeechClass c
skos:closeMatch
lexinfo:AdverbPOS c

fel:OverlapClassc back to top or classes

IRI: https://w3id.org/vcb/fel#OverlapClass

This meta-class gathers classes that divide annotations based on whether or not their mention overlaps with others. For example, in the sentence 'Living with Michael Jackson is a television documentary' the mention 'documentary' does not overlap with another mention; for this reason it is considered non-overlapping. On the other hand, the mentions 'Living with Michael Jackson' and 'Michael Jackson' have overlap.

Is defined by
https://w3id.org/vcb/fel#
has sub-classes
fel:NoOverlap c
fel:MaximalOverlap c
fel:IntermediateOverlap c
fel:MinimalOverlap c

fel:NoOverlapc back to top or classes

IRI: https://w3id.org/vcb/fel#NoOverlap

This class gathers all the mentions without overlap.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:OverlapClass c

fel:MaximalOverlapc back to top or classes

IRI: https://w3id.org/vcb/fel#MaximalOverlap

This class describes all the mentions that overlap with others and that, more specifically, contain other mentions entirely inside them but are not contained in other mentions. For instance, 'Living with Michael Jackson' is considered as maximal overlap assuming 'Michael Jackson' is also annotated and it is not contained inside another mention.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:OverlapClass c

fel:IntermediateOverlapc back to top or classes

IRI: https://w3id.org/vcb/fel#IntermediateOverlap

This class describes all the mentions that overlap with others and that, more specifically, both contain and are contained in other mentions. For instance, in the mention 'New York Police Department Museum', the mention 'New York Police Department' has intermediate overlap because it is contained in the overall mention and contains the mention 'New York'.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:OverlapClass c

fel:MinimalOverlapc back to top or classes

IRI: https://w3id.org/vcb/fel#MinimalOverlap

This class describes all the mentions that overlap with others and that, more specifically, are contained in but do not contain other mentions. For instance, in the annotation 'Living with Michael Jackson', the mention 'Michael Jackson' is considered to have minimal overlap.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:OverlapClass c

fel:ReferenceClassc back to top or classes

IRI: https://w3id.org/vcb/fel#ReferenceClass

This meta-class gathers classes that divide annotations based on how the mention references its entity. Examples of types of reference include Anaphoric, Direct, Descriptive, Metaphoric, Metonymic and Related.

Is defined by
https://w3id.org/vcb/fel#
has sub-classes
fel:DirectReference c
fel:MetaphoricReference c

fel:AnaphoricReferencec back to top or classes

IRI: https://w3id.org/vcb/fel#AnaphoricReference

This class gathers mentions that are pro-forms referring to an antecedent or postcedent in the text. For instance, in the sentence 'His son was widely regarded ...' the mention 'His' may be an anaphoric reference to wiki:Joe_Jackson_(manager). (Note that noun phrases such as 'His son' referring to wiki:Michael_Jackson' should rather be marked as descriptive references.).

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:ReferenceClass c

fel:DirectReferencec back to top or classes

IRI: https://w3id.org/vcb/fel#DirectReference

This class gathers mentions with references based on the direct, literal meaning of the words and names. For instance, the reference 'Michael Jackson' referring to wiki:Michael_Jackson, or the reference 'talent manager' referring to wiki:Talent_manager, are considered direct references.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:ReferenceClass c

fel:DescriptiveReferencec back to top or classes

IRI: https://w3id.org/vcb/fel#DescriptiveReference

This class gathers mentions based on describing the entities they refer to. For instance, the mention 'the capital of Peru' refers descriptively to wiki:Lima, or in the sentence 'Michael Jackson and his father', the mention 'his father' refers to wiki:Joe_Jackson_(manager). Note that proforms should rather be marked as anaphoric reference.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:ReferenceClass c

fel:MetaphoricReferencec back to top or classes

IRI: https://w3id.org/vcb/fel#MetaphoricReference

This class gathers mentions that make reference based on a figurative rather than literal meaning of the words. For example, in the phrase 'the King of Pop', the mention 'King' can be considered a metaphoric reference to wiki:King; in the sentence 'they added spice to their relationship', the mention 'spice' (wiki:Spice) is again a metaphoric reference.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:ReferenceClass c

fel:MetonymicReferencec back to top or classes

IRI: https://w3id.org/vcb/fel#MetonymicReference

This class gathers mentions that refer to something specific by reference to a broader related entity (often, but not always, countries). For example, in the phrase 'Russia announced today', the mention 'Russia' is a metonymic reference to wiki:Government_of_Russia; in the phrase 'Poland won 3-2 on penalties', 'Poland' may be a metonymic reference to wiki:Poland_national_football_team, etc.

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:ReferenceClass c

fel:RelatedReferencec back to top or classes

IRI: https://w3id.org/vcb/fel#RelatedReference

This class gathers mentions that refer to something for which there is (only) something closely related in the knowledge-base. For instance, in the phrase 'The Russian daily RBK', the mention 'daily' refers to a daily newpaper, but in Wikipedia we only have wiki:Newspaper, so 'daily' can be seen as a reference to the closely related wiki:Newspaper. (Such references are sometimes reflected, for example, with redirects in Wikipedia, or pointers to a subsection of an entity's article.)

Is defined by
https://w3id.org/vcb/fel#
has super-classes
fel:ReferenceClass c

Object Properties

fel:entityTypeop back to top or properties

IRI: https://w3id.org/vcb/fel#entityType

To specify the entity type of a KB-entity. The domain is URIs/IRIs of EK entities, and the range types of entities, e.g., Organization, Place, Person, etc.

Is defined by
https://w3id.org/vcb/fel#

Legend back to top

c: Classes
op: Object Properties
dp: Data Properties
ni: Named Individuals

Example back to top

An example cannot be overlooked! Below we show in NIF format the sentence "The program 'Living with Michael Jackson' was broadcast." and the annotation of three mentions. We incorporate the categorization of each anntation, as well as the entity type of their links.

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix dbo: <http://dbpedia.org/ontology/> .

<http://example.org/doc1#char=0,56>
    a nif:String , nif:Context , nif:RFC5147String ;
    nif:isString """The program 'Living with Michael Jackson' was broadcast."""^^xsd:string ;
    nif:beginIndex "0"^^xsd:nonNegativeInteger ;
    nif:endIndex "56"^^xsd:nonNegativeInteger ;
    nif:sourceUrl <http://example.org/doc1> .

<http://example.org/doc1_sentence0>
    a nif:String , nif:Context , nif:RFC5147String ;
    nif:isString """The program 'Living with Michael Jackson' was broadcast."""^^xsd:string ;
    nif:beginIndex "0"^^xsd:nonNegativeInteger ;
    nif:endIndex "56"^^xsd:nonNegativeInteger ;
    nif:broaderContext <http://example.org/doc1#char=0,56> .

<http://example.org/doc1_sentence0#char=13,40>
    a nif:String , nif:Context , nif:Phrase , nif:RFC5147String, fel:FullProperForm,
    fel:SingularNounPhrasePoS, fel:MaximalOverlap, fel:DirectReference ;
    nif:referenceContext <http://example.org/doc1_sentence0> ;
    nif:context <http://example.org/doc1#char=0,56> ;
    nif:anchorOf """Living with Michael Jackson"""^^xsd:string ;
    nif:beginIndex "13"^^xsd:nonNegativeInteger ;
    nif:endIndex "40"^^xsd:nonNegativeInteger ;
    itsrdf:taIdentRef <http://en.wikipedia.org/wiki/Living_with_Michael_Jackson> .

<http://en.wikipedia.org/wiki/Living_with_Michael_Jackson> fel:entityType fel:Miscellany .

<http://example.org/doc1_sentence0#char=25,40>
    a nif:String , nif:Context , nif:Phrase , nif:RFC5147String, fel:FullProperForm,
    fel:SingularNounPhrasePoS, fel:MinimalOverlap, fel:DirectReference ;
    nif:referenceContext <http://example.org/doc1_sentence0> ;
    nif:context <http://example.org/doc1#char=0,56> ;
    nif:anchorOf """Michael Jackson"""^^xsd:string ;
    nif:beginIndex "25"^^xsd:nonNegativeInteger ;
    nif:endIndex "40"^^xsd:nonNegativeInteger ;
    itsrdf:taIdentRef <http://en.wikipedia.org/wiki/Michael_Jackson> .

<http://en.wikipedia.org/wiki/Michael_Jackson> fel:entityType fel:Person .

<http://example.org/doc1_sentence0#char=46,55>
    a nif:String , nif:Context , nif:Phrase , nif:RFC5147String, fel:CommonForm, fel:VerbPoS,     fel:NoOverlap, fel:DirectReference ;
    nif:referenceContext <http://example.org/doc1_sentence0> ;
    nif:context <http://example.org/doc1#char=0,56> ;
    nif:anchorOf """broadcast"""^^xsd:string ;
    nif:beginIndex "46"^^xsd:nonNegativeInteger ;
    nif:endIndex "55"^^xsd:nonNegativeInteger ;
    itsrdf:taIdentRef <http://en.wikipedia.org/wiki/Broadcasting> .

<http://en.wikipedia.org/wiki/broadcast> fel:entityType fel:Miscellany .

References back to top

[Grishman at al., 1996]
Grishman, R., and Sundheim, B. Message understanding conference-6: A brief history. In COLING 1 (1996)

[Eckhardt at al., 2014]
Eckhardt, A., Hreško, J., Procházka, J., Smrí, O. Entity linking based on the co- occurrence graph and entity probability. ERD, ACM (2014) 37–44

[Uren at al., 2006]
Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., Ciravegna, F. Semantic annotation for knowledge management: Requirements and a survey of the state of the art. Journal of Web Semantics, 4(1) (2006) 14–28

[Perera at al., 2016]
Perera, S., Mendes, P. N., Alex, A., Sheth, A. P., Thirunarayan, K. Implicit entity linking in tweets. In ISWC, (2016) 118–132

[Borrega at al., 2007]
Borrega, O., Taulé, M., Martí, M.A. What do we mean when we speak about Named Entities. In Proceedings of Corpus Linguistics, 2007

[Ling at al., 2015]
Ling, X., Singh, S., Weld, D. S. Design challenges for entity linking. TACL 3 (2015) 315–328

Citation back to top

If you use FEL vocabulary in a research work, we would ask you to reference the following paper that describes the [categories] in detail (Note, it doesn't describe the vocabulary):

Henry Rosales-Méndez, Aidan Hogan, Barbara Poblete. "A Fine-Grained Categorisation for Entity Linking". In the Proceedings of the Conference on Empirical Methods in Natural Language Processing and International Joint Conference on Natural Language Processing (EMNLP–IJCNLP), Hong Kong, China, November 3–7, 2019.