Table of contents
1.
Introduction
2.
Named Entity Linking
2.1.
Named Entity Recognition(NER)
2.2.
Named Entity Linking(NEL)
3.
NEL using DBpedia Spotlight
4.
Main Steps in Entity Linking in General
5.
Applications
6.
Frequently Asked Questions
7.
Conclusion
Last Updated: Aug 13, 2025
Easy

Entity Linking: A primary NLP task for Information Extraction

Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

In natural language processing, entity linking also referred to as named-entity linking (NEL), named-entity disambiguation (NED), named-entity recognition and disambiguation (NERD), or named-entity normalization (NEN), is the task of assigning a unique identity to entities (such as famous individuals, locations, or companies) mentioned in the text.

In entity linking, words of interest (names of persons, locations and companies) are mapped from an input text to corresponding unique entities in a target knowledge base. Words of interest are called named entities (NEs), mentions, or surface forms.

To give an example, "Paris is the capital of France," we want to know if the word "Paris" refers to the French capital, another city, Paris Hilton, or any of a number of other alternatives. like

"Paris(surname), a lit of fictional characters". 

"Paris a prince of Troy in Greek mythology". 

Let's look at another example to further grasp what entity linking is.

Assume we're creating an automated stock trader that will trade on news events. In a news story one morning, our algorithm comes across the phrase "Tesla Crashes, Jim Cramer Expects Rally." We immediately see an issue that our algorithmic trader must be able to solve: is the term "Tesla" referring to the public corporation, a Tesla vehicle, or (nonsensically) Nikola Tesla? If 'Tesla' refers to the firm, we know the headline is implying that Tesla stock has collapsed and that now is a good time to acquire Tesla stock at its current cheap price. If 'Tesla' refers to a Tesla vehicle, on the other hand, we know that this is a negative headline for Tesla (maybe a problem with self-driving technology caused a Tesla to crash on the highway), indicating that now is a good time to sell Tesla shares.

As humans, we find this challenge trivial: given references of the entities' Jim Cramer' and'Rally,' and their relationship to finance, we can relatively quickly and reliably deduce that this is, in fact, referring to Tesla stock from the context of the remainder of the story.

Named Entity Linking

Information extraction comprises multiple sub-tasks. In most cases, we will have the following sub-tasks. And they are performed in order to extract the information from unstructured data.
1. Named Entity Recognition (NER)
2. Named Entity Linking (NEL)
3. Relation Extraction

Named Entity Recognition(NER)

A named entity is a real-world object, such as persons, locations, organizations, etc. NER identifies and classifies named entity occurrences in text into pre-defined categories. NER is modelled as a task of assigning a tag to each word in a sentence. Below is an example result from a NER system.

NER will tell us what words are entities and what are their types. In the above example, NER will locate "Sebastian Thrun" as a person. But we still don't know exactly which "Sebastian Thrun" the text is speaking about in the above example. NEL is the next sub-task that will answer this question.

Named Entity Linking(NEL)

NEL will assign a unique identity to entities mentioned in the text. In other words, NEL is the task to link entities mentioned in the text with their corresponding entities in a knowledge base [1]. The target knowledge base depends on the application, but we can use knowledge bases derived from Wikipedia for open-domain text. In our above example, we can find exactly which "Sebastian Thrun" by linking the entities to DBpedia. DBpedia is a structured knowledge base extracted from Wikipedia. This process of linking entities to Wikipedia is also called Wikification.
 

NEL is also referred to as Entity Linking, Named Entity Disambiguation (NED), Named Entity Recognition and Disambiguation (NERD) or Named Entity Normalization (NEN). NEL has a wide range of applications other than Information Extraction. NEL is used in Information Retrieval, Content Analysis, Intelligent Tagging, Question Answering systems, Recommender Systems, etc.

NEL also plays a significant role in the Semantic Web. The Semantic Web is a term coined by Tim Berners-Lee for a web of data that can be processed by machines. A vital issue in the Semantic Web is automatically populating and enriching existing knowledge bases with newly extracted facts. NEL is inherently considered an essential subtask for the knowledge base population.

NEL using DBpedia Spotlight

There are many libraries available to implement NEL, But here we are going to use DBpedia Spotlight. The target knowledge base for NEL here is DBpedia. DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs, is developed as a step towards interconnecting the Web of Documents with the Web of Data [3].

DBpedia Spotlight is deployed as a Web Service, and we can use the provided Spotlight API to achieve NEL. You can even check the status of the DBpedia Spotlight server. Below is a sample python code that uses Spotlight API to do NEL.

import requests
from IPython.core.display import display, HTML
# An API Error Exception
class APIError(Exception):
  def __init__(self, status):
          self.status = status
  def __str__(self):
          return "APIError: status={}".format(self.status)
     
# Base URL for Spotlight API
base_url = "http://api.dbpedia-spotlight.org/en/annotate"
# Parameters
# 'text' - text to be annotated
# 'confidence' -   confidence score for linking
params = {"text": "My name is Taneesh. I am final year student at IIT Ropar. I love Natural Language Processing.", "confidence": 0.35}
# Response content type
headers = {'accept': 'text/html'}
# GET Request
res = requests.get(base_url, params=params, headers=headers)
if res.status_code != 200:
    # Something went wrong
    raise APIError(res.status_code)
# Display the result as HTML in Jupyter Notebook
display(HTML(res.text))
You can also try this code with Online Python Compiler
Run Code

Output: 

As you can see in the above example, DBpedia Spotlight links the located entities to the DBpedia knowledge base.

Pyspotlight is a library that is a wrapper around this, so we can use that too directly as it will be easier to use. 

Main Steps in Entity Linking in General

  • Recognize: Recognize the items referenced inside the text's context. The entity linking system in this module seeks to filter out unnecessary entities in the knowledgebase for each entity mention m M and get a candidate entity set Em, which contains possible entities that entity mention m may refer to.
  • Rank: Give each applicant a score. The size of the candidate entity set Em is usually greater than one in most circumstances. To rank the potential entities in Em, researchers use many types of evidence. They look for the entity e Em, which is the most plausible connection for the mention m.
  • Link: In the knowledge graph, connect the recognised entities to the categorised entities.

Applications

This is a critical task in NLP and is mainly used in almost all the significant applications in the area of NLP, there is a massive impact of Entity linking in Health Care, information retrieval systems, speech recognition, bot creation, and much more, as in almost all the applications there is definitely a need to understand how the real world objects are linked to other real-world objects, and this makes this area really hot in NLP. 

Frequently Asked Questions

1. What is Entity linking spacy?
Spacy Entity Linker is a pipeline for spaCy that performs Linked Entity Extraction with Wikidata on a given Document. The Entity Linking System operates by matching potential candidates from each sentence (subject, object, prepositional phrase, compounds, etc.) to aliases from Wikidata.
2. What is a linked entity?
The linked entities point to the entities in other dataflows and do not copy or duplicate the data. Linked entities are read-only, so if you want to create transformations for a linked entity, you must create a new computed entity with reference to the linked Entity.
3. What is word sense disambiguation in NLP?
Word sense disambiguation (WSD) is the challenge of detecting which "sense" (meaning) of a word is triggered by its use in a given context in natural language processing. People appear to be mainly unaware of this procedure.
4. How many steps of NLP are there?
The five phases of NLP involve lexical (structure) analysis, parsing, semantic analysis, discourse integration, and pragmatic analysis.

Conclusion

In a nutshell, Entity linking is a crucial step in NLP; it finds out how to decipher the meaning of a sentence in a better way. 

Hey Ninjas! Don't stop here; check out Coding Ninjas for Machine Learning, more unique courses and guided paths. Also, try Coding Ninjas Studio for more exciting articles, interview experiences, and fantastic Data Structures and Algorithms problems. 

Happy Learning!

Live masterclass