What are the main approaches to automatic summarization?
There are two main types of how to summarize the text in NLP:
Extraction-based summarization
The extractive text summarization technique entails extracting essential words from a source document and combining them to create a summary. The extraction process is done according to the defined measure without changing the texts without modifying the texts.
Here is an example:
Source text: Mary and Joseph rode on a donkey to attend the yearly event in Jerusalem. In that city, Mary gave birth to Jesus.
Extractive Summary: Mary and Joseph attend event Jerusalem. Mary birth Jesus.
From the above example, the words in bold have been extracted, and a summary is created — although sometimes the summary can be grammatically strange.
Abstraction-based summarization
Parts of the source document are paraphrased and shortened as part of the abstraction approach. When abstraction is used for text summarization in serious learning issues, it can overcome the extractive method's grammar errors.
The abstractive text summarization algorithms, like humans, generate new phrases and sentences that communicate the essential information from the original text.
As a result, abstraction outperforms extraction. The text summarising algorithms required for abstraction, on the other hand, are more challenging to build, which explains extraction usage.
Here is an example:
Abstractive Summary: Mary and Joseph came to Jerusalem, where Jesus was born.
How does a text summarization algorithm work?
Text summarization is typically a supervised machine learning problem in NLP (where future outcomes are predicted based on provided data). The following is a typical example of how an extraction-based technique for text summarization can work:
- Create a method for extracting the essential keys from the original document. To find the vital words, you can utilize part-of-speech tagging, word sequences, or other linguistic patterns.
- Collect text documents with keywords that are positively labeled. The keys must work with the extraction method that has been specified. You can also build negatively labeled keys to improve accuracy.
-
To make the text summary, train a binary machine learning classifier. You can use the following features:
- The length of the key
- The frequency of the key
- The key's most often used word
- The number of characters in the key is the number of characters in the key.
- Finally, generate all relevant terms and sentences in the test phrase and classify them accordingly.
Frequently Asked Questions
-
What is text summarization?
The practice of condensing the essential information from a source (or sources) to create an abridged version for a specific user (or users) and the task is known as text summarising (or tasks).
-
What is text summarization in NLP?
Automatic Text Summarization is a technique in which computer software shortens lengthy texts and provides summaries to convey the desired content. It is a prevalent challenge in machine learning and natural language processing (NLP).
-
What is text summarization in machine learning?
Text summarization is creating summaries from large amounts of data while keeping the material's original context. Throughout the summary, the language should be fluid and concise.
-
Is text summarization supervised or unsupervised?
Text summarization is typically a supervised machine learning issue in NLP (where future outcomes are predicted based on provided data).
Conclusion
So that's the end of the article.
In this article, we have extensively discussed Text Summarization.
Isn't Machine Learning exciting!! We hope that this blog has helped you enhance your knowledge regarding Text Summarization and if you would like to learn more, check out our articles on the MACHINE LEARNING COURSE. Do upvote our blog to help other ninjas grow. Happy Coding!