Table of contents

Introduction

What are Stop Words?

Why should we remove Stop Words?

When do we remove Stop Words?

Removing stop words using NLTK

FAQS

Key Takeaways

Last Updated: Aug 13, 2025

Easy

Removing Stop Words using NLTK

Q: 1. What is removing stop words in Python?

Ans: Stop words are the most frequently occurring words that don’t add much value to the meaning of the data like a, an, the, etc, and hence they can be removed from the data.

Q: 2. What are the advantages of removing stop words?

Ans: Removing stop words can help reduce data size and hence reduce the training time of the model. It can also increase the model's performance by giving more accurate results.

Q: 3. Is it mandatory to remove stop words?

Ans: No, it depends on your task. Removing stop words in a sentiment analysis task can hamper your performance as it can flip the intent of the sentence.

Q: 4. What is NLTK?

Ans: Natural Language Toolkit (NLTK) is a super helpful library that can be used for NLP tasks like text preprocessing, removing stop words, etc.

Q: 5. How do you remove stop words in NLP?

Ans: You can use various libraries like NLTK, spaCy, etc., to remove stop words.

Author Prakriti

Do you think IIT Guwahati certified course can help you in your career?

Yes

Introduction

Natural Language Processing is the ability of computers to understand and interpret the human language to perform a myriad of tasks like automatic question answering, resolving ambiguity, sentiment analysis, etc., which can make our lives easier. But computers cannot directly use the language as is, and hence we perform various preprocessing steps so that computers can use it to perform tasks like prediction, analysis, etc.

What are Stop Words?

Stop words are the most frequently used words in a natural language like in, on, a, an, the, etc., in the English language. While analyzing data, these words might not add much value to the meaning of the data, and hence they can be filtered out to focus more on the essential words.

Why should we remove Stop Words?

Stop words are in plenty in any natural language, and therefore by removing them, we can filter out the low-level information and focus on more important words.

For example, if we search “How to study NLP” on a search engine, then if the engine finds web pages having “how” and “to” as they are more frequently used in the English language, we will get a lot of unwanted pages. However, removing “how” and “to” will be beneficial since the search engine can now focus on more critical words like “study” and “NLP” so that we get helpful resources that are of interest to us.

Therefore, removing stop words helps reduce the data, which reduces the training time of the model. At the same time, it helps in a better performance as we saw in the search engine example above that removing stop words helped give more accurate results.

When do we remove Stop Words?

Do you think that we can remove stop words in every task?
The answer is a big NO!

Let’s say we want to predict the sentiment of the sentence “The decoration was not good.”

After removing the stop words, we are left with “decoration good.”

Although the real meaning of the sentence meant a negative review that the decoration was not good, removing the stop words changed the review to positive.

So removing stop words is not suited for this case.

In general, stop words removal is suited for text classification, but it can be a curse in tasks like sentiment analysis, machine translation, etc. Therefore research about your job before removing stop words to check if it is required or not.

Removing stop words using NLTK

Natural Language Toolkit (NLTK) is a beautiful suite of libraries to work in NLP using Python.

There is no universally accepted list of stop words, but most libraries provide their list of stop words, and we can add or remove words from that list as per our task.

import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
stop_words = stopwords.words('english')
print("Number of stop words is ", len(stop_words))
print("Stop words are:",stop_words)

You can also try this code with Online Python Compiler

Run Code

Output

Number of stop words is  179
Stop words are: ['i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', "you're", "you've", "you'll", "you'd", 'your', 'yours', 'yourself', 'yourselves', 'he', 'him', 'his', 'himself', 'she', "she's", 'her', 'hers', 'herself', 'it', "it's", 'its', 'itself', 'they', 'them', 'their', 'theirs', 'themselves', 'what', 'which', 'who', 'whom', 'this', 'that', "that'll", 'these', 'those', 'am', 'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 'having', 'do', 'does', 'did', 'doing', 'a', 'an', 'the', 'and', 'but', 'if', 'or', 'because', 'as', 'until', 'while', 'of', 'at', 'by', 'for', 'with', 'about', 'against', 'between', 'into', 'through', 'during', 'before', 'after', 'above', 'below', 'to', 'from', 'up', 'down', 'in', 'out', 'on', 'off', 'over', 'under', 'again', 'further', 'then', 'once', 'here', 'there', 'when', 'where', 'why', 'how', 'all', 'any', 'both', 'each', 'few', 'more', 'most', 'other', 'some', 'such', 'no', 'nor', 'not', 'only', 'own', 'same', 'so', 'than', 'too', 'very', 's', 't', 'can', 'will', 'just', 'don', "don't", 'should', "should've", 'now', 'd', 'll', 'm', 'o', 're', 've', 'y', 'ain', 'aren', "aren't", 'couldn', "couldn't", 'didn', "didn't", 'doesn', "doesn't", 'hadn', "hadn't", 'hasn', "hasn't", 'haven', "haven't", 'isn', "isn't", 'ma', 'mightn', "mightn't", 'mustn', "mustn't", 'needn', "needn't", 'shan', "shan't", 'shouldn', "shouldn't", 'wasn', "wasn't", 'weren', "weren't", 'won', "won't", 'wouldn', "wouldn't"]

As we can see, we have 179 stop words in the English language in NLTK.

Let us see how we can remove the stop words.

text = "Coding ninjas is one of the best learning platforms."
words = [word for word in text.split() if word.lower() not in stop_words]
modified_text = " ".join(words)
print("Original text:", text)
print("Modified text:", modified_text)

You can also try this code with Online Python Compiler

Run Code

Output

Original text: Coding ninjas is one of the best learning platforms.
Modified text: Coding ninjas one best learning platforms.

As we can see that “is”, “of,” and “the” have been removed from the text.

In the second line of code, I first split the text into words as stop words are in the form of words. Then I converted them to lowercase as stop words are in lowercase. Then I checked which words were not there in the stop_words list, and then I joined those words using space in the third line of code to print the modified sentence.

We can also edit the list of stop words.

To add a single word to the stop words list.

stop_words = stopwords.words('english')    # returns a list of stop words in English language
print("Original length:", len(stop_words))
stop_words.append('example')               # to add a single word
print("Modified length:", len(stop_words))

You can also try this code with Online Python Compiler

Run Code

Output

Original length: 179
Modified length: 180

To add a list of words to stop_words list.

stop_words = stopwords.words('english')    # returns a list of stop words in English language
print("Original length:", len(stop_words))
stop_words.extend(['stopwordone', 'stopwordtwo']) # to add a list of stop words
print("Modified length:", len(stop_words))

You can also try this code with Online Python Compiler

Run Code

Output

Original length: 179
Modified length: 181

To remove a word from the list of stop words.

stop_words = stopwords.words('english')    # returns a list of stop words in English language
print("Original length:", len(stop_words))
stop_words.remove('a')   # to remove a word from the list of stop words
print("Modified length:", len(stop_words))

You can also try this code with Online Python Compiler

Run Code

Output

Original length: 179
Modified length: 180

NLTK has stop words in 24 different languages, which we can check below.

print(stopwords.fileids()) # to see the available languages

You can also try this code with Online Python Compiler

Run Code

Output

['arabic', 'azerbaijani', 'bengali', 'danish', 'dutch', 'english', 'finnish', 'french', 'german', 'greek', 'hungarian', 'indonesian', 'italian', 'kazakh', 'nepali', 'norwegian', 'portuguese', 'romanian', 'russian', 'slovene', 'spanish', 'swedish', 'tajik', 'turkish']

FAQS

1. What is removing stop words in Python?

Ans: Stop words are the most frequently occurring words that don’t add much value to the meaning of the data like a, an, the, etc, and hence they can be removed from the data.

2. What are the advantages of removing stop words?

Ans: Removing stop words can help reduce data size and hence reduce the training time of the model. It can also increase the model's performance by giving more accurate results.

3. Is it mandatory to remove stop words?

Ans: No, it depends on your task. Removing stop words in a sentiment analysis task can hamper your performance as it can flip the intent of the sentence.

4. What is NLTK?

Ans: Natural Language Toolkit (NLTK) is a super helpful library that can be used for NLP tasks like text preprocessing, removing stop words, etc.

5. How do you remove stop words in NLP?

Ans: You can use various libraries like NLTK, spaCy, etc., to remove stop words.

Key Takeaways

This article discussed stop words, when, and how we can remove them using NLTK.

We hope this blog has helped you enhance your knowledge regarding stop words and NLTK and if you would like to learn more, check out our free content on NLP and more unique courses. Do upvote our blog to help other ninjas grow.

Happy Coding!

Live masterclass

Zomato Data Analysis Case Study: Ace 25L+ Roles in FoodTech

by Abhishek Soni

16 Mar, 2026

01:30 PM

40+ registered

Data Analysis for 20L+ CTC@Flipkart: End-Season Sales dataset

by Sumit Shukla

15 Mar, 2026

06:30 AM

268+ registered

Beginner to GenAI Engineer Roadmap for 30L+ CTC at Amazon

by Shantanu Shubham

15 Mar, 2026

08:30 AM

55+ registered

Multi-Agent AI Systems: Live Workshop for 25L+ CTC at Google

by Saurav Prateek

16 Mar, 2026

03:00 PM

8+ registered

Zomato Data Analysis Case Study: Ace 25L+ Roles in FoodTech

by Abhishek Soni

16 Mar, 2026

01:30 PM

40+ registered

Data Analysis for 20L+ CTC@Flipkart: End-Season Sales dataset

by Sumit Shukla

15 Mar, 2026

06:30 AM

268+ registered

View more events