Implementation of NLTK
To download NLTK, you need Python versions 3.7, 3.8, 3.9, or 3.10.
Installing NLTK in Windows
- Install Python 3.10 using https://www.python.org/downloads/ if you do not have Python installed.
- Install NLTK using https://pypi.python.org/pypi/nltk
-
Run the “import nltk” command to check if NLTK is installed properly.
Installing NLTK in MAC/Unix
- Install Python 3.10 using https://www.python.org/downloads/ if you do not have Python installed.
- Run the command “pip install --user -U nltk”.
-
Run the “import nltk” command to check if NLTK is installed properly.
To use NLTK in google colab
We can install NLTK using the pip command.
pip install nltk #installing nltk
Now, run the following command to check if NLTK is installed properly.
import nltk #importing nltk
If everything goes fine, NLTK is installed properly and ready to use.
NLTK has many datasets, pre-trained models for easy use. We can find the detailed list here.
Let’s use the famous Brown corpus present in NLTK.
nltk.download('brown') #first we need to download the data
from nltk.corpus import brown #then we can import the data
print(brown.words())
Output
[nltk_data] Downloading package brown to /root/nltk_data...
[nltk_data] Unzipping corpora/brown.zip.
['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', ...]
Instead of downloading the datasets separately, we can download everything in a single go using the following command.
nltk.download('all')
Output
[nltk_data] Downloading collection 'all'
[nltk_data] |
[nltk_data] | Downloading package abc to /root/nltk_data...
[nltk_data] | Unzipping corpora/abc.zip.
[nltk_data] | Downloading package alpino to /root/nltk_data...
[nltk_data] | Unzipping corpora/alpino.zip.
…
[nltk_data] | Downloading package words to /root/nltk_data...
[nltk_data] | Unzipping corpora/words.zip.
[nltk_data] | Downloading package ycoe to /root/nltk_data...
[nltk_data] | Unzipping corpora/ycoe.zip.
[nltk_data] |
[nltk_data] Done downloading collection all
True
Similarly, if we want to download only the corpus, we can use nltk.download(“all-corpora”).
Now, let’s try out some functions of NLTK.
import nltk
sentence="Coding Ninjas is one of the best learning platforms."
tokens = nltk.word_tokenize(sentence)
print(tokens)
Output
['Coding', 'Ninjas', 'is', 'one', 'of', 'the', 'best', 'learning', 'platforms', '.']
Our sentence is now split into tokens in a single step using word_tokenize() of the NLTK package.
Now, if we want to do POS tagging of the words, we can do the following.
tagged_tokens = nltk.pos_tag(tokens)
print(tagged_tokens)
Output
[('Coding', 'VBG'), ('Ninjas', 'NNP'), ('is', 'VBZ'), ('one', 'CD'), ('of', 'IN'), ('the', 'DT'), ('best', 'JJS'), ('learning', 'NN'), ('platforms', 'NNS'), ('.', '.')]
FAQs
1. What is NLTK used for?
Natural language ToolKit(NLTK) is used for doing NLP tasks such as removing stopwords, tokenizing words, etc.
2. What is the difference between NLP and NLTK?
Natural Language Processing(NLP) aims to understand and interpret the human language to perform various tasks such as language translation, automatic question answering, etc. Natural Language ToolKit(NLTK) package contains multiple libraries to perform NLP tasks in Python.
3. Why is NLTK the best?
NLTK is best as it has a lot of pre-trained models and algorithms for doing NLP tasks quickly and easily.
4. How do I use NLTK in Python?
You can use google colab to use NLTK in Python easily. You can download it using the command “pip install nltk” and import using the command “import nltk”.
5. What is an alternative for NLTK?
We have a library called spaCy, which is similar to NLTK.
Key Takeaways
This article discussed the package NLTK, its use cases, installation, and implementation.
We hope this blog has helped you enhance your knowledge regarding the NLTK package in NLP and if you would like to learn more, check out our free content on NLP and more unique courses. Do upvote our blog to help other ninjas grow.
Happy Coding!