Who does not want to improve their productivity and save time? Who does not want to get the maximum amount of work done at the same time? It is even valid for data researchers. Selecting the best possible software before diving into any data science project is crucial. There are many options available on the market but choosing one is tricky.
So, in this article, I will share why Anaconda is one of the best tools for practicing data science.
Anaconda is the same as any other virtual environment (collection of tools or packages), but it offers more than managing environments. Anaconda has its python package manager called conda, and at the same time, Anaconda supports other package managers like pip,pipenv, and others. Conda is a package manager who takes care of different packages by handling, updating and removing them. Anaconda supports other languages like R too. So let us have a closer look at what an anaconda is?
Anaconda is a software distribution just like Miniconda. Anaconda comes with over 150 installed data science packages, everything we can imagine.
A package is a code someone else has written that can be used for specific purposes. We can use a package at our convenience.
Packages are helpful because you would have to write more code to get what you need to do without them.
Anaconda is a hardware store for data scientists with every tool from training datasets to modeling them.
In short, Anaconda is all in one package.
Why do we use Anaconda?
No directory needed: Unlike other virtual environments like virtualenv, we do not specify the directory before setting up an environment. Anaconda enables us to start a new virtual environment without worrying about its path.
Python Version: As long as the python version exists in the server, Anaconda's conda can easily create the environment by grabbing the exact version of Python from the server. Also see, How to Check Python Version in CMD
Conda Package Manager:Instead of just managing different packages like another package manager, conda does more than that by updating, removing packages.
Anaconda provides the ability to share the foundation of our work. Anaconda ensures that if someone else reproduces our work, they have the same tool as us.
Suppose we are working on a dataset where we should have tools to explore the data, train our model and visualize graphs for better understanding; this is where Anaconda comes to the picture. Anaconda provides all these essential tools in one place, eventually making our tasks more manageable.
What makes Conda environment special from other virtual environments?
A Virtual environment is used to manage python packages for different projects; however, there are few differences between Conda and Virtual environments.
Other virtual elements are not virtual agnostic,i.e., different virtual environments are Python-specific, whereas conda environments are not.
Virtual environments depend on the base system install of Python, whereas Conda environments are independent of the base install.
Conda packages are binary, whereas libraries in Virtual environments are packaged as source distributions.
Now, let's look at the installation of Anaconda in our system.
Installation
Installing software in the system can be very tedious, but it's quite the opposite with Anaconda. Installation of Anaconda is user-friendly. We can download Anaconda in Linux, windows, mac os. We have to visit this link; it will redirect us to the anaconda download site. We have to select the specifications of our machine and follow the further instructions for downloading.
After installation, if we search for Anaconda now, we can see an anaconda navigator popping up.
Click on anaconda navigator, and you can see this window popping up:
That is all from the installation path. Now, you can launch a jupyter notebook, and it will redirect us to our web browser and create a new notebook.
We do not have to install the jupyter notebook separately. It is already present inside Anaconda.
Now open the anaconda terminal, we can notice the terminal has base written at the starting. It shows us our base conda environment is fully set up.
Now let us look at some of the basic commands which make our task more manageable.
View Installed Packages
The conda environment has some pre-installed packages in our environment. The command is as follows:
conda list
As you can see, there are many packages already installed. We can search for a package too by using:
Searching
conda search <package_name>
We can add and remove packages too by using:
Installation
conda install numpy==version
Removing
conda remove <package_name>
Those are some of the basic commands of conda. The commands are very intuitive as the syntax is similar to the task we have to perform. Now moving on, let us see what all are the things we can do in a conda environment.
Working On Conda Environment
Conda enables us to create, activate, deactivate, share and remove environments. All the built environments are isolated and can have different packages with different versions without interfering.
The name after -n is the name of the created environment. We have to specify the version of the Python. We can set the list of the packages we want in this newly created environment. For example,
As seen in the above output, the environment name precedes the command line indicating we are inside our environment.
Deactivate the Environments:
conda deactivate
Thus, we can see it revert to base.
Listing All Environments:
conda env list
It is a quick command to check all the available environments present.
Removing an Environment:
conda env remove -n environment_name
The environment_name should contain the name of the environment you want to remove. For example,
Exporting Environments:
One of the main reasons for using conda environments is sharing the same Python packages with identical versions. Conda uses YAML files to export the environment information, unlike pip which uses requirements.txt.
conda env export > environment.yml
That's all from the basics of the conda environment. Once you get familiar with the basics, you can also refer to the anaconda cheat sheet.
Frequently Asked Questions
1. Is anaconda cloud-based? Ans. Anaconda cloud hosts many applicable packages, notebooks for a variety of applications. We users do not need an account to use those packages on the cloud.
2. What is the purpose of using Anaconda? Ans. Anaconda has many in-built tools used in data science and machine learning projects frequently. In addition to that, we can also create separate environments to isolate different libraries and their versions.
3. Should we add Anaconda to the windows path? Ans. It is generally not preferred to add Anaconda to the windows path to interfere with other software. Instead, it is recommended to use an anaconda prompt(Anaconda has its terminal).
Key Takeaways
This article teaches about the conda environment and its relatively easy installation. Further, we looked into what makes the conda environment different from other environments. Moving forward, we learned about some of the basic commands used for manipulating packages, and finally, we saw how to create an environment and control it.
In today's time, the Anaconda is one of the most promising software used by data scientists. Some trusted companies like HSBC, Samsung, Cisco, BMW use anaconda for their work.