Introduction
Importance of Data Visualization
Data Visualization is the process of taking raw data and transforming it into graphs, charts, or images to derive meaningful insights from it.
It enables us to gain a qualitative understanding of the data by helping us identify new patterns, trends, outliers, and much more from the data. We can demonstrate the key relationships in the data along with the numerical measures in different plots and graphs, which can help us and the stakeholders gain an overall sense of the data.
Thousands of rows of data can be easily visualized in graphs and pie charts. For example, it would be straightforward for a product-based company to understand how their product is performing comparatively in different regions by visualizing the number of sales of the product in a pie chart rather than looking at only the sheer numbers of the sales.
Therefore, Data Visualization is an essential technique for businesses, and data can be expressed in different ways with the help of various plots such as the line plot, scatter plot, box and whisker plot, histogram plot, pie charts, and much more.
In this blog, we will be studying the line and scatter plots.
Introduction to Matplotlib
Matplotlib is a visualization library in Python that offers us a wide variety of visualizations such as line, bar, scatter, histogram, boxplot, and many more. We can create beautiful visual charts and graphs with ease and define our custom labels for the axes, the plot's title, the color of the plot, and a lot more.
We can easily customize our draw and customize our plots using the functions under the pyplot module in matplotlib.
import matplotlib.pyplot as plt
The above python code shows how we can import the pyplot and give it an alias called plt to use the functions available under the module.
We will learn to utilize the pyplot module to generate line and scatter plots on a data sample in Python.
Line Plot
Line plots are used to display data which is collected at regular intervals or to show the relationship between two values, i.e., how an observation changes with the change in a specific variable. To give you an example, we could plot the change in population(observation) with time(variable) for a particular city on a line plot, or we could plot the temperature(observation) at different times of the day(variable).
Usually, the x-axis represents the variable, and the y axis represents the observation.
Therefore, line plots are helpful in presenting time series data as well as any sequence data where there is an ordering between observations.
Why are Line Plots effective?
We can understand why Line plots are effective by looking at this line plot.
This is a line plot for the monthly Sales(observation) each month(variable) for a company.
Source: excelinexcel.in
Looking at this plot, it is easy to understand the rate of the change of the Actual Sales and the target sales for each month. We can easily understand how our actual sales changed each month and have a practical comparison between the target and the actual sales. We can learn about the months in which the sales did not meet the target and the months in which the sales were higher than the target and by how much.