Introduction
Big data refers to the set of data that is too large to handle and has been increasing exponentially. Big data encompasses everything from live stocks to payment transitions to audios to images, etc. And in the subsequent article, we will discuss Big data types, integration of data types into a big data environment, and Metadata.
Big data is any kind of data source that has these given mutual characteristics:
- The high volume of data
- Wide variety of data
- The high velocity of data
It is essential because it aids the organisation or individual gather, handle, manipulate, and organise a large amount of data timely without any delay or mess to get accurate insights. There are two main data types that make up big data: Structured and Unstructured.
Now we will discuss two main types of data.
Types of Data:
There are generally two types of data:
- Structured Data
- Unstructured Data
Structured Data:
Structured data refers to the data that has a defined format like numbers, strings (including numbers and words, for example, house address) and dates. And these kinds of data are generally stored in a database. Moreover, we can query it using query language like SQL.
And generally, most of us are used to this kind of data. And you know, Structured data accounts for about 20 percent of the data out there.
As technology evolves, structured data is taking a new role in Big data., and the new sources of structured data are being produced in a large amount in real-time. The two primary sources of data are:
-
Computer-Generated Data- refers to the data generated by the computer itself.
Some examples of computer-generated data are Financial data, Weblog data, sensor data etc. -
Human-Generated Data- is the data that humans supply in interaction with computers.
Some examples of Human-generated data are Input data, Gaming related data, Clickstream data etc.
Now, let’s explore a brief about Unstructured data.
Unstructured Data:
Unstructured data is the type of data that does not follow any specific format like structured data. If around 20 percent of the data we are dealing with is structured, then about 80 percent of data we will encounter is unstructured.
Unstructured data is everywhere. Like structured data, unstructured data is machine-generated and human-generated. Let's have a look at some examples of them.
- Computer-generated data- generated by computers like photographs and videos, satellite data, RADAR and SONAR data, scientific data etc.
- Human-generated data- generated by humans with computer aid. Examples are survey results, website content, mobile data, social media data etc.
Now we will see data types integrated into a big data environment.