Introduction
The ever-increasing volume of data collected each year makes extracting usable information increasingly important. The data is usually stored in a data warehouse, a collection of data acquired from various sources, such as company databases, summarised data from internal systems, and other sources.
Simple query and reporting functions, statistical analysis, more advanced multidimensional analysis, and data mining are all data analysis examples. Multifaceted analysis, which necessitates powerful data manipulation and computational capabilities, is commonly related to online analytical processing (OLAP).
Business intelligence (BI) has become a prominent issue as the amount of data created each year grows. As a result of the growing attention on BI, several significant businesses have begun to expand their position in the sector, resulting in a concentration around some of the world's top software providers.
Business Intelligence includes data warehousing, database management systems, and Online Analytical Processing, as well as data analysis and data mining.
Data Mining
Data Mining is extracting data, processing it from multiple angles, and then creating a meaningful summary of the information that discovers relationships within the data. Data mining can be Descriptive or Predictive.
Descriptive Data Mining- Descriptive data mining provides information about current data. It uses past data to find shared similarities or groupings to figure out why something worked or didn't, such as categorizing customers based on product preferences or attitudes.
Descriptive Modelling techniques include
1. Clustering- Putting similar records in the same Group.
2. Anomaly detection- Identifying multidimensional outliers.
3. Association rule learning- Detecting relationships between records.
4. Principal component analysis- Detecting relationships between variables.
5. Affinity grouping- Grouping people with common interests or similar goals.
Predictive Data Mining- It generates predictions based on the data. This type of modeling takes a step further to classify future events or estimate unknown outcomes, such as credit scoring, to determine a person's likelihood of repaying a loan. Customer attrition, marketing response, and credit defaults are all areas where predictive modeling can help.
Predicting Modelling techniques include
1. Regression- A metric for assessing the strength of a link between one dependent variable and a set of independent variables.
2. Neural network- Computer programs that detect patterns, make predictions, and learn.
3. Support vector machine- Supervised learning models with associated learning algorithms.
4. Decision tree- Tree-shaped diagrams in which each branch represents a probable occurrence.
There are two types of data.
Structured Data- Structured data is very particular and is saved in a specific format.
Unstructured Data- Unstructured data collect several different kinds of data stored in their original formats.
Data warehouses store structured data, whereas data lakes are used to store unstructured data. Both can be stored in the cloud; however, organized data takes up less space while unstructured data is more.
So, what is the significance of data mining? We've seen the startling figures: every two years, the amount of data produced doubles. 90% of the digital cosmos is made up of unstructured data.
Data Mining helps us in the following ways.
1.Data Mining helps us understand what's important and then use that knowledge to predict potential outcomes.
2.Increase the speed at which we can make well-informed decisions.