DaTa Mining
Data mining is a technique where we extract useful information from the raw data that we have.
Data Mining comes under Data Science. In DBMS we store and retrieve the data as per our requirement
by using Database queries.
Then a question arises in learners that what is the use of Data mining?
By Database queries we can only retrieve the data that was given by the user.
If there is scenario where we must analyze the data and conclude which part of the
data is useful and which part is not, then Database queries are not going to work out
in such situation. This situation can be handled by using data mining whereby this technique
we can predict the data by analyzing the previous data. By Data Mining we can extract essence
of information that is available and generate report, views or summary of data for better decision making.
For example: In google we get the recommendations related to the data that we search in the browser most of the time. They analyze the data that the user is interested most of the time and predict the user’s interest by going through his browse history. This job is done by Data mining. Data Mining is also known as KDD (Knowledge Discovery in Databases) which refers to the broad procedure of discovering knowledge in the data and emphasizes the high-level applications of specific Data Mining techniques.
Process of Data Mining
Data mining is an iterative process and it goes through the following phases laid down by Cross Industry Standard Process for Data Mining (CISP-DM) process model
Problem Definition
The problem definition is listed first. In business aims and objectives are determined based on current back-ground analysis which is helpful in guiding future perspectives.Data Exploration
Required data is collected and analyzed by using various statistical methods by also considering the current underlying problems.Data Preparation
The raw data is cleansed and formatted accordingly, hence it is prepared for modeling. The meaning of data is not changed while preparing.Modeling
data model is prepared by using certain mathematical functions (which help in predicting data from the existing raw data) and modeling techniques. After the model is created it goes under validation and verification.In this phase, mathematical models are used to determine data patterns.
- Based on the business objectives, suitable modeling techniques should be selected for the prepared data set.
- Create a scenario to test check the quality and validity of the model. Run the model on the prepared data set.
- Results should be assessed by all stakeholders to make sure that model can meet data mining objectives.