DaTa Mining

Data mining is a technique where we extract useful information from the raw data that we have. Data Mining comes under Data Science. In DBMS we store and retrieve the data as per our requirement by using Database queries.
Then a question arises in learners that what is the use of Data mining?
By Database queries we can only retrieve the data that was given by the user. If there is scenario where we must analyze the data and conclude which part of the data is useful and which part is not, then Database queries are not going to work out in such situation. This situation can be handled by using data mining whereby this technique we can predict the data by analyzing the previous data. By Data Mining we can extract essence of information that is available and generate report, views or summary of data for better decision making.

For example: In google we get the recommendations related to the data that we search in the browser most of the time. They analyze the data that the user is interested most of the time and predict the user’s interest by going through his browse history. This job is done by Data mining. Data Mining is also known as KDD (Knowledge Discovery in Databases) which refers to the broad procedure of discovering knowledge in the data and emphasizes the high-level applications of specific Data Mining techniques.


Process of Data Mining


Data mining is an iterative process and it goes through the following phases laid down by Cross Industry Standard Process for Data Mining (CISP-DM) process model

Problem Definition

The problem definition is listed first. In business aims and objectives are determined based on current back-ground analysis which is helpful in guiding future perspectives.

Data Exploration

Required data is collected and analyzed by using various statistical methods by also considering the current underlying problems.

Data Preparation

The raw data is cleansed and formatted accordingly, hence it is prepared for modeling. The meaning of data is not changed while preparing.

Modeling

data model is prepared by using certain mathematical functions (which help in predicting data from the existing raw data) and modeling techniques. After the model is created it goes under validation and verification.
In this phase, mathematical models are used to determine data patterns.
  • Based on the business objectives, suitable modeling techniques should be selected for the prepared data set.
  • Create a scenario to test check the quality and validity of the model. Run the model on the prepared data set.
  • Results should be assessed by all stakeholders to make sure that model can meet data mining objectives.

Evaluation

After the model is created it is verified by the team of experts to check whether it satisfies the business objectives or not.

Deployment

After evaluation, the model is deployed, and further plans are made for its maintenance. Finally, a properly organized report is prepared with the summary of the work done.