Data mining is the iterative process that aims to discover and identify the relations in the data set or in the flow of data examined, through manual or automatic methods. Such analysis is divided into two types of activity: predictive analysis and descriptive analysis.
It is not right to think that one theory is better than the other. As a matter of fact, the results of the two activities are complementary to achieving the same goal. While the descriptive analysis tries to find patterns and other new information, the predictive analysis allows to produce an executable model in the form of code, useful for the prediction, the estimation and the identification of a process. Thus, in a nutshell, data mining concerns the activity carried out on big data in order to make them intelligible to everyone, and to extract from them predictive information useful to those who have requested them. The main data mining techniques are:
Data mining can be seen as the union of two sciences, statistical sciences and machine learning. It can be defined as the process that allows to discover models and descriptions from a data set. Such a process cannot be the application of random machine-learning methods and statistical tools. Instead, it must be a well-planned and structured process, so as to be useful and fully descriptive of the system being examined. This information extraction plan usually follows a five-step experimental procedure:
These phases are not independent, that is, the data mining process necessarily involves an iterative approach. Thanks to the observation of the results obtained from a certain phase, the data set can be computed again so as to solve the problem concerned.
WHY DATA MINING IS IMPORTANT?
The fields of application of data mining are countless, but they can be grouped into some macro-categories. Below the list of the main fields and the advantages that data mining can bring to each of them.