A quantitative methodology to identify related features in data sets
Description
In this paper, a methodology which quantifies the dependence beteen features in a data set is developed. This methodology uses the Ameva discretization algorithm. In particular, it uses the Ameva coefficient to quantify the dependece. Furthermore, a new coefficient called entropy has been proposed for cases where it is not possible to apply the Ameva discretization algorithm. Thus, different matrices of inter-dependence are built provinding a grade of dependence between two features. Finally, to verify the qualitiews of this methodology, a simple method to discard features base don it is applied to a well-known data set in a classification process and promising results for the carried out system are obtained.
Abstract
Ministerio de Ciencia e Innovación TIN2009-14378-C02-01
Additional details
- URL
- https://idus.us.es/handle//11441/146581
- URN
- urn:oai:idus.us.es:11441/146581
- Origin repository
- USE