Low Dimensionality or Same Subsets as a Result of Feature Selection: An In-Depth Roadma
Description
This paper addresses the situation that may happen after the application of feature subset selection in terms of a reduced number of selected features or even same solutions obtained by different algorithms. The data mining community has been working for a long time with the assumption that meaningful attributes are either highly correlated with the class or represent a consistent subset, that is, with no inconsistencies. We have analysed around a hundred data sets very varied with a number of attributes below one hundred, a number of instances not greater than fifty thousand and a number of classes below fifty. Basically, in the first round we applied two different feature subset selection methods to pick up the figures in terms of reduced dimensionality. After that, we divided them into different groups according to the number of selected attributes. Next, we deepened the analysis in every category and we added a new feature selection procedure. Finally, we assessed the performance of the original problem and the reduced subsets with four classifiers providing some prospective directions.
Abstract
Comisión Interministerial de Ciencia y Tecnología TIN2014-55894-C2-R
Abstract
Junta de Andalucía P11-TIC-7528
Additional details
- URL
- https://idus.us.es/handle//11441/145549
- URN
- urn:oai:idus.us.es:11441/145549
- Origin repository
- USE