No description
-
March 28, 2022 (v1)PublicationUploaded on: December 4, 2022
-
May 3, 2023 (v1)Publication
Progress in digital data acquisition and storage technology has resulted in the growth of huge databases. Nevertheless, these techniques often have high computational cost. Then, it is advisable to apply a preprocessing phase to reduce the time complexity. These preprocessing techniques are fundamentally oriented to either of the next goals:...
Uploaded on: May 4, 2023 -
July 6, 2016 (v1)Publication
Gene expression microarray is a rapidly maturing technology that provides the opportunity to assay the expression levels of thousands or tens of thousands of genes in a single experiment. We present a new heuristic to select relevant gene subsets in order to further use them for the classification task. Our method is based on the statistical...
Uploaded on: March 27, 2023 -
May 24, 2022 (v1)Publication
The problem of protein structure prediction (PSP) is one of the main challenges in structural bioinformatics. To tackle this problem, PSP can be divided into several subproblems. One of these subproblems is the prediction of disulfide bonds. The disulfide connectivity prediction problem consists in identifying which nonadjacent cysteines...
Uploaded on: December 5, 2022 -
May 8, 2023 (v1)Publication
This paper describes an approach based on evolutionary algorithms, HIDER ( erarchical cision ules), for learning rules in continuous and discrete domains. The algorithm produces a hierarchical set of rules, that is, the rules are sequentially obtained and must be therefore tried in order until one is found whose conditions are satised. In...
Uploaded on: May 10, 2023 -
May 9, 2023 (v1)Publication
The attribute selection techniques for supervised learning, used in the preprocessing phase to emphasize the most relevant attributes, allow making models of classification simpler and easy to understand. Depending on the method to apply: starting point, search organization, evaluation strategy, and the stopping criterion, there is an added...
Uploaded on: May 11, 2023 -
April 7, 2016 (v1)Publication
In this work, we suggest a new feature selection technique that lets us use the wrapper approach for finding a well suited feature set for distinguishing experiment classes in high dimensional data sets. Our method is based on the relevance and redundancy idea, in the sense that a ranked-feature is chosen if additional information is gained by...
Uploaded on: March 27, 2023 -
March 31, 2016 (v1)Publication
The attribute selection techniques for supervised learning, used in the preprocessing phase to emphasize the most relevant attributes, allow making models of classification simpler and easy to understand. The algorithm (SOAP: Selection of Attributes by Projection) has some interesting characteristics: lower computational cost (O(m n log n) m...
Uploaded on: March 27, 2023 -
April 7, 2016 (v1)Publication
We propose a new feature selection criterion not based on calculated measures between attributes, or complex and costly distance calculations. Applying a wrapper to the output of a new attribute ranking method, we obtain a minimum subset with the same error rate as the original data. The experiments were compared to two other algorithms with...
Uploaded on: March 27, 2023 -
March 1, 2022 (v1)Publication
Traditional gene selection methods often select the top–ranked genes according to their individual discriminative power. We propose to apply feature evaluation measure broadly used in the machine learning field and not so popular in the DNA microarray field. Besides, the application of sequential gene subset selection approaches is included. In...
Uploaded on: December 4, 2022 -
March 30, 2016 (v1)Publication
The attribute selection techniques for supervised learning, used in the preprocessing phase to emphasize the most relevant attributes, allow making models of classification simpler and easy to understand. Depending on the method to apply: starting point, search organization, evaluation strategy, and the stopping criterion, there is an added...
Uploaded on: December 4, 2022 -
May 3, 2023 (v1)Publication
The enormous increase of the size in databases makes finding an optimal subset of features extremely difficult. In this paper, a new feature selection method is proposed that will allow any subset evaluator -including the wrapper evaluation method- to be used to find a group of features that will allow a distinction to be made between the...
Uploaded on: May 4, 2023 -
May 4, 2023 (v1)Publication
Comisión Interministerial de Ciencia y Tecnología TIN2004-06689-C03-03
Uploaded on: May 5, 2023 -
April 1, 2016 (v1)Publication
In this paper, we propose a new feature selection criterion. It is based on the projections of data set elements onto each attribute. The main advantages are its speed and simplicity in the evaluation of the attributes. The measure allows features to be sorted in ascending order of importance in the definition of the class. In order to test the...
Uploaded on: March 27, 2023 -
March 31, 2016 (v1)Publication
The attribute selection techniques for supervised learning, used in the preprocessing phase to emphasize the most relevant attributes, allow making models of classification simpler and easy to understand. The algorithm has some interesting characteristics: lower computational cost (O(m n log n) m attributes and n examples in the data set) with...
Uploaded on: December 4, 2022 -
July 11, 2016 (v1)Publication
Data mining methods in software engineering are becoming increasingly important as they can support several aspects of the software development life-cycle such as quality. In this work, we present a data mining approach to induce rules extracted from static software metrics characterising fault-prone modules. Due to the special characteristics...
Uploaded on: December 4, 2022 -
May 3, 2023 (v1)Publication
La aplicación de métodos de Minería de Datos a la Ingeniería del Software tiene una importancia creciente en distintos aspectos del ciclo de vida del software. En este trabajo presentamos una metodología para inducir reglas que nos permitan establecer cuáles son las métricas y los umbrales que caracterizan la aparición de módulos con fallos....
Uploaded on: May 4, 2023 -
March 1, 2022 (v1)Publication
The majority of the biclustering approaches for microarray data analysis use the Mean Squared Residue (MSR) as the main evaluation measure for guiding the heuristic. MSR has been proven to be inefficient to recognize several kind of interesting patterns for biclusters. Transposed Virtual Error (VEt ) has recently been discovered to overcome...
Uploaded on: March 25, 2023 -
March 8, 2022 (v1)Publication
El interés por extraer conocimiento útil de datos de expresión genómica ha experimentado un enorme auge en los últimos años con el desarrollo de los microarrays. Las técnicas de biclustering son aplicadas para obtener subconjuntos de genes que se expresen de manera similar frente a determinadas condiciones en un microarray. Una manera de medir...
Uploaded on: March 25, 2023 -
March 3, 2022 (v1)Publication
The most widespread biclustering algorithms use the Mean Squared Residue (MSR) as measure for assessing the quality of biclusters. MSR can identify correctly shifting patterns, but fails at discovering biclusters presenting scaling patterns. Virtual Error (VE) is a measure which improves the performance of MSR in this sense, since it is...
Uploaded on: December 4, 2022 -
March 8, 2022 (v1)Publication
In recent years, the interest in extracting useful knowledge from gene expression data has experimented an enormous increase with the development of microarray technique. Biclustering is a recent technique that aims at extracting a subset of genes that show a similar behaviour for a subset conditions. It is important, therefore, to measure the...
Uploaded on: December 4, 2022