Optimization of multi-classifiers for computational biology: application to gene finding and expression

Creators: Romero Zaliz, Rocío; Rubio Escudero, Cristina; Zwir, Igor; Val, Coral del

Others:: Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos; Universidad de Sevilla. TIC-254: Data Science and Big Data Lab; Ministerio de Ciencia Y Tecnología (MCYT). España; Junta de Andalucía

Description

Genomes of many organisms have been sequenced over the last few years. However, transforming such raw sequence data into knowledge remains a hard task. A great number of prediction programs have been developed to address part of this problem: the location of genes along a genome and their expression. We propose a multi-objective methodology to combine state-of-the-art algorithms into an aggregation scheme in order to obtain optimal methods' aggregations. The results obtained show a major improvement in sensitivity when our methodology is compared to the performance of individual methods for gene finding and gene expression problems. The methodology proposed here is an automatic method generator, and a step forward to exploit all already existing methods, by providing alternative optimal methods' aggregations to answer concrete queries for a certain biological problem with a maximized accuracy of the prediction. As more approaches are integrated for each of the presented problems, de novo accuracy can be expected to improve further.

Abstract

Ministerio de Ciencia y Tecnología TIN2006-12879

Abstract

Junta de Andalucía TIC-02788

Optimization of multi-classifiers for computational biology: application to gene finding and expression

Description

Abstract

Abstract

Additional details