Published July 7, 2016
| Version v1
Publication
Inferring gene regression networks with model trees
Description
Background: Novel strategies are required in order to handle the huge amount of data produced by microarray
technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between
genes building the so-called gene co-expression networks. They are typically generated using correlation statistics
as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two
genes have a strong global similarity but do not detect local similarities.
Results: We propose model trees as a method to identify gene interaction networks. While correlation-based
methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the
remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into
account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to
control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two
well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are
tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the
results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at
detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods.
Conclusions: REGNET generates gene association networks from gene expression data, and differs from
correlation-based methods in that the relationship between one gene and others is calculated simultaneously.
Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression
functions. They are very often more precise than linear regression models because they can add just different
linear regressions to separate areas of the search space favoring to infer localized similarities over a more global
similarity. Furthermore, experimental results show the good performance of REGNET.
Abstract
Ministerio de Ciencia e Innovación TIN2011-68084-C02-00Abstract
Ministerio de Ciencia e Innovación PCI2006-A7-0575Abstract
Junta de Andalucia P07-TIC- 02611Abstract
Junta de Andalucía TIC-200Additional details
Identifiers
- URL
- https://idus.us.es/handle/11441/43335
- URN
- urn:oai:idus.us.es:11441/43335
Origin repository
- Origin repository
- USE