Published March 9, 2022 | Version v1
Publication

Improved biclustering on expression data through overlapping control

Description

Purpose – The purpose of this paper is to present a novel control mechanism for avoiding overlapping among biclusters in expression data. Design/methodology/approach – Biclustering is a technique used in analysis of microarray data. One of the most popular biclustering algorithms is introduced by Cheng and Church (2000) (Ch&Ch). Even if this heuristic is successful at finding interesting biclusters, it presents several drawbacks. The main shortcoming is that it introduces random values in the expression matrix to control the overlapping. The overlapping control method presented in this paper is based on a matrix of weights, that is used to estimate the overlapping of a bicluster with already found ones. In this way, the algorithm is always working on real data and so the biclusters it discovers contain only original data. Findings – The paper shows that the original algorithm wrongly estimates the quality of the biclusters after some iterations, due to random values that it introduces. The empirical results show that the proposed approach is effective in order to improve the heuristic. It is also important to highlight that many interesting biclusters found by using our approach would have not been obtained using the original algorithm. Originality/value – The original algorithm proposed by Ch&Ch is one of the most successful algorithms for discovering biclusters in microarray data. However, it presents some limitations, the most relevant being the substitution phase adopted in order to avoid overlapping among biclusters. The modified version of the algorithm proposed in this paper improves the original one, as proven in the experimentation.

Abstract

Ministerio de Ciencia y Tecnología TIN2007-68084-C02- 00

Additional details

Created:
March 25, 2023
Modified:
November 28, 2023