Published May 20, 2022
| Version v1
Publication
Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features
Description
Motivation: The prediction of a protein's contact map has become
in recent years, a crucial stepping stone for the prediction of the
com-plete 3D structure of a protein. In this article, we describe a
method-ology for this problem that was shown to be successful in
CASP8 and CASP9. The methodology is based on (i) the fusion of the
prediction of a variety of structural aspects of protein residues, (ii)
an ensemble strategy used to facilitate the training process and
(iii) a rule-based machine learning system from which we can
extract human-readable explanations of the predictor and derive
useful information about the contact map representation.
Results: The main part of the evaluation is the comparison against
the sequence-based contact prediction methods from CASP9,
where our method presented the best rank in five out of the six
evaluated met-rics. We also assess the impact of the size of the
ensemble used in our predictor to show the trade-off between
performance and training time of our method. Finally, we also study
the rule sets generated by our machine learning system. From this
analysis, we are able to estimate the contribution of the attributes in
our representation and how these interact to derive contact
predictions
Additional details
Identifiers
- URL
- https://idus.us.es/handle//11441/133498
- URN
- urn:oai:idus.us.es:11441/133498
Origin repository
- Origin repository
- USE