Learning a confidence score and the latent space of a new supervised autoencoder for diagnosis and prognosis in clinical metabolomic studies

Creators: Chardin, David; Gille, Cyprien; Pourcher, Thierry; Humbert, Olivier; Barlaud, Michel

Others:: Université Côte d'Azur (UCA); Centre de Lutte contre le Cancer Antoine Lacassagne [Nice] (UNICANCER/CAL) ; UNICANCER-Université Côte d'Azur (UCA); UMR E4320 (TIRO-MATOs) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Côte d'Azur (UCA); ANR-19-P3IA-0002,3IA@cote d'azur,3IA Côte d'Azur(2019)

Description

Abstract Background Presently, there is a wide variety of classification methods and deep neural network approaches in bioinformatics. Deep neural networks have proven their effectiveness for classification tasks, and have outperformed classical methods, but they suffer from a lack of interpretability. Therefore, these innovative methods are not appropriate for decision support systems in healthcare. Indeed, to allow clinicians to make informed and well thought out decisions, the algorithm should provide the main pieces of information used to compute the predicted diagnosis and/or prognosis, as well as a confidence score for this prediction. Methods Herein, we used a new supervised autoencoder (SAE) approach for classification of clinical metabolomic data. This new method has the advantage of providing a confidence score for each prediction thanks to a softmax classifier and a meaningful latent space visualization and to include a new efficient feature selection method, with a structured constraint, which allows for biologically interpretable results. Results Experimental results on three metabolomics datasets of clinical samples illustrate the effectiveness of our SAE and its confidence score. The supervised autoencoder provides an accurate localization of the patients in the latent space, and an efficient confidence score. Experiments show that the SAE outperforms classical methods (PLS-DA, Random Forests, SVM, and neural networks (NN)). Furthermore, the metabolites selected by the SAE were found to be biologically relevant. Conclusion In this paper, we describe a new efficient SAE method to support diagnostic or prognostic evaluation based on metabolomics analyses.

Abstract

International audience

Learning a confidence score and the latent space of a new supervised autoencoder for diagnosis and prognosis in clinical metabolomic studies

Description

Abstract

Additional details