Two-stage computational bio-network discovery approach for metabolites: ovarian cancer as a case study.
Date
Authors
Advisors
Journal Title
Journal ISSN
ISSN
Volume Title
Publisher
Type
Peer reviewed
Abstract
Machine learning and other computational techniques have been applied in identifying biomarkers and constructing computational predictive models for early diagnosis of ovarian cancer. Most studies focus on large biopolymers such as DNA, RNA and Proteins but small metabolic molecules have received significantly less attention. In addition, studies have focused only on the analysis of classification performance of the biomarkers selected by various feature selection methods but do not consider possible temporal relationship among feature subsets. In this paper, we propose a two-stage bio-network discovery approach for ovarian cancer metabolites. At the first stage, feature selection is carried out using four different selection methods. The best features are selected based on overall best classification performance. At the second stage, Dynamic Bayesian Network (DBN) is used to model the temporal relationship among the stratified features. The results show that 39 features out of a total of 592 metabolomics features selected by the Least Angle Shrinkage and Selection Operator (LASSO) feature selection method yielded the highest predictive accuracy of 93%. Two DBN methods are then used to model the temporal relationships among the 39 features. The results show consistently significant relationships between the cancer biomarkers at features 219 and 225, and between features 543 and 219 across time points. Among these metabolites, around 20 metabolic chemical compounds have been identified, which could be regarded as potential biomarkers associated with ovarian cancer.