In Silico Discovery of Significant Pathways in Colorectal Cancer Metastasis Using a Two-Stage Optimization Approach
Date
Advisors
Journal Title
Journal ISSN
ISSN
Volume Title
Publisher
Type
Peer reviewed
Abstract
Accurate and reliable modelling of protein-protein interaction networks for complex diseases such as colorectal cancer can help better understand mechanism of diseases and potentially discover new drugs. Different machine learning methods such as Empirical Mode Decomposition combined with Least Square Support Vector Machine, and Discrete Fourier Transform (DFT) have been widely utilised as a classifier and for automatic discovery of biomarkers for the diagnosis of the disease. The existing methods are however less efficient as they tend to ignore interaction with the classifiers. In this study, we propose a two-stage optimization approach to effectively select biomarkers and discover interactions among them. At the first stage, Particle Swarm Optimization (PSO) and Differential Evolution (DE) are used to optimize parameters of Support Vector Machine Recursive Feature Elimination algorithm, and Dynamic Bayesian Network is then used to predict temporal relationship between biomarkers across two time points. Results show that 18 and 25 biomarkers selected by PSO and DE-based approach, respectively, yield the same accuracy of 97.3% and F1-score of 97.7% and 97.6%, respectively. The stratified analysis reveals that Alpha-2-HS-glycoprotein was a dominant hub gene with multiple interactions to other genes including Fibrinogen alpha chain, which is also a potential biomarker for colorectal cancer.