Improving the drug discovery process by using multiple classifier systems

Date

2019-03-01

Advisors

Journal Title

Journal ISSN

ISSN

Volume Title

Publisher

Elsevier

Type

Article

Peer reviewed

Yes

Abstract

Machine learning methods have become an indispensable tool for utilizing large knowledge and data repositories in science and technology. In the context of the pharmaceutical domain, the amount of acquired knowledge about the design and synthesis of pharmaceutical agents and bioactive molecules (drugs) is enormous. The primary challenge for automatically discovering new drugs from molecular screening information is related to the high dimensionality of datasets, where a wide range of features is included for each candidate drug. Thus, the implementation of improved techniques to ensure an adequate manipulation and interpretation of data becomes mandatory. To mitigate this problem, our tool (called D2-MCS) can split homogeneously the dataset into several groups (the subset of features) and subsequently, determine the most suitable classifier for each group. Finally, the tool allows determining the biological activity of each molecule by a voting scheme. The application of the D2-MCS tool was tested on a standardized, high quality dataset gathered from ChEMBL and have shown outperformance of our tool when compare to well-known single classification models.

Description

The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.

Keywords

Drug discovery, Machine learning algorithms, Feature clustering, Multiple classifier systems

Citation

Ruano-Ordas, D. et al. (2019) Improving the drug discovery process by using multiple classifier systems. Expert Systems with Applications, 121, pp. 292-303

Rights

Research Institute

Cyber Technology Institute (CTI)