Multiobjective deep clustering and its applications in single-cell RNA-seq data

dc.cclicenceN/Aen
dc.contributor.authorWang, Yunhe
dc.contributor.authorBiao, Chuang
dc.contributor.authorWong, Ka-Chun
dc.contributor.authorLi, Xiangtao
dc.contributor.authorYang, Shengxiang
dc.date.acceptance2021-09-09
dc.date.accessioned2021-10-07T09:31:38Z
dc.date.available2021-10-07T09:31:38Z
dc.date.issued2021-09-21
dc.descriptionThe file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.en
dc.description.abstractSingle-cell RNA sequencing is a transformative technology that enables us to study the heterogeneity of the tissue at the cellular level. Clustering is used as the key computational approach to group cells under the transcriptome profiles from single-cell RNA-seq data. However, accurate identification of distinct cell types is facing the challenge of high dimensionality, and it could cause uninformative clusters when clustering is directly applied on the original transcriptome. To address such challenge, an evolutionary multiobjective deep clustering (EMDC) algorithm is proposed to identify single-cell RNA-seq data in this study. First, EMDC removes redundant and irrelevant genes by applying the differential gene expression analysis to identify differentially expressed genes across biological conditions. After that, a deep autoencoder is proposed to project the high-dimensional data into different low-dimensional nonlinear embedding subspaces under different bottleneck layers. Then, the basic clustering algorithm is applied in those nonlinear embedding subspaces to generate some basic clustering results to produce the cluster ensemble. To lessen the unnecessary cost produced by those clusterings in the ensemble, the multiobjective evolutionary optimization is designed to prune the basic clustering results in the ensemble, unleashing its cell type discovery performance under three objective functions. Multiple experiments have been conducted on 30 synthetic single-cell RNA-seq datasets and six real single-cell RNA-seq datasets, which reveal that EMDC outperforms eight other clustering methods and three multiobjective optimization algorithms in cell type identification. In addition, we have conducted extensive comparisons to effectively demonstrate the impact of each component in our proposed EMDC.en
dc.funderOther external funder (please detail below)en
dc.funder.otherNational Natural Science Foundation of Chinaen
dc.identifier.citationWang, Y., Biao, C., Wong, K-C., Yang, S. and Li, X. (2021) Multiobjective deep clustering and its applications in single-cell RNA-seq data. IEEE Transactions on Systems, Man and Cybernetics: Systems.en
dc.identifier.doihttps://doi.org/10.1109/tsmc.2021.3112049
dc.identifier.issn2168-2216
dc.identifier.urihttps://dora.dmu.ac.uk/handle/2086/21321
dc.language.isoen_USen
dc.peerreviewedYesen
dc.projectid62076109en
dc.publisherIEEE Pressen
dc.researchinstituteInstitute of Artificial Intelligence (IAI)en
dc.subjectEvolutionary multiobjective deep clusteringen
dc.subjectmultiobjective optimizationen
dc.subjectsingle-cell RNA-seq dataseten
dc.titleMultiobjective deep clustering and its applications in single-cell RNA-seq dataen
dc.typeArticleen

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
IEEETSMCA21.pdf
Size:
2.71 MB
Format:
Adobe Portable Document Format
Description:
Main article
Loading...
Thumbnail Image
Name:
Supplement.pdf
Size:
421.92 KB
Format:
Adobe Portable Document Format
Description:
Supplementary material
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.2 KB
Format:
Item-specific license agreed upon to submission
Description: