Clustering Deviation Index (CDI): a robust and accurate internal measure for evaluating scRNA-seq data clustering.

Authors

Fang, J; Chan, C; Owzar, K; Wang, L; Qin, D; Li, Q-J; Xie, J

Abstract

Most single-cell RNA sequencing (scRNA-seq) analyses begin with cell clustering; thus, the clustering accuracy considerably impacts the validity of downstream analyses. In contrast with the abundance of clustering methods, the tools to assess the clustering accuracy are limited. We propose a new Clustering Deviation Index (CDI) that measures the deviation of any clustering label set from the observed single-cell data. We conduct in silico and experimental scRNA-seq studies to show that CDI can select the optimal clustering label set. As a result, CDI also informs the optimal tuning parameters for any given clustering method and the correct number of cluster components.

Citation

Fang, Jiyuan, Cliburn Chan, Kouros Owzar, Liuyang Wang, Diyuan Qin, Qi-Jing Li, and Jichun Xie. “Clustering Deviation Index (CDI): a robust and accurate internal measure for evaluating scRNA-seq data clustering.” Genome Biol 23, no. 1 (December 27, 2022): 269. https://doi.org/10.1186/s13059-022-02825-5.

Publication Links