ccImpute: an accurate and scalable consensus clustering based algorithm to impute dropout events in the single-cell RNA-seq data

6Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: In recent years, the introduction of single-cell RNA sequencing (scRNA-seq) has enabled the analysis of a cell’s transcriptome at an unprecedented granularity and processing speed. The experimental outcome of applying this technology is a M× N matrix containing aggregated mRNA expression counts of M genes and N cell samples. From this matrix, scientists can study how cell protein synthesis changes in response to various factors, for example, disease versus non-disease states in response to a treatment protocol. This technology’s critical challenge is detecting and accurately recording lowly expressed genes. As a result, low expression levels tend to be missed and recorded as zero - an event known as dropout. This makes the lowly expressed genes indistinguishable from true zero expression and different than the low expression present in cells of the same type. This issue makes any subsequent downstream analysis difficult. Results: To address this problem, we propose an approach to measure cell similarity using consensus clustering and demonstrate an effective and efficient algorithm that takes advantage of this new similarity measure to impute the most probable dropout events in the scRNA-seq datasets. We demonstrate that our approach exceeds the performance of existing imputation approaches while introducing the least amount of new noise as measured by clustering performance characteristics on datasets with known cell identities. Conclusions: ccImpute is an effective algorithm to correct for dropout events and thus improve downstream analysis of scRNA-seq data. ccImpute is implemented in R and is available at https://github.com/khazum/ccImpute.

References Powered by Scopus

Principal component analysis

9925Citations
N/AReaders
Get full text

Top 10 algorithms in data mining

4424Citations
N/AReaders
Get full text

Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq

2198Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Imputation Methods for scRNA Sequencing Data

6Citations
N/AReaders
Get full text

Predicting the Hall-Petch slope of magnesium alloys by machine learning

5Citations
N/AReaders
Get full text

A Framework for Comparison and Assessment of Synthetic RNA-Seq Data

2Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Malec, M., Kurban, H., & Dalkilic, M. (2022). ccImpute: an accurate and scalable consensus clustering based algorithm to impute dropout events in the single-cell RNA-seq data. BMC Bioinformatics, 23(1). https://doi.org/10.1186/s12859-022-04814-8

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 1

50%

Researcher 1

50%

Readers' Discipline

Tooltip

Computer Science 2

67%

Biochemistry, Genetics and Molecular Bi... 1

33%

Article Metrics

Tooltip
Mentions
Blog Mentions: 1

Save time finding and organizing research with Mendeley

Sign up for free