Literature DB >> 31392316

scHinter: imputing dropout events for single-cell RNA-seq data with limited sample size.

Pengchao Ye1,2, Wenbin Ye1,2, Congting Ye3, Shuchao Li1,2, Lishan Ye4, Guoli Ji1,2, Xiaohui Wu1,2.   

Abstract

MOTIVATION: Single-cell RNA-sequencing (scRNA-seq) is fast and becoming a powerful technique for studying dynamic gene regulation at unprecedented resolution. However, scRNA-seq data suffer from problems of extremely high dropout rate and cell-to-cell variability, demanding new methods to recover gene expression loss. Despite the availability of various dropout imputation approaches for scRNA-seq, most studies focus on data with a medium or large number of cells, while few studies have explicitly investigated the differential performance across different sample sizes or the applicability of the approach on small or imbalanced data. It is imperative to develop new imputation approaches with higher generalizability for data with various sample sizes.
RESULTS: We proposed a method called scHinter for imputing dropout events for scRNA-seq with special emphasis on data with limited sample size. scHinter incorporates a voting-based ensemble distance and leverages the synthetic minority oversampling technique for random interpolation. A hierarchical framework is also embedded in scHinter to increase the reliability of the imputation for small samples. We demonstrated the ability of scHinter to recover gene expression measurements across a wide spectrum of scRNA-seq datasets with varied sample sizes. We comprehensively examined the impact of sample size and cluster number on imputation. Comprehensive evaluation of scHinter across diverse scRNA-seq datasets with imbalanced or limited sample size showed that scHinter achieved higher and more robust performance than competing approaches, including MAGIC, scImpute, SAVER and netSmooth.
AVAILABILITY AND IMPLEMENTATION: Freely available for download at https://github.com/BMILAB/scHinter. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2020        PMID: 31392316     DOI: 10.1093/bioinformatics/btz627

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  4 in total

Review 1.  Discovery of alternative polyadenylation dynamics from single cell types.

Authors:  Congting Ye; Juncheng Lin; Qingshun Q Li
Journal:  Comput Struct Biotechnol J       Date:  2020-04-20       Impact factor: 7.271

Review 2.  An Overview of Algorithms and Associated Applications for Single Cell RNA-Seq Data Imputation.

Authors:  Zarrin Basharat; Sania Majeed; Humaira Saleem; Ishtiaq Ahmad Khan; Azra Yasmin
Journal:  Curr Genomics       Date:  2021-12-30       Impact factor: 2.689

3.  Single-cell specific and interpretable machine learning models for sparse scChIP-seq data imputation.

Authors:  Steffen Albrecht; Tommaso Andreani; Miguel A Andrade-Navarro; Jean Fred Fontaine
Journal:  PLoS One       Date:  2022-07-01       Impact factor: 3.752

4.  MSPJ: Discovering potential biomarkers in small gene expression datasets via ensemble learning.

Authors:  HuaChun Yin; JingXin Tao; Yuyang Peng; Ying Xiong; Bo Li; Song Li; Hui Yang
Journal:  Comput Struct Biotechnol J       Date:  2022-07-14       Impact factor: 6.155

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.