Literature DB >> 35102420

Intrinsic entropy model for feature selection of scRNA-seq data.

Lin Li1,2, Hui Tang3, Rui Xia1,2, Hao Dai1, Rui Liu3, Luonan Chen1,4,5,6.   

Abstract

Recent advances of single-cell RNA sequencing (scRNA-seq) technologies have led to extensive study of cellular heterogeneity and cell-to-cell variation. However, the high frequency of dropout events and noise in scRNA-seq data confounds the accuracy of the downstream analysis, i.e. clustering analysis, whose accuracy depends heavily on the selected feature genes. Here, by deriving an entropy decomposition formula, we propose a feature selection method, i.e. an intrinsic entropy (IE) model, to identify the informative genes for accurately clustering analysis. Specifically, by eliminating the 'noisy' fluctuation or extrinsic entropy (EE), we extract the IE of each gene from the total entropy (TE), i.e. TE = IE + EE. We show that the IE of each gene actually reflects the regulatory fluctuation of this gene in a cellular process, and thus high-IE genes provide rich information on cell type or state analysis. To validate the performance of the high-IE genes, we conduct computational analysis on both simulated datasets and real single-cell datasets by comparing with other representative methods. The results show that our IE model is not only broadly applicable and robust for different clustering and classification methods, but also sensitive for novel cell types. Our results also demonstrate that the intrinsic entropy/fluctuation of a gene serves as information rather than noise in contrast to its total entropy/fluctuation.
© The Author(s) (2022). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, CEMCS, CAS.

Entities:  

Keywords:  entropy decomposition; extrinsic entropy; feature selection; informative genes; intrinsic entropy; scRNA-seq

Mesh:

Year:  2022        PMID: 35102420      PMCID: PMC9175189          DOI: 10.1093/jmcb/mjac008

Source DB:  PubMed          Journal:  J Mol Cell Biol        ISSN: 1759-4685            Impact factor:   8.185


  34 in total

1.  Separating intrinsic from extrinsic fluctuations in dynamic biological systems.

Authors:  Andreas Hilfinger; Johan Paulsson
Journal:  Proc Natl Acad Sci U S A       Date:  2011-07-05       Impact factor: 11.205

2.  scmap: projection of single-cell RNA-seq data across data sets.

Authors:  Vladimir Yu Kiselev; Andrew Yiu; Martin Hemberg
Journal:  Nat Methods       Date:  2018-04-02       Impact factor: 28.547

3.  Personalized characterization of diseases using sample-specific networks.

Authors:  Xiaoping Liu; Yuetong Wang; Hongbin Ji; Kazuyuki Aihara; Luonan Chen
Journal:  Nucleic Acids Res       Date:  2016-09-04       Impact factor: 16.971

4.  Single-cell profiling of human gliomas reveals macrophage ontogeny as a basis for regional differences in macrophage activation in the tumor microenvironment.

Authors:  Sören Müller; Gary Kohanbash; S John Liu; Beatriz Alvarado; Diego Carrera; Aparna Bhaduri; Payal B Watchmaker; Garima Yagnik; Elizabeth Di Lullo; Martina Malatesta; Nduka M Amankulor; Arnold R Kriegstein; Daniel A Lim; Manish Aghi; Hideho Okada; Aaron Diaz
Journal:  Genome Biol       Date:  2017-12-20       Impact factor: 13.583

5.  An entropy-based metric for assessing the purity of single cell populations.

Authors:  Baolin Liu; Chenwei Li; Ziyi Li; Dongfang Wang; Xianwen Ren; Zemin Zhang
Journal:  Nat Commun       Date:  2020-06-22       Impact factor: 14.919

6.  "Dysfunctions" induced by Roux-en-Y gastric bypass surgery are concomitant with metabolic improvement independent of weight loss.

Authors:  Meiyi Li; Zhiyuan Liu; Bangguo Qian; Weixin Liu; Katsuhisa Horimoto; Jie Xia; Meilong Shi; Bing Wang; Huarong Zhou; Luonan Chen
Journal:  Cell Discov       Date:  2020-01-28       Impact factor: 10.849

7.  Splatter: simulation of single-cell RNA sequencing data.

Authors:  Luke Zappia; Belinda Phipson; Alicia Oshlack
Journal:  Genome Biol       Date:  2017-09-12       Impact factor: 13.583

8.  GiniClust2: a cluster-aware, weighted ensemble clustering method for cell-type detection.

Authors:  Daphne Tsoucas; Guo-Cheng Yuan
Journal:  Genome Biol       Date:  2018-05-10       Impact factor: 13.583

9.  DeepInsight: A methodology to transform a non-image data to an image for convolution neural network architecture.

Authors:  Alok Sharma; Edwin Vans; Daichi Shigemizu; Keith A Boroevich; Tatsuhiko Tsunoda
Journal:  Sci Rep       Date:  2019-08-06       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.