Literature DB >> 32271772

A Zipf-plot based normalization method for high-throughput RNA-seq data.

Bin Wang1.   

Abstract

Normalization is crucial in RNA-seq data analyses. Due to the existence of excessive zeros and a large number of small measures, it is challenging to find reliable linear rescaling normalization parameters. We propose a Zipf plot based normalization method (ZN) assuming that all gene profiles have similar upper tail behaviors in their expression distributions. The new normalization method uses global information of all genes in the same profile without gene-level expression alteration. It doesn't require the majority of genes to be not differentially expressed (DE), and can be applied to data where the majority of genes are weakly or not expressed. Two normalization schemes are implemented with ZN: a linear rescaling scheme and a non-linear transformation scheme. The linear rescaling scheme can be applied alone or together with the non-linear normalization scheme. The performance of ZN is benchmarked against five popular linear normalization methods for RNA-seq data. Results show that the linear rescaling normalization scheme by itself works well and is robust. The non-linear normalization scheme can further improve the normalization outcomes and is optional if the Zipf plots show parallel patterns.

Entities:  

Year:  2020        PMID: 32271772      PMCID: PMC7144957          DOI: 10.1371/journal.pone.0230594

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


  14 in total

1.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.

Authors:  B M Bolstad; R A Irizarry; M Astrand; T P Speed
Journal:  Bioinformatics       Date:  2003-01-22       Impact factor: 6.937

2.  A scaling normalization method for differential expression analysis of RNA-seq data.

Authors:  Mark D Robinson; Alicia Oshlack
Journal:  Genome Biol       Date:  2010-03-02       Impact factor: 13.583

3.  The miR-17/92 polycistron is up-regulated in sonic hedgehog-driven medulloblastomas and induced by N-myc in sonic hedgehog-treated cerebellar neural precursors.

Authors:  Paul A Northcott; Africa Fernandez-L; John P Hagan; David W Ellison; Wesia Grajkowska; Yancey Gillespie; Richard Grundy; Timothy Van Meter; James T Rutka; Carlo M Croce; Anna Marie Kenney; Michael D Taylor
Journal:  Cancer Res       Date:  2009-04-07       Impact factor: 12.701

4.  Distinctive microRNA signature of acute myeloid leukemia bearing cytoplasmic mutated nucleophosmin.

Authors:  Ramiro Garzon; Michela Garofalo; Maria Paola Martelli; Roger Briesewitz; Lisheng Wang; Cecilia Fernandez-Cymering; Stefano Volinia; Chang-Gong Liu; Susanne Schnittger; Torsten Haferlach; Arcangelo Liso; Daniela Diverio; Marco Mancini; Giovanna Meloni; Robin Foa; Massimo F Martelli; Cristina Mecucci; Carlo M Croce; Brunangelo Falini
Journal:  Proc Natl Acad Sci U S A       Date:  2008-02-28       Impact factor: 11.205

5.  Comparing the normalization methods for the differential analysis of Illumina high-throughput RNA-Seq data.

Authors:  Peipei Li; Yongjun Piao; Ho Sun Shon; Keun Ho Ryu
Journal:  BMC Bioinformatics       Date:  2015-10-28       Impact factor: 3.169

6.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.

Authors:  Michael I Love; Wolfgang Huber; Simon Anders
Journal:  Genome Biol       Date:  2014       Impact factor: 13.583

7.  Comparison of normalization methods for the analysis of metagenomic gene abundance data.

Authors:  Mariana Buongermino Pereira; Mikael Wallroth; Viktor Jonsson; Erik Kristiansson
Journal:  BMC Genomics       Date:  2018-04-20       Impact factor: 3.969

8.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

9.  Statistical methods for detecting differentially abundant features in clinical metagenomic samples.

Authors:  James Robert White; Niranjan Nagarajan; Mihai Pop
Journal:  PLoS Comput Biol       Date:  2009-04-10       Impact factor: 4.475

10.  Comparison of normalization approaches for gene expression studies completed with high-throughput sequencing.

Authors:  Farnoosh Abbas-Aghababazadeh; Qian Li; Brooke L Fridley
Journal:  PLoS One       Date:  2018-10-31       Impact factor: 3.240

View more
  2 in total

1.  PsiNorm: a scalable normalization for single-cell RNA-seq data.

Authors:  Matteo Borella; Graziano Martello; Davide Risso; Chiara Romualdi
Journal:  Bioinformatics       Date:  2021-09-09       Impact factor: 6.937

2.  cdev: a ground-truth based measure to evaluate RNA-seq normalization performance.

Authors:  Diem-Trang Tran; Matthew Might
Journal:  PeerJ       Date:  2021-10-04       Impact factor: 2.984

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.