| Literature DB >> 27695478 |
Abstract
In the past 5 years, RNA-Seq has become a powerful tool in transcriptome analysis even though computational methods dedicated to the analysis of high-throughput sequencing data are yet to be standardized. It is, however, now commonly accepted that the choice of a normalization procedure is an important step in such a process, for example in differential gene expression analysis. The present article highlights the similarities between three normalization methods: TMM from edgeR R package, RLE from DESeq2 R package, and MRN. Both TMM and DESeq2 are widely used for differential gene expression analysis. This paper introduces properties that show when these three methods will give exactly the same results. These properties are proven mathematically and illustrated by performing in silico calculations on a given RNA-Seq data set.Entities:
Keywords: DESeq2; RNA-seq data; comparison of methods; edgeR; normalization
Year: 2016 PMID: 27695478 PMCID: PMC5025571 DOI: 10.3389/fgene.2016.00164
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Default normalization factors for the fruit set RNA-Seq data.
| TMM | 0.98012 | 0.92236 | 0.71989 | 1.05807 | 0.98130 | 0.88352 | 1.13027 | 1.19388 | 1.24130 |
| RLE | 1.01712 | 0.80899 | 0.72660 | 0.86594 | 1.23622 | 0.73647 | 1.28172 | 1.27220 | 1.37315 |
| MRN | 0.87105 | 0.75416 | 0.91430 | 0.79324 | 1.20131 | 0.80461 | 1.33984 | 1.25330 | 1.29317 |
Normalization factors of tomato fruit set samples are obtained from TMM, RLE, and MRN normalization methods with default settings.
Figure 1Normalization factors for the fruit set RNA-Seq data depending on corresponding library sizes. All three studied normalization methods are carried out with default settings. For all three methods, regression (dashed) lines are estimated from a simple linear regression modeling the relationship between default normalization factors and library sizes. Color key: TMM, RLE, and MRN are respectively colored in green, blue, and red. Key to symbols: Bud, Ant, and Pos stages are respectively drawn with circles, squares, and triangles.
Description of the three normalization methods.
| I | Pre-normalization by library size | |||
| II | Reference sample, or | |||
| III | Relative sizes of transcriptomes and reference sample, or | |||
| IV | ||||
| V | Taking into account both the relative size and the library size, or | |||
| VI | Normalization factors, or | |||
| VII | Normalization of counts, or |
Normalization factors of tomato fruit set samples, obtained from TMM and MRN normalization methods with parameters of Proposition 1.
| TMM | 0.97654 | 0.92966 | 0.72054 | 1.06259 | 0.97360 | 0.87363 | 1.14166 | 1.19541 | 1.23937 |
| MRN | 0.97658 | 0.92957 | 0.72079 | 1.06280 | 0.97361 | 0.87361 | 1.14189 | 1.19599 | 1.23792 |
Normalization factors of some pairs of tomato fruit set samples, obtained from RLE and MRN normalization methods with parameters of Proposition 2.
| RLE | 1.1015522 | 0.9078099 | 0.7870385 | 1.2705859 | 0.8248517 | 1.2123391 |
| MRN | 1.1015522 | 0.9078099 | 0.7870385 | 1.2705859 | 0.8248517 | 1.2123391 |