| Literature DB >> 28884399 |
Kai Song1,2,3, Li Li4,5,6, Guofan Zhang7,8,9.
Abstract
RNA-seq is a recently developed approach widely used for transcriptome profiling in biological analyses that use next-generation sequencing technologies. Accurate estimation of gene expression levels is critical for answering biological questions. Here, we show that the commonly used measure of gene expression levels, fragments per kilobase of transcript per million mapped reads (FPKM), is biased in transcript length, GC content, and dinucleotide frequencies in the RNA-seq analysis of marine species. We used a generalized linear model to correct the observed biases of FPKM. We used RNA-seq data sets from eight species obtained by different sequencing methods to evaluate the correction methods. Our work contributes to the understanding of potential technical artifacts in RNA-seq experiments for marine species, and presents a means by which more accurate gene expression measures can be obtained.Keywords: GC content; Gene expression; RNA-seq analysis bias; Transcriptome profiling
Mesh:
Substances:
Year: 2017 PMID: 28884399 DOI: 10.1007/s10126-017-9773-5
Source DB: PubMed Journal: Mar Biotechnol (NY) ISSN: 1436-2228 Impact factor: 3.619