Literature DB >> 31328831

Modelling RNA-Seq data with a zero-inflated mixture Poisson linear model.

Siyun Liu1, Yuan Jiang2, Tao Yu1.   

Abstract

RNA sequencing (RNA-Seq) has been frequently used in genomic studies and has generated a vast amount of data. The RNA-Seq data are composed of two parts: (a) a sequence of nucleotides of the genome; and (b) a corresponding sequence of counts, standing for the number of short reads whose mapped positions start at each position of the genome. One common feature of these count data is that they are typically nonuniform; recent studies have revealed that the nonuniformity is partially owing to a systematic bias resulted from the sequencing preference. Existing works in the literature model the nonuniformity with a single component Poisson linear model that incorporates the effects of the sequencing preference. However, we observe consistently that the short reads mapped to a gene may have a mixture structure and can be zero-inflated. A single component model may not suffice to model the complexity of such data. In this paper, we propose a zero-inflated mixture Poisson linear model for the RNA-Seq count data and derive a fast expectation-maximisation-based algorithm for estimating the unknown parameters. Numerical studies are conducted to illustrate the effectiveness of our method.
© 2019 Wiley Periodicals, Inc.

Entities:  

Keywords:  Bayesian information criterion; RNA-Seq; mixture Poisson linear model; nonuniformity; overdispersion; zero-inflated count data

Mesh:

Year:  2019        PMID: 31328831      PMCID: PMC6763381          DOI: 10.1002/gepi.22246

Source DB:  PubMed          Journal:  Genet Epidemiol        ISSN: 0741-0395            Impact factor:   2.135


  24 in total

1.  Modeling overdispersion heterogeneity in differential expression analysis using mixtures.

Authors:  Elisabetta Bonafede; Franck Picard; Stéphane Robin; Cinzia Viroli
Journal:  Biometrics       Date:  2015-12-18       Impact factor: 2.571

2.  A zero-inflated Poisson model for insertion tolerance analysis of genes based on Tn-seq data.

Authors:  Fangfang Liu; Chong Wang; Zuowei Wu; Qijing Zhang; Peng Liu
Journal:  Bioinformatics       Date:  2016-02-01       Impact factor: 6.937

3.  Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors:  Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

4.  Length bias correction for RNA-seq data in gene set analyses.

Authors:  Liyan Gao; Zhide Fang; Kui Zhang; Degui Zhi; Xiangqin Cui
Journal:  Bioinformatics       Date:  2011-01-19       Impact factor: 6.937

5.  Novel fusion transcripts in bladder cancer identified by RNA-seq.

Authors:  T Kekeeva; A Tanas; A Kanygina; D Alexeev; A Shikeeva; L Zavalishina; Y Andreeva; G A Frank; D Zaletaev
Journal:  Cancer Lett       Date:  2016-02-16       Impact factor: 8.679

6.  A two-parameter generalized Poisson model to improve the analysis of RNA-seq data.

Authors:  Sudeep Srivastava; Liang Chen
Journal:  Nucleic Acids Res       Date:  2010-07-29       Impact factor: 16.971

7.  Classifying next-generation sequencing data using a zero-inflated Poisson model.

Authors:  Yan Zhou; Xiang Wan; Baoxue Zhang; Tiejun Tong
Journal:  Bioinformatics       Date:  2018-04-15       Impact factor: 6.937

8.  Improving RNA-Seq expression estimates by correcting for fragment bias.

Authors:  Adam Roberts; Cole Trapnell; Julie Donaghey; John L Rinn; Lior Pachter
Journal:  Genome Biol       Date:  2011-03-16       Impact factor: 13.583

9.  GC-content normalization for RNA-Seq data.

Authors:  Davide Risso; Katja Schwartz; Gavin Sherlock; Sandrine Dudoit
Journal:  BMC Bioinformatics       Date:  2011-12-17       Impact factor: 3.169

10.  Transcriptome analysis of paired primary colorectal carcinoma and liver metastases reveals fusion transcripts and similar gene expression profiles in primary carcinoma and liver metastases.

Authors:  Ja-Rang Lee; Chae Hwa Kwon; Yuri Choi; Hye Ji Park; Hyun Sung Kim; Hong-Jae Jo; Nahmgun Oh; Do Youn Park
Journal:  BMC Cancer       Date:  2016-07-26       Impact factor: 4.430

View more
  1 in total

1.  Anti-bias training for (sc)RNA-seq: experimental and computational approaches to improve precision.

Authors:  Philip Davies; Matt Jones; Juntai Liu; Daniel Hebenstreit
Journal:  Brief Bioinform       Date:  2021-11-05       Impact factor: 11.622

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.