Literature DB >> 22035330

AREM: aligning short reads from ChIP-sequencing by expectation maximization.

Daniel Newkirk1, Jacob Biesinger, Alvin Chon, Kyoko Yokomori, Xiaohui Xie.   

Abstract

High-throughput sequencing coupled to chromatin immunoprecipitation (ChIP-Seq) is widely used in characterizing genome-wide binding patterns of transcription factors, cofactors, chromatin modifiers, and other DNA binding proteins. A key step in ChIP-Seq data analysis is to map short reads from high-throughput sequencing to a reference genome and identify peak regions enriched with short reads. Although several methods have been proposed for ChIP-Seq analysis, most existing methods only consider reads that can be uniquely placed in the reference genome, and therefore have low power for detecting peaks located within repeat sequences. Here, we introduce a probabilistic approach for ChIP-Seq data analysis that utilizes all reads, providing a truly genome-wide view of binding patterns. Reads are modeled using a mixture model corresponding to K enriched regions and a null genomic background. We use maximum likelihood to estimate the locations of the enriched regions, and implement an expectation-maximization (E-M) algorithm, called AREM (aligning reads by expectation maximization), to update the alignment probabilities of each read to different genomic locations. We apply the algorithm to identify genome-wide binding events of two proteins: Rad21, a component of cohesin and a key factor involved in chromatid cohesion, and Srebp-1, a transcription factor important for lipid/cholesterol homeostasis. Using AREM, we were able to identify 19,935 Rad21 peaks and 1,748 Srebp-1 peaks in the mouse genome with high confidence, including 1,517 (7.6%) Rad21 peaks and 227 (13%) Srebp-1 peaks that were missed using only uniquely mapped reads. The open source implementation of our algorithm is available at http://sourceforge.net/projects/arem.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 22035330      PMCID: PMC3216101          DOI: 10.1089/cmb.2011.0185

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  33 in total

1.  Translocation and gross deletion breakpoints in human inherited disease and cancer II: Potential involvement of repetitive sequence elements in secondary structure formation between DNA ends.

Authors:  Nadia Chuzhanova; Shaun S Abeysinghe; Michael Krawczak; David N Cooper
Journal:  Hum Mutat       Date:  2003-09       Impact factor: 4.878

2.  Mapping short DNA sequencing reads and calling variants using mapping quality scores.

Authors:  Heng Li; Jue Ruan; Richard Durbin
Journal:  Genome Res       Date:  2008-08-19       Impact factor: 9.043

3.  Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors:  Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

4.  The value of prior knowledge in discovering motifs with MEME.

Authors:  T L Bailey; C Elkan
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1995

5.  Cohesin mediates transcriptional insulation by CCCTC-binding factor.

Authors:  Kerstin S Wendt; Keisuke Yoshida; Takehiko Itoh; Masashige Bando; Birgit Koch; Erika Schirghuber; Shuichi Tsutsumi; Genta Nagae; Ko Ishihara; Tsuyoshi Mishiro; Kazuhide Yahata; Fumio Imamoto; Hiroyuki Aburatani; Mitsuyoshi Nakao; Naoko Imamoto; Kazuhiro Maeshima; Katsuhiko Shirahige; Jan-Michael Peters
Journal:  Nature       Date:  2008-01-30       Impact factor: 49.962

6.  SOAP: short oligonucleotide alignment program.

Authors:  Ruiqiang Li; Yingrui Li; Karsten Kristiansen; Jun Wang
Journal:  Bioinformatics       Date:  2008-01-28       Impact factor: 6.937

7.  CTCF physically links cohesin to chromatin.

Authors:  Eric D Rubio; David J Reiss; Piri L Welcsh; Christine M Disteche; Galina N Filippova; Nitin S Baliga; Ruedi Aebersold; Jeffrey A Ranish; Anton Krumm
Journal:  Proc Natl Acad Sci U S A       Date:  2008-06-11       Impact factor: 11.205

8.  Genome-wide maps of chromatin state in pluripotent and lineage-committed cells.

Authors:  Tarjei S Mikkelsen; Manching Ku; David B Jaffe; Biju Issac; Erez Lieberman; Georgia Giannoukos; Pablo Alvarez; William Brockman; Tae-Kyung Kim; Richard P Koche; William Lee; Eric Mendenhall; Aisling O'Donovan; Aviva Presser; Carsten Russ; Xiaohui Xie; Alexander Meissner; Marius Wernig; Rudolf Jaenisch; Chad Nusbaum; Eric S Lander; Bradley E Bernstein
Journal:  Nature       Date:  2007-07-01       Impact factor: 49.962

9.  Mediator and cohesin connect gene expression and chromatin architecture.

Authors:  Michael H Kagey; Jamie J Newman; Steve Bilodeau; Ye Zhan; David A Orlando; Nynke L van Berkum; Christopher C Ebmeier; Jesse Goossens; Peter B Rahl; Stuart S Levine; Dylan J Taatjes; Job Dekker; Richard A Young
Journal:  Nature       Date:  2010-08-18       Impact factor: 49.962

10.  FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology.

Authors:  Anthony P Fejes; Gordon Robertson; Mikhail Bilenky; Richard Varhol; Matthew Bainbridge; Steven J M Jones
Journal:  Bioinformatics       Date:  2008-07-03       Impact factor: 6.937

View more
  15 in total

1.  CNV-guided multi-read allocation for ChIP-seq.

Authors:  Qi Zhang; Sündüz Keleş
Journal:  Bioinformatics       Date:  2014-06-24       Impact factor: 6.937

2.  Bioinformatics Approaches for Determining the Functional Impact of Repetitive Elements on Non-coding RNAs.

Authors:  Chao Zeng; Atsushi Takeda; Kotaro Sekine; Naoki Osato; Tsukasa Fukunaga; Michiaki Hamada
Journal:  Methods Mol Biol       Date:  2022

3.  Ritornello: high fidelity control-free chromatin immunoprecipitation peak calling.

Authors:  Kelly P Stanton; Jiaqi Jin; Roy R Lederman; Sherman M Weissman; Yuval Kluger
Journal:  Nucleic Acids Res       Date:  2017-12-01       Impact factor: 16.971

4.  Hobbes: optimized gram-based methods for efficient read alignment.

Authors:  Athena Ahmadi; Alexander Behm; Nagesh Honnalli; Chen Li; Lingjie Weng; Xiaohui Xie
Journal:  Nucleic Acids Res       Date:  2011-12-22       Impact factor: 16.971

5.  Xenome--a tool for classifying reads from xenograft samples.

Authors:  Thomas Conway; Jeremy Wazny; Andrew Bromage; Martin Tymms; Dhanya Sooraj; Elizabeth D Williams; Bryan Beresford-Smith
Journal:  Bioinformatics       Date:  2012-06-15       Impact factor: 6.937

6.  Saturation analysis of ChIP-seq data for reproducible identification of binding peaks.

Authors:  Peter Hansen; Jochen Hecht; Daniel M Ibrahim; Alexander Krannich; Matthias Truss; Peter N Robinson
Journal:  Genome Res       Date:  2015-07-10       Impact factor: 9.043

7.  Transposable elements modulate human RNA abundance and splicing via specific RNA-protein interactions.

Authors:  David R Kelley; David G Hendrickson; Danielle Tenen; John L Rinn
Journal:  Genome Biol       Date:  2014-12-03       Impact factor: 13.583

Review 8.  Hematopoietic transcriptional mechanisms: from locus-specific to genome-wide vantage points.

Authors:  Andrew W DeVilbiss; Rajendran Sanalkumar; Kirby D Johnson; Sunduz Keles; Emery H Bresnick
Journal:  Exp Hematol       Date:  2014-05-09       Impact factor: 3.084

9.  Transposable elements reveal a stem cell-specific class of long noncoding RNAs.

Authors:  David Kelley; John Rinn
Journal:  Genome Biol       Date:  2012-11-26       Impact factor: 13.583

10.  Improving read mapping using additional prefix grams.

Authors:  Jongik Kim; Chen Li; Xiaohui Xie
Journal:  BMC Bioinformatics       Date:  2014-02-05       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.