Literature DB >> 27609510

A novel method for predicting activity of cis-regulatory modules, based on a diverse training set.

Wei Yang1, Saurabh Sinha1.   

Abstract

MOTIVATION: With the rapid emergence of technologies for locating cis-regulatory modules (CRMs) genome-wide, the next pressing challenge is to assign precise functions to each CRM, i.e. to determine the spatiotemporal domains or cell-types where it drives expression. A popular approach to this task is to model the typical k-mer composition of a set of CRMs known to drive a common expression pattern, and assign that pattern to other CRMs exhibiting a similar k-mer composition. This approach does not rely on prior knowledge of transcription factors relevant to the CRM or their binding motifs, and is thus more widely applicable than motif-based methods for predicting CRM activity, but is also prone to false positive predictions.
RESULTS: We present a novel strategy to improve the above-mentioned approach: to predict if a CRM drives a specific gene expression pattern, assess not only how similar the CRM is to other CRMs with similar activity but also to CRMs with distinct activities. We use a state-of-the-art statistical method to quantify a CRM's sequence similarity to many different training sets of CRMs, and employ a classification algorithm to integrate these similarity scores into a single prediction of the CRM's activity. This strategy is shown to significantly improve CRM activity prediction over current approaches.
AVAILABILITY AND IMPLEMENTATION: Our implementation of the new method, called IMMBoost, is freely available as source code, at https://github.com/weiyangedward/IMMBoost CONTACT: sinhas@illinois.eduSupplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Substances:

Year:  2016        PMID: 27609510      PMCID: PMC6075022          DOI: 10.1093/bioinformatics/btw552

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  21 in total

1.  FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin.

Authors:  Paul G Giresi; Jonghwan Kim; Ryan M McDaniell; Vishwanath R Iyer; Jason D Lieb
Journal:  Genome Res       Date:  2006-12-19       Impact factor: 9.043

2.  Tandem repeats finder: a program to analyze DNA sequences.

Authors:  G Benson
Journal:  Nucleic Acids Res       Date:  1999-01-15       Impact factor: 16.971

3.  Discriminative prediction of mammalian enhancers from DNA sequence.

Authors:  Dongwon Lee; Rachel Karchin; Michael A Beer
Journal:  Genome Res       Date:  2011-08-29       Impact factor: 9.043

4.  Integrating motif, DNA accessibility and gene expression data to build regulatory maps in an organism.

Authors:  Charles Blatti; Majid Kazemian; Scot Wolfe; Michael Brodsky; Saurabh Sinha
Journal:  Nucleic Acids Res       Date:  2015-03-19       Impact factor: 16.971

5.  The NIH Roadmap Epigenomics Mapping Consortium.

Authors:  Bradley E Bernstein; John A Stamatoyannopoulos; Joseph F Costello; Bing Ren; Aleksandar Milosavljevic; Alexander Meissner; Manolis Kellis; Marco A Marra; Arthur L Beaudet; Joseph R Ecker; Peggy J Farnham; Martin Hirst; Eric S Lander; Tarjei S Mikkelsen; James A Thomson
Journal:  Nat Biotechnol       Date:  2010-10       Impact factor: 54.908

6.  Genome-wide discovery of human heart enhancers.

Authors:  Leelavati Narlikar; Noboru J Sakabe; Alexander A Blanski; Fabio E Arimura; John M Westlund; Marcelo A Nobrega; Ivan Ovcharenko
Journal:  Genome Res       Date:  2010-01-14       Impact factor: 9.043

7.  Motif-blind, genome-wide discovery of cis-regulatory modules in Drosophila and mouse.

Authors:  Miriam R Kantorovitz; Majid Kazemian; Sarah Kinston; Diego Miranda-Saavedra; Qiyun Zhu; Gene E Robinson; Berthold Göttgens; Marc S Halfon; Saurabh Sinha
Journal:  Dev Cell       Date:  2009-10       Impact factor: 12.270

8.  Sequence and chromatin determinants of cell-type-specific transcription factor binding.

Authors:  Aaron Arvey; Phaedra Agius; William Stafford Noble; Christina Leslie
Journal:  Genome Res       Date:  2012-09       Impact factor: 9.043

9.  Machine learning classification of cell-specific cardiac enhancers uncovers developmental subnetworks regulating progenitor cell division and cell fate specification.

Authors:  Shaad M Ahmad; Brian W Busser; Di Huang; Elizabeth J Cozart; Sébastien Michaud; Xianmin Zhu; Neal Jeffries; Anton Aboukhalil; Martha L Bulyk; Ivan Ovcharenko; Alan M Michelson
Journal:  Development       Date:  2014-02       Impact factor: 6.868

10.  Integrating diverse datasets improves developmental enhancer prediction.

Authors:  Genevieve D Erwin; Nir Oksenberg; Rebecca M Truty; Dennis Kostka; Karl K Murphy; Nadav Ahituv; Katherine S Pollard; John A Capra
Journal:  PLoS Comput Biol       Date:  2014-06-26       Impact factor: 4.475

View more
  1 in total

1.  Identification of gene specific cis-regulatory elements during differentiation of mouse embryonic stem cells: An integrative approach using high-throughput datasets.

Authors:  M S Vijayabaskar; Debbie K Goode; Nadine Obier; Monika Lichtinger; Amber M L Emmett; Fatin N Zainul Abidin; Nisar Shar; Rebecca Hannah; Salam A Assi; Michael Lie-A-Ling; Berthold Gottgens; Georges Lacaud; Valerie Kouskoff; Constanze Bonifer; David R Westhead
Journal:  PLoS Comput Biol       Date:  2019-11-04       Impact factor: 4.475

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.