| Literature DB >> 25215497 |
Matthew T Weirauch1, Ally Yang2, Mihai Albu2, Atina G Cote2, Alejandro Montenegro-Montero3, Philipp Drewe4, Hamed S Najafabadi2, Samuel A Lambert5, Ishminder Mann2, Kate Cook5, Hong Zheng2, Alejandra Goity3, Harm van Bakel6, Jean-Claude Lozano7, Mary Galli8, Mathew G Lewsey9, Eryong Huang10, Tuhin Mukherjee11, Xiaoting Chen11, John S Reece-Hoyes12, Sridhar Govindarajan13, Gad Shaulsky10, Albertha J M Walhout12, François-Yves Bouget7, Gunnar Ratsch4, Luis F Larrondo3, Joseph R Ecker14, Timothy R Hughes15.
Abstract
Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ∼1% of eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ∼34% of the ∼170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in chromatin immunoprecipitation sequencing (ChIP-seq) peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif "library" can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25215497 PMCID: PMC4163041 DOI: 10.1016/j.cell.2014.08.009
Source DB: PubMed Journal: Cell ISSN: 0092-8674 Impact factor: 41.582