| Literature DB >> 32814038 |
Cory C Funk1, Alex M Casella2, Segun Jung3, Matthew A Richards1, Alex Rodriguez3, Paul Shannon1, Rory Donovan-Maiye1, Ben Heavner1, Kyle Chard3, Yukai Xiao3, Gustavo Glusman1, Nilufer Ertekin-Taner4, Todd E Golde4, Arthur Toga5, Leroy Hood1, John D Van Horn6, Carl Kesselman7, Ian Foster8, Ravi Madduri9, Nathan D Price10, Seth A Ament11.
Abstract
Characterizing the tissue-specific binding sites of transcription factors (TFs) is essential to reconstruct gene regulatory networks and predict functions for non-coding genetic variation. DNase-seq footprinting enables the prediction of genome-wide binding sites for hundreds of TFs simultaneously. Despite the public availability of high-quality DNase-seq data from hundreds of samples, a comprehensive, up-to-date resource for the locations of genomic footprints is lacking. Here, we develop a scalable footprinting workflow using two state-of-the-art algorithms: Wellington and HINT. We apply our workflow to detect footprints in 192 ENCODE DNase-seq experiments and predict the genomic occupancy of 1,515 human TFs in 27 human tissues. We validate that these footprints overlap true-positive TF binding sites from ChIP-seq. We demonstrate that the locations, depth, and tissue specificity of footprints predict effects of genetic variants on gene expression and capture a substantial proportion of genetic risk for complex traits.Entities:
Keywords: DNase-seq; ENCODE; footprinting; gene regulation; motifs; psychiatric genetics; transcription factors
Mesh:
Substances:
Year: 2020 PMID: 32814038 PMCID: PMC7462736 DOI: 10.1016/j.celrep.2020.108029
Source DB: PubMed Journal: Cell Rep Impact factor: 9.995