Artur Jaroszewicz1,2, Jason Ernst1,2,3,4,5,6. 1. Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA. 2. Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA. 3. Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90095, USA. 4. Computer Science Department, University of California, Los Angeles, Los Angeles, CA 90095, USA. 5. Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA 90095, USA. 6. Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Abstract
MOTIVATION: Chromatin interactions play an important role in genome architecture and gene regulation. The Hi-C assay generates such interactions maps genome-wide, but at relatively low resolutions (e.g. 5-25 kb), which is substantially coarser than the resolution of transcription factor binding sites or open chromatin sites that are potential sources of such interactions. RESULTS: To predict the sources of Hi-C-identified interactions at a high resolution (e.g. 100 bp), we developed a computational method that integrates data from DNase-seq and ChIP-seq of TFs and histone marks. Our method, χ-CNN, uses this data to first train a convolutional neural network (CNN) to discriminate between called Hi-C interactions and non-interactions. χ-CNN then predicts the high-resolution source of each Hi-C interaction using a feature attribution method. We show these predictions recover original Hi-C peaks after extending them to be coarser. We also show χ-CNN predictions enrich for evolutionarily conserved bases, eQTLs and CTCF motifs, supporting their biological significance. χ-CNN provides an approach for analyzing important aspects of genome architecture and gene regulation at a higher resolution than previously possible. AVAILABILITY AND IMPLEMENTATION: χ-CNN software is available on GitHub (https://github.com/ernstlab/X-CNN). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Chromatin interactions play an important role in genome architecture and gene regulation. The Hi-C assay generates such interactions maps genome-wide, but at relatively low resolutions (e.g. 5-25 kb), which is substantially coarser than the resolution of transcription factor binding sites or open chromatin sites that are potential sources of such interactions. RESULTS: To predict the sources of Hi-C-identified interactions at a high resolution (e.g. 100 bp), we developed a computational method that integrates data from DNase-seq and ChIP-seq of TFs and histone marks. Our method, χ-CNN, uses this data to first train a convolutional neural network (CNN) to discriminate between called Hi-C interactions and non-interactions. χ-CNN then predicts the high-resolution source of each Hi-C interaction using a feature attribution method. We show these predictions recover original Hi-C peaks after extending them to be coarser. We also show χ-CNN predictions enrich for evolutionarily conserved bases, eQTLs and CTCF motifs, supporting their biological significance. χ-CNN provides an approach for analyzing important aspects of genome architecture and gene regulation at a higher resolution than previously possible. AVAILABILITY AND IMPLEMENTATION: χ-CNN software is available on GitHub (https://github.com/ernstlab/X-CNN). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Hyejung Won; Luis de la Torre-Ubieta; Jason L Stein; Neelroop N Parikshak; Jerry Huang; Carli K Opland; Michael J Gandal; Gavin J Sutton; Farhad Hormozdiari; Daning Lu; Changhoon Lee; Eleazar Eskin; Irina Voineagu; Jason Ernst; Daniel H Geschwind Journal: Nature Date: 2016-10-19 Impact factor: 49.962
Authors: Erez Lieberman-Aiden; Nynke L van Berkum; Louise Williams; Maxim Imakaev; Tobias Ragoczy; Agnes Telling; Ido Amit; Bryan R Lajoie; Peter J Sabo; Michael O Dorschner; Richard Sandstrom; Bradley Bernstein; M A Bender; Mark Groudine; Andreas Gnirke; John Stamatoyannopoulos; Leonid A Mirny; Eric S Lander; Job Dekker Journal: Science Date: 2009-10-09 Impact factor: 47.728