Yaqiang Cao1, Zhaoxiong Chen1,2, Xingwei Chen1, Daosheng Ai1, Guoyu Chen1,2, Joseph McDermott1,2, Yi Huang1, Xiaoxiao Guo2, Jing-Dong J Han1,2. 1. CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China. 2. Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing 100871, China.
Abstract
MOTIVATION: Sequencing-based 3D genome mapping technologies can identify loops formed by interactions between regulatory elements hundreds of kilobases apart. Existing loop-calling tools are mostly restricted to a single data type, with accuracy dependent on a predefined resolution contact matrix or called peaks, and can have prohibitive hardware costs. RESULTS: Here, we introduce cLoops ('see loops') to address these limitations. cLoops is based on the clustering algorithm cDBSCAN that directly analyzes the paired-end tags (PETs) to find candidate loops and uses a permuted local background to estimate statistical significance. These two data-type-independent processes enable loops to be reliably identified for both sharp and broad peak data, including but not limited to ChIA-PET, Hi-C, HiChIP and Trac-looping data. Loops identified by cLoops showed much less distance-dependent bias and higher enrichment relative to local regions than existing tools. Altogether, cLoops improves accuracy of detecting of 3D-genomic loops from sequencing data, is versatile, flexible, efficient, and has modest hardware requirements. AVAILABILITY AND IMPLEMENTATION: cLoops with documentation and example data are freely available at: https://github.com/YaqiangCao/cLoops. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Sequencing-based 3D genome mapping technologies can identify loops formed by interactions between regulatory elements hundreds of kilobases apart. Existing loop-calling tools are mostly restricted to a single data type, with accuracy dependent on a predefined resolution contact matrix or called peaks, and can have prohibitive hardware costs. RESULTS: Here, we introduce cLoops ('see loops') to address these limitations. cLoops is based on the clustering algorithm cDBSCAN that directly analyzes the paired-end tags (PETs) to find candidate loops and uses a permuted local background to estimate statistical significance. These two data-type-independent processes enable loops to be reliably identified for both sharp and broad peak data, including but not limited to ChIA-PET, Hi-C, HiChIP and Trac-looping data. Loops identified by cLoops showed much less distance-dependent bias and higher enrichment relative to local regions than existing tools. Altogether, cLoops improves accuracy of detecting of 3D-genomic loops from sequencing data, is versatile, flexible, efficient, and has modest hardware requirements. AVAILABILITY AND IMPLEMENTATION: cLoops with documentation and example data are freely available at: https://github.com/YaqiangCao/cLoops. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Ricardo Linares-Saldana; Wonho Kim; Nikhita A Bolar; Haoyue Zhang; Bailey A Koch-Bojalad; Sora Yoon; Parisha P Shah; Ashley Karnay; Daniel S Park; Jennifer M Luppino; Son C Nguyen; Arun Padmanabhan; Cheryl L Smith; Andrey Poleshko; Qiaohong Wang; Li Li; Deepak Srivastava; Golnaz Vahedi; Gwang Hyeon Eom; Gerd A Blobel; Eric F Joyce; Rajan Jain Journal: Nat Genet Date: 2021-10-05 Impact factor: 38.330
Authors: Tao Zhen; Yaqiang Cao; Gang Ren; Ling Zhao; R Katherine Hyde; Guadalupe Lopez; Dechun Feng; Lemlem Alemu; Keji Zhao; P Paul Liu Journal: Blood Date: 2020-11-19 Impact factor: 22.113
Authors: Sarah E Johnstone; Alejandro Reyes; Yifeng Qi; Carmen Adriaens; Esmat Hegazi; Karin Pelka; Jonathan H Chen; Luli S Zou; Yotam Drier; Vivian Hecht; Noam Shoresh; Martin K Selig; Caleb A Lareau; Sowmya Iyer; Son C Nguyen; Eric F Joyce; Nir Hacohen; Rafael A Irizarry; Bin Zhang; Martin J Aryee; Bradley E Bernstein Journal: Cell Date: 2020-08-24 Impact factor: 41.582