| Literature DB >> 23657883 |
Piotr Dittwald1, Tomasz Gambin, Przemyslaw Szafranski, Jian Li, Stephen Amato, Michael Y Divon, Lisa Ximena Rodríguez Rojas, Lindsay E Elton, Daryl A Scott, Christian P Schaaf, Wilfredo Torres-Martinez, Abby K Stevens, Jill A Rosenfeld, Satish Agadi, David Francis, Sung-Hae L Kang, Amy Breman, Seema R Lalani, Carlos A Bacino, Weimin Bi, Aleksandar Milosavljevic, Arthur L Beaudet, Ankita Patel, Chad A Shaw, James R Lupski, Anna Gambin, Sau Wai Cheung, Pawel Stankiewicz.
Abstract
We delineated and analyzed directly oriented paralogous low-copy repeats (DP-LCRs) in the most recent version of the human haploid reference genome. The computationally defined DP-LCRs were cross-referenced with our chromosomal microarray analysis (CMA) database of 25,144 patients subjected to genome-wide assays. This computationally guided approach to the empirically derived large data set allowed us to investigate genomic rearrangement relative frequencies and identify new loci for recurrent nonallelic homologous recombination (NAHR)-mediated copy-number variants (CNVs). The most commonly observed recurrent CNVs were NPHP1 duplications (233), CHRNA7 duplications (175), and 22q11.21 deletions (DiGeorge/velocardiofacial syndrome, 166). In the ∼25% of CMA cases for which parental studies were available, we identified 190 de novo recurrent CNVs. In this group, the most frequently observed events were deletions of 22q11.21 (48), 16p11.2 (autism, 34), and 7q11.23 (Williams-Beuren syndrome, 11). Several features of DP-LCRs, including length, distance between NAHR substrate elements, DNA sequence identity (fraction matching), GC content, and concentration of the homologous recombination (HR) hot spot motif 5'-CCNCCNTNNCCNC-3', correlate with the frequencies of the recurrent CNVs events. Four novel adjacent DP-LCR-flanked and NAHR-prone regions, involving 2q12.2q13, were elucidated in association with novel genomic disorders. Our study quantitates genome architectural features responsible for NAHR-mediated genomic instability and further elucidates the role of NAHR in human disease.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23657883 PMCID: PMC3759717 DOI: 10.1101/gr.152454.112
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
Figure 1.Schematic representation of LCR clustering. Horizontal arrows indicate LCR elements and their orientation; the same color represents a pair of paralogous LCRs. A hierarchical clustering tree is depicted above; the dashed horizontal line (violet) shows the height threshold for cutting this tree. Directly oriented paralogous LCRs (DP-LCRs) can potentially mediate NAHR events. The structure of LCR clusters (subunit structure, orientation, etc.) as well as the DNA sequence homology between LCR clusters flanking NAHR-prone regions often revealed extensive complexity, in contradistinction to the concept of a “segmental duplication” and more consistent with “complex LCR clusters” and with current accepted models for generating duplications and complex genomic rearrangements; e.g., FoSTeS (Lee et al. 2007) or MMBIR (Hastings et al. 2009) (Supplemental Fig. S1).
Figure 2.Site frequency spectrum of known pathogenic (de novo, inherited, or unknown origin) deletions and duplications in the MGL BCM CMA database. The most commonly observed regions of genomic instability are NPHP1 duplications (233), CHRNA7 duplications (175), and 22q11.21 deletions (DGS/VCFS, 166).
Figure 3.Known recurrent CNVs found in the MGL BCM CMA database divided into de novo (colored) and inherited (white), excluding ∼75% of events of unknown parental origin (i.e., no parental study was performed). Among de novo events, more deletions were found than reciprocal duplications.
Correlation of DP-LCRs characteristics and frequency of de novo recurrent rearrangements
Correlation of LCR clusters' characteristics and frequency of de novo recurrent rearrangements
Figure 4.Four novel NAHR-prone regions on chromosome 2q12.2q13. (Top) Schematic representation of paralogous DP-LCRs (colored arrows) with their sequence homology and distance in between. UCSC display of LCR clusters and deletion CNVs found in patients 1–8 (middle) and deletion (red) and duplication (blue) CNVs from the DECIPHER and ISCA databases (bottom). Green arrows indicate the ST6GAL2, SLC5A7, EDAR, and RANBP2 genes proposed to contribute to the patients' phenotypes.
Figure 5.DNA sequence homology between four LCR clusters in the 2q12.2q13 region (chr2:106,985,338-110,870,754) for paralogous subunits larger than 1 kb in size (hg19). (Top and bottom) UCSC Segmental Duplications (segdup) track representing the 2q12.2q13 region. (Middle) Results of Miropeats program analysis among all four clusters.