We identified complex genomic rearrangements consisting of intermixed duplications and triplications of genomic segments at the MECP2 and PLP1 loci. These complex rearrangements were characterized by a triplicated segment embedded within a duplication in 11 unrelated subjects. Notably, only two breakpoint junctions were generated during each rearrangement formation. All the complex rearrangement products share a common genomic organization, duplication-inverted triplication-duplication (DUP-TRP/INV-DUP), in which the triplicated segment is inverted and located between directly oriented duplicated genomic segments. We provide evidence that the DUP-TRP/INV-DUP structures are mediated by inverted repeats that can be separated by >300 kb, a genomic architecture that apparently leads to susceptibility to such complex rearrangements. A similar inverted repeat-mediated mechanism may underlie structural variation in many other regions of the human genome. We propose a mechanism that involves both homology-driven events, via inverted repeats, and microhomologous or nonhomologous events.
We identified complex genomic rearrangements consisting of intermixed duplications and triplications of genomic segments at the MECP2 and PLP1 loci. These complex rearrangements were characterized by a triplicated segment embedded within a duplication in 11 unrelated subjects. Notably, only two breakpoint junctions were generated during each rearrangement formation. All the complex rearrangement products share a common genomic organization, duplication-inverted triplication-duplication (DUP-TRP/INV-DUP), in which the triplicated segment is inverted and located between directly oriented duplicated genomic segments. We provide evidence that the DUP-TRP/INV-DUP structures are mediated by inverted repeats that can be separated by >300 kb, a genomic architecture that apparently leads to susceptibility to such complex rearrangements. A similar inverted repeat-mediated mechanism may underlie structural variation in many other regions of the human genome. We propose a mechanism that involves both homology-driven events, via inverted repeats, and microhomologous or nonhomologous events.
One of the surprising outcomes of clinical implementation of tiling-path high-resolution comparative genomic hybridization arrays (aCGH) is the frequent observation of complex genomic rearrangements, some of which include triplications. Despite the clinical relevance of genomic triplications encompassing dosage sensitive genes to both diagnosis and prognosis, the molecular mechanism(s) for triplication formation are poorly understood. Triplications have remained an enigma potentially due to both a paucity of patients reported in the literature and experimental challenges to breakpoint determination; the latter information is a prerequisite to infer mechanism. We recently reported a cohort of 30 patients with MECP2 duplications in which we identified six patients (20%) with a triplicated segment embedded within the duplication. Preliminary molecular characterization of three tandem duplications revealed microhomology in two cases (3-4 bp in length). Breakpoints from one triplication suggested potential inversion occurring concomitant with triplication formation, but the mechanism for complex duplication/triplication formation remained perplexing.Our observations led us to hypothesize that a replication-based mechanism, such as Break-Induced Replication (BIR)[1-5], Fork Stalling and Template Switching (FoSTeS)[6,7] and/or Microhomology-Mediated Break Induced Replication (MMBIR)[4,5] underlies formation of complex rearrangements including triplications and inversions. We also hypothesized that triplication involving dosage-sensitive genes, such as MECP2, could potentially produce a more severe clinical phenotype than duplications[8]. Therefore, to obtain further insight into the mechanism for triplication formation, as well as to learn how triplications impact the clinical phenotype, we studied nine patients from eight families with unique duplication and triplication rearrangements encompassing the MECP2 gene. Remarkably, our data unveiled a rearrangement product with shared structural features (Fig. 1) suggesting a common mechanism for complex duplication (DUP)/triplication (TRP) formation. Further analysis supports a role for a replication-based mechanism that relies on the presence of low copy repeats (LCRs) in an inverted orientation. The same structural pattern for rearrangement products was also observed in patients with triplications embedded in duplications at the PLP1 locus in Xq22, suggesting that the same specific mechanism might underlie triplication formation at other loci in the human genome.
Figure 1
General genomic structure of the complex rearrangements triplications embedded in duplications
(a) Copy number dosage alteration inferred from aCGH. A typical aCGH experiment is shown for a complex DUP-TRP-DUP rearrangement. Transitions of copy number dosage alterations are demonstrated by black vertical dotted arrows; the size of genomic segments defined by those boundaries vary in each individual complex rearrangement and are denoted a, b, c. The horizontal line below depicts the array data. Duplications are represented in red and triplications in blue; yellow arrows represent inverted repeats. (b) Representative figure of the genomic structure as determined by further analysis of copy number breakpoint junctions (jct1 and jct2) in five independent rearrangements using multiple molecular experimental approaches. The genomic segments involved are denoted a, b, c whereas the respective copy number gains are denoted as a’, b’, c’. These findings were corroborated by three additional independent rearrangements for which we obtained information for either jct1 or jct2 from two genomic loci. Change in orientation of the genomic segment (represented by black arrows) occurred at each of the junctions formed. DUPp: proximal transition/breakpoint of duplications; TRPp: proximal transition/breakpoint of triplication; DUPd: distal transition/breakpoint of duplications; TRPd: distal transition/breakpoint of triplications.
The severity of disease observed in patients with triplication positively correlates with the copy number status of MECP2 and IRAK1 in multiple patients studied, further confirming observations from case reports. Our findings elucidate a common structure DUP-TRP/INV-DUP as one potential outcome for human genome rearrangements that utilize inverted repeats as substrates for recombination. We further observed that an incremental increase of MECP2 dosage from two to three copies results in a more severe phenotype with additional novel and distinct clinical findings.
Results
Triplications embedded in duplications spanning MECP2
We previously identified six complex rearrangements (triplications embedded within duplications) in a cohort of 30 patients with MECP2 duplication by high-resolution human genome analysis using customized high-density array CGH[9]. An additional four subjects with a complex DUP-TRP-DUP pattern were identified by array CGH suggesting that complex rearrangements are a relatively frequently observed outcome of genomic alterations at this locus. We systematically investigated these complex rearrangements to characterize the molecular features of the rearrangement product. In total, we studied nine patients with triplications embedded in duplications; in five cases the MECP2 gene is included within the triplicated segment (Fig. 2a).
Figure 2
Individuals carrying complex triplications of Xq28
(a) Genomic region harboring the alterations involving MECP2. Duplications are represented in red and triplications in blue. Arrows on top of the BAB2769 and BAB2772 rearrangements indicate the position of the transitions to gain according to array-based Comparative Genomic Hybridization (aCGH); in 6 out of 8 cases (BAB2796 and BAB2980 are brothers), the distal transition/breakpoint of both duplications (DUPd) and triplications (TRPd) (indicated by arrows) cluster within a pair of LCRs termed K1 and K2 contrasting with the scattered nature of the proximal breakpoints of both duplications (DUPp) and triplications (TRPp) in the same group of patients. Vertical lines embedded within the rearrangement bars represent low copy repeat regions (LCRs) for which the copy numbers were not inferred due to poor probe coverage. (b) aCGH (Agilent Technologies, Santa Clara, CA) result for family HOU1217. Carrier mother BAB3115 with a de novo complex triplication that was transmitted to her son (BAB3114).
Both triplication and duplication sizes are unique in each family and ranged from 41 kb to 537 kb and from 444 kb to 5.7 Mb, respectively. The triplicated region includes the entire MECP2 gene in five patients: BAB2797, BAB2801, BAB2805, BAB3053 and BAB3114 (Fig. 2a). Oligonucleotide-array CGH revealed that all of the complex rearrangements were inherited from a carrier mother, except for BAB3053 who harbors a translocation to Yq11.22. The breakpoint at Yq11.22 was not precisely mapped due to the paucity of unique sequences on the Y chromosome.We independently confirmed genomic triplications in each of the nine families by both Multiplex Ligation-Dependent Probe Amplification (MLPA) and Fluorescence in situ Hybridization (FISH) (Supplementary Fig. 1 and data not shown). The mothers and grandmothers when available for study were tested by both CGH and MLPA and were shown to carry the same complex rearrangement as their sons or grandsons in all but one family (pedigree HOU1217, Fig. 2b), in which aCGH studies revealed that the complex rearrangement was a de novo event in the mother (BAB3115) (Fig. 2b). X-chromosome inactivation (XCI) studies revealed 100% advantageously skewed XCI patterns in all carrier females tested (data not shown); i.e. consistent with preferential inactivation of the X chromosome harboring the complex rearrangement. Family pedigrees are displayed in Supplementary Fig. 2.
Duplicated and triplicated segments likely originate from the same chromosome
To examine for potential interchromosomal exchanges during rearrangement product formation, we evaluated marker haplotypes from the genomic interval spanning the complex rearrangement using the Illumina HumanOmni1-Quad microarray. All patients except BAB3053 were notable for absence of heterozygosity for all SNPs tested using this platform, including SNPs localized to both duplicated and triplicated genomic intervals (Supplementary Fig. 3, Supplementary Table 1). Subject BAB3053 carries a translocation of MECP2 sequences to Yq11 and shows multiple heterozygous SNPs, suggesting that this complex rearrangement was generated by a distinct mechanism.The absence of heterozygosity observed in all nontranslocation DUP-TRP-DUP products suggests that the substrate(s) for these complex genomic rearrangements originated from a single chromosome. This contention is supported by the results obtained in family HOU1217, in which the patientBAB3114 inherited the complex rearrangement from his mother (BAB3115) who is a de novo carrier. SNP array analysis revealed that the segment to which the rearrangement maps was inherited from his maternal grandfather (Supplementary Fig. 4 and Table 2), suggesting a premeiotic event during male gametogenesis.
Triplicated segments are inserted in inverted orientation amid the duplications
Complex rearrangements can be defined by multiple breakpoint junctions or join points that juxtapose discreet genomic segments. The genomic rearrangement complexity was revealed by aCGH; however, aCGH provides neither orientation nor genomic positional information for the complex rearrangement but rather only copy number information. Based on aCGH results that demonstrated distinct transitions at gains of genomic intervals (i.e. duplication versus normal, triplication versus duplication), we initially hypothesized the existence of at least four potential breakpoints per patient; two for transitions to and from duplications (proximal and distal) and two for transitions to and from triplications (proximal and distal) (Fig. 1a and 2a). However, the simplest hypothesis is that each of the two duplication/triplication breakpoints was joined during rearrangement formation, ultimately producing only two breakpoint junctions, designated breakpoint junction 1, jct1 and breakpoint junction 2, jct2 (Fig. 1b).To test this latter hypothesis, we first sought to obtain breakpoint junctions using both conventional and long-range PCR and by attempting to use primer pairs in all possible orientations; i.e. inwardly-facing, outwardly-facing, forward primer pairs, and reverse primer pairs. These primers were designed at the apparent boundaries, as denoted by transitions signifying a gain of each duplicated or triplicated segment relative to the reference genome as inferred from the aCGH results. In cases of failure to obtain breakpoint junctions using this assay, alternative experimental approaches were attempted. These alternative approaches included inverse PCR (iPCR) or Southern analyses, both of which have the advantage of not relying on any preconceived notion of genome structure for the rearrangement.Southern blotting was used to analyze the recurrent breakpoint junctions mapping to the inverted repeat pair of low copy repeats (LCRs) K, which is involved in six our of eight independent complex rearrangements in our cohort (Fig. 2a). This assay was performed as described previously[10]; for males, the expected result was either a 30.7 kb band corresponding to a reference size structural variation haplotype (H1) or an 18.2 kb band corresponding to a polymorphic inversion of the region flanked by the LCR K1 and the LCR K2 that is present in 18% of the population of European-descent[10] (H2) (Fig. 3a). Females could potentially carry either one allele (the 30.7 kb or 18.2 kb) in the homozygous state or both alleles as heterozygotes (NA15510, Fig. 3b). To our surprise, all male samples carrying dup/trip involving the LCR K1 and the LCR K2 (BAB2772, BAB2796/BAB2980, BAB2797, BAB2801, BAB2805 and BAB3114) yielded the same pattern consisting of two bands, 18.2 kb and 30.7 kb, corresponding to those usually observed with the H2 and H1 inversion haplotype structures, respectively. We surmised that the unexpected presence of both bands in all male patients was a result of rearrangement formation which suggests that all seven samples have a common jct1 structure. In addition, an 18.2 kb band is expected if the centromeric-flanking region (which contains the TKTL1 gene) is duplicated and inverted whilst still flanking the LCR K1 and the LCR K2 on either the reference (H1dup) or the inverted structure (H2dup) on the ancestral chromosome (Fig. 3a). Therefore, we hypothesized that the 18.2 kb band corresponds to jct1 and, by inference that the 30.7 kb band corresponds to the ancestral state (H1 structure) in these chromosomes. We confirmed this hypothesis using the haplotype data obtained from SNP arrays; all patients in our cohort carry the SNP haplotype associated with the H1 structure (Supplementary Fig. 5).
Figure 3
Southern blot analysis of the region flanked by LCRs K1 and K2 at the Xq28 chromosome
LCRs K1 and K2 are approximately 11.3 kb in length, and are located ~38 kb apart in inverted orientation. Their 99% nucleotide sequence identity is likely maintained by frequent gene conversion[34]. These LCRs flank two genes FLNA and EMD. (a) Yellow arrows indicate an inversion: cent-FLNA/EMD-tel (H1) and the alternative genomic orientation cent-EMD/FLNA-tel (H2). An 18.2 kb band is expected to be produced in either inversion haplotype background (H1dup or H2dup) upon duplication and inversion; we hypothesize that the 18.2 kb band includes the breakpoint junction 1 (jct1). To test the haplotype of our cohort and to map the duplication breakpoints, we performed a Southern blot assay as shown here. Genomic DNA was digested with BglII; EMD was targeted using a PCR-based probe. The reference genome H1 produces a 30.7 kb band whereas the inversion haplotype (H2) yields an 18.2 kb band. (b) Southern blot results for patients carrying triplications embedded within duplications: BAB2769, BAB2772, BAB2796/BAB2980, BAB2797, BAB2801, BAB2805 (left) and family HOU1217 (right). NA10851: male carrying the reference haplotype (H1); NA15510 heterozygous female carrying both reference and inversion haplotypes (H1 and H2). BAB2771: patient carrying MECP2 duplication not involving LCR K1 and LCR K2 (H1).
Three important conclusions can be drawn from these experimental observations: 1) the inverted LCRs K1 and K2 likely mediated the rearrangements; 2) the new segment copied (containing the TKTL1 gene) was inserted in an inverted orientation with respect to the original copy; and 3) a second event, likely represented by jct2, must have occurred in order to “reverse” the inversion process. Supporting our experimental observations, a de novo complex rearrangement occurring in association with sporadic disease in family HOU1217 revealed a novel formation of the 18.2 kb band in addition to the 30.7 kb band already present in all family members. As anticipated, BAB2769 has a 30.7 kb band size corresponding to the reference structural H1 haplotype but no 18.2 kb band, consistent with the fact that BAB2769 is the only sample for which the complex rearrangement does not include the LCRs K1 and K2.Jct1 for patientBAB2769 was obtained by sequencing across the junction using reverse primer pairs positioned at the proximal ends of the duplication and triplication, respectively. Remarkably, the junction consists of two identical 149 bp segments, present as two small inverted repeats (856 bp) located 317.8 kb apart from each other in the haploid reference human genome sequence (Fig. 4). These inverted repeats are 98% identical in sequence. Thus, in all seven cases in which jct1 were identified, an inverted repeat was located at the breakpoint junction.
Figure 4
Rearrangement structure for patients BAB2769, BAB2772, BAB2796/BAB2980, BAB2797 and BAB2805 based on aCGH, Southern blotting and breakpoint sequencing
(a) Genomic region harboring duplications and triplications spanning chromosome Xq28 according to aCGH: duplications are represented in red, triplications in blue. Arrows on top of the rearrangements indicate the position of the breakpoints; inverted repeats involved in the rearrangement are represented as yellow arrows. Letters a, b and c represent the segments with copy-number gain. (b) Individual genomic structure of the region involved in the rearrangement as inferred by analysis of breakpoint junction 1 (jct1) and breakpoint junction 2 (jct2) for each patient. Jct1* represents those junctions analyzed by Southern blotting (please refer to Fig. 3); all others were sequenced. Genomic positions of each of the junctions are shown. Breakpoint junction sequences are color-coded to highlight their segment of origin in the reference genome (duplications in red and triplications in blue). The triplicated segment (b’) is inserted amid a normal (a, b, c) and a duplicated copy (a’, c’) in inverted orientation as supported by jct1 and jct2 analysis. This structure was further confirmed by FISH for patient BAB2805 (Supplementary Fig. 6). Microhomologies observed at the junctions are represented by underlined black letters; insertions or mismatches at the junctions are represented in green; deletions are represented by dashes; mismatches between the reference sequences and patient sequences are marked with a green asterisk underneath.
Jct2 in five out of eight rearrangements (BAB2769, BAB2772, BAB2796/BAB2980, BAB2797 and BAB2805) was obtained by PCR (regular, long-range or inverted PCR). For patientsBAB2772, BAB2796/BAB2980, BAB2797 and BAB2805, the breakpoints were obtained using reverse primers at the proximal ends of the duplication and the triplication, respectively. Jct2 in patientBAB2769 was obtained using forward primer pairs designed at the distal ends of the duplication and triplication, respectively. Routine PCR was attempted first followed by sequencing of the PCR products. One junction was obtained by iPCR (BAB2805); three samples (BAB2801, BAB3053 and BAB3114) were refractory to all attempts to amplify a unique breakpoint junction.Analysis of the breakpoint sequences of jct2 revealed that the triplicated segment is inverted relative to the duplicated segment in all patients (BAB2769, BAB2772, BAB2796/BAB2980, BAB2797 and BAB2805). Microhomologies of 2 to 4 nucleotides were observed in two out of five cases (BAB2772 and BAB2769, Fig. 4); in two cases, one nucleotide A or two nucleotides AA were inserted at the junction (BAB2796/BAB2980 and BAB2797); in one case (BAB2805), the junction was perfectly joined. In all five cases, one of the breakpoints occurred within or adjacent to a repetitive sequence element such as a SINE or a LINE (Table 1). A few nucleotide dissimilarities flanking the junctions were observed in two cases (BAB2772: transversion C=>G; BAB2805: deletion of one G, Fig. 4); we interpret these dissimilarities to be likely population polymorphisms that are not yet deposited in the dbSNP database (http://www.ncbi.nlm.nih.gov/projects/SNP/). Alternatively, there is evidence that the polymerase(s) involved in break-induced-replication (BIR) are ‘error prone’, with poor processivity[2] at initiation followed by lower replication fidelity compared to normal DNA replication[11]. The jct2 for subject BAB2769 reveals a break that we interpret as two template-switching events. The first event is represented by a GC microhomology that connects the distal duplication breakpoint to the distal triplication breakpoint; the second event is represented by a microhomology of CAGC accompanying a deletion of 23 bp on the distal triplication side (Fig. 4).
Table 1
Presence of inverted repeats at the breakpoint junctions of genomic triplications observed in the present study and in the literature
Chromosomal alteration
Patient ID
Size[*]
Method used to detect complexity
Complexity pattern
Jct1
Jct2
Reference
dup PLP1
BAB1290
800 kb
aCGH
dup-trp[a]-dup
IRs at dist brkpt
NA
[6]
dup PLP1
BAB1612
125 kb
aCGH
dup-trp/inv[a]-dup
IRs at dist brkpt
Microhomology ACCT, Prox trip: L2
[35]This study
dup PLP1
BAB2389
4.0 Mb
aCGH
dup-trp[a]-dup
IRs at dist brkpt
NA
[6]
dup MECP2
BAB2769
651 kb
aCGH Brkpt seq Southern
dup-trp/inv-dup
IRs at prox brkpt
Microhomologies GC, CAGC, 23 nt del Dist dup: MER73
As defined in this present study, jct1 and jct2 are join points of duplications and triplications produced in the same molecular event. In this table, they were arbitrarily assigned to each of the two duplication/triplication join points of complex rearrangements published by other groups.
This is the size of the whole region involved in the complex rearrangement.
** aCGH was performed in the mother only.
The orientation of the triplicated segment is unknown.
bThere is no evidence for duplication.
In summary, analysis of two breakpoint junctions (jct1 and jct2) from each of five unrelated patients with triplications embedded within duplications at the Xq28 chromosome reveals a common structure in that the triplication was inserted in an inverted orientation within the duplication (i.e. DUP-TRP/INV-DUP). FISH experiments in patientBAB2805 reveal a pattern consistent with this DUP-TRP/INV-DUP genomic structure (Supplementary Fig. 6). Furthermore, in all cases one of the junctions of the rearrangement involved an inverted repeat pair, with the inverted genomic segments either closely approximated (38 kb) or separated by a sizable distance (>300 kb). These shared genomic architectural features are observed at breakpoints of all complex duplication/triplication alterations at the MECP2 locus analyzed herein.
Inverted repeats mediate triplication at the PLP1/Xq22 region
We have provided evidence to demonstrate that inverted repeats between 856 bp and 11.3 kb in length with at least 98% sequence identity and separated by ~38 kb to ~318 kb can mediate complex triplications (DUP-TRP/INV-DUP) at the MECP2 locus; and that the genomic rearrangement likely involved only one chromosome homologue. We applied these “rules” to reanalyze the breakpoint junctions of previously published DUP-TRP-DUP cases involving the PLP1 locus at Xq22[6,12] (see Table 1 and Supplementary Fig. 7). Remarkably, all three cases present the same pattern observed in DUP-TRP-DUP cases at Xq28: clustering of distal duplication and triplication breakpoints at a pair of inverted repeats (jct1) with high identity between each paralagous segment (~98.9% nucleotide identity, in this case separated by ~64 kb) plus scattered proximal breakpoints (jct2). In addition, sequencing of the proximal triplication breakpoint junction (jct2) in patient BAB1612 demonstrated inversion in regard to the reference genome and connection to the proximal duplicated segment consistent with a DUP-TRP/INV-DUP structure.
Phenotypic consequences of DUP-TRP/INV-DUP
The complex DUP-TRP/INV-DUP products vary in size for both triplicated and duplicated intervals (Fig. 2a). In the case of complex Xq28 genomic rearrangements, the MECP2 gene was either duplicated or triplicated. This distinction provided a unique opportunity to assess the phenotypic consequences of incremental increases in MECP2 gene dosage. The MECP2 gene was entirely mapped within the triplicated genomic interval in five patients with the complex DUP-TRP/INV-DUP rearrangement. Similar to observations in a previous case report[8] and observations in patients without precise breakpoint junction mapping[13], the phenotype associated with MECP2 triplication was more severe than that observed for MECP2 duplication.The most salient clinical findings are summarized in Supplementary Table 3 (for complete clinical descriptions, please see Supplementary Note). Note that early respiratory insufficiency with an oxygen or ventilation requirement, early dysphagia and requirement for a feeding tube, hearing loss, and minor cardiac defects are much more commonly observed with MECP2 triplication (100%) compared with MECP2 duplication (0% to 25%), a robust observation even when compared to the collective published data on boys with MECP2 duplication[13]. Moreover, polyhydramnios and intestinal pseudoobstruction were observed clinically only in subjects with triplication. Interestingly, patientsBAB2805 and BAB3114 were reported to have Xq28/MECP2 duplications by the diagnostic laboratories that performed their clinical chromosome microarray analysis. We correctly anticipated that the MECP2 gene was triplicated in patientsBAB2805 and BAB3114 based on the observed clinical phenotype. Routine clinical diagnostic testing correctly identified Xq28/MECP2 triplication in the remaining three children with MECP2 triplications.
Discussion
We demonstrate from repeated independent cases of complex rearrangement at the MECP2 locus that a particular genomic rearrangement structure DUP-TRP/INV-DUP is associated with a specific and common pattern of underlying genomic architecture, namely the presence of inverted repeats separated by distances of up to hundreds of kb, including one pair too small (i.e. < 1kb) to be called segmental duplication under the current definition[14]. We show that these complex rearrangement events appear to involve a single X-chromosome homologue, likely in the male germline, generating carrier daughters and affected grandsons. The involvement of a single homologue is consistent with studies of copy number gain in other X-chromosome loci including duplications involving the DMD locus[15] and PLP1[16]. Furthermore, we provide evidence that DUP-TRP/INV-DUP occurring at the PLP1 locus is also associated with underlying inverted repeat genomic architecture.Whereas many genomic disorders have been shown to result from CNV due to either duplication or deletion at a given locus, our data clearly show that triplication of MECP2 conveys a more severe, distinct and clinically recognizable syndrome.
Triplications embedded in duplications may share the same general genomic structure
We observed triplications embedded in duplications in at least 20% of the rearrangements involving MECP2copy number gain[9]. This observation is supported by two recent reports in which triplications were observed in 2 out of 9 patients[17] and in 2 out of 4 patients[18]. Our data show that triplications embedded in duplications at Xq28 share a common structure and reveal a potential common formation mechanism. This mechanism: i) requires two breakpoint junctions: one (jct1) invariably maps within inverted repeats with at least 98% sequence identity that can be separated by up to hundreds of kb, the second (jct2) is scattered and does not occur at sites of sequence homology; ii) the triplicated segment is inserted in an inverted orientation between duplicated sequences in direct orientation (one of the copies in direct orientation corresponds to the original copy); iii) the second breakpoint junction presents no extensive homology although some microhomologies may be found at the junction (e.g. BAB2769); iv) all extra segments (duplications and triplications) apparently originated from only one chromosome. The same pattern was also observed in patients carrying triplications embedded in a duplication reported at the PLP1 locus (Table 1 and Supplementary Fig. 6), suggesting that the mechanism producing triplication at Xq28 is also responsible for triplication formation elsewhere in the genome.
Towards a mechanism of formation of triplications embedded in duplications
We propose that DUP-TRP/INV-DUP complex rearrangements are formed by a combination of homology-directed BIR with microhomology-mediated BIR or non-homologous end joining (NHEJ) as described in Fig. 5. BIR is a mechanism that uses homologous recombination to restart a collapsed (broken) replication fork[1-5]. During this process, a 3’-tail at the broken DNA end invades the sister molecule from which it broke. The 3’-end primes DNA synthesis and forms a replication fork. BIR, in the cases discussed here, occurred non-allelically (ectopically), using the homology of the inverted LCR. This has the effect of synthesizing a length of chromatid back in the opposite direction from that in which the fork had been traveling, forming a large inverted duplication. This is unlikely to result in a healthy viable cell unless a second compensating inversion event occurs. If the reversed replication fork again collapses, or if there is a double-strand break in the chromatid carrying the inverted duplication, then there exists a new DNA end. This end could again be repaired by BIR. However, in the patients we studied, rejoining did not use homologous sequence. Instead, ends joined in inverted orientation to the unchanged chromatid by non-homologous end joining[19] (if there was a second break), or by a replicative mechanism such as MMBIR[4] or break-induced serial replication slippage (BISRS)[20]. These mechanisms are suggested to substitute for BIR when, for any reason, homologous recombination repair is unavailable. Such mechanisms yield the non-homologous or microhomologous joints that we see, including the complexities that are characteristic of events attributed to MMBIR. The complex microhomologous events observed in BAB2769 (Fig. 4) are characteristic of events attributed to the MMBIR mechanism[4,6,9,21,22].
Figure 5
Proposed model for generation of the common DUP-TRP/INV-DUP rearrangement product
(a) The rearrangement may have occurred during spermatogenesis in the ancestral male on his X chromosome, likely during S or G2 phase. (b) During replication of the sister chromatid, the replication fork may collapse and induce break-induced replication (BIR) (c) BIR uses homologous recombination to re-establish a new fork using ectopic homology provided by inverted repeats forming jct1. (d) This event initiates replication that forms a length of chromatid back in the opposite direction from that in which the fork had been traveling before the collapse, forming a large inverted duplication (e). If the reversed replication again collapses, or if there is a double-strand break in the chromatid carrying the inverted duplication, then there is a new DNA end (f). In our patient cohort, this new end rejoining (jct2 formation) occurred by either non-homologous end joining (NHEJ) requiring a double-strand break (DSB) on the original strand followed by ligation of that segment to the end of the rearranged newly replicated strand, or by a replication mechanism such as microhomology-mediated break-induced replication (MMBIR) (g) that requires a new strand annealing and extension to the end of the replicon or the chromosome. (h) Representative structure obtained after two steps of homologous and nonhomologous mechanisms. Duplicated and triplicated segments are represented in red and blue colors, respectively.
If this second inverted join links the duplication-carrying chromatid to the intact sister chromatid in direct orientation so that it compensates for the first inversion, and if the joint occurs beyond the length already duplicated, then a triplication embedded in duplication will result (Fig. 5). Six out of seven of the events reported here are interpreted as having initiated from replication forks oriented away from the centromere, and the seventh event commenced from a fork oriented towards the centromere. However, the sequence of events in the two configurations is the same. This mixture of orientations indicates that a dicentric intermediate was not necessary in this process. Thus, the model we propose is a two-step mechanism: BIR followed by a non-homologous or microhomologous mechanism, probably occuring during phase S or G2 in a single pre-meiotic cell in a male gonad.The relevance of this model for formation of triplications in other parts of the genome as well as for formation of novel inversions is still to be unveiled. Duplications embedded in triplications inserted in inverted orientation have been reported; such as triplication encompassing chromosome 9q34[23] and large interstitial triplications associated with inversions detected by FISH including 15q11-q13[24], 2q11.2-q21[25] and 13q12q22[26] (Table 1). The observation of co-occurrence of triplications and inversions in other genomic disorders suggest that this mechanism may underlie the triplication formation at other sites of the human genome.
Inverted repeat genome architecture
Evidence that inverted repeats and palindromes can interfere with the replication process and lead to chromosomal rearrangements has been accumulating. Lebofsky and Bensimon[27] studied the replication of the palindromic-laden human rDNA gene array using DNA molecular combing in HeLa cells and observed fork arrest associated with the presence of palindromic structures. Inverted Alu repeats close enough to form hairpins can cause a replication blockage in E. coli, yeast and mammalian cells in a homology-dependent manner[28]. The inverted repeats involved in the rearrangements observed in our cohort are not palindromes, as the spacer distance is too long; therefore, thus far there is no evidence that in these cases secondary structures such as hairpin and cruciform are causing fork stalling or fork collapse. However, our data add evidence that inverted repeats, even at a distance, can lead to rearrangements, and can contribute to local instability in the human genome.Recently, Paek et al.[29] observed fusion of nearby inverted repeats in budding yeast, and Mizuno et al.[30] observed similar events in fission yeast. Paek et al.[29] demonstrated that formation of dicentric and acentric fragments in budding yeast lead to further chromosome instability; Mizuno et al.[30] also showed that formation of dicentric and acentric chromosomes followed replication fork arrest within palindromes. There was no evidence of involvement of either double-strand breaks or homologous recombination proteins in this process that is stimulated upon disruption of DNA replication[29]. The mechanism that Paek et al.[29] proposed, termed “faulty template switching”, relies on homology between the inverted repeats to re-start a stalled fork that underwent a fork reversal; if the nascent strand pairs with the inverted nearby copy, then this will lead to an inverted repeat fusion and will result in the formation of an unstable dicentric chromosome prone to undergo further rearrangements[29]. Interestingly, paralleling our data, the inverted repeats involved could be several kilobases apart and share nucleotide identity as short as 20 bp. Whether fork reversal of inverted repeats, perhaps brought in proximity either by a replication factory[31] or long-distance transcriptional regulatory complexes, or if collisions between ‘head-on’ and/or co-directional replication transcriptional conflicts[32,33] can stimulate fork collapse potentially associated with inverted repeat directed DUP-TRP/INV-DUP formation remains unknown.In conclusion, we document that the presence of inverted LCRs in the MECP2 vicinity are mediating genomic disorder-associated complex rearrangements that have a particular genomic structure DUP-TRP/INV-DUP. Furthermore, such a structure is also observed to occur at the PLP1 locus in association with inverted repeats. These genomic instability considerations are likely to apply wherever inverted repeats occur within a range of hundreds of kilobase pairs of DNA in the human genome. Moreover, structural variation in personal genomes may result in individual specific structural haplotypes that are more susceptible to the events reported herein. Of note, inversion during rearrangement formation can generate remarkable complexity with only two breakpoint junctions. Furthermore, multiple genic changes (e.g. gene interruptions, fusions, dosage changes, etc) can evolve with a single mutational event suggesting that complex genomic rearrangements such as DUP-TRP/INV-DUP may have an important role in evolution.
Authors: A B Singleton; M Farrer; J Johnson; A Singleton; S Hague; J Kachergus; M Hulihan; T Peuralinna; A Dutra; R Nussbaum; S Lincoln; A Crawley; M Hanson; D Maraganore; C Adler; M R Cookson; M Muenter; M Baptista; D Miller; J Blancato; J Hardy; K Gwinn-Hardy Journal: Science Date: 2003-10-31 Impact factor: 47.728
Authors: C Mimault; G Giraud; V Courtois; F Cailloux; J Y Boire; B Dastugue; O Boespflug-Tanguy Journal: Am J Hum Genet Date: 1999-08 Impact factor: 11.025
Authors: R Z Cer; K H Bruce; D E Donohue; N A Temiz; U S Mudunuri; M Yi; N Volfovsky; A Bacolla; B T Luke; J R Collins; R M Stephens Journal: Curr Protoc Hum Genet Date: 2012-04
Authors: Tianshu Yang; Melissa B Ramocki; Jeffrey L Neul; Wen Lu; Luz Roberts; John Knight; Christopher S Ward; Huda Y Zoghbi; Farrah Kheradmand; David B Corry Journal: Sci Transl Med Date: 2012-12-05 Impact factor: 17.956
Authors: James R Lupski; Pengfei Liu; Pawel Stankiewicz; Claudia M B Carvalho; Jennifer E Posey Journal: Expert Rev Mol Diagn Date: 2020-10-10 Impact factor: 5.225