| Literature DB >> 35047838 |
Zeinab Fadaie1,2, Kornelia Neveling1,3, Tuomo Mantere1,4, Ronny Derks1, Lonneke Haer-Wigman1, Amber den Ouden1, Michael Kwint1,2, Luke O'Gorman1, Dyon Valkenburg2,5, Carel B Hoyng2,5, Christian Gilissen1,4, Lisenka E L M Vissers1,2, Marcel Nelen1, Frans P M Cremers1,2, Alexander Hoischen1,4,6, Susanne Roosing1,2.
Abstract
The lack of molecular diagnoses in rare genetic diseases can be explained by limitations of current standard genomic technologies. Upcoming long-read techniques have complementary strengths to overcome these limitations, with a particular strength in identifying structural variants. By using optical genome mapping and long-read sequencing, we aimed to identify the pathogenic variant in a large family with X-linked choroideremia. In this family, aberrant splicing of exon 12 of the choroideremia gene CHM was detected in 2003, but the underlying genomic defect remained elusive. Optical genome mapping and long-read sequencing approaches now revealed an intragenic 1,752 bp inverted duplication including exon 12 and surrounding regions, located downstream of the wild-type copy of exon 12. Both breakpoint junctions were confirmed with Sanger sequencing and segregate with the X-linked inheritance in the family. The breakpoint junctions displayed sequence microhomology suggestive for an erroneous replication mechanism as the origin of the structural variant. The inverted duplication is predicted to result in a hairpin formation of the pre-mRNA with the wild-type exon 12, leading to exon skipping in the mature mRNA. The identified inverted duplication is deemed the hidden pathogenic cause of disease in this family. Our study shows that optical genome mapping and long-read sequencing have significant potential for the identification of (hidden) structural variants in rare genetic diseases.Entities:
Keywords: CHM; RNA hairpin structure; choroideremia; inverted duplication; long-read sequencing; optical genome mapping; strucutural variation
Year: 2021 PMID: 35047838 PMCID: PMC8756506 DOI: 10.1016/j.xhgg.2021.100046
Source DB: PubMed Journal: HGG Adv ISSN: 2666-2477
Figure 1The pedigree of family A affected with choroideremia
The identified inverted duplication segregates with the disease in family A. The affected female individuals manifest the phenotype as well. DNA material of the affected male individual indicated by the arrow was utilized for optical mapping and long-read sequencing analysis.
Figure 2Identification of the intragenic inverted duplication through optical mapping and long-read sequencing
(A) Schematic representation of the inverted duplication in the CHM locus that has been identified in family A. (B) The result of optical genome mapping revealed an insertion of 1,573 and 1,549 bp within a 15.9 kb region upstream of CHM exon 12 in the affected individual compared to the reference genome. The green bar demonstrates the genome map of the reference genome. The blue bars show the genome maps of the affected individual; these two maps are only distinguished by one label difference. Both structural variant calls are shown on top, both calling an insertion within the region of interest. (C) By using long-read sequencing, the insertion first seen by optical mapping was identified as an intragenic inverted duplication. Two out of four reads covering this region span the inverted duplication completely (reads 1 and 4), whereas the two other reads (reads 2 and 3) do not span the entire event. CHM is located on the minus strand (3′ to 5′); however, the results shown in this figure are provided for the plus strand.
Figure 3Assessment of microhomology at 5′ and 3′ breakpoints
(A) A schematic representation of the genomic region of exon 13 to exon 12 CHM (5′→3′) (B and C) The 5′ and 3′ breakpoint regions of the inverted duplication event were assessed for the presence of microhomology using multiple sequence alignment of the Cluster Omega tool. (B) Analysis of the reference fragment spanning the insertion site c.1510+693 and c.1510+694 (upper sequence) and the reference sequence spanning c.1510+402 (BP-5′, lower sequence) region showed a microhomology region of 8 nucleotides. (C) Analysis of the reference fragment spanning the insertion site c.1510+693 and c.1510+694, (upper sequence) and the reference sequence spanning c.1414−1244 (BP-3′, lower sequence) showed microhomology of 4 nucleotides. 60 bp reference sequences spanning each position were used as input. The start and end positions of the assessed sequences are provided. The reference sequence is indicated in black; the observed sequence as in family A is marked in red and green. Homology between the reference and observed sequence is shown with a vertical black line, and the regions of microhomology are highlighted in the yellow boxes.
Figure 4A proposed FoSTeS/MMBIR mechanism underlying the origin of inverted duplication
(A) A schematic representation of the genomic region of intron 11 until intron 13 of CHM is shown as present on the reverse strand. The relevant nucleotides for the proposed model are depicted. A red dotted line represents the location of the sticky end break. (B) The proposed mechanism of the SV is illustrated, i.e., (1) the DNA polymerase synthesized the DNA from 5′ to 3′, (2) the polymerase stalls due to the sticky end break at position c.1510+693, and (3) template switching to the forward strand of CHM (indicated in red) occurs due to the presence of 8 bp microhomology. (4) The polymerase continues DNA replication of the strand and thereby generates the inverted duplication containing a second copy of exon 12. (5) A 4 bp microhomology region at position c.1414−1244 in the forward strand stalls the DNA replication, and (6) template switching occurs to the reverse strand. (7) The DNA mismatch repair mechanism completes the 3′ sticky overhang by 3′ flap cleavage and fill in synthesis leading to a 4 bp random nucleotide insertion. From there, (8) DNA replication continues in the original strand. (C) The resulting CHM allele, specific for family A, containing the inverted duplication is shown, occurring through the FoSTeS/MMBIR mechanism.
Figure 5Hairpin formation putatively underlying the observed CHM exon 12 skipping in mature mRNA
(A) Schematic representation of CHM exons 10 to 13 of the reference genome and the affected individual with the intragenic inverted duplication downstream of exon 12. (B) Enlargement of the hairpin stem at the basal part at the nucleotide level. The first base pair of the hairpin stem is assembled from c.1414−1243 to the 1,742nd nucleotide on the inverted duplication. The last nucleotide of the inverted duplication and 4-bp inserted sequence (highlighted in green) do not contribute to the hairpin stem. (C) Enlargement of the hairpin stem at the top part at the nucleotide level. The hairpin stem is terminated by the last base pair from c.1510+410 to c.1510+686 of wild-type sequence. The 274-nucleotide single-strand RNA starting from c.1510+411 till c.1510+685 is on the loop part of the hairpin structure.