| Literature DB >> 35764313 |
Belén de la Morena-Barrio1, Jonathan Stephens2,3, María Eugenia de la Morena-Barrio1, Luca Stefanucci2,4,5, José Padilla1, Antonia Miñano1, Nicholas Gleadall2,3, Juan Luis García6, María Fernanda López-Fernández7, Pierre-Emmanuel Morange8,9, Marja Puurunen10, Anetta Undas11, Francisco Vidal12,13,14, Frances Lucy Raymond3,15, Vicente Vicente1, Willem H Ouwehand2,3, Javier Corral1, Alba Sanchis-Juan2,3.
Abstract
The identification of inherited antithrombin deficiency (ATD) is critical to prevent potentially life-threatening thrombotic events. Causal variants in SERPINC1 are identified for up to 70% of cases, the majority being single-nucleotide variants and indels. The detection and characterization of structural variants (SVs) in ATD remain challenging due to the high number of repetitive elements in SERPINC1. Here, we performed long-read whole-genome sequencing on 10 familial and 9 singleton cases with type I ATD proven by functional and antigen assays, who were selected from a cohort of 340 patients with this rare disorder because genetic analyses were either negative, ambiguous, or not fully characterized. We developed an analysis workflow to identify disease-associated SVs. This approach resolved, independently of its size or type, all eight SVs detected by multiple ligation-dependent probe amplification, and identified for the first time a complex rearrangement previously misclassified as a deletion. Remarkably, we identified the mechanism explaining ATD in 2 out of 11 cases with previous unknown defect: the insertion of a novel 2.4 kb SINE-VNTR-Alu retroelement, which was characterized by de novo assembly and verified by specific polymerase chain reaction amplification and sequencing in the probands and affected relatives. The nucleotide-level resolution achieved for all SVs allowed breakpoint analysis, which revealed repetitive elements and microhomologies supporting a common replication-based mechanism for all the SVs. Our study underscores the utility of long-read sequencing technology as a complementary method to identify, characterize, and unveil the molecular mechanism of disease-causing SVs involved in ATD, and enlarges the catalogue of genetic disorders caused by retrotransposon insertions. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/).Entities:
Mesh:
Substances:
Year: 2022 PMID: 35764313 PMCID: PMC9393088 DOI: 10.1055/s-0042-1749345
Source DB: PubMed Journal: Thromb Haemost ISSN: 0340-6245 Impact factor: 6.681
Cohort of individuals included in this study—demographic, antithrombin values, and genetic results
| Participant | Antithrombin | Family history | Gender |
MLPA
| PGM | CGHa | LR-PCR and Illumina sequencing | WGS ONT | Algorithm | Genotype | Coordinates | Length (bp) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Anti-FXa% | Ag (%) | ||||||||||||
| P1 | 30 | 30 | Yes | M | Deletion exon 1 | – | Negative | Deletion exon 1 | Deletion exon 1 | Nanosv; sniffles; svim | Het | 1:173916704–173935703 | 18,999 |
| P2 | 54 | 41 | Yes | M | Deletion exon 1 | – | Negative | Deletion exons 1, 2 | CxSV (Deletion exon 1; duplication exon 3) | Nanosv; sniffles | Het;Het | 1:173911379–173915115; 1:173912151–173919034 | 3,737;6,884 |
| P3 | 44 | 41 | Yes | F | Complete deletion | – | Deletion 2 genes | – | Deletion 2 genes | Nanosv; sniffles | Het | 1:173879820–173925989 | 46,169 |
| P4 | 45 | 38 | No | M | Complete deletion | – | Deletion 20 genes | – | Deletion 20 gene | Nanosv; sniffles | Het | 1:173847847–174816147 | 968,005 |
| P5 | 36 | 50 | Yes | F | Complete deletion | – | – | – | Deletion 5 genes | Nanosv | Het | 1:173850996–173950174 | 99,178 |
| P6 | 61 | 46 | Yes | M | Duplication exons 1, 2, and 4; deletion exon 6 | – | Negative | Tandem duplication exons 1–5 | Tandem duplication exons 1–5 | Nanosv | Het | 1:173908412–173919816 | 11,404 |
| P7 | 45 | 38 | No | M | Deletion exons 1–5 | – | Deletion exons 1–5 + 1 gene | – | Deletion 2 genes | Nanosv; sniffles | Het | 1:173908334–174103015 | 194,389 |
| P8 | 52 | 37 | Yes | F | Deletion exons 2–5 | – | – | – | Deletion exons 2–5 | Nanosv; sniffles | Het | 1:173908218–173915405 | 7,187 |
| P9 | 56 | 61 | Yes | F | Negative | Negative | – | Negative | Insertion SVA | Nanosv | Het | 1:173905922 | 2,440 |
| P10 | 50 | 46 | Yes | F | Negative | Negative | – | Negative | Insertion SVA | Visual inspection | Het | 1:173905922 | 2,440 |
| P11 | 40 | 41 | Yes | F | Negative | Negative | – | Negative | Negative | ||||
| P12 | 73 | 62 | No | F | Negative | Negative | – | Negative | Negative | ||||
| P13 | 63 | 58 | No | M | Negative | – | – | Negative | Negative | ||||
| P14 | 69 | NA | No | F | Negative | Negative | – | Negative | Negative | ||||
| P15 | 56 | 45 | Yes | F | Negative | – | – | – | Negative | ||||
| P16 | 68 | 54 | No | M | Negative | Negative | – | Negative | Negative | ||||
| P17 | 66 | 67 | No | M | Negative | Negative | – | Negative | Negative | ||||
| P18 | 67 | 70 | No | F | Negative | – | – | – | Negative | ||||
| P19 | 50 | 70 | Yes | M | Negative | – | – | – | Negative | ||||
Abbreviations: Ag, antigen; bp, base pair; Het, heterozygous.
Note: SERPINC1 gene-driven tests include MLPA, PGM sequencing (Ion Torrent) and long-range PCR (LR-PCR) amplification, and Miseq sequencing (Illumina). Genome wide tests are CGHa and whole genome sequencing (WGS) using nanopore technology (ONT). Coordinates have been confirmed by Sanger sequencing. Length refers to the extension of the structural variants.
Fig. 1Long-read sequencing workflow and results. ( A ) Overview of the general stages of the SVs discovery workflow. Algorithms used are depicted in yellow boxes. ( B ) Nanopore sequencing results. (i) Sequence length template distribution. Average read length was 4,499 bp (SD ± 4,268); the maximum read length observed was 2.5 Mb. (ii) Genome median coverage per participant. The average across all samples was 16× (SD ± 7.7). ( C ) Filtering approach and number of SVs obtained per step. SERPINC1 + promoter region corresponds to [GRCh38/hg38] Chr1:173,903,500–173,931,500. ( D ) Anti-FXa percentage levels for the participants with a variant identified (P1–P10), cases without a candidate variant (P11–P19), and 300 controls from our internal database. The statistical significance is denoted by asterisks (*), where *** p < 0.001, **** p ≤ 0.0001. p -Values calculated by one-way ANOVA with Tukey's posthoc test for repeated measures. ATD, antithrombin deficiency; ONT, Oxford Nanopore Technologies; SV, structural variant.
Fig. 2Candidate SVs identified by long-read sequencing. ( A ) Schematic of chromosome 1 followed by protein coding genes falling in the zoomed region (1q25.1). SVs for each participant (P) are colored in red (deletions) and blue (duplications). The insertion identified in P9 and P10 is shown with a black line . ( B ) Schematic of SERPINC1 gene (NM_000488) followed by repetitive elements (REs) in the region. SINEs and LINEs are colored in light and dark gray , respectively. Asterisks are present where the corresponding breakpoint falls within a RE. ( C ) Characteristics of the antisense-oriented SINE-VNTR-Alu (SVA) retroelement (with respect to the canonical sequence) observed in P9. Lengths of the fragments are subject to errors from nanopore sequencing. SV, structural variant; TSD, target site duplication.
Fig. 3Resolution of a complex SV. Schematic representation of genetic diagnostic methods used to characterize the SVs in participant P2. Results from MLPA, LR-PCR, and nanopore are shown in white boxes . Primers used for both LR-PCR and Sanger validation experiments are shown representing the genetic location of each one with orange and green arrows , respectively. SERPINC1 gene in the IGV screenshot is represented in blue and exons are indicated. J1 and J2 correspond to the newly formed junctions described in Fig. S5 . J = new junction; M1k = 1 kb molecular weight marker; M = 100 bp molecular weight marker; P = patient; C = control; B = blank. LR-PCR, long-range polymerase chain reaction; MPLA, multiple ligation-dependent probe amplification; SV, structural variant.
Fig. 4Schematic representation of genetic diagnostic methods used to characterize the SVs in participant P6. Results from MLPA, LR-PCR, and nanopore are shown in white boxes . Primers used for both LR-PCR and Sanger validation experiments are shown representing the genetic location of each one with orange and green arrows , respectively. SERPINC1 gene in the IGV screenshot is represented in blue and exons are indicated. J1 corresponds to the newly formed junctions described in Fig. S5 . J = new junction; M = molecular weight marker 1 kb or 100 b; P = patient; C = control; B = Blank. For the LR-PCR results, C1 and P1 correspond to PCR 1 (done with Primer F + Primer R), and C2 and P2 correspond to PCR2 (done with Primer F + Primer R2). LR-PCR, long-range polymerase chain reaction; MPLA, multiple ligation-dependent probe amplification; SV, structural variant.