Literature DB >> 34177904

Next Generation Sequencing Based Multiplex Long-Range PCR for Routine Genotyping of Autoinflammatory Disorders.

Ferhat Guzel1, Micol Romano2, Erdi Keles1, David Piskin2,3, Seza Ozen4, Hakan Poyrazoglu5, Ozgur Kasapcopur6, Erkan Demirkaya2,3.   

Abstract

Background: During the last decade, remarkable progress with massive sequencing has been made in the identification of disease-associated genes for AIDs using next-generation sequencing technologies (NGS). An international group of experts described the ideal genetic screening method which should give information about SNVs, InDels, Copy Number Variations (CNVs), GC rich regions. We aimed to develop and validate a molecular diagnostic method in conjunction with the NGS platform as an inexpensive, extended and uniform coverage and fast screening tool which consists of nine genes known to be associated with various AIDs.
Methods: For the validation of basic and expanded panels, long-range multiplex models were setup on healthy samples without any known variations for MEFV, MVK, TNFRSF1A, NLRP3, PSTPIP1, IL1RN, NOD2, NLRP12 and LPIN2 genes. Patients with AIDs who had already known causative variants in these genes were sequenced for analytical validation. As a last step, multiplex models were validated on patients with pre-diagnosis of AIDs. All sequencing steps were performed on the Illumina NGS platform. Validity steps included the selection of related candidate genes, primer design, development of screening methods, validation and verification of the product. The GDPE (Gentera) bioinformatics pipeline was followed.
Results: Although there was no nonsynonymous variation in 21 healthy samples, 107 synonymous variant alleles and some intronic and UTR variants were detected. In 10 patients who underwent analytical validation, besides the 11 known nonsynonymous variant alleles, 11 additional nonsynonymous variant alleles and a total of 81 synonymous variants were found. In the clinical validation phase, 46 patients sequenced with multiplex panels, genetic and clinical findings were combined for diagnosis.
Conclusion: In this study, we describe the development and validation of an NGS-based multiplex array enabling the "long-amplicon" approach for targeted sequencing of nine genes associated with common AIDs. This screening tool is less expensive and more comprehensive compared to other methods and more informative than traditional sequencing. The proposed panel offers advantages to WES or hybridization probe equivalents in terms of CNV analysis, high sensitivity and uniformity, GC-rich region sequencing, InDel detection and intron covering.
Copyright © 2021 Guzel, Romano, Keles, Piskin, Ozen, Poyrazoglu, Kasapcopur and Demirkaya.

Entities:  

Keywords:  autoinflammation; genetic diagnosis; long range PCR; multiplex; next generation sequencing - NGS

Year:  2021        PMID: 34177904      PMCID: PMC8219981          DOI: 10.3389/fimmu.2021.666273

Source DB:  PubMed          Journal:  Front Immunol        ISSN: 1664-3224            Impact factor:   7.561


Introduction

As a term, autoinflammation was used to define the diseases not accompanied with high titer of antibodies or T cells. Now it has been understood that these diseases are caused by mutations in genes regulating the innate immune responses (1). Cells that are mediating the pathogenesis of hereditary autoinflammatory diseases (AIDs) are the cells of the innate immune system such as dendritic cells, neutrophils, monocytes and macrophages (2). The diagnosis of AIDs is generally based on the clinical manifestations, symptoms and other biochemical parameters. A detailed clinical history and physical examination are the first steps in the diagnosis and management of autoinflammatory diseases in childhood (3–5). Some of these diseases occur soon after birth and can be fatal if diagnosis and treatment is delayed or unavailable. The differential diagnosis will be maintained so that the treatment modalities can be established before the disease progresses with the help of the genetic screening service. Specific clinical manifestations and genetic analysis are significant for making a differential diagnosis. However, many patients share similar clinical symptoms/features and 50% of patients do not have confirmation by molecular genetic testing (6, 7). Genetic analysis of patients with AIDs allows early and accurate diagnosis and the administration of appropriate treatments. Molecular genetics has greatly contributed to correct diagnosis, especially in atypical presentations (1, 8). These patients are on costly drugs because of empirical treatment modalities due to lack of genetic diagnosis and many patients are using off label or not able to use medications due to not being able to detect genetic mutation. Mutational screening may not be comprehensive related to the used techniques (Sanger Sequencing, Strip Assay etc.), partial gene screening, or screening just known genes and/or mutational hotspots or a subset of coding portions. During the last decade, remarkable progress with massive sequencing has been made in the identification of disease-associated genes for AIDs using the next generation sequencing technologies (NGS) (1, 8–10). NGS has become an instrumental technology for finding single-gene defects with a comprehensive approach in undiagnosed patients with early onset symptoms (11–14). NGS has advanced the field of autoinflammation by identifying disease-causing genes that point to pathways not known to regulate cytokine signaling or inflammation. Unfortunately, it is still not available for use in routine practice due to great expense in many countries (15). Accurate diagnosis of AIDs is essential to access for the treatment. Overlapping disease manifestations provoke genetic testing among AIDs is the unique way for the diagnosis. International group of experts recently published a guideline and suggested sequencing the 8 genes at a minimum, and if possible additional AIDs genes from the list referenced in Infevers (16). According to this guideline, an ideal genetic screening method should give information about SNVs, InDels, Copy Number Variations (CNVs), GC rich regions, mosaicism and it must be deep sequencing (1000−10,000×). However, there is no tool that currently provides this data for both the clinician and geneticist. Our aim was to develop and validate a molecular diagnostic method in conjunction with NGS platform as an inexpensive, extended and uniform coverage and fast screening tool which consists of nine genes known to be associated with various AIDs.

Material and Methods

Sample Collection and DNA Extraction

DNA extraction from blood samples was performed using the QIAamp DNA mini kit (Qiagen, Germany). The concentration and purity of the DNA were measured by a spectrophotometric method (NanoDrop 2000c, Thermo Scientific) at 260/280 nm wave length (17). The DNA was isolated at 25 ng/µl concentration and stored to use at 4°C.

Primer Design, Multiplex PCR Amplification

Two different multiplex panels were designed as basic and expanded panels. The basic panel is a part of the expanded panel. The basic panel contains MEFV, MVK, TNFRSF1A and NLRP3 genes’ CDS, UTR and some intronic regions. The expanded panel contains PSTPIP1, IL1RN, NOD2, NLRP12 and LPIN2 in addition to the basic panel ( ). The regions covered by the panels are shown in . First, gene regions were determined and primers were designed to cover all coding sequences (CDS) and UTRs of genes by using NCBI Primer Blast (18). The Long PCR method was chosen to reduce the number of reactions. With this method, amplicons up to 11.78 Kb were obtained. Each of the primers was tested with healthy control DNAs. For multiplex PCR amplification, primers were designed to minimize primer-dimer formation. Attention was paid to ensure that Tm degrees were close. Reaction contents and conditions were also optimized in this context. The gene regions in the basic panel were amplified in one tube while the expanded panel regions were amplified in four tubes. The 25 µl multiplex PCR amplification reaction volume contained 1,25 unit of LA Taq Hot-Start DNA Polymerase, 1,25 unit of PrimeSTAR GXL DNA Polymerase (TaKaRa Bio, Shiga, Japan), 2.5 µl of 10X LA PCR Buffer II (Mg2+ plus), 4 µl of dNTP mixture (2.5 mM each), 5 µl of Betaine (Sigma), 1 µl (0.2 µM each) of primer mixture ( ), 1-25 ng of template genomic DNA and the rest ddH20. After determining the appropriate amounts of DNA input and PCR components, the PCR cycle and running parameters were adjusted for effective amplification. The cycling parameters are as follows: initial denaturation at 95°C for 3 min, followed by 34 cycles of 30 second denaturing at 95°C, 40 s annealing at 58°C, and 12 min extension at 68°C. A final extension at 72°C for 12 min concludes the PCR. The PCR reactions were performed using C1000 Thermal Cycler (Bio-Rad). Amplification was seen as a single band in gel electrophoresis. A 1 Kb DNA ladder was used to assess band size. PCR products were purified by Agencourt AMPure XP (Beckman Coulter) magnetic beads and quantified by Qubit 2.0 Fluorometer (Thermo Fisher) with dsDNA BR and HS Reagent Kit.
Table 1

Expanded panel content and covered region percentage in exons and introns.

DiseaseGeneChrProteinInheritanceGene Length (bp)Number of ExonExon Length (bp)Intron Length (bp)Total Amplicon Length (bp)Exon Coverage PercentIntron Coverage Percent
FMFMEFV16PyrinAR14.750103.50611.24415.962100%100%
MKD/HIDSMVK12Mevalonate kinaseAR24.332112.83321.49921.205100%85%
TRAPSTNFRSF1A12TNF-R1AD13.306102.17111.1359.080100%62%
CAPSNLRP31NLRP3AD32.64593.54529.10028.529100%85%
PAPAPSTPIP115PSTPIP1AD42.350151.94140.40918.769100%41%
DIRAIL1RN2IL1RaAR16.12441.81414.31014.470100%88%
Blau SyndromeNOD2 (CARD15)16NOD2AD33.696124.41429.28227.155100%77%
FCAS2NLRP1219NLRP12AD30.661103.55027.11126.953100%79%
Majeed SyndromeLPIN218Phosphatidate phosphatase LPIN2AR96.322206.31890.00424.526100%20%

MEFV, Mediterranean Fever; MVK, Mevalonate Kinase; TNFRSF1A, Tumor Necrosis Factor Receptor Superfamily Member 1A; NLRP3, NLR Family Pyrin Domain Containing 3; PSTPIP1, Proline-Serine-Threonine Phosphatase Interacting Protein 1; IL1RN, Interleukin-1 Receptor Antagonist; NOD2, Nucleotide-binding Oligomerization Domain-containing Protein 2; NLRP12, NLR Family Pyrin Domain Containing 12; LPIN2, Phosphatidate Phosphatase LPIN2; AR, Autosomal Recessive; AD, Autosomal Dominant.

Figure 1

Targeted PCR regions outline in 9 gene. (A) MEFV, Mediterranean Fever, (B) MVK, Mevalonate Kinase (C) TNFRSF1A, Tumor Necrosis Factor Receptor Superfamily Member 1A, (D) NLRP3, NLR Family Pyrin Domain Containing 3, (E) PSTPIP1, Proline-Serine-Threonine Phosphatase Interacting Protein 1, (F) IL1RN, Interleukin-1 Receptor Antagonist, (G) NOD2, Nucleotide-binding Oligomerization Domain-containing Protein 2, (H) NLRP12, NLR Family Pyrin Domain Containing 12, (I) LPIN2, Phosphatidate Phosphatase LPIN2. Brown and orange boxes indicate UTRs and exons, respectively.

Table 2

Primer Sequence and Multiplex Volumes.

Multiplex TubeAmplicon NamePrimer Sequence (3’-5’)Tm (UCSC)Amplicon Length (bp)Volume (µl)
Multiplex Tube 1MEFV-1FCCTCACTTGCCTTCTTGGG60,884902,5
MEFV-1RAGGTTGCTCTCTACCATCTTCT56,42,5
MEFV-2FTAACCAGCAGCCAAGGGTAAG61,574721
MEFV-2RAAGGGAGAATCGGGAATAAGACA62,31
MVK-1FACTCACCTGTCCCCGTCTT59,585401,5
MVK-1RCATCTAACCTGCTGCCCTCT59,51,5
MVK-2FCACCTCTGCCCGTTCTTCTT61,787971
MVK-2RCTGGGTCTGTCTCCTGCTTG611
MVK-3FACAAGCAGGAGACAGACCCA60,938680,6
MVK-3RGAAGACAACAGCAGGGAAGG59,80,6
TNFRSF1A-1FTTACAGGAACCCCAGGAGACA61,812191,5
TNFRSF1A-1RACTTCACCAGCCGCCAAAA641,5
TNFRSF1A-2FCCTGAGACTGCAAAGCACAC59,678611
TNFRSF1A-2RGAGGGAATGTGGTGGTGGAA63,11
NLRP3-1FCCTGCCACATACCAGCCATT63,2117832,5
NLRP3-1RTCTCCACCTTCCACCTCACT59,72,5
NLRP3-2FGAGTGAGGTGGAAGGTGGAG59,794992,5
NLRP3-2RAAGAGCAGGTGATACAGGGAA58,82,5
NLRP3-3FAGAGAGGTGGACAGAAAGGG58,372472
NLRP3-3RGTGTTTGTAGCAGGAAGGCA58,92
Multiplex Tube 2NOD2-2FATCTCGCCTCCTGGGTTGAT63,269101
NOD2-2RCTCGGTGCTCCCACACTTAG601
NOD2-3FCAAGAGGAGTGGCAGACAGG6176121
NOD2-3RTCACCAAACCAGCAAACCCA64,71
NLRP12-1FTCCAAGAGTGCTAAGGAGGC59,636182
NLRP12-1RGACCATCATCCTGCCTACCG62,82
NLRP12-2FCAGAGCTGACAAGGGAGGA59,1101372
NLRP12-2RTGGGGTGGAAAAGAGGAGAA61,92
NLRP12-3FTGGGGATCAGTCACAAAGGT60,469401
NLRP12-3RAGGCAAAGAGGGGACAGAGA61,31
NLRP12-4FCGTGGGTAGAAGTGCTCAAA58,962581
NLRP12-4RACTATGTTCCGATGCAGCCA61,61
IL1RN-1FGTAACTGGAAGCGGGATGGA62,345811
IL1RN-1RTAAGGCAGCAGGACAGGTTT59,91
IL1RN-2FGTTGGTTGGAAGATGTGTTGGT61,198891
IL1RN-2RTTTGCTGCCTTGCCTGTTTC63,61
Multiplex Tube 3PSTPIP1-1FCTGGGAGGTGATGGGAAGGA64,624401
PSTPIP1-1RACTGAGGCTTGGAGACAGAAA59,11
PSTPIP1-2FCTGGAGTGTGCGTGACCTT59,922961
PSTPIP1-2RCTGTTAGGGTGGCTGTGTCC60,61
PSTPIP1-3FTCCTCTGACCCTTGGCTTCT61,399841
PSTPIP1-3RCACTCCTTTCTGCCCCTTCC63,31
PSTPIP1-4FCCACTCTTCAGCCACCCTTC62,140491
PSTPIP1-4RACCTTTGCCCACGCACTT61,71
NOD2-1FCTTCCCCTCCTCTCCTGTCT59,843111
NOD2-1RGCCACCACACACTTCCTCT58,71
NOD2-4FTTACATTGAGAGCCCTTGGAGT60,183222
NOD2-4RTGGAGCCACTTTGAGGGAATC63,72
LPIN2-1FAGCCTCTCTGTCCACTTCTAAC56,818991
LPIN2-1RTTCCAAACCACTGCCTACCAA62,71
Multiplex Tube 4LPIN2-2FGCAGGAGGTCAGGGTTCTTT60,68021
LPIN2-2RTTCAGTTTCCCTTTCCCTTGA60,91
LPIN2-3FCTGAACCGTGTGAGGTGAGG61,740521
LPIN2-3RGAAACTGCCTTTGCTGCTTG61,11
LPIN2-4FTTTATAGTGGGTGGCATTGGTG61,837171
LPIN2-4RCAAGAGACAGCAGTGTGCAAG59,81
LPIN2-5FCCTCGTTCCCAATGCAGGT63,442061
LPIN2-5RGCCACAGGCCAAACTGAGAA63,61
LPIN2-6FGGGAAGCCCTCATTCACTCT60,664391
LPIN2-6RGCTCCCACACCATCAACAGG63,91
LPIN2-7FTAAGGGCTCGTGGAGTTGTC60,334111
LPIN2-7RGGCGTTTGTGGGTTCCTAAT61,11
Expanded panel content and covered region percentage in exons and introns. MEFV, Mediterranean Fever; MVK, Mevalonate Kinase; TNFRSF1A, Tumor Necrosis Factor Receptor Superfamily Member 1A; NLRP3, NLR Family Pyrin Domain Containing 3; PSTPIP1, Proline-Serine-Threonine Phosphatase Interacting Protein 1; IL1RN, Interleukin-1 Receptor Antagonist; NOD2, Nucleotide-binding Oligomerization Domain-containing Protein 2; NLRP12, NLR Family Pyrin Domain Containing 12; LPIN2, Phosphatidate Phosphatase LPIN2; AR, Autosomal Recessive; AD, Autosomal Dominant. Targeted PCR regions outline in 9 gene. (A) MEFV, Mediterranean Fever, (B) MVK, Mevalonate Kinase (C) TNFRSF1A, Tumor Necrosis Factor Receptor Superfamily Member 1A, (D) NLRP3, NLR Family Pyrin Domain Containing 3, (E) PSTPIP1, Proline-Serine-Threonine Phosphatase Interacting Protein 1, (F) IL1RN, Interleukin-1 Receptor Antagonist, (G) NOD2, Nucleotide-binding Oligomerization Domain-containing Protein 2, (H) NLRP12, NLR Family Pyrin Domain Containing 12, (I) LPIN2, Phosphatidate Phosphatase LPIN2. Brown and orange boxes indicate UTRs and exons, respectively. Primer Sequence and Multiplex Volumes.

NGS Using Illumina NextSeq 500 System

Barcoded library PCR products were prepared with Nextera XT Sample Prep Kit (Illumina) according to manufacturer’s protocol for > 500 bp amplicon read. 1 ng total multiplex PCR products were used for the preparation of each DNA library. PCR products were enzymatically fragmented with Nextera XT kit (19). Each DNA library was tagged with unique index by 12 cycles of PCR. After cleaning up and normalization steps, all libraries were pooled. Sequencing was performed using the NextSeq 500 Sequencing System (Illumina) with 300 cycle Mid-Output Kit.

Bioinformatics

Sequencing data was analyzed for rare pathogenic variants that might be associated with the disease. Gentera Data Processing Engine (GDPE) [Gentera (20), Turkey], an easy-to-use automatic pipeline, was used for analyzing genomes. GDPE provides high accuracy variant detection by using different algorithms. A sample sheet and raw data were used as an input. The 5 ‘and 3’ ends of this DNA sequence data are trimmed to certain lengths considering the quality parameters. The DNA sequence data of targeted genes are aligned with using BWA (21) based on the reference human genome sequence (GRCh38). After the alignment, the following steps were followed with using GATK (22) algorithm. Realignment in InDel regions, recalibrating the quality score, parameter optimizations for variations, variant annotation, filtering variants according to Strand Bias status by taking the upper limit (20%), eliminating unreliable (<15%) variations according to the percentage of variation detected. FastQC (23) was used to evaluate the quality of data. Raw VCF file annotated with Annovar (24). The dbSNP150 (25) database was used to determine SNP annotations, amino acid and nucleotide changes and locations in the final report. “Sorting Intolerant from Tolerant” [SIFT (26)] and “Polymorphism Phenotyping” [PolyPhen (27)] applications were used to evaluate the possible impact assessment of variations on proteins. CNV analysis was performed with coverage-based CNVpytor (a python extension of CNVnator) to determine copy number and large structural variations (28). CNVpytor refined the data with multiple-bandwidth partitioning and GC correction approaches.

Verifications of the Singleplex and Multiplex Models in Healthy Controls

With the primers designed as described, a trial study was carried out without any known variations on the genes being sequenced. Each of the 9 genes was sequenced separately in different DNA with the singleplex model and visualized using gel electrophoresis. The singleplex PCR products controlled on the gel were sequenced as described in the NGS system. The same procedure was performed for basic and expanded panels on healthy controls. Multiplex PCR primers and conditions were prepared as described. They were visualized on gel electrophoresis ( ). After PCR products were sequenced as described in NGS system, optimization was done by decreasing the number of amplicons with high reading in NGS results, increasing the number of amplicons with low reading and changing reaction conditions.
Figure 2

NGS Workflow and detailed protocols for 1 sample by PCR, library prep, sequencing and bioinformatics for 4-gene singleplex, basic multiplex panel, 9-gene singleplex and expanded multiplex panel, respectively.

NGS Workflow and detailed protocols for 1 sample by PCR, library prep, sequencing and bioinformatics for 4-gene singleplex, basic multiplex panel, 9-gene singleplex and expanded multiplex panel, respectively.

Analytical Validations of the Multiplex Models in Known Mutation Carriers

Genomic DNA samples from patients with AIDs who had already been found to carry at least one variation in one of the known causative genes previously tested through Sanger sequencing or Strip Assay Kits (ViennaLab) were collected. These variation-positive patients were taken into consideration for developing an NGS-based diagnostic protocol.

Clinical Validations of the Multiplex Models in Undiagnosed Patients

Patients underwent screening history, physical examination, and laboratory evaluation, in the outpatient department in ten different pediatric and adult rheumatology centers where these groups of patients are mainly followed up in Turkey and pre-diagnosed with FMF, MKD (HIDS), TRAPS, CAPS, PAPA, DIRA, Blau, FCAS2 and Majeed syndrome. Initially, data was collected on their clinical parameters such as presence and duration of fever, frequency of attacks, abdominal pain, age of onset, organ system involvement, the presence of visible lesions (rashes, purpura, nodules etc.). Molecular diagnostics were also considered based on their clinical presentations and response to therapies. Blood samples from previously consented patients and, in some cases, unaffected family members, were collected to extract DNA, and perform NGS analysis with basic and expanded panels ( ).

Results

Verifications of the Singleplex and Multiplex Models in Healthy Controls

For singleplex optimization, the PCR procedure was applied to all primer pairs separately on 9 healthy control DNA samples. In gel electrophoresis, the clear unique-band appearance reflects that the targeted PCR product is amplified ( ). Basic and expanded panels were developed after optimization of primer composition and PCR conditions with using specific primer sets. Ten healthy control DNA samples were screened with the basic panel, similar PCR product patterns were observed with singleplex electrophoresis. Eight PCR products longer than 7.5 Kb were observed as single thick band. Two bands of 3.8 and 1.2 kb were also observed separately ( ). Adequate quality was observed in all regions covered by the primers. Two healthy control DNA samples were tested with expanded panel and bands were observed in 0.8 kb and 10 kb range. In total, 21 control samples were sequenced, and no nonsynonymous variation was detected. A total of 107 synonymous variations were detected. 55% of these variations were found on the MEFV (n=13) and 25% on the NLRP3 (n=13). Of the 130 UTR variations detected, 48% were on MEFV and 30% were on NLRP3. In addition, 16% of 1221 intronic variations were found on MEFV, 17% on MVK (n=13) and 42% on NLRP3 ( ).
Figure 3

Band patterns of the PCR products. (A) Singleplex 31 amplicons band patterns of MEFV, MVK, TNFRSF1A, PSTPIP1, IL1RN, NOD2, NLRP12 and LPIN2 are shown respectively. L label indicates the 1 kb sized DNA ladder. (B) The multiplex band patterns of the MEFV, MVK, TNFRSF1A and NLRP3 are shown in Multiplex Tube-1. PSTPIP1, IL1RN, NOD2, NLRP12 and LPIN2 patterns are shown in Multiplex Tube 2-4. L label indicates the 1 kb sized DNA ladder.

Table 3

Detected variant alleles at verification, analytical validation and clinical validation steps in patients with AID.

Variant TypeAA ChangedbSNP CodeExon numberNumber of Variant Allele at Verification (n=21)Number of Variant Allele at Analytical Validation (n=10)Number of Variant Allele at Clinical Validation (n=46)Total Number of Variant Allele
MEFVNonsynonymousSNVE148Qrs37439302350
G304Rrs7597770121
R202Qrs2242222123
P369Srs1146602333
R408Qrs1146602433
M680I*rs289405801012
M694V*rs617527171017
V726Ars2899405791012
A744Srs61732874102
SynonymousSNV5932170261
UTRSNV632291176
Insertion1262846
Deletion941831
IntronicSNV20457619880
Insertion541928
Deletion408101149
MVKNonsynonymousSNVP11Lrs8766610012117
S52N*rs79576193412
SynonymousSNV342027
UTRSNV653546
Insertion321217
Deletion0
IntronicSNV2101226981030
Insertion322530
Deletion4123135199
TNFRSF1ANonsynonymousSNVR92Q*rs4149584411
SynonymousSNV852740
UTRSNV0
Insertion44
Deletion22
IntronicSNV7230177279
Insertion3912
Deletion741627
NLRP3NonsynonymousSNVQ703K*rs358294193369
SynonymousSNV2720101148
UTRSNV4028134202
Insertion11103859
Deletion1053853
IntronicSNV51825416502422
Insertion6527163255
Deletion10344304451
PSTPIP1NonsynonymousSNVA196V*rs370965231911
SynonymousSNV33
UTRSNV11
Insertion134
Deletion11
IntronicSNV176481
Insertion34916
Deletion235
IL1RNSynonymousSNV24511
UTRSNV561324
Insertion0
Deletion112
IntronicSNV65147212
Insertion369
Deletion33
NOD2NonsynonymousSNVP268Srs206684242318
R676Crs574327741
G775D*16:50712235 (GRCh38)41
M491L16:50711382 (GRCh38)43
N852Srs10489546761
G908Rrs206684581
V955I*rs5743291933
SynonymousSNV491629
UTRSNV4122137
Insertion0
Deletion0
IntronicSNV232492139
Insertion145
Deletion541221
NLRP12NonsynonymousSNVN394K*rs201241894311
SynonymousSNV471526
UTRSNV224
Insertion0
Deletion1236
IntronicSNV7814166258
Insertion101213
Deletion171330
LPIN2NonsynonymousSNVE601Krs617353931412
SNVP348L*rs3467669171
SynonymousSNV22
UTRSNV10162349
Insertion2439
Deletion0
IntronicSNV346094
Insertion11
Deletion10616

*Known variations (AA change) in analytical validation of multiplex model.

Band patterns of the PCR products. (A) Singleplex 31 amplicons band patterns of MEFV, MVK, TNFRSF1A, PSTPIP1, IL1RN, NOD2, NLRP12 and LPIN2 are shown respectively. L label indicates the 1 kb sized DNA ladder. (B) The multiplex band patterns of the MEFV, MVK, TNFRSF1A and NLRP3 are shown in Multiplex Tube-1. PSTPIP1, IL1RN, NOD2, NLRP12 and LPIN2 patterns are shown in Multiplex Tube 2-4. L label indicates the 1 kb sized DNA ladder. Detected variant alleles at verification, analytical validation and clinical validation steps in patients with AID. *Known variations (AA change) in analytical validation of multiplex model.

Analytical Validations of the Multiplex Models in Known Mutation Carriers:

At this step, 5 patients with known MEFV:M694V, MEFV:M680I, MVK:S52N, TNFRSF1A:R92Q, NLRP3:Q703K/Q703K variations were screened with the basic panel. Another 5 patients with known PSTPIP1:A196V, NOD2:V955I, NOD2:G775D, NLRP12:N394K, LPIN2:P348L variations were screened with the expanded panel. In addition to these nonsynonymous variations, 11 more nonsynonymous allele variants were detected. A total of 81 synonymous variations were detected. 39% of these variations were found on the MEFV (n=10) and 24% on the NLRP3 (n=10). Of the 89 UTR variations detected, 24% were on MEFV and 31% were on NLRP3. In addition, 50% of 501 intronic variations were found on NLRP3 and 24% on MVK (n=10) ( ).

Clinical Validations of the Multiplex Models in Undiagnosed Patients

Panels validated with prediagnosed AIDs patients (n=46) who were diagnosed by primary attending physician, reported symptoms which were associated with episodes of the patients with undiagnosed AIDs were screened with panel in this step summarized in . The most common symptoms were fever>38C (82.6%), abdominal pain (52.2%), arthralgia (52.2%). Skin involvement was described in 50% of the patients such an urticarial rash, maculo-papular rash, pseudo-folliculitis, and erysipelas-like rash. The number of episodes in a year were reported as more than 12 (10.9%), 6-12 (34,8%) and between 2 and 6 (54.3%) by the patients or their parents. The duration of episodes in our cohort were 2-5 days in 71.7%, 5-10 days in 8.7% and >10 days in 19.6%.
Table 4

Clinical manifestations of the patients.

Clinical manifestations in 46 patients
n (%)n (%)
Constitutional Symptoms Cardiorespiratory
Fever (>38C)38 (82.6%)Thoracic pain9 (19.6%)
Headache10 (21.7%)Pleurisy3 (6.5%)
Fatigue15 (32.6%)Pericarditis1 (2.2%)
Musculoskeletal Lymphoid
Skeletal dysplasia1 (2.2%)Lymphadenopathy8 (17.4%)
Skull anomaly1 (2.2%)Splenomegaly7 (15.2%)
Overgrowth patella1 (2.2%) Gastrointestinal
Myalgias18 (39.1%)Abdominal pain24 (52.2%)
Arthralgia24 (52.2%)Peritoneal adhesion2 (4.3%)
Arthritis7 (15.2%)Constipation4 (8.7%)
Mucocutaneous Diarrhea11 (23.9%)
Aphthous stomatitis6 (13%)Vomiting9 (19.6%)
Exudative pharyngitis5 (10.9%) Ocular
Cold urticaria1 (2.2%)Conjunctivitis4 (8.7%)
Maculo-papular rash11 (23.9%)Keratitis2 (4.3%)
Urticarial rash6 (13%)Periorbital edema5 (10.9%)
Erysipelas-like rash2 (4.3%)Visual loss1 (2.2%)
Pseudo-folliculitis2 (4.3%)
Clinical manifestations of the patients. Expanded panels were tested in 14 samples, basic panels tested in 32 samples according to their possible clinical diagnosis. Primary physicians requested to screen a total of 129 genes from 46 patients with the preliminary diagnosis of FMF (n=18), CAPS (n= 25), MKD (n=19), TRAPS (n=23), Blau (n=10), DIRA (n=9), Majeed (n=9), PAPA (n=8) and NALP12 (n=8). Exonic and intronic variations were detected and grouped according to their variant types and summarized in . A total of 77 nonsynonymous variations were detected. 59% of these were found on MEFV (n=46), 17% were on MVK (n=46), 15% were on NOD2 (n=14) and 7% were on NLRP3 (n=46). A total of 359 synonymous variations were detected and 47% of these variations were found on MEFV and 28% on the NLRP3. Of the 320 UTR variations detected, 28% were on MEFV and 41% were on NLRP3. In addition, 50% of 3673 intronic variations were found on NLRP3, 19% on MVK and 16% on MEFV. Uniform coverage was obtained from exonic, intronic and UTRregions ( ). Also, the CNV alteration was not seen in any sample ( ).
Figure 4

Advantages of our multiplex model. (A) Uniform coverage demonstration on MEFV first exon* (B) Minimum and maximum coverage depth of the MEFV second exon with high GC content* (C) A deletion from PSTPIP1’s UTR as an example of InDel determination. (D) A heterozygosis SNV detection of NOD2 with deep sequencing rate (E) An intronic NLRP12 insertion site that deeply covered (F) An intronic LPIN2 variation site that cause frameshift. *These figures show the change of coverage depth along the exon. G+C content graph can be seen at the bottom.

Figure 5

CNV analysis manhattan plot demonstration throughout the sequenced chromosomes.

Advantages of our multiplex model. (A) Uniform coverage demonstration on MEFV first exon* (B) Minimum and maximum coverage depth of the MEFV second exon with high GC content* (C) A deletion from PSTPIP1’s UTR as an example of InDel determination. (D) A heterozygosis SNV detection of NOD2 with deep sequencing rate (E) An intronic NLRP12 insertion site that deeply covered (F) An intronic LPIN2 variation site that cause frameshift. *These figures show the change of coverage depth along the exon. G+C content graph can be seen at the bottom. CNV analysis manhattan plot demonstration throughout the sequenced chromosomes.

Discussion

In this study, we describe the development and validation of NGS-based multiplex array enabling the “long-amplicon” approach, which allows compatibility for both long and short read NGS systems including Illumina (Miseq, Hiseq and Nextseq500), Oxford Nanopore and other NGS platforms for targeted sequencing of the nine genes associated with the most common AIDS. Validity steps included the selection of related candidate genes, primer design, development of screening methods, validation and verification of the product. This screening tool is less expensive and more comprehensive compared to other methods and more informative than traditional sequencing. Regarding the complex clinical and molecular diagnosis for AIDs, it should be emphasized that this diagnostic kit will simultaneously test all known genes and variants. Our long-range multiplex model is enabled to screen exon-intron boundaries and deep introns. WGS approach shows that there is an augmenting number of pathogenic variants positioned within introns at least 100 bp away from the exon-intron junction (29). Moreover, Genome-Wide Association Studies (GWAS) revealed that many intronic variants have a significant association with diseases (30, 31). Protein translation and expression are affected by deep intronic variants which were observed in cystic fibrosis or collagenopathies (32). Current deep intronic variants which can cause AIDs have not been described. WES, hybridization probe-based sequencing and microarray-based analysis failed to detect intronic variations (29). The relationship between intronic variants and AIDs has been reported. In a patient with periodic fever syndrome, an intronic NLRP12 pathogenic variant (rs104895565) activated a cryptic splice site upstream in exon 3 and caused a frameshift followed by an early stop codon (33). Similarly, an intronic LPIN2 pathogenic variant (rs80338808) reported in a patient with chronic recurrent multifocal osteomyelitis (CRMO) and congenital dyserythropoietic anemia (CDA) caused a frameshift in exon 17 which resulted in early stop codon (34). Our results showed that our panel detects potential intronic variations such as previously described rs104895565 and rs80338808 and able to extend in the light of new information ( ). We found the least intron coverage in the LPIN2 gene with 20% and the highest coverage in the MEFV gene (100%). We are able to cover 80% of the intronic regions in six genes among our panel. Long-read sequencing instruments perform with high accuracy in detecting small insertions, deletions (InDels) and large complex structural variants compared to short-read systems (35). Our panel is compatible with long read sequencing systems which allow high coverage and more accurate results. Rowczenio et al. performed targeted autoinflammatory panel to investigate the molecular cause of persistent fevers resulting in a 24 nt pathogenic mutation being identified in a patient with TRAPS and confirmed as the first known case of gonosomal TNFRSF1A mosaicism (36). Our panels allow us to detect inDels in exonic, UTR ( ) and intronic regions. Methods with high specificity and uniform coverage can achieve adequate quality even with a lower sequence data. This also makes sequencing more cost effective. The specificity and uniformity of amplicon-based studies are too high to compare with such WES or targeted hybrid capture based methods (37). Our panels are highly specific to the exons of the genes they contain ( ). We performed in-silico analysis and BLAST for our panel to increase specificity not to hit a different region in the human genome. Our NGS results showed that, uniform coverage depth was observed except for GC-rich and repetitive regions ( ). Sequencing difficulties through GC-rich sequences in AIDs related genes (such as MEFV exon 2) underlined as an important concern for the diagnosis (38). We obtained minimum 552X and maximum 3512X coverage depth for MEFV exon 2 with our multiplex panel ( ). To improve GC-rich sequence reading quality, we used betaine in the PCR step (39). Also, PCR conditions are optimized for the best GC-rich sequence amplification. Our results revealed that high quality sequence readings obtained targeted GC-rich regions. Copy number variations (CNVs), intermediate structural variants, refers to DNA copy number changes between 1 Kb to 5 Mb (40). There are different CNV calling strategies according to analysis type of NGS including paired-end read (41), read-depth (42) and split read (43). Whole exome sequencing (WES) studies require much effort for CNV analysis (44). Nevertheless, previous studies demonstrate the limitation such as low sensitivity and high false positive rates (45). Shinar et al. have highlighted the importance of CNVs in the genetic analysis of patients with AIDs (16). For sensitive CNV analysis, above 1000X average reading depth (46) and uniform sequence coverage are needed (47). In particular, detection of structural variants is a crucial and recommended for diagnosis of the AIDS (16). Mosaicism derives as a result of single nucleotide variations (SNVs) and CNVs and NGS is a useful method for identifying, categorizing, verifying and validating (48). The 1000X over coverage obtained outside of GC-rich and repetitive regions and uniform read depth made our multiplex model suitable for CNV analysis. Attention was paid to ensure that the algorithm chosen for data analysis was suitable for these parameters. However, no major structural changes and copy number changes were encountered ( ). Depending on the method and chemistry used, our panels have some limitations. The panel containing 9 genes does not include all genes related to AIDS. Our effort continues to expand the panel content to increase the number of genes. Focusing on specific regions of the genome reduces the possibility of finding novel variations (49). Compared to WGS, our panel is limited in detecting large structural variants and CNVs (50). Both PCR amplification and library preparation, DNA polymerase is used which causes artifacts. The artifacts cause the base substitution errors between 1/3.200 to 1/300.000 errors/base rates (51). In order to reduce the error caused by DNA polymerase and increase the amplification efficiency, we preferred high compatibility enzyme in our studies (52). As a conclusion, in this study, we described the development and validation of NGS-based multiplex array enabling the “long-amplicon” approach for targeted sequencing of nine AIDs genes. This screening tool is less expensive and more comprehensive compared to other methods and more informative than traditional sequencing. The proposed panel has an advantage compared to WES or hybridization probe equivalents in terms of CNV analysis, high sensitivity and uniformity, GC-rich region sequencing, InDel detection and intron covering ( ). Currently, there are 29 genes that have been associated with more than 30 hereditary auto-inflammatory disorders (http://fmf.igh.cnrs.fr/ISSAID/infevers/). The screening tool will be updated from time to time to incorporate the newly discovered genes.
Table 5

Comparison of the methods in terms of quality, duration and cost.

Whole Exome SeqHybridization Probe Targeted SeqSingleplex Long-Range AmpliconMultiplex Long-Range Amplicon
4-Gene9-Gene4-Gene9-Gene
 Hands-on Time  Rate for 1  Sample PCR3X8X1X2X
Gel Electrophoresis3X5X0.5X1X
Library Preparation30X8X6.5X6.5X6.5X6.5X
Total30X8X12.5X19.5X8X9.5X
 Coverage High Output (120 Gb)Mod.HighHighHighHighHigh
Mid Output (32 Gb)LowHighHighHighHighHigh
Low Output (5Gb)Mod.Mod.Mod.Mod.Mod.
 Region Exons ExonsExons & Some IntronsExons & Some IntronsExons & Some IntronsExons & Some Introns
 CNV Analysis LowLowHighHighHighHigh
 InDels Detection LowLowHighHighHighHigh
 Uniform Coverage LowLowHighHighHighHigh
 Compatibility with Different Sequencing Platforms ++++
 Sensitivity LowLowHighHighHighHigh
 GC-Rich Content Detection LowLowHighHighHighHigh
 Estimated Cost  Rate for 1  Sample High Output (120 Gb)35X24X10X15X7X8X
Mid Output (32 Gb)25X20X7X12X5X6X
Low Output (5Gb)19X6X11X4X5X

The data in the table were prepared based on 2*150bp long readings and estimated cost calculated based on consumables price.

Comparison of the methods in terms of quality, duration and cost. The data in the table were prepared based on 2*150bp long readings and estimated cost calculated based on consumables price.

Data Availability Statement

All data relevant to the study are included in the article or uploaded as supplementary information. The raw datasets presented in this article are not readily available because of institutional restrictions and patient privacy. Requests to access the datasets should be directed to corresponding author. The semi processed data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics Statement

The studies involving human participants were reviewed and approved by Cerrahpasa Medical Faculty, Istanbul, Turkey, with the registry number of 83045809/604.01/02-312418/A-31 on 07.10.2015. Written informed consent to participate in this study was provided by the participants’ legal guardian/next of kin.

Author Contributions

ED and FG coordinated the study, drafted the manuscript analyzed the data. FG and EK performed experiments. FG, DP, and MR performed statistical analysis. ED, SO, HP, and OK participated in writing the manuscript with input from all authors. ED, HP, and OK included patients, provided clinical information and samples. ED revised the manuscript critically for important intellectual content and have given final approval of the version to be submitted for publication. ED as a PI had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the analysis. All authors contributed to the article and approved the submitted version.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  47 in total

1.  Sensitive and accurate detection of copy number variants using read depth of coverage.

Authors:  Seungtai Yoon; Zhenyu Xuan; Vladimir Makarov; Kenny Ye; Jonathan Sebat
Journal:  Genome Res       Date:  2009-08-05       Impact factor: 9.043

2.  Genome-Wide Copy Number Variation Detection Using NGS: Data Analysis and Interpretation.

Authors:  Wei Shen; Philippe Szankasi; Jacob Durtschi; Todd W Kelley; Xinjie Xu
Journal:  Methods Mol Biol       Date:  2019

Review 3.  Deep intronic mutations and human disease.

Authors:  Rita Vaz-Drago; Noélia Custódio; Maria Carmo-Fonseca
Journal:  Hum Genet       Date:  2017-05-12       Impact factor: 4.132

4.  Detecting copy-number variations in whole-exome sequencing data using the eXome Hidden Markov Model: an 'exome-first' approach.

Authors:  Satoko Miyatake; Eriko Koshimizu; Atsushi Fujita; Ryoko Fukai; Eri Imagawa; Chihiro Ohba; Ichiro Kuki; Megumi Nukui; Atsushi Araki; Yoshio Makita; Tsutomu Ogata; Mitsuko Nakashima; Yoshinori Tsurusaki; Noriko Miyake; Hirotomo Saitsu; Naomichi Matsumoto
Journal:  J Hum Genet       Date:  2015-01-22       Impact factor: 3.172

5.  Brief Report: Association of Tumor Necrosis Factor Receptor-Associated Periodic Syndrome With Gonosomal Mosaicism of a Novel 24-Nucleotide TNFRSF1A Deletion.

Authors:  Dorota M Rowczenio; Hadija Trojer; Ebun Omoyinmi; Juan I Aróstegui; Grigor Arakelov; Anna Mensa-Vilaro; Anna Baginska; Caroline Silva Pilorz; Guosu Wang; Thirusha Lane; Paul Brogan; Philip N Hawkins; Helen J Lachmann
Journal:  Arthritis Rheumatol       Date:  2016-08       Impact factor: 10.995

Review 6.  Classification criteria for autoinflammatory recurrent fevers.

Authors:  Marco Gattorno; Michael Hofer; Silvia Federici; Federica Vanoni; Francesca Bovis; Ivona Aksentijevich; Jordi Anton; Juan Ignacio Arostegui; Karyl Barron; Eldad Ben-Cherit; Paul A Brogan; Luca Cantarini; Isabella Ceccherini; Fabrizio De Benedetti; Fatma Dedeoglu; Erkan Demirkaya; Joost Frenkel; Raphaela Goldbach-Mansky; Ahmet Gul; Veronique Hentgen; Hal Hoffman; Tilmann Kallinich; Isabelle Kone-Paut; Jasmin Kuemmerle-Deschner; Helen J Lachmann; Ronald M Laxer; Avi Livneh; Laura Obici; Seza Ozen; Dorota Rowczenio; Ricardo Russo; Yael Shinar; Anna Simon; Nataša Toplak; Isabelle Touitou; Yosef Uziel; Marielle van Gijn; Dirk Foell; Claudia Garassino; Dan Kastner; Alberto Martini; Maria Pia Sormani; Nicolino Ruperto
Journal:  Ann Rheum Dis       Date:  2019-04-24       Impact factor: 19.103

7.  RNA splicing. The human splicing code reveals new insights into the genetic determinants of disease.

Authors:  Hui Y Xiong; Babak Alipanahi; Leo J Lee; Hannes Bretschneider; Daniele Merico; Ryan K C Yuen; Yimin Hua; Serge Gueroussov; Hamed S Najafabadi; Timothy R Hughes; Quaid Morris; Yoseph Barash; Adrian R Krainer; Nebojsa Jojic; Stephen W Scherer; Benjamin J Blencowe; Brendan J Frey
Journal:  Science       Date:  2014-12-18       Impact factor: 47.728

8.  Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction.

Authors:  Jian Ye; George Coulouris; Irena Zaretskaya; Ioana Cutcutache; Steve Rozen; Thomas L Madden
Journal:  BMC Bioinformatics       Date:  2012-06-18       Impact factor: 3.169

Review 9.  The innate immune perspective of autoimmune and autoinflammatory conditions.

Authors:  Shirly Frizinsky; Soad Haj-Yahia; Diti Machnes Maayan; Yulia Lifshitz; Ramit Maoz-Segal; Irean Offengenden; Mona Kidon; Nancy Agmon-Levin
Journal:  Rheumatology (Oxford)       Date:  2019-11-01       Impact factor: 7.580

10.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.