Literature DB >> 32547748

Whole exome sequencing identifies compound heterozygous variants of CR2 gene in monozygotic twin patients with common variable immunodeficiency.

Adiratna Mat Ripen¹, Hamidah Ghani², Chai Teng Chear¹, Mei Yee Chiow², Sharifah Nurul Husna Syed Yahya¹, Asiah Kassim³, Saharuddin Bin Mohamad^2,4.

Abstract

OBJECTIVES: A pair of female Malay monozygotic twins who presented with recurrent upper respiratory tract infections, hepatosplenomegaly, bronchiectasis and bicytopenia were recruited in this study. Both patients were suspected with primary immunodeficiency diseases. However, the definite diagnosis was not clear due to complex disease phenotypes. The objective of this study was to identify the causative gene mutation in these patients.
METHODS: Lymphocyte subset enumeration test and whole exome sequencing were performed.
RESULTS: We identified a compound heterozygous CR2 mutation (c.1916G>A and c.2012G>A) in both patients. These variants were then confirmed using Sanger sequencing.
CONCLUSION: Whole exome sequencing analysis of the monozygotic twins revealed compound heterozygous missense mutations in CR2.

Entities: Chemical

Keywords: Respiratory medicine; allergy/immunology; paediatric; primary immunodeficiencies; whole exome sequencing

Year: 2020 PMID： 32547748 PMCID： PMC7249565 DOI： 10.1177/2050312120922652

Source DB: PubMed Journal: SAGE Open Med ISSN： 2050-3121

Introduction

Primary immunodeficiency diseases (PID) are rare genetic diseases that result in defective immune function.[1] Common variable immunodeficiency (CVID) is a type of PID with estimated incidence of 3.8 per 100,000 live births in Denmark, 3.1 in Iceland, 2.6 in Norway, 2.17 in Taiwan, 0.7 in France and 0.6 in Spain.[2,3] CVID comprises heterogeneous immune disorders such as antibody deficiency, increased susceptibility to autoimmunity and lymphoproliferation.[4,5] The diagnosis and management of the disease are difficult due to the polygenic nature of CVID and similar clinical symptoms shared with the other PID.[6,7] Moreover, CVID cases have been reported to inherit in both autosomal dominant and autosomal recessive patterns.[8] Owing to the wide clinical spectrum and genetic heterogeneity, genotype–phenotype relationship of CVID remains unclear.[9] Advancement of next-generation sequencing (NGS) provides a better insight into the genetic diagnosis of complex diseases.[10,11] Whole exome sequencing (WES), an NGS method that targets the protein-coding regions in the genome, had been widely used in the identification of genetic mutation in rare disorders.[12-14] The use of WES has enabled the identification of a single hemizygous, missense XIAP variant in X-linked lymphoproliferative disease 2 which may not be easily diagnosed using traditional genetic techniques due to unusual clinical manifestation.[15,16] A pair of monozygotic (MZ) twins (P1 and P2) of non-consanguineous Malay parents (Figure 1) were recruited in this study. In the first few years of life, they had several episodes of upper respiratory tract infections, which were resolved without requiring hospital admission. The patients were first hospitalised for respiratory tract infection at the age of 7, when they presented with tachypnoea and were noted to have digital clubbing, hepatosplenomegaly, failure to thrive, bronchiectasis and bicytopenia. However, bone marrow aspiration did not suggest any malignancy infiltration. The parents and three other siblings appeared healthy (Figure 1). These patients were suspected with PID. Therefore, WES was performed in both patients to determine the causative gene responsible for the disease phenotype.

Figure 1.

Family pedigree of the patients. Roman numerals i and ii indicate generations, squares indicate males, circles indicate females, filled circles indicate the affected MZ twins and crossed-out circles indicate the deceased. The unaffected family members were not studied. The values under each symbol represent the age in years.

Materials and methods

Subjects and ethics statement

The study protocol was approved by the Medical Research and Ethics Committee (MREC), Ministry of Health Malaysia (KKM/NIHSEC/P16-837). MZ twins suspected with PID were recruited in this study.

Lymphocyte subset enumeration

Whole blood (10 mL) from each patient was obtained in ethylenediaminetetraacetic acid (EDTA) blood tube for lymphocyte subset enumeration and DNA extraction. The patient’s whole blood (50 µL) was stained with two cocktails containing monoclonal antibodies in a BD Trucount tube, respectively: BD Multitest™ CD3 FITC (clone SK7)/CD8 PE (clone SK1)/CD45 PerCP (clone 2D1(HLe-1))/CD4 APC (clone SK3) and BD Multitest CD3 FITC (clone SK7)/CD16+ CD56 PE (clone B73.1 and clone NCA16.2)/CD45 PerCP(clone2D1(HLe-1))/CD19 APC (clone SJ25C1). The tubes were incubated for 15 min in dark. Then, the mixture was lysed with BD FACS™ Lysing solution and incubated for 10 minutes. The percentage and absolute count of the lymphocyte subsets were analysed using BD FACSDiva™ software on a BD FACS Canto™ II flow cytometer (Becton Dickinson, USA).

DNA extraction

Peripheral blood mononuclear cells (PBMCs) were isolated from whole blood using the standard Ficoll-Paque centrifugation procedure. Genomic DNA was extracted using QIAamp® DNA Mini Kit (Qiagen, Germany) following the manufacturer’s instructions. The quality of DNA samples was tested using 1% agarose gel electrophoresis. Qubit 2.0 Fluorometer (Life Technologies, USA) was used for DNA quantification.

Exome capture and sequencing

Both patients’ genomic DNA was subjected to WES, with average sequencing depth of 100× coverage, using Agilent SureSelect Human All Exon V5 (Agilent, USA) that targets 50-Mb exonic sequence.

Quality control check and processing of sequencing reads

Quality of raw sequence data from all FASTQ files was assessed using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Trimmomatic was used to trim raw reads with poor base quality and sequence adapters.[17] Remaining clean reads were aligned to the human reference genome GRCh38 (https://www.ncbi.nlm.nih.gov/assembly/GCA_000001405.27) using Burrows-Wheeler Aligner (BWA).[18] SAMtools was used for sorting reads and collecting statistical metrics.[19] Next, Picard tools (http://picard.sourceforge.net/) were utilised to mark the duplicate reads and collect sequencing artefact metrics.

Variant calling and annotation

Variant calling for single nucleotide polymorphisms (SNPs) and insertions/deletions (Indels) was performed using Genome Analysis Toolkit (GATK).[20] These variants were subsequently annotated using web-based ANNOVAR (wANNOVAR).[21]

Genotype concordance and discordance analysis

To determine the concordant and discordant variants in both twins, all exonic SNPs from their variant call format (VCF) files were compared. The concordant and discordant variants were extracted and annotated using BCFtools (http://samtools.github.io/bcftools) and wANNOVAR, respectively. The variants were considered concordant if the genotypes are identical in both VCF files. Whereas variants with genotypes that are unique to only one twin were categorised as discordant variants.

Variant filtering and prioritisation

Annotated variants were filtered and prioritised based on three criteria. First criterion was variant classification such as synonymous, non-synonymous, frameshift and non-frameshift mutations. Second criterion was allele frequency reported by Genome Aggregation Database (gnomAD). Third criterion was in silico protein functional prediction for variants by Sorting Intolerant From Tolerant (SIFT),[22] Polymorphism Phenotyping v2 (PolyPhen-2)[23] and MutationTaster.[24]

Polymerase chain reaction and Sanger sequencing

The CR2 variant identified by WES was confirmed by Sanger sequencing. The results were viewed using CLC Genomics Workbench v7.5.1 software.

Results

Immunological analysis

The initial diagnostic assessment revealed reduced T, B and natural killer (NK) cell counts in both patients (Figure 2 and Table 1). We found that the total T-cell values were low, 671 × 106/L and 545 × 106/L for P1 and P2, respectively. In addition, the total B-cell values were also low, 26 × 106/L and 16 × 106/L for both patients.

Figure 2.

MZ: monozygotic; SSC: side-scatter; FITC: fluorescein isothiocyanate; APC: allophycocyanine; PE: phycoerythrin.

Table 1.

Clinical and laboratory findings of the patients.

Features	P1	P2
Age at presentation	7 years	7 years
Sex	Female	Female
Geographic location	Malaysia	Malaysia
Race	Malay	Malay
Infection	URTI	URTI
Lymphoproliferation	Hepatosplenomegaly	Hepatosplenomegaly
Lung disease	Bronchiectasis	Bronchiectasis
Bone marrow aspiration	Not suggestive for any malignancy infiltration
Others	Anaemia, digital clubbing, persistent tachypnoea, failure to thrive
Treatment	IVIG	IVIG
Status	Passed away at 10 years	Passed away at 11 years
Immunoglobulins	Low IgG and IgA	Low IgG and IgA
IgG (g/L)	4.22 (N = 5.2–15.6)	4.16 (N = 5.2–15.6)
IgA (g/L)	<0.02 (N = 0.54–3.6)	<0.02 (N = 0.54–3.6)
IgM (g/L)	1.27 (N = 0.13–2.4)	9.97 (N = 0.13–2.4)
Lymphocyte populations	Lymphopenia	Lymphopenia
T cells (×10⁶/L)	671 (N = 1400–2000)	545 (N = 1400–2000)
B cells (×10⁶/L)	26 (N = 300–500)	16 (N = 300–500)
NK cells (×10⁶/L)	79 (N = 200–600)	95 (N = 200–600)

IVIG: intravenous immunoglobulin therapy; URTI: Upper respiratory tract infection; N: age-matched reference range; NK: natural killer.

Flow cytometry data on lymphocyte subset enumeration. Dot plots show the distribution of lymphocyte sub-populations in the MZ twins and a normal control in (a) CD3+ T cells versus SSC; (b) CD3+ CD8+ T cells versus CD3+ CD4+ T cells; and (c) CD16+ CD56+ NK cells versus CD19+ B cells. Each dot represents an individual cell analysed by the flow cytometer. MZ: monozygotic; SSC: side-scatter; FITC: fluorescein isothiocyanate; APC: allophycocyanine; PE: phycoerythrin. Clinical and laboratory findings of the patients. IVIG: intravenous immunoglobulin therapy; URTI: Upper respiratory tract infection; N: age-matched reference range; NK: natural killer.

WES analysis

WES generated 78,835,626 and 77,270,546 paired-end reads in P1 and P2, respectively. The reads were mapped to the human reference genome GRCh38. Both patients had achieved high percentage of mapping, which were 99.26% for P1 and 99.39% for P2. We investigated variations such as SNPs and Indels in both patients as these are the two largest types of human genetic variations.[25]

Evaluation of genotype concordance and discordance

We identified approximately 22,000 exonic SNPs in both twins, 93.4% and 90.5% of which were concordant between the twins, respectively (Figure 3(a)). Besides, we also investigated the number of private and shared SNPs between each twin and an unrelated control. Conversely, when compared with an unrelated control, only 59%–64% of exonic SNPs were found concordant (Figure 3(b) and (c)).

Figure 3.

Genotype concordance and discordance of two samples. Venn diagrams depict the number of shared and private SNPs identified between (a) P1 and P2, (b) P1 and unrelated control and (c) P2 and unrelated control.

PID gene screening

We screened the WES datasets for 240 PID-related genes as reported by the International Union of Immunological Societies for Primary Immunodeficiency to identify the genetic variations in PID-related genes.[26] We interpreted the candidate variants according to the standards and guidelines as recommended by American College of Medical Genetics and Genomics and the Association for Molecular Pathology.[27] Since the patients’ clinical symptoms resembled CVID, we narrowed the analysis to the 31 genes known to be associated with CVID. We identified 17 missense SNPs (Table 2) and 15 silent SNPs in CVID-associated genes from the WES datasets of both patients. The deleterious effect of missense SNPs on resulting proteins was evaluated using SIFT, PolyPhen-2 and MutationTaster. Homozygous SNPs were identified in CD19, CD27, CD86, ADAM28 and STXBP2. However, the damaging variant prediction tools predicted them as non-damaging variants (Table 2). Three missense SNPs (two SNPs in CR2 and one SNP in FANCA) were predicted to be damaging variants by at least one of the variant prediction softwares (Table 2). A heterozygous SNP in FANCA (rs1800282) identified in both patients was not considered as disease-associated variant because FANCA has been associated with autosomal recessive inheritance.[28] Two SNPs in CR2, c.1916G>A (rs17615) and c.2012G>A (rs17616), were identified in both patients (Table 2). Subsequent Sanger sequencing revealed that the patients’ father carried c.2012G>A (rs17616) SNP, while the mother carried c.1916G>A (rs17615) SNP (Figure 4). This indicated that both patients inherited the alleles from each of their parents leading to compound heterozygosity status.

Table 2.

Summary of missense variants found in the CVID-related genes.

Gene (dbSNP)	RefSeq	Chr	Exon	Nucleotide change	AA change (change in polarity)	Zygosity	SIFT	PolyPhen-2	MutationTaster	MAF in gnomAD
CD19 (rs2904880)	NM_001178098	16	3	c.520C>G	p. Leu174Val (both are NP)	Hom	T	B	P	0.7183
CD27 (rs2532502)	NM_001242	12	6	c.698A>G	p. His233Arg (both are Pr)	Hom	T	B	P	0.9928
CD86 (rs2681417)	NM_001206924	3	3	c.217G>A	p. Val73Ile (both are NP)	Hom	T	B	P	0.9211
CR2 (rs17615)	NM_001006658	1	10	c.1916G>A	p. Ser639Asn (both are Pr)	Het	T	D	P	0.2609
CR2 (rs17616)	NM_001006658	1	11	c.2012G>A	p. Arg671His (both are Pr)	Het	T	D	P	0.2597
LRBA (rs1782360)	NM_001199282	4	23	c.3269C>G	p. Ala1090Gly (both are NP)	Het	T	B	P	0.1245
ADAM28 (rs7814768)	NM_014265	8	22	c.2293G>A	p. Val765Met (both are NP)	Hom	T	B	P	0.9875
SDK1 (rs671694)	NM_001079653	7	7	c.383A>G	p. His128Arg (both are Pr)	Het	T	B	P	0.7519
SDK1 (rs138116831)	NM_152744	7	15	c.2161G>A	p. Val721Ile (both are NP)	Het	T	B	P	0.0008
CTLA4 (rs231775)	NM_001037631	2	1	c.49A>G	p. Thr17Ala (Pr>NP)	Het	T	B	P	0.415
STXBP2 (rs6791)	NM_001127396	19	18	c.1567A>G	p. Ile523Val (both are NP)	Hom	T	B	P	0.6342
FANCA (rs1800282)	NM_000135	16	1	c.17T>A	p. Val6Asp (NP>Pr)	Het	D	B	N	0.0769
FANCA (rs11646374)	NM_000135	16	14	c.1235C>T	p. Ala412Val (both are NP)	Het	T	B	P	0.0657
FANCA (rs2239359)	NM_000135	16	16	c.1501G>A	p. Gly501Ser (NP>Pr)	Het	T	B	P	0.5015
FANCA (rs17232910)	NM_000135	16	22	c.1927C>G	p. Pro643Ala (both are NP)	Het	T	B	P	0.0669
FANCA (rs7195066)	NM_000135	16	26	c.2426G>A	p. Gly809Asp (NP>Pr)	Het	T	B	P	0.471
FANCA (rs9282681)	NM_000135	16	40	c.3982A>G	p. Thr1328Ala (Pr>NP)	Het	T	B	P	0.0653

Chr: chromosome; AA: amino acid; NP: non-polar; Pr: polar; Hom: homozygous; Het: heterozygous; T: tolerated; D: damaging; P: polymorphism; N: neutral; MAF: minor allele frequency; gnomAD: Genome Aggregation Database; B: benign.

Figure 4.

Confirmation of compound heterozygous CR2 variants identified from WES using Sanger sequencing. DNA sequence chromatograms illustrate the variants in CR2 at (a) c.1916G>A and (b) c.2012G>A for normal control, P1, P2, mother and father of the MZ twins. The patients’ mother has a heterozygous c.1916G>A variant whereas the father has a heterozygous c.2012G>A variant, leading to the compound heterozygosity in patients.

C: normal control; M: mother; F: father.

Summary of missense variants found in the CVID-related genes. Chr: chromosome; AA: amino acid; NP: non-polar; Pr: polar; Hom: homozygous; Het: heterozygous; T: tolerated; D: damaging; P: polymorphism; N: neutral; MAF: minor allele frequency; gnomAD: Genome Aggregation Database; B: benign. Confirmation of compound heterozygous CR2 variants identified from WES using Sanger sequencing. DNA sequence chromatograms illustrate the variants in CR2 at (a) c.1916G>A and (b) c.2012G>A for normal control, P1, P2, mother and father of the MZ twins. The patients’ mother has a heterozygous c.1916G>A variant whereas the father has a heterozygous c.2012G>A variant, leading to the compound heterozygosity in patients. C: normal control; M: mother; F: father.

Discussion

Here, we performed WES analysis on a pair of MZ twins suspected with PID to identify the causative genetic variation responsible for their condition. WES results demonstrated that CR2 gene could be the possible disease-associated gene in both patients. CR2 deficiency is known to be a causative factor for CVID phenotype.[29] CVID is a primary immunodeficiency characterised by a decrease in serum IgG level, a decrease in either IgA or IgM, and a poor response to vaccines in children of at least 2 years of age, after excluding other causes of hypogammaglobulinemia.[30] Complement receptor 2 (CR2), an essential receptor for Epstein–Barr virus, that expressed primarily on B cells and follicular dendritic cells was also reported on T cells.[31] This 145-kDa membrane protein is encoded by CR2 gene with 19 exons at chromosomal location of 1q32.[32] CR2 protein functions as a coordinator for humoral response and complement system.[33] The CR2 protein is composed of 15 or 16 extracellular short consensus repeats (SCRs), one transmembrane domain and an intracytoplasmic region.[34] Various types of mutations (nonsense, frameshift, splice site and missense) have been reported across all domains of CR2 protein.[32] The mutation sites (c.1916G>A and c.2012G>A) found in our patients occurred in the exon 10 and exon 11 of CR2 gene. These mutations were predicted to cause amino acid changes in the protein-coding sequence (p. Ser639Asn and p. Arg671His), respectively, in which the resulting amino acid residues are located in the conserved sushi domain or SCRs of CR2 protein. However, protein expression analysis was unable to be conducted because both patients deceased in the middle of the study. Similarly, functional assays (in vitro or in vivo) which are crucial to evaluate the pathogenicity of causative gene especially in compound heterozygous case were unable to be performed. CR2 deficiency has been reported as autosomal recessive disorder.[26] Besides, a compound heterozygous mutation in CR2 gene was recently described to be associated with CR2 deficiency. Hypogammaglobulinemia and reduced memory B cells were noted in the patient with CR2 deficiency.[32] In our study, Sanger sequencing showed that each of our patients inherited a mutated allele from each parent, suggesting compound heterozygous status in both patients. Therefore, these variants (c.1916G>A and c.2012G>A) are most likely to be associated with the disease phenotypes seen in our patients. In addition, copy number variation (CNV) analysis in patients with complex phenotypes could be used to further investigate the concordant and discordant CNVs between MZ twins.

Conclusion

In summary, this report describes an analysis based on WES dataset in a pair of Malay-descendant MZ twins. Based on evidence from Sanger sequencing and bioinformatics analysis, the compound heterozygous missense mutations (exon 10: c.1916G>A and exon 11: c.2012G>A) in CR2 gene resulted in a change to unrelated amino acids (p. Ser639Asn and p. Arg671His) that may cause CR2 deficiency in both patients.

33 in total

1. MutationTaster evaluates disease-causing potential of sequence alterations.

Authors: Jana Marie Schwarz; Christian Rödelsperger; Markus Schuelke; Dominik Seelow
Journal: Nat Methods Date: 2010-08 Impact factor: 28.547

2. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm.

Authors: Prateek Kumar; Steven Henikoff; Pauline C Ng
Journal: Nat Protoc Date: 2009-06-25 Impact factor: 13.491

3. Four study design principles for genetic investigations using next generation sequencing.

Authors: Clinton C Mason
Journal: BMJ Date: 2017-10-12

Review 4. Impacts of variation in the human genome on gene regulation.

Authors: Rajini R Haraksingh; Michael P Snyder
Journal: J Mol Biol Date: 2013-07-16 Impact factor: 5.469

5. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.

Authors: Sue Richards; Nazneen Aziz; Sherri Bale; David Bick; Soma Das; Julie Gastier-Foster; Wayne W Grody; Madhuri Hegde; Elaine Lyon; Elaine Spector; Karl Voelkerding; Heidi L Rehm
Journal: Genet Med Date: 2015-03-05 Impact factor: 8.822

6. Evaluation of Clinical and Immunological Characteristics of Children with Common Variable Immunodeficiency.

Authors: Gülsüm Alkan; Sevgi Keles; İsmail Reisli
Journal: Int J Pediatr Date: 2018-04-23

7. Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors: Heng Li; Richard Durbin
Journal: Bioinformatics Date: 2009-05-18 Impact factor: 6.937

Review 8. Common variable immunodeficiency: an update.

Authors: Ulrich Salzer; Klaus Warnatz; Hans Hartmut Peter
Journal: Arthritis Res Ther Date: 2012-09-24 Impact factor: 5.156

9. Whole-exome sequencing and functional studies identify RPS29 as a novel gene mutated in multicase Diamond-Blackfan anemia families.

Authors: Lisa Mirabello; Elizabeth R Macari; Lea Jessop; Steven R Ellis; Timothy Myers; Neelam Giri; Alison M Taylor; Katherine E McGrath; Jessica M Humphries; Bari J Ballew; Meredith Yeager; Joseph F Boland; Ji He; Belynda D Hicks; Laurie Burdett; Blanche P Alter; Leonard Zon; Sharon A Savage
Journal: Blood Date: 2014-05-14 Impact factor: 25.476

10. Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors: Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal: Bioinformatics Date: 2014-04-01 Impact factor: 6.937