Literature DB >> 32489286

Whole exome sequencing of a Saudi family and systems biology analysis identifies CPED1 as a putative causative gene to Celiac Disease.

Hifaa A Bokhari¹, Noor Ahmad Shaik^1,2, Babajan Banaganapalli^1,2, Khalidah Khalid Nasser^2,3, Hossain Ibrahim Ageel⁴, Ali Saad Al Shamrani⁵, Omran M Rashidi¹, Omar Yaseen Al Ghubayshi¹, Jilani Shaik⁶, Aftab Ahmad⁷, Nuha Mohammad Alrayes^3,2, Jumana Yousuf Al-Aama^1,2, Ramu Elango^1,2, Omar Ibrahim Saadah⁸.

Abstract

Celiac disease (CD) is a gastrointestinal disorder whose genetic basis is not fully understood. Therefore, we studied a Saudi family with two CD affected siblings to discover the causal genetic defect. Through whole exome sequencing (WES), we identified that both siblings have inherited an extremely rare and deleterious CPED1 genetic variant (c.241 A > G; p.Thr81Ala) segregating as autosomal recessive mutation, suggesting its putative causal role in the CD. Saudi population specific minor allele frequency (MAF) analysis has confirmed its extremely rare prevalence in homozygous condition (MAF is 0.0004). The Sanger sequencing analysis confirmed the absence of this homozygous variant in 100 sporadic Saudi CD cases. Genotype-Tissue Expression (GTEx) data has revealed that CPED1 is abundantly expressed in gastrointestinal mucosa. By using a combination of systems biology approaches like protein 3D modeling, stability analysis and nucleotide sequence conservation analysis, we have further established that this variant is deleterious to the structural and functional aspects of CPED1 protein. To the best of our knowledge, this variant has not been previously reported in CD or any other gastrointestinal disease. The cell culture and animal model studies could provide further insight into the exact role of CPED1 p.Thr81Ala variant in the pathophysiology of CD. In conclusion, by using WES and systems biology analysis, present study for the first-time reports CPED1 as a potential causative gene for CD in a Saudi family with potential implications to both disease diagnosis and genetic counseling.

Entities: Chemical Disease Gene Mutation Species

Keywords: Autosomal recessive; CPED1; Celiac Disease (CD); Molecular analysis; Whole exome sequencing

Year: 2020 PMID： 32489286 PMCID： PMC7254030 DOI： 10.1016/j.sjbs.2020.04.011

Source DB: PubMed Journal: Saudi J Biol Sci ISSN： 2213-7106 Impact factor: 4.219

Introduction

Celiac Disease (CD) is a chronic enteropathy elicited by gluten, a storage protein found in rye, barley, and wheat, in genetically susceptible individuals (Fasano and Catassi, 2012). CD has a worldwide prevalence of 0.5%-1%. Recent meta-analysis study has reported the variable prevalence of CD in Asians (0.6%), North Americans (0.5%), South Americans (0.4%), and Europeans (0.8%) (Singh et al., 2018). Interestingly, CD was found to be prevalent in 1.5% of Saudi children (Al-Hussaini et al., 2017). The actual prevalence of CD is seen to be variable among different regions in the kingdom. For example, the highest sero-positivity prevalence was reported in the central region with 3.2%, followed by the southwestern region with 2.1% and the lowest frequency of 1.8% was reported in the western region (Aljebreen et al., 2013). CD primarily damage the small intestinal tissues, although it shows broad clinical manifestations. According to Oslo classification, the clinical presentation of CD, could be either classical, non-classical, silent, latent or refractory among other specific subtypes (Ludvigsson et al., 2013). Therefore, one of the main complications of CD is delayed diagnosis due to its clinical diversity (Fuchs et al., 2018). Nevertheless, duodenal biopsy for small bowel villous atrophy and serological tests of anti-tTG antibodies, anti-endomysium antibodies (EmA), and deamidated gliadin peptide (DGP) antibodies are still considered as gold standard diagnostic tests for confirmation of CD (Volta et al., 2010). Till date, CD treatment involves keeping patients on a strict gluten-free diet to reduce the symptoms, but not the histological manifestations (Wada et al., 2018). Like other autoimmune diseases, CD is also caused by the complex interactions between genetic, non-genetic factors and immune responses. Gluten consumption along with HLA and non-HLA predisposition alleles contributes to its complexity, thus making CD not just a multifactorial but polygenic disorder. Particularly, HLA-DQ2.5 and HLA-DQ8 haplotypes play a crucial role in stimulating the abnormal reaction in the gastrointestinal tract against gluten. In Saudi population, high risk HLA-DQ alleles were known to be prevalent in 52.7% of healthy children from Riyadh province (Al-Hussaini et al., 2018). Although, HLA genes can increase the risk of the disease, they are not sufficient for developing CD. For example, together HLA DQ2.5 and DQ8 alleles could only explain 25%-40% of the CD genetic risk (Lindfors et al., 2019). This implies that, unknown non-HLA genes also largely contribute to the disease development. The introduction of Genome-wide association studies (GWAS) in CD patients, have helped in discovering the role of 56 non-HLA genetic loci in modifying the disease risk (Hunt et al., 2008, Trynka et al., 2009, Dubois et al., 2010, Trynka et al., 2011). As per the current understanding, non-HLA genetic susceptibility loci only contributes 10% of the current CD risk (Withoff et al., 2016). Although, GWAS provided a well-recognized knowledge about CD genetics, however most studies were carried out among Caucasians or Asians (Hrdlickova et al., 2018a, Hrdlickova et al., 2018b). This necessitates us to look for genetic markers in other ethnic populations like Saudi Arabians who possess a unique genetic architecture due to the high consanguineous marriage backgrounds (Al-Hussaini et al., 2018, Alnaqeb et al., 2018). Furthermore, majority of the classical GWAS studies were carried out among sporadic cases and could not identify any causal genes. This highlights the need to study CD in rare familial cases to unveil the role of rare variants in genes that cannot be spotted by genome wide association studies conducted in sporadic cases. While common variants in CD are extensively studied through microarray-based genome wide genotyping, studying rare variants has only become feasible with next-generation sequencing approaches like whole exome sequencing (WES), which scans coding regions of the human genome. Therefore, owing to the limited scientific data available on familial aggregation of CD, this study was aimed to find the potential rare variant causal to CD in a Saudi Arabian family using a combination of whole exome sequencing and systems biology approaches.

Materials and methods

Recruitment of CD family and sporadic cases

The Saudi family with two CD affected siblings were referred to our Pediatric Gastroenterology clinic in King Abdulaziz University Hospital. Both these children met the standard diagnostic guidelines of the European Society for Pediatric Gastroenterology, Hepatology and Nutrition (ESPHGAN) for CD (Husby et al., 2012). In brief, CD diagnosis included clinical examinations by pediatric gastroenterologists, intestinal endoscopy and serological test for tissue transglutaminase antibodies (anti-tTG). Multi-generation pedigree was drawn by geneticists after collecting the full family history. In order to validate the genetic findings of CD family, we have additionally recruited 100 sporadic CD cases, who were regularly followed up at our gastroenterology clinic since the time they were diagnosed. This study protocol was approved by the Research Ethics Committee, King Abdulaziz University Hospital, Jeddah (KAUH). The parental written informed consent were obtained after explaining them about our study protocol. Approximately 2.5–3 ml of venous blood samples from all the family members were collected in EDTA tubes and stored at −80 °C until genetic analysis was performed.

DNA extraction

Genomic DNA was extracted and purified with QIAamp DNA blood Kit, following the manufacturer’s instructions. DNA concentration and purity were measured by Nanodrop (ND-1000 UV–VIS) spectrophotometer.

WES analysis

Genomic DNA (100 ng/µl; 260/280 ratio of 1.8–2.0) was used for preparing the library. The whole exome sequencing was performed on HiSeq2000 Next Generation Sequencer (Illumina, San Diego, CA). DNA shearing was done using Agilent Sure Select Target V6 Enrichment capture kit, reaching an optimal size-ranged fragments. The fragmented DNA were hybridized with ultra-long 120-mer biotinylated cRNA library baits. Library concentration and size were examined by capillary electrophoresis. Different adapters were integrated during enrichment, allowing the samples to be amplified for further sequencing. The sequencing reads (in FASTQ format) were aligned against the human genome reference sequence build 38 (GRCH38.p12) supported by BLAST (version 0.6.4d) for variant calling and annotation. The genome analysis toolkit (GATK) was used for downstream high quality variant calling (McKenna et al., 2010). The SAM tools suite was used to differentiate SNPs and short insertion/deletion in each read (Basha et al., 2018). The read coverage was ~ 100×, representing 87% of the targeted region, as (Fig. 1).

Fig. 1

Workflow whole-exome sequencing; Phase I- Sample preparation; Phase II Sequence read mapping: Phase III Variant calling; Phase IV: Variants validation and Genotyping.

Variant filtering and candidate gene selection

CD candidate gene list was collected from different databases (NCBI, GWAS catalog, celiac gene panels) to verify their presence in the exome data of our patients. From the WES data, variants were retained if: (1) they passed the sequencing quality control (2) they were found in coding or regulatory sites (3) they had an allele frequency (MAF) of < 1.5% in general populations, African population, East and south Asian population, (4) they had an allele frequency < 1.49 × 10−02 based on the gnomAD database, (5) they are homozygote or compound heterozygotes showing an autosomal recessive inheritance pattern. Moreover, all genes with variants were filtered by their functional relevance to autoimmune diseases in general, and particularly to CD. Genes whose expression was reported in the small intestine or any related parts of the gastrointestinal system were considered potential to CD. The rare frequency of the filtered variants, was further verified in the WES data of 2379 healthy Saudi volunteers, hosted on Saudi Genome Project webserver (SHGP: https://www.saudigenomeproject.org/en/).

Sanger sequencing of candidate variant

The potential WES-derived candidate variants were further confirmed through Sanger sequencing. For this purpose, we designed oligonucleotide specific primers using open source NCBI Primer Blast webserver (Ye et al., 2012) with an average of 400–600 bp of target amplicon size. (supplementary Table 1). All the sequence reads were aligned and analyzed against reference mRNA (ENST00000310396.10) sequence of CPED1 gene with BioEdit (http://www.mbio.ncsu.edu/) program. Mutation position was determined considering “A” of first ATG codon of the mRNA.

Computational analysis of candidate protein

The human CPED1 sequences (both nucleotide and amino acid) were aligned against CPED1 sequence of 13 selected primates, with help of comparative genomic orthologue option available in Ensemble browser (www.esembl.org) for examining whether the causative variant is located in a conserved sequence region. We also sought to explore the consequences of candidate rare variant on the protein structure by simulating its 3D structure through either ab-initio or homology modeling approaches, depending on the availability of experimentally solved structure (Rafi et al., 2019). Confidence of each model is quantified by the C-score based on (a) significance of threading template alignments, and (b) convergence parameters of the structure assembly simulations. The best protein selected by highest C-score was subjected to energy minimization by applying the structure to the gromacs-steepest descent energy minimization method using NOMAD-Ref web server (Lindahl et al., 2006). Moreover, the energy minimized native model was entered into DUET software to build the mutated model and to predict variant impact on the protein stability (Abduljaleel, 2019). In addition, backbone atoms variations among superimposed proteins both at whole structure and residue levels were analyzed by calculating their positional root mean square deviation (RMSD) values. RMSD cutoff score considered for complete structure alterations was > 2.0 Å and for residue level it was > 0.2 Å (Ahmed Awan et al., 2020, Masoodi et al., 2019, Kufareva and Abagyan, 2012). Finally, network analysis results of STRING v.9.1 webserver were used to identify the key pathways and gene networks based on known protein–protein interactions between the candidate proteins and other genes (Franceschini et al., 2013, Safaei et al., 2016).

Results

Case presentation

The proband III.1 (3 years) was born to unrelated parents with no family history for celiac disease (Fig. 2). With the introduction of formula milk, she developed gastrointestinal symptoms including severe abdominal pain, bloating and diarrhea. She also demonstrated other symptoms like mild osteoporosis, eczema, severe skin rash, short stature, weight loss and lack of appetite. Besides that, she was previously diagnosed with left congenital renal and ureter disease. Endoscopy test showed fissures on the folds and absence of intestinal villi and her tTG antibody screening confirmed that she is CD positive. Her younger sister III.2 (2 years) suffers from severe diarrhea, vomiting and abdominal bloating is triggered when solid food was introduced at six months of age. She was also put on gluten free diet. She has a normal weight chart and no further complications were reported. All symptoms disappeared after keeping both of them on gluten-free diet.

Fig. 2

Sanger sequencing analysis of CD Family. Probands are indicated by the (arrow). Exome sequenced individuals are indicated with an (*) mark showing the electrophoretic trace for mutations of the CPED1 gene. The probands are homozygous to the mutation in exons 2 (c.241 G/G). Both parents are heterozygous for the mutation (c.241 A/G).

Exome sequencing

The terms ‘variant’ or ‘mutation’ represents each other in this manuscript. Exome data had an output of approximately 100 K of variants for each individual. Non-coding variants (downstream, upstream, intergenic and intronic variants) were excluded, except the regulatory region variants (3′-UTR and 5′-UTR). All data were filtered to exclude shared variants with MAF of > 0.015 based on 1000 Genomes Project data. The exome data were screened for both homozygous and heterozygous mutations. In these patients, about 2600 unique and rare variants were detected in the form of missense, frameshift and regulatory region mutations (Fig. 3).

Fig. 3

The genetic variants yield of CD family. Yellow section indicates the whole variants generated from whole exome sequencing. Green section indicates known variants with MAF > 1.5%. Orange section indicates variants with MAF < 1.5%. Blue section indicates final filtered variants. Note: Filtering criteria for variants: exonic, unknown or extremely rare (MAF = <0.015%). All the filtered variants were further segregated based on their functional annotations to CD, or other gastrointestinal disease with known autoimmune background. The final list consisted of 16 pathogenic variants (8 missense, 2 frameshift and 6 UTR), spanning 9 genes located on different chromosomes: 7,12,14,18,19 and 22. All these genes were known to have a putative functional relationship with inflammatory responses or autoimmune diseases (Table 1, Table 2). Among these 16 variants, a homozygous missense mutation in cadherin-like and PC-esterase domain containing 1 (CPED1) gene is the only one which survived our stringent variant filtering criteria, as detailed in our methods. Interim release of UK Biobank data (UKBB) on gene-phenotype association reported a number of CPED1 variants are significantly associated with gastrointestinal clinical phenotypes (Staley et al., 2016, Kamat et al., 2019). The putative CD causal variant of CPED1 gene is found in homozygous condition in both siblings, whereas it was seen in heterozygous condition in parents.

Table 1

List of homozygous nucleotide variants which showed autosomal recessive inheritance pattern in CD family.

No.	Gene name	Chromosomal position	Ref/Alt Bases	Effect	HGVS.c	HGVS.p	dbSNP ID	1000G	ExAC
1	HEBP1	12: 13,000,192	C/CGGCGGCAGGGCGGCAG	5′ UTR variant	c.-94_-79dupCTGCCGCCCTGCCGCC	.	rs71064363	.	.
2	NFATC1	18: 79,400,367	C/CGCCCCG	5′ UTR variant	c.-14_-9dupCGGCCC	.	rs749710747	.	.
3	CPED1	7: 120,989,862	A/G	Missense variant	c.241A > G	p.T81A	rs117047013	0.0011	0.0011

Table 2

List of compound heterozygous nucleotide variants following autosomal recessive inheritance pattern in CD family.

No.	Gene name	Chromosomal position	Ref/Alt Bases	Effect	HGVS.c	HGVS.p	dbSNP ID	1000G	ExAC
1	CYP27B1	12: 57,766,052	G/C	Missense variant	c.341C > G	p.T114R	rs1247839240	.	.
	CYP27B1	12: 57,766,119	C/T	Missense variant	c.274G > A	p.A92T	rs958027919	.	.
2	HIF1A	14–61,697,832	A/AT	5′ UTR variant	c.-9dupT	.	rs769491528	.	.
	HIF1A	14–61,738,090	C/T	Missense variant	c.1325C > T	p.T442I	rs41508050	0.0039	0.0036
3	SMAD4	18: 51,082,732	G/GT	3′ UTR variant	c.*4277dupT	.	rs559471969	.	.
	SMAD4	18:51,084,012	G/GCACA	3′ UTR variant	c.5574_5577dupCACA	.	rs1491420266	.	.
4	MUC16	19: 8,888,826	T/TCCGA	Frameshift variant	c.40672_40673insTCGG	p.K13558fs	rs769228524	.	.
	MUC16	19: 8,974,159	GA/G	Frameshift variant	c.6979delT	p.S2327fs	rs759867982;rs208917	.	.
	MUC16	19: 8,974,184	T/C	Missense variant	c.6955A > G	p.I2319V	rs200019597;rs208918	.	0.0003
5	FCGBP	19: 39,886,005	T/G	Missense variant	c.8174A > C	p.N2725T	rs2916067	.	0.00005
	FCGBP	19: 39,902,423	C/T	Missense variant	c.4405G > A	p.G1469R	rs143680639	.	0.0738
6	SLC7A4	22: 21,029,052	C/T	3′ UTR variant	c.*3G > A	.	rs140809496	0.0023	.
	SLC7A4	22: 21,031,031	G/T	Missense variant	c.782C > A	p.A261D	.	.	.

List of homozygous nucleotide variants which showed autosomal recessive inheritance pattern in CD family. List of compound heterozygous nucleotide variants following autosomal recessive inheritance pattern in CD family. This homozygous nucleotide variant c. 241A > G (rs117047013), results in the substitution of novel Threonine amino acid at 81st residue position to Alanine. This mutation appeared to have an allele frequency of 0.02985 (~2%) among 2379 in Saudi Arabia people, whereas only one individual had a homozygous genotype (1/2379 = 0.0004). In the 1000 genome project, allele frequency was reported to be 0.1% with only one homozygous individual. Moreover, in gnomAD, which hosts the genetic data of ~ 251 K individuals from across the world, the frequency of this variant is 0.1%. Hence, the rare frequency of this variant highly suggests it to be a pathogenic and has a potential role in pathophysiology of CD.

Validation of family specific variant using Sanger sequencing

Sanger sequencing has confirmed that both affected sisters (III.1 and III.2) were homozygous for the mutation c.241A > G (GG genotype), while both parents (II.1 and II.2) were heterozygous carriers (AG genotype), and that this variant was inherited in the family in an autosomal recessive fashion. This variant was confirmed to be absent in 98% of Saudi CD sporadic cases whereas the remaining 2% were found to be carriers for the heterozygous mutation, confirming the unique familial segregation of this variant, showing a strong association of CPED1 gene to be the causative in celiac disease.

Computational analysis of CPED1 mutant and wild type gene products

Cross-species alignment of CPED1 nucleotide sequences showed that this variant site falls in a highly conserved region (Fig. 4), further confirming our initial suspicion that this variant is pathogenic. Additionally, amino acid sequence analysis too confirmed the high-level conservation of the residue and its location across nine primates, indicating its critical role to the function of CPED1 protein.

Fig. 4

(A) Phylogenetic tree of the human Cadherin-like and PC-esterase domain containing 1 (CPED1) aligned with 13 primates showing the aligned sequence in dark blue sections and gaps in white sections. (B) Nucleotide sequence Alignment of human and primates CPED1 gene showing the highly conservative site of CPED1 variant. (C) Alignment region containing CPED1 variant in chromosome 7. While an open reading frame for CPED1 gene has been defined, but its protein structure at three-dimensional level has not been solved through conventional x-ray crystallography methods. Therefore, we have adopted an ab-initio modeling approach using I-TASSER that uses only the highest significance in the string alignments to obtain CPED1 protein model. The predicted CPED1 protein model had the C-score of −1.16, TM-score of 0.57 ± 0.15 and RMSD score of 11.8 ± 4.5 Å. Overall, these scores indicate the high quality of the predicted structure and accurate folding of CPED1 protein chains. Furthermore, superimposition at the protein residue level revealed significant divergence of CPED1 protein in mutant condition (RMSD score is 1.8735 Å), (Fig. 5).

Fig. 5

Molecular view of protein superimposition of native (green in color) and mutant (blue in color) version of CPED1 proteins.

Molecular view of protein superimposition of native (green in color) and mutant (blue in color) version of CPED1 proteins. DUET-Integrated computational approach (ΔΔG) of the two methods (mCSM and SDM) that uses SVM regression with a Radial Basis Function kernel has predicted that the p.Thr81Ala substitution destabilizes the CPED1 protein (ΔΔG is −0.562 Kcal/mol). Furthermore, protein–protein interaction network analysis showed the functional association of CPED1 with ten protein partners, of which Wnt family member 16 (WNT16) and Protein Phosphatase 6 Regulatory subunit 3 (PPP6R3) proteins had the highest confidence scores of 0.61 and 0.57, respectively (Fig. 6).

Fig. 6

Predicted interactions in protein network analysis for CPED1 (red colored node) were exported from STRING online database (http://string-db.org). Circled protein nodes indicate the strongest protein partners for CPED1 protein.

Discussion

HLA loci, spanning several genes and variants by and large is the major known genetic risk factor contributing to the development of CD (Abadie et al., 2020). In quest of solving the non-HLA genetic basis of CD, several GWAS studies with case-control study design were conducted in several ethnic populations. However, most studies have reported the association of genetic variants which could explain disease risk only for the specific ethnic population. First GWAS was conducted with 778CD cases and 1422 controls revealed a genetic variant (rs13119723) in a linkage disequilibrium block encompassing the KIAA1109/Tenr/IL2/IL21 genes, as a novel susceptibility factor for CD among British population (van Heel et al., 2007). Later on, the list of potential CD risk variants was further expanded with CCR3, IL12A, IL18RAP, RGS1, SH2B3 and TAGAP genes which regulates immune reactions in humans (Hunt et al., 2008). GWAS has so far identified 57 risk loci spanning multiple genes like PTPN2, IL18RAP, CCR4, ICOSLG, TAGAP, DUSP10, PUS10, etc. (Dubois et al., 2010, Festen et al., 2011, Garner et al., 2014, Hrdlickova et al., 2018a, Hrdlickova et al., 2018b, Ostensson et al., 2013, Trynka et al., 2011, Zhernakova et al., 2011). However, these GWAS studies have been conducted on patients of European and American ethnicity, leaving a gap of knowledge on CD from Arab population, which has high inbreeding rate (Al-Hussaini et al., 2018, Alnaqeb et al., 2018). Single case-control study reported from Saudi Arabia had described that a 518 Thr/Thr polymorphism in MMEL1 had an independent association with susceptibility to CD in Saudi patients (Saadah et al., 2015). However, all the above studies only highlighted statistically associated common variants but not the specific causal factors of CD etiology. On the contrary, WES has created a paradigm shift in genetic diagnosis of complex diseases, which are otherwise very difficult to be exclusively studied by conventional Sanger sequencing (Li et al., 2018, Maffucci et al., 2016). Extending the power of WES and deep resequencing, a large cohort of large families with many affected members of Caucasian origin was unable to identify any new CD mutations (Mistry et al., 2015). Moreover, a recent study on a Saudi family found the putative protective role of a rare homozygous insertion (c.1683_1684insATT) in AK5 gene which modifies the risk of Saudi population against CD development (Al-Aama et al., 2017). However, with the recent availability of Saudi Human Genome Project (SHGP) data, it is now feasible to find the true positive disease causative genes (Al-Hamed et al., 2019, Yemni et al., 2019). Here, by using WES, a hypothesis free survey to scan potential coding rare variants, we identified a rare CPED1 genetic mutation (c. 241A > G). Remarkably, this novel variant has not been previously associated with any disease. This study is the first to propose CPED1 as a causative gene of CD in a Saudi family. This gene is located on the long arm of chromosome 7 at the position of 31.31. It is composed of 3081 nucleotides, 23 exons and encodes a 1,026 amino acid long protein (Zhu et al., 2012). This protein comprises an N-terminal signal peptide that functions in the guidance of the protein product to the secretory pathway. It consist of two domains mainly involved in acyl-transferase and acyl-esterase activities for the modification of glycoproteins, a cadherin-like domain, and a PC esterase domain with an N-terminal fused ATP-grasp domain (Anantharaman and Aravind, 2010). Our in-silico analysis showed a decreased stability of the assembly of these two domains which may affect the protein integrity and might also contribute to CD. Moreover, genetic background of CD has been reported to share a common susceptibility risk-loci with other immune diseases like inflammatory bowel disease and rheumatoid arthritis. Meta-analysis of GWAS data has reported four shared risk loci in PUS10, PTPN2, TAGAP, and IL18RAP between CD and Crohn’s disease due to their common genetic pathway (Festen et al., 2011), and seven shared loci with rheumatoid arthritis (Alexandra Zhernakova et al., 2011) counting on the shared mechanism between the two diseases in antigen presentation and T-cell activation. Furthermore, a cross‐disease study based on gene expression showed the upregulation of FASLG, PLEK, CCR4, and TAGAP genes in both CD and ulcerative colitis (Medrano et al., 2019). Overall, our results show a similar correlation that links CPED1 as a CD causal gene based on the genome wide association studies in UKBB with inflammatory bowel disease, rheumatoid arthritis and multiple intestinal diseases (Supplementary table 2). Moreover, GWAS among 3,475 Finnish ancestry individuals, found the association of intronic CPED1 variant (rs77670845) with alterations in plasma interleukin 2 levels (IL-2) (Ahola-Olli et al., 2017). Interleukin 2 has an important role in immune system especially in the autoimmune disease prevention, where IL-2 promotes the differentiation of particular immature T-cells into regulatory T-cells, which suppress other T-cells that are otherwise primed to attack normal healthy cells in the body (Zhang et al., 2019). The role of IL-2 in immunity reportedly to act against gluten consumption in CD patients (Tye-Din et al., 2019) suggesting the use of IL-2 high serum level as a reliable diagnostic approach in celiac patients (Goel et al., 2019). Because CPED1 is a novel candidate gene with largely unknown functions, CPED1 mutations are few and limited to 20 citations in PubMed. Data from Human Protein Atlas () indicate that CPED1 is expressed in several tissues and most abundant in the gastrointestinal mucosa. Likewise, CPED1 expressed in myeloid dendritic cells (DC), plasmacytoid DC, intermediate monocyte, classical monocyte and non-classical monocyte. Protein network predictions of CPED1 has linked it to the protein PPP6R3, which is known for its important role in maintaining immune self-tolerance (Coulibaly et al., 2019). Another predicted protein–protein association was on CPED1 with WNT16 protein. WNT signaling has a fundamental role in cellular development, proliferation and cell fate regulation (Loh et al., 2016). WNT signaling is important for the intestinal cells’ proliferation, gut development (Gregorieff et al., 2005) and alteration in such signaling pathways can affect the intestinal epithelial proliferation as it is expressed early in human infants (Dudhwala et al., 2019). This suggests a therapeutic potential as it activates the WNT signaling to compromise the intestinal epithelia in IBD and CD patients (Kuo, 2005). WNT16 activates the β-catenin/T-cell factor (TCF) pathway in dendritic cells. Recent studies have reported that the activation of Wnt/β-catenin pathway in DCs plays to solve a crucial puzzle in the mucosal tolerance and the suppression of chronic autoimmune pathologies (Suryawanshi et al., 2016). Emerging evidence support our hypothesis that mutations in CPED1 gene may alter the WNT signaling in DCs of the intestinal epithelium, hence it most likely contributes to the celiac pathogenesis. Human GWAS have repeatedly mapped the uncharacterized CPED1 gene to the bone mineral density (BMD) phenotypes and with osteoporosis development (Chesi et al., 2017, Estrada et al., 2012, Medina-Gomez et al., 2012, Medina-Gomez et al., 2018). These findings may also explain the short stature phenotype in our patients and the growth failure in Saudi CD patients (Saadah, 2020). Shifting from genetic causes to environmental factors, one study showed how cesarean delivery can play a crucial role on developing CD by influencing the mucosal immune system through the infant’s microbiome (Decker et al., 2011). Although more recent studies have reported otherwise (Dydensborg Sander et al., 2018, Koletzko et al., 2018), we found both affected individuals in this family were born by cesarean section. There are few limitations to which this study would sincerely like to admit. First, the affected sisters have no other sibling to compare the inheritance pattern of variant and also the negative family history to CD. Secondly, CPED1 has unknown function, making the in-silico analysis more challenging to support our claim. In conclusion, we showed how WES method can be used as a powerful genetic strategy in resolving the molecular basis of complex or polygenic diseases like CD and also presents an opportunity to discover new potential candidate genes and therapeutic targets helpful in disease diagnosis and treatment. Our WES analysis has identified a genetic missense variant in CPED1 gene c.(241A > G) as a novel causal gene for CD in a Saudi family. This study expands the spectrum of celiac mutations that could be useful to the genetic diagnosis and counseling of families with CD. It is important to point out the importance of cell line based functional studies from gastrointestinal system to confirm the specific molecular mechanisms through which CPED1 mutations contributes to the biological pathways connected to CD.

63 in total

Review 1. Coeliac disease.

Authors: Katri Lindfors; Carolina Ciacci; Kalle Kurppa; Knut E A Lundin; Govind K Makharia; M Luisa Mearin; Joseph A Murray; Elena F Verdu; Katri Kaukinen
Journal: Nat Rev Dis Primers Date: 2019-01-10 Impact factor: 52.329

2. A locus at 7p14.3 predisposes to refractory celiac disease progression from celiac disease.

Authors: Barbara Hrdlickova; Chris J Mulder; Georgia Malamut; Bertrand Meresse; Mathieu Platteel; Yoichiro Kamatani; Isis Ricaño-Ponce; Roy L J van Wanrooij; Maria M Zorro; Marc Jan Bonder; Javier Gutierrez-Achury; Christophe Cellier; Alexandra Zhernakova; Petula Nijeboer; Pilar Galan; Sebo Withoff; Mark Lathrop; Gerd Bouma; Ramnik J Xavier; Bana Jabri; Nadine C Bensussan; Cisca Wijmenga; Vinod Kumar
Journal: Eur J Gastroenterol Hepatol Date: 2018-08 Impact factor: 2.566

3. Methods of protein structure comparison.

Authors: Irina Kufareva; Ruben Abagyan
Journal: Methods Mol Biol Date: 2012

4. Newly identified genetic risk variants for celiac disease related to the immune response.

Authors: Karen A Hunt; Alexandra Zhernakova; Graham Turner; Graham A R Heap; Lude Franke; Marcel Bruinenberg; Jihane Romanos; Lotte C Dinesen; Anthony W Ryan; Davinder Panesar; Rhian Gwilliam; Fumihiko Takeuchi; William M McLaren; Geoffrey K T Holmes; Peter D Howdle; Julian R F Walters; David S Sanders; Raymond J Playford; Gosia Trynka; Chris J J Mulder; M Luisa Mearin; Wieke H M Verbeek; Valerie Trimble; Fiona M Stevens; Colm O'Morain; Nicholas P Kennedy; Dermot Kelleher; Daniel J Pennington; David P Strachan; Wendy L McArdle; Charles A Mein; Martin C Wapenaar; Panos Deloukas; Ralph McGinnis; Ross McManus; Cisca Wijmenga; David A van Heel
Journal: Nat Genet Date: 2008-03-02 Impact factor: 38.330

5. Coeliac disease-associated risk variants in TNFAIP3 and REL implicate altered NF-kappaB signalling.

Authors: G Trynka; A Zhernakova; J Romanos; L Franke; K A Hunt; G Turner; M Bruinenberg; G A Heap; M Platteel; A W Ryan; C de Kovel; G K T Holmes; P D Howdle; J R F Walters; D S Sanders; C J J Mulder; M L Mearin; W H M Verbeek; V Trimble; F M Stevens; D Kelleher; D Barisani; M T Bardella; R McManus; D A van Heel; C Wijmenga
Journal: Gut Date: 2009-02-24 Impact factor: 23.059

6. Life-Course Genome-wide Association Study Meta-analysis of Total Body BMD and Assessment of Age-Specific Effects.

Authors: Carolina Medina-Gomez; John P Kemp; Katerina Trajanoska; Jian'an Luan; Alessandra Chesi; Tarunveer S Ahluwalia; Dennis O Mook-Kanamori; Annelies Ham; Fernando P Hartwig; Daniel S Evans; Raimo Joro; Ivana Nedeljkovic; Hou-Feng Zheng; Kun Zhu; Mustafa Atalay; Ching-Ti Liu; Maria Nethander; Linda Broer; Gudmar Porleifsson; Benjamin H Mullin; Samuel K Handelman; Mike A Nalls; Leon E Jessen; Denise H M Heppe; J Brent Richards; Carol Wang; Bo Chawes; Katharina E Schraut; Najaf Amin; Nick Wareham; David Karasik; Nathalie Van der Velde; M Arfan Ikram; Babette S Zemel; Yanhua Zhou; Christian J Carlsson; Yongmei Liu; Fiona E McGuigan; Cindy G Boer; Klaus Bønnelykke; Stuart H Ralston; John A Robbins; John P Walsh; M Carola Zillikens; Claudia Langenberg; Ruifang Li-Gao; Frances M K Williams; Tamara B Harris; Kristina Akesson; Rebecca D Jackson; Gunnar Sigurdsson; Martin den Heijer; Bram C J van der Eerden; Jeroen van de Peppel; Timothy D Spector; Craig Pennell; Bernardo L Horta; Janine F Felix; Jing Hua Zhao; Scott G Wilson; Renée de Mutsert; Hans Bisgaard; Unnur Styrkársdóttir; Vincent W Jaddoe; Eric Orwoll; Timo A Lakka; Robert Scott; Struan F A Grant; Mattias Lorentzon; Cornelia M van Duijn; James F Wilson; Kari Stefansson; Bruce M Psaty; Douglas P Kiel; Claes Ohlsson; Evangelia Ntzani; Andre J van Wijnen; Vincenzo Forgetta; Mohsen Ghanbari; John G Logan; Graham R Williams; J H Duncan Bassett; Peter I Croucher; Evangelos Evangelou; Andre G Uitterlinden; Cheryl L Ackert-Bicknell; Jonathan H Tobias; David M Evans; Fernando Rivadeneira
Journal: Am J Hum Genet Date: 2018-01-04 Impact factor: 11.025

7. Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture.

Authors: Karol Estrada; Unnur Styrkarsdottir; Evangelos Evangelou; Yi-Hsiang Hsu; Emma L Duncan; Evangelia E Ntzani; Ling Oei; Omar M E Albagha; Najaf Amin; John P Kemp; Daniel L Koller; Guo Li; Ching-Ti Liu; Ryan L Minster; Alireza Moayyeri; Liesbeth Vandenput; Dana Willner; Su-Mei Xiao; Laura M Yerges-Armstrong; Hou-Feng Zheng; Nerea Alonso; Joel Eriksson; Candace M Kammerer; Stephen K Kaptoge; Paul J Leo; Gudmar Thorleifsson; Scott G Wilson; James F Wilson; Ville Aalto; Markku Alen; Aaron K Aragaki; Thor Aspelund; Jacqueline R Center; Zoe Dailiana; David J Duggan; Melissa Garcia; Natàlia Garcia-Giralt; Sylvie Giroux; Göran Hallmans; Lynne J Hocking; Lise Bjerre Husted; Karen A Jameson; Rita Khusainova; Ghi Su Kim; Charles Kooperberg; Theodora Koromila; Marcin Kruk; Marika Laaksonen; Andrea Z Lacroix; Seung Hun Lee; Ping C Leung; Joshua R Lewis; Laura Masi; Simona Mencej-Bedrac; Tuan V Nguyen; Xavier Nogues; Millan S Patel; Janez Prezelj; Lynda M Rose; Serena Scollen; Kristin Siggeirsdottir; Albert V Smith; Olle Svensson; Stella Trompet; Olivia Trummer; Natasja M van Schoor; Jean Woo; Kun Zhu; Susana Balcells; Maria Luisa Brandi; Brendan M Buckley; Sulin Cheng; Claus Christiansen; Cyrus Cooper; George Dedoussis; Ian Ford; Morten Frost; David Goltzman; Jesús González-Macías; Mika Kähönen; Magnus Karlsson; Elza Khusnutdinova; Jung-Min Koh; Panagoula Kollia; Bente Lomholt Langdahl; William D Leslie; Paul Lips; Östen Ljunggren; Roman S Lorenc; Janja Marc; Dan Mellström; Barbara Obermayer-Pietsch; José M Olmos; Ulrika Pettersson-Kymmer; David M Reid; José A Riancho; Paul M Ridker; François Rousseau; P Eline Slagboom; Nelson L S Tang; Roser Urreizti; Wim Van Hul; Jorma Viikari; María T Zarrabeitia; Yurii S Aulchenko; Martha Castano-Betancourt; Elin Grundberg; Lizbeth Herrera; Thorvaldur Ingvarsson; Hrefna Johannsdottir; Tony Kwan; Rui Li; Robert Luben; Carolina Medina-Gómez; Stefan Th Palsson; Sjur Reppe; Jerome I Rotter; Gunnar Sigurdsson; Joyce B J van Meurs; Dominique Verlaan; Frances M K Williams; Andrew R Wood; Yanhua Zhou; Kaare M Gautvik; Tomi Pastinen; Soumya Raychaudhuri; Jane A Cauley; Daniel I Chasman; Graeme R Clark; Steven R Cummings; Patrick Danoy; Elaine M Dennison; Richard Eastell; John A Eisman; Vilmundur Gudnason; Albert Hofman; Rebecca D Jackson; Graeme Jones; J Wouter Jukema; Kay-Tee Khaw; Terho Lehtimäki; Yongmei Liu; Mattias Lorentzon; Eugene McCloskey; Braxton D Mitchell; Kannabiran Nandakumar; Geoffrey C Nicholson; Ben A Oostra; Munro Peacock; Huibert A P Pols; Richard L Prince; Olli Raitakari; Ian R Reid; John Robbins; Philip N Sambrook; Pak Chung Sham; Alan R Shuldiner; Frances A Tylavsky; Cornelia M van Duijn; Nick J Wareham; L Adrienne Cupples; Michael J Econs; David M Evans; Tamara B Harris; Annie Wai Chee Kung; Bruce M Psaty; Jonathan Reeve; Timothy D Spector; Elizabeth A Streeten; M Carola Zillikens; Unnur Thorsteinsdottir; Claes Ohlsson; David Karasik; J Brent Richards; Matthew A Brown; Kari Stefansson; André G Uitterlinden; Stuart H Ralston; John P A Ioannidis; Douglas P Kiel; Fernando Rivadeneira
Journal: Nat Genet Date: 2012-04-15 Impact factor: 38.330

8. Cesarean Section on the Risk of Celiac Disease in the Offspring: The Teddy Study.

Authors: Sibylle Koletzko; Hye-Seung Lee; Andreas Beyerlein; Carin A Aronsson; Michael Hummel; Edwin Liu; Ville Simell; Kalle Kurppa; Åke Lernmark; William Hagopian; Marian Rewers; Jin-Xiong She; Olli Simell; Jorma Toppari; Anette-G Ziegler; Jeffrey Krischer; Daniel Agardh
Journal: J Pediatr Gastroenterol Nutr Date: 2018-03 Impact factor: 3.288

9. Mode of delivery is not associated with celiac disease.

Authors: Stine Dydensborg Sander; Anne Vinkel Hansen; Ketil Størdal; Anne-Marie Nybo Andersen; Joseph A Murray; Steffen Husby
Journal: Clin Epidemiol Date: 2018-03-19 Impact factor: 4.790

10. AKIRIN1: A Potential New Reference Gene in Human Natural Killer Cells and Granulocytes in Sepsis.

Authors: Anna Coulibaly; Sonia Y Velásquez; Carsten Sticht; Ana Sofia Figueiredo; Bianca S Himmelhan; Jutta Schulte; Timo Sturm; Franz-Simon Centner; Jochen J Schöttler; Manfred Thiel; Holger A Lindner
Journal: Int J Mol Sci Date: 2019-05-09 Impact factor: 5.923

5 in total

1. Exome Sequencing Identifies the Extremely Rare ITGAV and FN1 Variants in Early Onset Inflammatory Bowel Disease Patients.

Authors: Huda Husain Al-Numan; Rana Mohammed Jan; Najla Bint Saud Al-Saud; Omran M Rashidi; Nuha Mohammad Alrayes; Hadeel A Alsufyani; Abdulrahman Mujalli; Noor Ahmad Shaik; Mahmoud Hisham Mosli; Ramu Elango; Omar I Saadah; Babajan Banaganapalli
Journal: Front Pediatr Date: 2022-05-26 Impact factor: 3.569

2. Genome-Wide Association Study-Guided Exome Rare Variant Burden Analysis Identifies IL1R1 and CD3E as Potential Autoimmunity Risk Genes for Celiac Disease.

Authors: Haifa Mansour; Babajan Banaganapalli; Khalidah Khalid Nasser; Jumana Yousuf Al-Aama; Noor Ahmad Shaik; Omar Ibrahim Saadah; Ramu Elango
Journal: Front Pediatr Date: 2022-02-14 Impact factor: 3.418

3. Complex Inheritance of Rare Missense Variants in PAK2, TAP2, and PLCL1 Genes in a Consanguineous Arab Family With Multiple Autoimmune Diseases Including Celiac Disease.

Authors: Arwa Mastoor Alharthi; Babajan Banaganapalli; Sabah M Hassan; Omran Rashidi; Bandar Ali Al-Shehri; Meshari A Alaifan; Bakr H Alhussaini; Hadeel A Alsufyani; Kawthar Saad Alghamdi; Khalda Khalid Nasser; Yagoub Bin-Taleb; Ramu Elango; Noor Ahmad Shaik; Omar I Saadah
Journal: Front Pediatr Date: 2022-06-15 Impact factor: 3.569

4. Exploring celiac disease candidate pathways by global gene expression profiling and gene network cluster analysis.

Authors: Babajan Banaganapalli; Haifa Mansour; Arif Mohammed; Arwa Mastoor Alharthi; Nada Mohammed Aljuaid; Khalidah Khalid Nasser; Aftab Ahmad; Omar I Saadah; Jumana Yousuf Al-Aama; Ramu Elango; Noor Ahmad Shaik
Journal: Sci Rep Date: 2020-10-01 Impact factor: 4.379

5. TagSNP approach for HLA risk allele genotyping of Saudi celiac disease patients: effectiveness and pitfalls.

Authors: Reham H Baaqeel; Babajan Banaganapalli; Hadiah Bassam Al Mahdi; Mohammed A Salama; Bakr H Alhussaini; Meshari A Alaifan; Yagoub Bin-Taleb; Noor Ahmad Shaik; Jumana Yousuf Al-Aama; Ramu Elango; Omar I Saadah
Journal: Biosci Rep Date: 2021-06-25 Impact factor: 3.840

5 in total