Literature DB >> 24937328

Exome sequencing identifies DLG1 as a novel gene for potential susceptibility to Crohn's disease in a Chinese family study.

Shufang Xu1, Feng Zhou2, Jinsheng Tao3, Lu Song1, Siew Chien Ng4, Xiaobing Wang1, Liping Chen2, Fengming Yi1, Zhihua Ran5, Rui Zhou6, Bing Xia2.   

Abstract

BACKGROUND: Genetic variants make some contributions to inflammatory bowel disease (IBD), including Crohn's disease (CD) and ulcerative colitis (UC). More than 100 susceptibility loci were identified in Western IBD studies, but susceptibility gene has not been found in Chinese IBD patients till now. Sequencing of individuals with an IBD family history is a powerful approach toward our understanding of the genetics and pathogenesis of IBD. The aim of this study, which focuses on a Han Chinese CD family, is to identify high-risk variants and potentially novel loci using whole exome sequencing technique.
METHODS: Exome sequence data from 4 individuals belonging to a same family were analyzed using bioinformatics methods to narrow down the variants associated with CD. The potential risk genes were further analyzed by genotyping and Sanger sequencing in family members, additional 401 healthy controls (HC), 278 sporadic CD patients, 123 UC cases, a pair of monozygotic CD twins and another Chinese CD family.
RESULTS: From the CD family in which the father and daughter were affected, we identified a novel single nucleotide variant (SNV) c.374T>C (p.I125T) in exon 4 of discs large homolog 1 (DLG1), a gene has been reported to play multiple roles in cell proliferation, T cell polarity and T cell receptor signaling. After genotyping among case and controls, a PLINK analysis showed the variant was of significance (P<0.05). 4 CD patients of the other Chinese family bore another non-synonymous variant c.833G>A (p.R278Q) in exon 9 of DLG1.
CONCLUSIONS: We have discovered novel genetic variants in the coding regions of DLG1 gene, the results support that DLG1 is a novel potential susceptibility gene for CD in Chinese patients.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24937328      PMCID: PMC4061034          DOI: 10.1371/journal.pone.0099807

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Crohn’s disease (CD) and ulcerative colitis (UC) are classified as chronic, idiopathic inflammatory bowel diseases (IBD) [1], [2]. Familial aggregation, high concordance in twins and a higher prevalence of the disease in a certain ethnic population imply a strong genetic influence on the risk of disease development [3], [4]. Identifying the genetic loci or rare detrimental mutations in different populations or families with the disease will help elucidate the pathogenesis of these complex traits and facilitate the development of more targeted therapy. It is now widely recognized that common variants shown in GWAS can explain only relatively modest proportions of risk for diseases. Numerous functional and deleterious variants in the population are at frequencies of 0.5 to 5% that are too low to be detected by GWAS [5], [6]. As predisposing variants will present at a much higher frequency in the affected relatives of an index case, family studies may facilitate the detection of the ‘missing heritability’ not identified by GWAS [7]. Exome sequencing, which is a technique that focuses on the protein-coding portion of the genome, is not limited by the detailed and complete pedigree data that are necessary for classical linkage analysis and can be performed on only a few patients for the detection of causal mutations [8], [9]. Researchers have successfully identified a causal hemizygous mutation in the XIAP gene [10] and novel compound heterozygous mutations in interleukin-10 receptor 1 (IL-10R1) [11], using exome sequencing in children presenting with very early-onset and intractable IBD. The sequencing of eight pediatric IBD patients’ exomes revealed various profiles of specific variants with a limited number in each case [12]. Numerous candidate genes for Western IBD patients have been shown, but causality for specific variants in Chinese IBD patients is largely absent. In this study, we applied whole exome sequencing to 4 individuals belonging to a same family (Family A) to discover novel deleterious genetic variants associated with IBD and then validated these findings in other 10 family members of Family A, 401 healthy controls (HC), 278 subjects with sporadic CD, 123 subjects with UC, a pair of monozygotic twins and another Han Chinese CD family (Family B).

Materials and Methods

Patients and Controls

The familial patients included in this study were selected from the Hubei Clinical Center & Key Lab of Intestinal & Colorectal Diseases. Written informed consent was obtained from all subjects and the next of kin on behalf of the children enrolled in the study. This study was approved by the ethics committee of the Zhongnan Hospital of Wuhan University as part of the human subjects’ protocol to study the genetics of IBD in humans. The CD patients and HC were all unrelated subjects of Chinese descent and born to non-consanguineous parents. The ancestry of the patients and control individuals was assessed by self-report and appearance. Phenotypic data were acquired from a review of medical records, phone interviews and photographs. A combination of symptom assessment, laboratory and radiological examinations and endoscopy with histology was applied to make the diagnosis. For whole exome sequencing, we selected a Han Chinese family (Family A) including a daughter and a father both affected with CD from Hubei province. In this family, the father is the proband, the proband’s unaffected mother and wife were taken as exome sequencing controls. The father, who was diagnosed with CD at the age of 31 years in 1999 with terminal ileitis and proctitis, was treated with oral prednisolone and aminosalicylic acid (5-ASA). Small intestine computed tomography enterography (CTE) showed a thickened ileum wall in 2012. The affected daughter developed CD at the age of 16 years in 2012, with high fever, diarrhea, oral ulcers and an anal fistula. Endoscopy showed upper digestive tract ulcers, aphthous ulcers at the ileocecal junction and colitis involving the rectum, sigmoid colon, descending colon and transverse colon. A biopsy showed non-specific granulomatous inflammation and staining was negative for acid-fast bacilli. She was finally diagnosed with CD and was treated with an intravenous injection of corticosteroids, 5-ASA, immunosuppressants and infliximab for severe refractory disease. She is now in remission with azathioprine and 5-ASA (the supporting data are provided in Fig. 1).
Figure 1

Clinical characteristics of two patients in Chinese CD family A.

The index patient in the family, the father (Panels A to D) had evidence of mucous membrane granulation, polypoid proliferation and hyperemia in his colonoscopy, as shown in Panels A and B. Panel C shows the patient’s pathological findings of chronic intestinal inflammation. Panel D shows the thickening of the ileum wall by small intestine computed tomography enterography (CTE). The daughter in the family is another Patient (Panels E to H). Panel E shows her anal fistula at disease onset. Endoscopy showed intestinal poly-ulcers in Panel F. A biopsy showed non-specific granulomatous inflammation, as shown in Panel G, and the higher magnification of the pathology shown in Panel H reveals negative acid-fast staining granulomas. All of the images were collected in March 2012 in Zhongnan Hospital of Wuhan University.

Clinical characteristics of two patients in Chinese CD family A.

The index patient in the family, the father (Panels A to D) had evidence of mucous membrane granulation, polypoid proliferation and hyperemia in his colonoscopy, as shown in Panels A and B. Panel C shows the patient’s pathological findings of chronic intestinal inflammation. Panel D shows the thickening of the ileum wall by small intestine computed tomography enterography (CTE). The daughter in the family is another Patient (Panels E to H). Panel E shows her anal fistula at disease onset. Endoscopy showed intestinal poly-ulcers in Panel F. A biopsy showed non-specific granulomatous inflammation, as shown in Panel G, and the higher magnification of the pathology shown in Panel H reveals negative acid-fast staining granulomas. All of the images were collected in March 2012 in Zhongnan Hospital of Wuhan University. An additional 10 healthy members’ blood DNA samples from family A were also taken as Sanger sequencing controls to validate the co-segregation of the mutations in the CD family. In addition, 278 sporadic CD and 123 sporadic UC patients were enrolled from the Inflammatory Bowel Disease Center of Zhongnan Hospital of Wuhan University (131 CD, 76 UC), Renji Hospital of Shanghai Jiaotong University School of Medicine (40 CD) and the Institute of Digestive Disease of The Chinese University of Hong Kong (107 CD, 47 UC) from January 2001 to December 2012; 401 HC were from Wuhan, China (Table 1).
Table 1

Characteristics of 401 sporadic inflammatory bowel disease (IBD) patients and 401 healthy controls.

IndexCD 278UC 123HC
Male (n = )17672239
Female (n = )10251162
Average age (years)32.25±13.3835.40±10.6336.42±12.42
Moreover, we collected 25 young and intractable CD cases (Table 2), including a pair of monozygotic twins (Patient ID in Table 2: 24 and 25) and 23 cases selected from 131 sporadic CD patients of Hubei province. Another CD family (family B) was also from Wuhan city.
Table 2

Data of 25 young and intractable CD patients.

PatientIDAge atdiagnosis(year)SexDiseaselocationTreatment
Diagnosticanti-TBAminosalicylatesCorticosteroidsImmunosuppressiveagentesBiologicalagentesSurgery
125FemaleTerminal ileumNOYESNONONOYES
215MaleTerminal ileumNOYESYESNOYESNO
324MaleTerminal ileumNOYESNOYESNONO
434MaleTerminal ileum and ascendingcolonYESYESNOYESNOYES
516MaleTerminal ileumand sigmoid colonYESYESYESYESYESNO
614MaleTerminal ileumYESYESNOYESYESNO
719FemaleTerminal ileum and right sidedcolonNOYESYESYESYESYES
817MaleTerminal ileum and descendingcolonYESYESYESYESNONO
921MaleSmall intestineNOYESYESYESNONO
1020MaleTerminal ileum and right sidedcolonYESYESYESYESYESYES
1124FemaleTerminal ileumNOYESNONOYESNO
1221MaleTerminal ileumYESYESYESYESYESNO
1322FemaleTerminal ileumand sigmoid colonYESYESYESYESYESNO
1423MaleTerminal ileum and right sidedcolonNOYESYESYESYESNO
1523MaleSmall intestineNOYESYESYESYESNO
1617MaleTerminal ileum and right sidedcolonNOYESYESYESYESNO
1713FemaleTerminal ileum and ascendingcolonYESYESYESYESYESYES
1824FemaleTerminal ileumNOYESYESYESYESNO
1925MaleTerminal ileum and ascendingcolonNOYESYESYESYESYES
2011MaleTerminal ileum and right sidedcolonYESYESYESYESYESYES
2121FemaleTerminal ileum and ascendingcolonYESYESYESYESNONO
2214FemaleTerminal ileum and right sidedcolonNOYESYESYESNOYES
2311MaleTerminal ileum and right sidedcolonYESYESYESYESYESYES
2426MaleTerminal ileumNOYESYESYESNOYES
2529MaleTerminal ileumNOYESYESYESYESYES

TB: tuberculosis.

TB: tuberculosis.

DNA Extraction

Genomic DNA was extracted from EDTA-anticoagulated peripheral venous blood samples using a QIAamp DNA Blood Midi Kit (Qiagen, Germany) according to the manufacturer’s instructions.

Whole Exome Sequencing and Variant Detection

Using an E210 ultrasonicator (Covaris, MA, USA), the genomic DNA samples were randomly fragmented into 250–300 bp fragments and subjected to library preparation according to NimbleGen’s standard protocol. Target region enrichment was performed for the shotgun libraries using the NimbleGen SeqCap EZ custom design kit (NimbleGen, Madison, WI, USA), which consisted of SeqCap EZ Human Exome Library v2.0 and a continuous region covering the MHC genes. The enriched shotgun libraries were sequenced using the Hiseq2000 platform, and 90-bp paired-end reads were generated. Raw image data and base calling were processed by Illumina Pipeline software version 1.7 with the default parameters. Quality control for the reads was performed by discarding adaptor-containing reads and low-quality reads. For SNP calling, SOAP aligner [13] was used to align the reads to the human reference genome (hg19), and SOAP snp [14] was then used to assemble the consensus sequence and call SNPs. As another quality control, low-quality SNPs satisfying one of the four following criteria were discarded: (i) genotype quality<20; (ii) total reads covering the variant site<4; (iii) estimated copy number >2; (iv) distance from the nearest SNP<5 bp (except for SNPs present in dbSNP). For indel calling, high-quality reads were aligned to the human reference genome using BWA (version 0.5.9-r16) [15]. GATK Indel Realigner was used to realign reads around insertion/deletion sites, and then small indels were called using the IndelGenotyperV2 tool from GATK (version v1.0.4705) [16], [17]. Indels were called as heterozygous and homozygous if indel-supporting reads consisted of 30–70% and >70% of the total reads, respectively. SNP and indel detection was performed only for the targeted regions and flanking regions within 200 bp of the targeted regions.

Variant Annotation and Prioritization

The detected variants were annotated based on four databases, including NCBI CCDS, RefSeq, Ensembl and Encode (http://genome.ucsc.edu/ENCODE/). Exclusion steps were taken to help identify candidate mutations. Variants falling within intergenic, intronic and untranslated regions and synonymous substitutions were excluded; variants documented in 4 public genetic variant databases, including dbSNP132, 1000 Genomes, HapMap and YH (http://yh.genomics.org.cn/), with an allele frequency >0.5% (except for YH) were rejected; and variants shared by 2 exome sequenced cases and absent from 2 exome sequenced controls were kept. Additionally, we used the following criteria to evaluate and prioritize the candidate genes: (i) SIFT (http://sift.bii.a-star.edu.sg/), MutationTaster (http://www.mutationtaster.org/), PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/index.shtml) and PMut (http://mmb.pcb.ub.es/PMut/) were used to predict whether single amino acid changes in genes would alter the protein function; (ii) the conservation of candidate mutations was analyzed by evaluating the GERP score (http://snp.gs.washington.edu/SeattleSeqAnnotation137/); (iii) the candidate mutations’ total frequency of occurrence in the EVS (http://evs.gs.washington.edu/EVS/) and the BGI in-house database was analyzed and (iiii) the tissue distributions and functions were analyzed using the online tools BioGPS (http://biogps.org/), Entrez Gene (http://www.ncbi.nlm.nih.gov/gene) and Proteinatlas (http://www.proteinatlas.org/). As a final step, we compared and prioritized the remaining candidate genes using 4 internet tools: GeneDistiller2 (http://www.genedistiller.org/), SUSPECTS (http://www.genetics.med.ed.ac.uk/suspects/index.shtml), ToppGene (http://toppgene.cchmc.org/) and Endeavour (http://www.esat.kuleuven.be/endeavour). To prioritize the SNVs and genes, we chose 20 reported CD susceptibility genes (NOD2, ATG16L1, STAT3, IL23R, IL10R2, IL10R1, JAK2, ICOSLG, CDKAL1, MST1, PTGER4, IRGM, TNFSF15, ZNF365, NKX2-3, PTPN2, PTPN22, IL12B, XIAP and ITLN1) as the training set.

Validation Phase

All shared SNVs of the two affected individuals were verified for all members acquired from family A to detect co-segregation, by direct polymerase chain reaction (PCR) amplification followed by Sanger sequencing (PCR primers are listed in Table 3, Invitrogen). The sequencing reactions were conducted on an ABI 3730XL DNA Analyzer.
Table 3

The PCR primers of 22 candidate SNPs by Sanger Sequencing.

SNP_IDGeneF-primer sequences (5′-3′)R-primer sequences (5′-3′)
chr17_55183450 AKAP1 TCAGAGTCCTCGGGCATT CTGCTACATACTCTTCCTCC
chr20_30232655 COX4I2 ACAGTCCTTGGGGTCTAA CCACTGCTTCTTCTCATAG
chr9_110249480 KLF4 AGTCCCGCCGCTCCATTA TCTTTGGCTTGGGCTCCT
chr1_26368197 SLC30A2 ACTGCCTTATTCTGAACTGT GAAGCATAATCCTCACCC
chr7_73279329 WBSCR28 GAGAATCGCCCGAAACC CCAGGCACTGAGCAAGG
chr9_125582872 PDCL GATTCTTGTTGTGCCTCAG TTCCTGGTGAACTGACTGC
chr6_90418252 MDN1 AACCTCTTCCCCATCAT TCCAACACCCCACAACT
chr20_420894 TBC1D20 GACCTGACACCTGCCTTTC ACCCAGCATTTCCCAACT
chr9_2717768 KCNV2 CCACAGCCAGGAGGAAA CTCGTAGTCGTCGCACA
chr16_20492206 ACSM2A CAGGGCAGGGGATTTAG TTGCTGGATCGTATGGTAGTT
chr15_41275952 INO80 AGCCAAAGCAGCCTCAAC GGAATCAGGACCTTACCC
chr6_168366533 MLLT4 GAAGCAGGAGGCTGAGAA TTGAGGTAGGAGGCGTTT
chr3_196921405 DLG1 GGTAAGAAATGAGCAATCAATATTCAG GGGCGAACCTACATGAAAGAATA
chr1_1470881 TMEM240 GACGCCTCCGAGAACTACTTTG ACAGCTTGGGCAGCCAGGTC
chr16_16170185 ABCC1 AACCCGTGGCTGATGTC TGTCCAAGGCTGCTGTA
chr15_39885853 THBS1 TGGGTGCTGAGGATGTC TGGTGATGCTGGGAACT
chr10_116225553 ABLIM1 TTCCTTGGCAGTGTTTG GGAAATGTTTAGTCGTTGA
chr7_98602860 TRRAP TTTCCCGTGACAGTTCG CTCTTGGTGGTCTCCTTT
chr6_152536152 SYNE1 TTGGCTTTTCGCTATTC ACCTTGACTGCGGACTT
chr14_21860964 CHD8 GCCCAAGGTAACAAACAG CCAGGAGTCAATGAGGGA
chr7_99160120 ZNF655 TATGGGCTTTATTCCGTAG CGGAGAAGACGATGTGAA
chr7_87051466 ABCB4 ATCCAAGTGGGCGTTTT TGAATGTCTGCTGAGGG
Genotyping was conducted by the MassARRAY (MALDI-TOF MS) method using the SEQUENOM System (Sequenom, Inc.) to screen the candidate genes in an additional 401 HC individuals (278 sporadic CD patients and 123 UC cases), and the data were analyzed using TYPER 4.0 software. The primer sequences for genotyping were designed and synthesized using Primer 5.0 software (PCR primer sequences are listed in Table 4, and the primers were synthesized by Invitrogen). To further study the genes (DLG1 and PDCL) that we identified through the series of steps listed above, we applied PCR amplification followed by Sanger sequencing to examine all of the exons of DLG1 and PDCL in 25 young and intractable CD cases (the PCR primer sequences are listed in Table 5).
Table 4

The primers of the 22 candidate SNPs for the MassARRAY method.

SNP_IDGeneF-primer sequences (5′-3′)R-primer sequences (5′-3′)
chr17_55183450 AKAP1 ACGTTGGATGAGAGGGCAAGAGAGACAGGT ACGTTGGATGACAGAGCTTCTTCAAGCACC
chr20_30232655 COX4I2 ACGTTGGATGAGCGCATGCTGGACATGAAG ACGTTGGATGCTGCTTCTTCTCATAGTCCC
chr9_110249480 KLF4 ACGTTGGATGTCTTTGGCTTGGGCTCCTCT ACGTTGGATGATGATGCTCACCCCACCT
chr1_26368197 SLC30A2 ACGTTGGATGAACCTTGACCATCCTGAGAG ACGTTGGATGAAGAGCAAAAAGGGAGCCAC
chr7_73279329 WBSCR28 ACGTTGGATGTGATGGCTGACGGTTGTCTC ACGTTGGATGGGAGCAGGAAATTATAGAGG
chr9_125582872 PDCL ACGTTGGATGTGACTCTGAAGGAGTTTGCC ACGTTGGATGATTCGCTGCTTCCGGTACTG
chr6_90418252 MDN1 ACGTTGGATGTTTGATGGACTTTGACCCAC ACGTTGGATGTGCAGCTGATTCTAAAAGGG
chr20_420894 TBC1D20 ACGTTGGATGATGGGTGATGGTGAACCCAG ACGTTGGATGACCCACTGATGCCGATTTAC
chr9_2717768 KCNV2 ACGTTGGATGAGCCATGCTCAAACAGAGTG ACGTTGGATGCCTCATTCTCCGTCGTGTTC
chr16_20492206 ACSM2A ACGTTGGATGGGTAGAGAATGCACTGATGG ACGTTGGATGACGGGGTCTGGGCTGCTGAT
chr15_41275952 INO80 ACGTTGGATGACAACCAAACCAGTGCTGGG ACGTTGGATGGTCTCAGATACCGTGAATGG
chr6_168366533 MLLT4 ACGTTGGATGAGACAGCACGACGAGGCGG ACGTTGGATGTAGTCCCGGGGAAGCGGAG
chr3_196921405 DLG1 ACGTTGGATGGAACCAATTCTGGACCTATC ACGTTGGATGGGATGAAGATACACCTCCTC
chr1_1470881 TMEM240 ACGTTGGATGAGCCGCCTGACCGCCCCTGT ACGTTGGATGTGCACAGCTTGGGCAGCCAG
chr16_16170185 ABCC1 ACGTTGGATGTGTCCCTGACATGTCTCTGT ACGTTGGATGTGAATGTGGCATTCCTCACG
chr15_39885853 THBS1 ACGTTGGATGTGGCGAGCACCTGCGGAAC ACGTTGGATGTCCAGGGCTTTGCTTCTTAC
chr10_116225553 ABLIM1 ACGTTGGATGTGTACACAGGGGAGTTGATG ACGTTGGATGAGGATGTTCGGGATCGGATG
chr7_98602860 TRRAP ACGTTGGATGTGCAACACACGCTCCTCTC ACGTTGGATGTGGCAAGATCTACCCATACC
chr6_152536152 SYNE1 ACGTTGGATGCTTCCTTCTAGGGACAGATG ACGTTGGATGGGTAACCTATATCCAAGCTC
chr14_21860964 CHD8 ACGTTGGATGCACAGCTAGTACTCAGACTC ACGTTGGATGCGAGGTCAATACGGTTTATC
chr7_99160120 ZNF655 ACGTTGGATGGATAAACCGAATAATAAGG ACGTTGGATGACCTCTACAGAGAAGTGATG
chr7_87051466 ABCB4 ACGTTGGATGGCAGAAGTGCAACATATTCTC ACGTTGGATGACCTACCTGAAGGAAGAAAG
Table 5

The PCR primers of all exons of DLG1 and PDCL.

Primer IDF-primer sequences (5′-3′)R-primer sequences (5′-3′)Annealing temperature (°C)
DLG1
Exon 1 CCGACTTCTGTCTGTTCTT GGACCGTGCTGTCTCAT 54
Exon 2 CTCCTCCGTTTTCTAATG GTTACCGAATGCCTCAG 51.5
Exon 3 GTTAAGTAGTTTGCCTGAACTTGTAGC CAGATGAAGCCTTGTTGAGGTCT 62
Exon 4 GGTAAGAAATGAGCAATCAATATTCAG GGGCGAACCTACATGAAAGAATA 55.5
Exon 5 TTTATCTTTATGGCACAGC AAATGGCAAATCCTGACT 51.5
Exon 6 TTCTGTTTGGTGCTGGAG GGTCTTCGCATTTGTATC 55.5
Exon 7 CAGAGAAGGATCGGAGGTTGA GTAAATGGAAACTCTTGGGACTATC 58.8
Exon 8 CCTCCAGAACAAGTCCA GTATTTATCCCTTATCCAGTC 51.5
Exon 9 TGTTCCTTTTGCTGGCCCTT ATGACTGCACCACTGGACTC 63
Exon 10 TTCGTAACTCTAGGAGCAGCTGT CTGTGCATACAAGCCCTCAAC 61
Exon 11 AGACTGGGAGAATAGGAGG TCACTAATGGCATCACAAC 55.5
Exon 12 TTGAGACTAACCTGGGCAACAT AAGGACAATTTACCAAGCCTCAA C 61
Exon 13 CTTCTAAGTAGGGGCAGTG AATAGGTCCAGTGAAAATAAC 54
Exon 14 CAGTAGGCGTGAGAATGTGGC GCCTGGGCAGTAAGAGTGGA 63
Exon 15 GATTACTGCTGTCTGATGC GCCTCCTTTGCTACTATG 55.5
Exon 16 TCAATATAACTTACCATTGGATTACAATC AGTACTATTACCTGTAGTTGCCATGCT 57
Exon 17 TTAAACTCAGAAATGGTGCCTCA GGTCTGTGAAATGGGTGCTTG 61
Exon 18 AGGTATAAATGAACTATGCTGTCTGAA CCTTGAAGACAATTAGCAACCTG 58.8
Exon 19 AGTTTGTCCCCTTTGCC TCAGAATCCCTCCACCC 55.5
Exon 20 AAATAAAGGAGTAGCACATAGC GAAAGAAGTGGGATAAACAG 54
Exon 21 CATCTTTGGTTGATGGTAGAGTGAG AGAAAGGACAATAATATGGAGGATG 58.8
Exon 22 ATCCATCCTCCATATTATTGTCCT ACCCGGCCCTTATCTCCT 57
Exon 23 TTTCATTTCCTATCTAAAGTTTGCTG ATGGTTCTGCCTCACATTCTGT 57
Exon 24 TGTGTCATCTCTCCTTTGCCA GAGCCGAGTCATACCATTGC 62
Exon 25 CTATGGGATTGTACCCAGTTTCC GGTCAGGCCATTCCATCTTC 57
PDCL
Exon 1 TGTCCTGGAAATTGTAGGATCTCA GACTAGGTTACCTCTGAAAGTGGGA 60.5
Exon 2 ATGTTGGGCATTAGCTTGGC TTTGACAGGGCTCTATGATTTCTC 60.5
Exon 3-1 TCAAGTGATCCGCTCGTCT AGCTTCAAGGTCCACAGCA 60.5
Exon 3-2 GCCAGCAGTCAGTTCACCAG TTTGACAGGGCTCTATGATTTCTC 60.5
SPSS17.0 statistical software was used for statistical analysis, the measurement data were expressed as means +/− standard deviation (SD). PLINK was performed on analysis of genotype data. P values<0.05 were considered as significant.

Results

Whole Exome Sequencing of the CD Family

Whole exome sequencing was performed on DNA extracted from the peripheral blood of 4 members of Family A using next-generation sequencing technology. As shown in Table 6, we obtained at least 88.5 million reads that mapped to the target region for each exome, more than 98.5% of the target region was covered and the mean depth of the target region was 128.64×, 148.90×, 202.26× and 158.25×. The summary statistics of the total quality-passing SNPs and indels are all listed in Table 6.
Table 6

Summary of original exome sequencing data of four familial individuals.

Exome Capture StatisticsDaughterFatherGrandmotherMother
Target region (bp)48959543490622234895954348959543
Raw reads243896508204592452253503938192147514
Raw data yield (Mb)21951184132281517293
Reads mapped to genome204193470145810600202993882156292035
Reads mapped to target region(2) 88581652101995817138982919108872112
Data mapped to target region (Mb)6298.257305.569902.467748.06
Mean depth of target region (X)128.64148.90202.26158.25
Coverage of target region (%)98.7798.5698.8198.75
Average read length (bp)89.8789.8489.7889.85
Total quality-passing SNPs116950114204119780117371
Total quality-passing indels7442736177737500

Bioinformatic Analysis Identifies 22 Candidate Genes

In total, 82 variants shared by the 2 cases remained through the exclusion of 4 public genetic databases (the procedures are shown in Table 7), and no reported IBD single nucleotide variant was found. After performing filtering steps for gene function and mutation prediction, we obtained 22 candidate genes (Table 8). Using 4 internet tools, we acquired the top 6 genes from the 22 candidates: THBS1, KLF4, SYNE1, CHD8, PDCL and DLG1. These genes were the most likely to be the genetic cause of the 2 affected patients.
Table 7

Filtration of SNPs/Indels.

Individual IDGrandmotherMotherDaughterFather
Total SNPs and indels173991+7773172848+7500172005+7442173055+7361
Quality-passing SNPs and indels119780+7773117371+7500116950+7442114204+7361
Protein-disrupting SNPs and indels (PDSI)14553+136914345+131414581+128014547+1294
PDSI after filtering against dbSNP2144+3822138+3662108+3482129+359
PDSI after filtering against dbSNP+1000 Genomes1459+2201498+2211469+1891448+208
PDSI after filtering against dbSNP+1000 Genomes+HapMap1457+2201497+2211467+1891446+208
PDSI after filtering against dbSNP+1000 Genomes+HapMap+YH1420+2201460+2211438+1891413+208
PDSI after filtering against dbSNP+1000 Genomes+HapMap+YH+inhouse dataand fitting a dominant model (shared by two cases) 0082+0
Filtered candidate genes 22
Sanger sequence for validation 22
Table 8

List of 22 candidate genes and mutations prediction.

NOChromosomePositionReferenceGene nameCodonsSIFT PredictionMutationTaster PredictionConsScore GERP
1chr690418252C MDN1 GAC7861CACDAMAGINGpolymorphism2.23
2chr1421860964C CHD8 CGT5636CATDAMAGINGdisease causing5.34
3chr9125582872T PDCL GAT398GGTDAMAGINGdisease causing5.47
4chr1755183450G AKAP1 GTG625ATGDAMAGINGpolymorphism4.22
5chr1539885853G THBS1 GGA3251GAADAMAGINGdisease causing5.78
6chr1616170185G ABCC1 GGG1915TGGDAMAGINGdisease causing4.11
7chr11470881G TMEM240 TCG380TTGDAMAGINGdisease causing3.37
8chr773279329C WBSCR28 CAG79AAGDAMAGINGpolymorphism4.43
9chr92717768C KCNV2 TCC29TGCDAMAGINGdisease causing4.45
10chr799160120A ZNF655 3.92
11chr1541275952G INO80 2.69
12chr10116225553G ABLIM1 CGG1345TGGDAMAGINGdisease causing3.37
13chr126368197T SLC30A2 ATG685GTGDAMAGINGdisease causing5.6
14chr2030232655T COX4I2 GTG464GCGDAMAGINGdisease causing4.38
15chr6167570520G GPR31 ACG800ATGDAMAGINGpolymorphism2.65
16chr6168366533G MLLT4 GGA4993AGADAMAGINGdisease causing5.03
27chr1620492206C ACSM2A ACG1472ATGDAMAGINGpolymorphism3.26
18chr20420894C TBC1D20 GTG766ATGDAMAGINGdisease causing5.65
19chr9110249480G KLF4 3.45
20chr3196921405A DLG1 ATC374ACCTOLERATEDdisease causing5.17
21chr6152536152C SYNE1 CGT22022CATTOLERATEDdisease causing5.07
22chr787051466T ABCB ATT2287GTTTOLERATEDpolymorphism4.85

Sanger Sequencing and Genotyping Combined with Bioinformatic Analyses Identifies DLG1 as a Potential Susceptibility Gene

Sanger sequencing confirmed the presence of the 22 mutations in the affected father and daughter. 10 healthy members of family A were sequenced to test for these variants. We found that one family member carried the variant in the KLF4 gene. The other 21 mutations absent in healthy family members showed co-segregation. The genotyping of the 22 SNVs indicated that 8 variants in THBS1, SYNE1, CHD8, TMEM240, AKAP1, COX4I2, ZNF655 and KCNV2 were positive in 401 HC, whereas the other 14 variants were negative. We again focused on the 6 top candidate genes (THBS1, KLF4, SYNE1, CHD8, PDCL and DLG1) identified through the prioritization analysis. In contrast to THBS1, KLF4, SYNE1 and CHD8, none of the 401 HC was found to carry PDCL or DLG1 mutations. Subsequent genotyping of 22 SNVs in 401 sporadic IBD cases indicated that one female CD patient aged 21 years carried a mutation in DLG1 (Table 9), and no patients had variation in PDCL. A PLINK analysis showed the variant in DLG1was of significance (P<0.05).
Table 9

Distributions of rare variants in the DLG1 gene.

Patient IDGenderAge (years)NucleotidechangeAmino acidchangeChromosomePositionExonSequencing method
CD Family A
Father(diagnosed)Male44c.374T>Cp.I125Tchr3_1969214054Exome
Daughter(diagnosed)Female16c.374T>Cp.I125Tchr3_1969214054Exome
Mother(unaffected)Female42Exome
Grandma(unaffected)Female81Exome
CD Family B
Case 1(diagnosed)Male39c.833G>Ap.R278Qchr3_1968652429Direct PCR sequencing
Case 2(diagnosed)Male42c.833G>Ap.R278Qchr3_1968652429Direct PCR sequencing
Case 3(diagnosed)Female32c.833G>Ap.R278Qchr3_1968652429Direct PCR sequencing
Case 4(diagnosed)Female24c.833G>Ap.R278Qchr3_1968652429Direct PCR sequencing
CJ 5(undiagnosed)Female56c.833G>Ap.R278Qchr3_1968652429Direct PCR sequencing
CJ 6(undiagnosed)Female6c.833G>Ap.R278Qchr3_1968652429Direct PCR sequencing
CJ 7(undiagnosed)Female6c.833G>Ap.R278Qchr3_1968652429Direct PCR sequencing
CJ 8(undiagnosed))Male7c.833G>Ap.R278Qchr3_1968652429Direct PCR sequencing
By examining all of the exons of PDCL and DLG1 in 25 young and intractable CD patients, we found two cases (Table 2, Patient ID are 3 and 4) who carried another variant in DLG1 (Figure 2, exon 9, c.833G>A, p.R278Q). We traced Patient 3, 4 and their families, and found that two cousin sisters (Cases CJ2 and CJ3) and one brother (Case CJ4) of Patient 4 who were unexpectedly found to have ulcers in the terminal ileum by endoscopy, and a biopsy showed non-specific chronic inflammation. After being treated with 5-ASA and azathioprine, four affected cases in this family have almost achieved their colonic mucosal healing. Cases CJ2, CJ3 and CJ4 were all found to be carriers of mutation R278Q (c.833G>A) by Sanger sequencing, and the family was called family B. We found 4 unaffected carriers (CJ5, CJ6, CJ7 and CJ8) of this variant after sequencing the other 15 members of family B, and these individuals will be followed up. CJ5 received a diagnosis of rheumatic heart disease with arthritis. The variants and carriers of DLG1 are listed in Table 9. Neither of the monozygotic CD twins carried any mutation in all 3 exons of PDCL or in all 25 exons of DLG1.
Figure 2

Chromatogram of DLG1 gene mutations.

The Sanger sequence traces from normal human controls are shown in panel A and B; the mutations were heterozygous at the corresponding locus (orange arrows indicating) in panel C and D.

Chromatogram of DLG1 gene mutations.

The Sanger sequence traces from normal human controls are shown in panel A and B; the mutations were heterozygous at the corresponding locus (orange arrows indicating) in panel C and D. Bioinformatics analyses were used to dissect the two non-synonymous mutations of DLG1 found in the study described above. MutationTaster showed that the variant in DLG1 (Figure 2, c.374T>C, p.I125T) was likely to be disease-causing. We compared the SNV sequence of species at different evolutionary distances by GERP and found that the amino acid substitution of DLG1 was highly conserved. Regarding another variant of DLG1 (Figure 2, exon 9, c.833G>A, p.R278Q), the PMut analysis of the mutation indicated that it is pathological (http://mmb.pcb.ub.es/PMut/), and the prediction from PolyPhen-2 was that the mutation was most likely damaging; however, the MutationTaster analysis indicated polymorphism, and SIFT predicted the mutation to be tolerated.

Discussion

Rare and low-frequency variants might have substantial effect sizes in complex disorders such as IBD [18]. A main goal of human genetic studies is to identify uncommon variants that play important roles in pathogenesis and reveal the familial transmission of diseases [6], [8]. Furthermore, uncommon alleles shared by affected individuals in a family are more prone to familial clustering of disease than common alleles carried in a population. In this study, we applied whole exome sequencing to anatomize the genetic background of a Chinese family with CD and successfully identified genetic variants in the coding regions of the DLG1 gene that may be associated with increased risk of CD. We first identified a novel SNV c.374T>C (p.I125T) in exon 4 of DLG1 through whole exome sequencing and bioinformatic analysis. In subsequent validation studies, we also identified 4 CD patients of another Han Chinese family harbored the variant c.833G>A. Altogether these data suggest that DLG1 is a susceptible gene for CD. DLG1 encodes a multi-domain scaffolding protein, which may have a role in septate junction formation, signal transduction, cell proliferation, synaptogenesis and lymphocyte activation (http://www.ncbi.nlm.nih.gov/gene/). The DLG1 protein is composed of an N-terminal L27b oligomerization domain, a proline-rich domain (PRD), three PDZ (PSD-95, Dlg and ZO-1) domains, an SH3 (Src Homology 3) domain and a catalytically inactive GUK (GUanylate Kinase) domain. During antigen recognition, these modular domains allow DLG1 to co-localize with synaptic actin, translocate into sphingolipid-rich microdomains within the IS and associate with Lck, ZAP-70, Vav, WASp Ezrin and p38 [19]. DLG1 has been shown to play roles in T cell polarity and T cell receptor signal specificity [20], [21], and be involved in the generation of memory T cells [22]. The loss of DLG1 leads to increased invasion in response to pro-tumorigenic cytokines, such as IL-6 and TNF-α [23], [24]. In accord with the suggested autoimmune nature of CD, strong evidence has implicated T cells and T-cell migration to the gut in initiating and perpetuating the intestinal inflammatory process and tissue destruction [25], [26]. Anti-cytokine agents are therefore likely to be useful in the treatment of IBD [27], [28]. After intravenous injection with six cycles of infliximab, the affected daughter in Family A has almost achieved mucosal healing of her colonic disease and was likely to have a better prognosis than those DLG1 mutation carriers who did not accept infliximab treatments in our study. It was corroborative evidence that DLG1 was causative for the CD patients of the two Chinese families. Complex human disease is a large collection of individually rare, even private variants [29]. A single locus can harbor both common variants of weak effect and rare variants of strong effect [30]. The results of our study of two CD families indicated genetic heterogeneity and susceptibility. We analyzed family A using an autosomal dominant model, and several factors were important to the success of this study. First, according to the database at our center [31], although the incidence of CD and UC is still low, the number of cases and severity of disease are increasing in China [32], [33], which provides the appropriate conditions to recruit patients for the subsequent validations. Second, a stepwise approach was taken to help narrow down the list of genetic variants responsible for this disease. For the genetic susceptibility of CD, despite the success of GWAS in identifying significantly associated loci [34], the currently identified variants are estimated to account for less than a quarter of the predicted heritability [35]. Uncommon alleles may be maintained at a lower frequency in the population through negative selection, and it is not possible to create a complete catalog in the general population [36]. Therefore, rare causal variants are not likely to be found in public SNP databases and control exomes [37]. We did not find mutations in any reported susceptibility genes that were shared by the affected father and daughter, which suggested that other variants may be associated with CD in these 2 individuals. To predict the impact of nonsynonymous variants, we applied 4 popular methods (PolyPhen2, SIFT, MutationTaster and PMut) [38], [39]. However, none of these methods was perfectly sensitive or specific. Regarding the mutation c.374T>C, SIFT and MutationTaster predicted it to be tolerated and disease causing, respectively. Different prediction algorithms used different information, and each had its own relative merits. It is thought to be better to use predictions from multiple algorithms rather than relying on a single one [40], [41]. We also used several different bioinformatic methods to filter and prioritize the SNVs and genes to increase the robustness of the analysis results. Finally, to confirm the results and identify the susceptibility gene, we used genotyping and Sanger sequencing methods for validation. Traditional Sanger sequencing is the gold standard for mutation detection [9]. We were able to narrow the scope to only a few genes through these steps. By scanning all exons of DLG1 and PDCL, a nonsynonymous variant c.833G>A of DLG1 was found in family B, thus confirming that DLG1 is a gene whose mutation is associated with high risk. Some limitations must be addressed. First, IBD patients with family history are rare among Han Chinese. In this family study, there were only two affected members, so the size of the pedigree was small. Second, the patients studied did not have an onset as early as those were previous reported in Caucasian population [42], [43]. Third, because of genetic heterogeneity, the variants appear to be present only in a subset of CD patients, and were not carried by the pair of monozygotic twins studied. Furthermore, in complex diseases, a central problem is that each variant only makes a small contribution to the disorder [44]. Other candidate genes discovered by us, such as THBS1, KLF4, SYNE1, CHD8 and PDCL, may also contribute to CD. However, variation in these genes must be identified in more cases and controls. Additionally, considering that the variant was also present in the unaffected individuals of family B, other disease-causing factors lying outside of our set of candidate genes may also exist [45]. Finally, functional analyses are needed to elucidate the biological role of this gene in CD susceptibility. In conclusion, we report the discovery of coding region variants in DLG1 in human CD through whole exome sequencing and bioinformatic analysis and identify DLG1 as a potential susceptibility gene for CD in the Chinese population. Our study also demonstrates that whole exome sequencing is an efficient and cost-effective genetic strategy. Bioinformatic approaches are likely to become useful tools for the discovery of genes and to provide important guidance for finding rare variants in a complex disorder. Finding different, rare and pathogenic mutations in the same gene in unrelated individuals with the same phenotype provides important support for our study. However, confirmation of DLG1’s involvement in CD pathogenesis still requires validation in further functional experiments and clinical trials. Personalized medicine is also anticipated to be developed based on definite biological processes and molecular causes.
  45 in total

1.  The second European evidence-based Consensus on the diagnosis and management of Crohn's disease: Definitions and diagnosis.

Authors:  Gert Van Assche; Axel Dignass; Julian Panes; Laurent Beaugerie; John Karagiannis; Mathieu Allez; Thomas Ochsenkühn; Tim Orchard; Gerhard Rogler; Edouard Louis; Limas Kupcinskas; Gerassimos Mantzaris; Simon Travis; Eduard Stange
Journal:  J Crohns Colitis       Date:  2010-01-15       Impact factor: 9.071

2.  SNP detection for massively parallel whole-genome resequencing.

Authors:  Ruiqiang Li; Yingrui Li; Xiaodong Fang; Huanming Yang; Jian Wang; Karsten Kristiansen; Jun Wang
Journal:  Genome Res       Date:  2009-05-06       Impact factor: 9.043

3.  SOAP2: an improved ultrafast tool for short read alignment.

Authors:  Ruiqiang Li; Chang Yu; Yingrui Li; Tak-Wah Lam; Siu-Ming Yiu; Karsten Kristiansen; Jun Wang
Journal:  Bioinformatics       Date:  2009-06-03       Impact factor: 6.937

4.  The diploid genome sequence of an Asian individual.

Authors:  Jun Wang; Wei Wang; Ruiqiang Li; Yingrui Li; Geng Tian; Laurie Goodman; Wei Fan; Junqing Zhang; Jun Li; Juanbin Zhang; Yiran Guo; Binxiao Feng; Heng Li; Yao Lu; Xiaodong Fang; Huiqing Liang; Zhenglin Du; Dong Li; Yiqing Zhao; Yujie Hu; Zhenzhen Yang; Hancheng Zheng; Ines Hellmann; Michael Inouye; John Pool; Xin Yi; Jing Zhao; Jinjie Duan; Yan Zhou; Junjie Qin; Lijia Ma; Guoqing Li; Zhentao Yang; Guojie Zhang; Bin Yang; Chang Yu; Fang Liang; Wenjie Li; Shaochuan Li; Dawei Li; Peixiang Ni; Jue Ruan; Qibin Li; Hongmei Zhu; Dongyuan Liu; Zhike Lu; Ning Li; Guangwu Guo; Jianguo Zhang; Jia Ye; Lin Fang; Qin Hao; Quan Chen; Yu Liang; Yeyang Su; A San; Cuo Ping; Shuang Yang; Fang Chen; Li Li; Ke Zhou; Hongkun Zheng; Yuanyuan Ren; Ling Yang; Yang Gao; Guohua Yang; Zhuo Li; Xiaoli Feng; Karsten Kristiansen; Gane Ka-Shu Wong; Rasmus Nielsen; Richard Durbin; Lars Bolund; Xiuqing Zhang; Songgang Li; Huanming Yang; Jian Wang
Journal:  Nature       Date:  2008-11-06       Impact factor: 49.962

Review 5.  Genetic mapping in human disease.

Authors:  David Altshuler; Mark J Daly; Eric S Lander
Journal:  Science       Date:  2008-11-07       Impact factor: 47.728

Review 6.  Finding the missing heritability of complex diseases.

Authors:  Teri A Manolio; Francis S Collins; Nancy J Cox; David B Goldstein; Lucia A Hindorff; David J Hunter; Mark I McCarthy; Erin M Ramos; Lon R Cardon; Aravinda Chakravarti; Judy H Cho; Alan E Guttmacher; Augustine Kong; Leonid Kruglyak; Elaine Mardis; Charles N Rotimi; Montgomery Slatkin; David Valle; Alice S Whittemore; Michael Boehnke; Andrew G Clark; Evan E Eichler; Greg Gibson; Jonathan L Haines; Trudy F C Mackay; Steven A McCarroll; Peter M Visscher
Journal:  Nature       Date:  2009-10-08       Impact factor: 49.962

7.  DLG1/SAP97 modulates transforming growth factor alpha bioavailability.

Authors:  Anne-Laure Surena; Giselle P de Faria; Jeanne-Marie Studler; Franck Peiretti; Morgane Pidoux; Jacques Camonis; Hervé Chneiweiss; Etienne Formstecher; Marie-Pierre Junier
Journal:  Biochim Biophys Acta       Date:  2008-09-27

8.  Familial Crohn's disease in single or multiple first-degree relatives.

Authors:  Hugh J Freeman
Journal:  J Clin Gastroenterol       Date:  2002-07       Impact factor: 3.062

Review 9.  Decanalization and the origin of complex disease.

Authors:  Greg Gibson
Journal:  Nat Rev Genet       Date:  2009-01-02       Impact factor: 53.242

10.  Discs large (Dlg1) complexes in lymphocyte activation.

Authors:  Ramnik Xavier; Shahrooz Rabizadeh; Kazuhiro Ishiguro; Niko Andre; J Bernabe Ortiz; Heather Wachtel; David G Morris; Marco Lopez-Ilasaca; Albert C Shaw; Wojciech Swat; Brian Seed
Journal:  J Cell Biol       Date:  2004-07-19       Impact factor: 10.539

View more
  8 in total

Review 1.  Cytoskeletal Organization and Cell Polarity in the Pathogenesis of Crohn's Disease.

Authors:  Chengcen Guo; Jun Shen
Journal:  Clin Rev Allergy Immunol       Date:  2021-04       Impact factor: 8.667

Review 2.  Genetics of inflammatory bowel disease from multifactorial to monogenic forms.

Authors:  Anna Monica Bianco; Martina Girardelli; Alberto Tommasini
Journal:  World J Gastroenterol       Date:  2015-11-21       Impact factor: 5.742

Review 3.  Regulation of spermatid polarity by the actin- and microtubule (MT)-based cytoskeletons.

Authors:  Linxi Li; Baiping Mao; Siwen Wu; Qingquan Lian; Ren-Shan Ge; Bruno Silvestrini; C Yan Cheng
Journal:  Semin Cell Dev Biol       Date:  2018-07-12       Impact factor: 7.727

Review 4.  The Same Intestinal Inflammatory Disease despite Different Genetic Risk Factors in the East and West?

Authors:  Rashid N S Lui; Siew C Ng
Journal:  Inflamm Intest Dis       Date:  2016-07-15

5.  Gene Co-Expression Network Analysis for Identifying Modules and Functionally Enriched Pathways in Type 1 Diabetes.

Authors:  Ignacio Riquelme Medina; Zelmina Lubovac-Pilav
Journal:  PLoS One       Date:  2016-06-03       Impact factor: 3.240

6.  Polymorphisms of the TNF Gene and Three Susceptibility Loci Are Associated with Crohn's Disease and Perianal Fistula Crohn's Disease: A Study among the Han Population from South China.

Authors:  Min Zhang; Xiaoyan Wang; Xiaodong Jiang; Xiangling Yang; Chuangyu Wen; Min Zhi; Xiang Gao; Pinjin Hu; Huanliang Liu
Journal:  Med Sci Monit       Date:  2019-12-17

7.  Investigation of inflammatory bowel disease risk factors in 4 families in central China.

Authors:  Shufang Xu; Hui Zou; Heng Zhang; Siying Zhu; Rui Zhou; Jin Li
Journal:  Exp Ther Med       Date:  2017-11-28       Impact factor: 2.447

Review 8.  Genetic Studies of Inflammatory Bowel Disease-Focusing on Asian Patients.

Authors:  Sung Chul Park; Yoon Tae Jeen
Journal:  Cells       Date:  2019-05-01       Impact factor: 6.600

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.