| Literature DB >> 29093530 |
Zhen Zhang1, Qi Li1, Mei Diao1, Na Liu2, Wei Cheng3,4, Ping Xiao5, Jizhen Zou5, Lin Su6, Kaihui Yu7, Jian Wu2, Long Li1, Qian Jiang8.
Abstract
Hirschsprung disease (HSCR) is a common cause of functional colonic obstruction in children. The currently available genetic testing is often inadequate as it mainly focuses on RET and several other genes, accounting for only 15-20% of cases. To identify novel, potentially pathogenic variants, we isolated a panel of genes from a whole-exome sequencing study and from the published mouse aganglionosis phenotypes, enteric nervous system development, and a literature review. The coding exons of 172 genes were analyzed in 83 sporadic patients using next-generation sequencing. Rare stop-gain, splice-site variants, frameshift and in-frame insertions/deletions and non-synonymous variants (conserved and predicted to be deleterious) were prioritized as the most promising variants to have an effect on HSCR and subjected to burden analysis. GeneMANIA interaction database was used to identify protein-protein interaction-based networks. In addition, 6 genes (PTPN13, PHKB, AGL, ZFHX3, LAMA1, and AP3B2) were prioritized for follow-up studies: both their time-space expression patterns in mouse and human colon showed that they are good candidates for predicting pathogenicity. The results of this study broaden the mutational spectrum of HSCR candidate genes, and they provide an insight into the relative contributions of individual genes to this highly heterogeneous disorder.Entities:
Mesh:
Year: 2017 PMID: 29093530 PMCID: PMC5666020 DOI: 10.1038/s41598-017-14835-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Clinical categorization of 83 HSCR patients.
|
|
| |||
|---|---|---|---|---|
|
|
|
| ||
| Gender | ||||
| Male | 12 | 14 | 25 | 51 |
| Female | 20 | 5 | 7 | 32 |
| Age at surgery (months) | ||||
| 0–6 | 16 | 8 | 18 | 42 |
| 6–12 | 4 | 5 | 9 | 18 |
| 12–36 | 7 | 3 | 4 | 14 |
| >36 | 5 | 3 | 1 | 9 |
Figure 1Schematic of the analytical workflow. A two-step variant filtering and prioritization approach was used to select candidate variants for disease-causing mutations. In the first step (on the left), all variants present in the in-house control cohort (MutInNormal) and all synonymous variants were excluded; while in the alternative step (on the right), all variants that had ever been reported were retained, regardless of their effects. Later, variants left from either step were further filtered against high frequency (variants with an alternative allele frequency ≥0.05 in either the 1000 Genome Project, the NHLBI Exome Sequencing Project, the Exome Aggregation Consortium data and an internal exome database of ~500 individuals) and low quality or potential false positives (MutRatio <25% and MutCount <5), before being merged into a non-redundant, integrated list of 2756. Two categories of variants were further extracted with the following definitions: LGDstrict refers to stop-gain, canonical splice-site (±2) variants and frameshift InDels. LGDbroad refers to stop-gain, splice region (±5) variants, frameshift InDels, in-frame InDels, and non-synonymous (missense) variants predicted to be damaging by at least three bioinformatics tools.
List of 19 LGDstrict variants identified among 83 Chinese HSCR patients and validated by Sanger sequencing.
|
|
|
| rs ID |
|
|
|
|
|
|
|
| |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1000G | ESP6500 | ExACd | Inhouse | |||||||||||
| HSCR0058 |
| Chr2: 74742218 | — | c.293insG | p.A98Gfs*268 | — | — | 0 | 0 | 0 | 0 | Male | S | — |
| HSCR0116 |
| Chr10: 43600528 | — | c.754G > T | p.E252X | DC | 2.1 | 0 | 0 | 0 | 0 | Male | TCA |
|
| HSCR0117 |
| Chr10: 43615164 | — | c.2578C > T | p.Q860X | DC | 4.42 | 0 | 0 | 0 | 0 | Male | TCA |
|
| HSCR0129 |
| Chr4: 87728936 | — | c.6396G > A | p.W2132X | DC | 5.68 | 0 | 0 | 0 | 0 | Female | L | Mother |
| HSCR0129 |
| Chr10: 43600563 | — | c.789C > G | p.Y263X | DC | −3.43 | 0 | 0 | 0 | 0 | Female | L |
|
| HSCR0013 |
| Chr10: 43613869 | — | c.2333delT | p.V778Afs*1 | — | — | 0 | 0 | 0 | 0 | Male | S | — |
| HSCR0009 |
| Chr13: 23912864 | rs761184491 | c.4710delA | p.K1570Nfs*7 | — | — | 0 | 0 | 0.0003 | 0 | Female | S | — |
| HSCR0022 |
| Chr14: 68053899 | rs111462449 | c.4042C > T | p.R1348X | PO | 3.11 | 0.01 | 0.008626 | 0.0076 | 0 | Male | L | — |
| HSCR0033 |
| Chr16: 3740957 | — | c.118delA | p.T40Qfs*76 | — | — | 0 | 0 | 0 | 0 | Male | TCA | Mother |
| HSCR0034 |
| Chr1: 242042155 | — | c.1619C > G | p.S540X | DC | 3.26 | 0 | 0 | 0 | 0 | Female | L | — |
| HSCR0038 |
| Chr2: 145158778 | rs587784571 | c.832C > T | p.R278X | DC | 4.43 | 0 | 0 | 0 | 0 | Female | L |
|
| HSCR0055 |
| Chr1: 100327076 | rs781580050 | c.49C > T | p.R17X | DC | 2.92 | 0 | 0 | 0.00000831 | 0 | Female | S | Mother |
| HSCR0055 |
| Chr10: 43619117 | — | c.2802-2A > G | - | DC | 5.33 | 0 | 0 | 0 | 0 | Female | S |
|
| HSCR0057 |
| Chr7: 84628812 | — | c.2278C > T | p.R760X | DC | 5.05 | 0 | 0 | 0 | 0 | Male | L | Father |
| HSCR0070 |
| Chr16: 47694459 | — | c.2014C > T | p.R672X | DC | 3.39 | 0 | 0 | 0 | 0 | Female | S | — |
| HSCR0074 |
| Chr10: 43613844 | rs775711017 | c.2308C > T | p.R770X | DC | 3.49 | 0 | 0 | 0.000008277 | 0 | Male | TCA |
|
| HSCR0082 |
| Chr1: 222721107 | rs748262144 | c.279_280insGA | p.H94Dfs*58 | — | — | 0 | 0 | 0.00002486 | 0.001997 | Male | TCA | — |
| HSCR0146 |
| Chr10: 43596087 | — | c.254G > A | p.W85X | DC | 5.51 | 0 | 0 | 0 | 0 | Male | TCA | — |
| HSCR0146 |
| Chr17: 4859895 | rs550460218 | c.1095G > A | p.W365X | DC | 5.32 | 0.000199681 | 0 | 0.0001 | 0 | Male | TCA | — |
aNucleotide numbering of the exonic variants reflects cDNA numbering with + 1 corresponding to the A of the ATG translation initiation codon in the reference sequence, specifically, RefSeq NM_016170 for TLX2, RefSeq NM_020630 for RET, RefSeq NM_080684 for PTPN13, RefSeq NM_001278055 for SACS, RefSeq NM_020715 for PLEKHH1, RefSeq NM_016292 for TRAP1, RefSeq NM_003686 for EXO1, RefSeq NM_001171653 for ZEB2, RefSeq NM_000645 for AGL, RefSeq NM_152754 for SEMA3D, RefSeq NM_000293 for PHKB, RefSeq NM_024746 for HHIPL2 and RefSeq NM_001976 for ENO3.
bDC, disease-causing; PO, polymorphism.
cGERP, Genomic Evolutionary Rate Profiling. Score ranges from −15 (not conserved) to 7 (conserved).
dThe allele frequency is based on the Exome Aggregation Consortium data (http://exac.broadinstitute.org/) of all individuals.
eS, short-segment HSCR; L, long-segment HSCR; TCA, total colonic aganglionosis.
Figure 2Top 30 HSCR-implicated genes in which LGDbroad variants were detected in several patients. The numbers of HSCR patients who carried a candidate variant per gene are shown on the x axis. LGDbroad candidates consist of five categories of rare variants: frameshift indels (red), inframe indels (blue), splice region ( ± ) variants (purple), stop-gain variants (orange), and non-synonymous variants (green) predicted to be damaging by at least three bioinformatics tools.
Figure 3Distribution of the LGD variants across gender and segment length. LGDstrict variants occurred slightly more frequently in males than in females (57.9% vs 42.1%). (A) similar proportion was found for the LGDbroad variants (59.0% vs 41.0%, A). In contrast, the most severe phenotype group, namely L-HSCR and TCA, dominated both the LGDstrict and LGDbroad variants: 13 LGDstrict variants (68.4%) and 330 LGDbroad variants (63.6%) were present in the L-HSCR + TCA group (B). S, L, and TCA denote short-segment HSCR, long-segment HSCR, and total colonic aganglionosis.
Significant findings of functional annotation cluster analysis for LGDstrict genes with a nominal P < 0.05.
|
|
|
| % |
|
|
|
|
|---|---|---|---|---|---|---|---|
| GOTERM_BP_FAT | GO:0048484~enteric nervous system development | 2 | 15.4 |
| 450.93 | 3.99E-03 | 0.564 |
| GOTERM_BP_FAT | GO:0006006~glucose metabolic process | 3 | 23.1 |
| 26.53 | 4.34E-03 | 0.596 |
| GOTERM_BP_FAT | GO:0048729~tissue morphogenesis | 3 | 23.1 |
| 22.55 | 5.96E-03 | 0.712 |
| GOTERM_BP_FAT | GO:0019318~hexose metabolic process | 3 | 23.1 |
| 21.14 | 6.76E-03 | 0.756 |
| GOTERM_BP_FAT | GO:0005996~monosaccharide metabolic process | 3 | 23.1 |
| 18.28 | 8.95E-03 | 0.846 |
| GOTERM_BP_FAT | GO:0048483~autonomic nervous system development | 2 | 15.4 |
| 142.40 | 1.26E-02 | 0.928 |
| GOTERM_BP_FAT | GO:0001755~neural crest cell migration | 2 | 15.4 |
| 117.63 | 1.52E-02 | 0.959 |
| GOTERM_BP_FAT | GO:0048598~embryonic morphogenesis | 3 | 23.1 |
| 13.22 | 1.66E-02 | 0.969 |
| GOTERM_BP_FAT | GO:0006091~generation of precursor metabolites and energy | 3 | 23.1 |
| 12.97 | 1.73E-02 | 0.973 |
| GOTERM_BP_FAT | GO:0014033~neural crest cell differentiation | 2 | 15.4 |
| 81.99 | 2.17E-02 | 0.990 |
| GOTERM_BP_FAT | GO:0014032~neural crest cell development | 2 | 15.4 |
| 81.99 | 2.17E-02 | 0.990 |
| GOTERM_BP_FAT | GO:0005977~glycogen metabolic process | 2 | 15.4 |
| 77.30 | 2.31E-02 | 0.992 |
| GOTERM_BP_FAT | GO:0006073~cellular glucan metabolic process | 2 | 15.4 |
| 75.16 | 2.37E-02 | 0.993 |
| GOTERM_BP_FAT | GO:0044042~glucan metabolic process | 2 | 15.4 |
| 75.16 | 2.37E-02 | 0.993 |
| GOTERM_BP_FAT | GO:0001667~ameboidal cell migration | 2 | 15.4 |
| 73.12 | 2.44E-02 | 0.994 |
| GOTERM_BP_FAT | GO:0006112~energy reserve metabolic process | 2 | 15.4 |
| 62.92 | 2.83E-02 | 0.997 |
| GOTERM_BP_FAT | GO:0001838~embryonic epithelial tube formation | 2 | 15.4 |
| 62.92 | 2.83E-02 | 0.997 |
| GOTERM_BP_FAT | GO:0035148~tube lumen formation | 2 | 15.4 |
| 61.49 | 2.89E-02 | 0.998 |
| GOTERM_BP_FAT | GO:0048762~mesenchymal cell differentiation | 2 | 15.4 |
| 53.05 | 3.34E-02 | 0.999 |
| GOTERM_BP_FAT | GO:0014031~mesenchymal cell development | 2 | 15.4 |
| 53.05 | 3.34E-02 | 0.999 |
| GOTERM_BP_FAT | GO:0060485~mesenchyme development | 2 | 15.4 |
| 52.03 | 3.41E-02 | 0.999 |
| GOTERM_BP_FAT | GO:0044264~cellular polysaccharide metabolic process | 2 | 15.4 |
| 51.05 | 3.47E-02 | 0.999 |
| GOTERM_BP_FAT | GO:0016331~morphogenesis of embryonic epithelium | 2 | 15.4 |
| 46.65 | 3.79E-02 | 1.000 |
| GOTERM_BP_FAT | GO:0060562~epithelial tube morphogenesis | 2 | 15.4 |
| 40.38 | 4.37E-02 | 1.000 |
Figure 4Protein–protein interaction network of the LGDstrict variant-associated gene set identified using GeneMANIA. We used GeneMANIA to identify the genes most related to the LGDstrict gene set and to search for a possible protein–protein interaction network among them. The query genes (black circles) were assigned a label value of 1. Label propagation was then applied to the entire network and the resulting labels were saved as the score attributed in the node table. This score indicated the relevance of each gene to the original list based on the selected networks. Higher scores (larger circles) indicate genes that are more likely to be functionally related. This analysis found an interconnected network of 11 out of 13 proteins through either co-expression, co-localization, shared protein domains, or predicted interactions.
Figure 5Temporal and spatial expression of candidate genes in the colon. (A) Quantitative gene expression of Ptpn13, Phkb, Agl, Zfhx3, Lama1, Ap3b2, and Ret in mouse colon tissue during embryonic development. We analyzed gene expression using real-time PCR in fetal (E8.5, E10, E12, and E16) and postnatal (1 and 6 weeks) mice. Replicate experiments were normalized to Gapdh as a control. Ret was subsequently assigned a value of one for comparison of relative expression levels across the 6 time points. Each gene has been color coded identically across all time points (y axis in log10 scale). (B) Immunohistochemical staining of colon tissue from two human controls revealed intense expression of PTPN13, PHKB, AGL, ZFHX3, AP3B2, and RET, but not LAMA1 in both the mucosal layer and myenteric plexuses.