| Literature DB >> 30461124 |
Sagar J Pathak1,2, James L Mueller1, Kevin Okamoto1, Barun Das1, Jozef Hertecant3, Lynn Greenhalgh4, Trevor Cole5, Vered Pinsk6, Baruch Yerushalmi6, Odul E Gurkan7, Michael Yourshaw8, Erick Hernandez9, Sandy Oesterreicher10, Sandhia Naik11, Ian R Sanderson11, Irene Axelsson12, Daniel Agardh13, C Richard Boland14, Martin G Martin15, Christopher D Putnam14,16, Mamata Sivagnanam1,2.
Abstract
The epithelial cell adhesion molecule gene (EPCAM, previously known as TACSTD1 or TROP1) encodes a membrane-bound protein that is localized to the basolateral membrane of epithelial cells and is overexpressed in some tumors. Biallelic mutations in EPCAM cause congenital tufting enteropathy (CTE), which is a rare chronic diarrheal disorder presenting in infancy. Monoallelic deletions of the 3' end of EPCAM that silence the downstream gene, MSH2, cause a form of Lynch syndrome, which is a cancer predisposition syndrome associated with loss of DNA mismatch repair. Here, we report 13 novel EPCAM mutations from 17 CTE patients from two separate centers, review EPCAM mutations associated with CTE and Lynch syndrome, and structurally model pathogenic missense mutations. Statistical analyses indicate that the c.499dupC (previously reported as c.498insC) frameshift mutation was associated with more severe treatment regimens and greater mortality in CTE, whereas the c.556-14A>G and c.491+1G>A splice site mutations were not correlated with treatments or outcomes significantly different than random simulation. These findings suggest that genotype-phenotype correlations may be useful in contributing to management decisions of CTE patients. Depending on the type and nature of EPCAM mutation, one of two unrelated diseases may occur, CTE or Lynch syndrome.Entities:
Keywords: EPCAM; Lynch syndrome; congenital tufting enteropathy; genotype-phenotype correlation; in silico simulation; protein modeling
Mesh:
Substances:
Year: 2018 PMID: 30461124 PMCID: PMC6328345 DOI: 10.1002/humu.23688
Source DB: PubMed Journal: Hum Mutat ISSN: 1059-7794 Impact factor: 4.878
Novel variants of EPCAM in CTE patients
| Patient | Coding DNA | Protein | Type of mutation | Zygosity | Ethnicity | Gender | TPN requirement | Transplant |
|---|---|---|---|---|---|---|---|---|
| 1 | c.113G>A; c.48_68del21 | p.(C38Y); p.(A18_Q24del) | Missense; in‐frame deletion | Compound heterozygous | Caucasian | M | Full TPN | No |
| 2 | c.1A>C; c.38_62dup25 | p.(M1L); p.(A22Cfs*17) | Missense; frameshift | Compound heterozygous | Palestinian | F | Off TPN due to limited access | No |
| 3 | c.267G>C; c.*118T>C | p.(Q89H); unknown consequence | Missense; unknown | Compound heterozygous | English | M | Full TPN | No |
| 4 | c.307G>A; c.492‐5T>C | p.(G103R); splicing defect | Missense; splicing defect | Compound heterozygous | English‐Italian | M | Partial TPN | No |
| 5.1 | c.491+1G>A; c.556‐14A>G | p.(W143_T164del) [del exon 4]; p.(Y186_D219del*) | Splicing defect; splicing defect | Compound heterozygous | Bangladeshi | M | Full TPN | No |
| 5.2 | c.491+1G>A; c.556‐14A>G | p.(W143_T164del) [del exon 4]; p.(Y186_D219del*) | Splicing defect; splicing defect | Compound heterozygous | Bangladeshi | M | Full TPN | No |
| 6 | c.492‐1G>A; c.491+1G>A | p.(W143_T164del) [del exon 4]; p.(W143_T164del) [del exon 4] | Splicing defect; splicing defect | Compound heterozygous | Hispanic | M | Full TPN | No |
| 7 | c.509_511delTCA | p.(I170del) | In‐frame deletion | Homozygous | Pakistani | Unknown | Unknown | Unknown |
| 8 | c.555+1G>C | p.(A65Mfs*23) [del exon 5] | Splicing defect | Homozygous | Turkish | F | Weaned off | No |
| 9.1 | c.579delT | p.(I193Mfs*17) | Frameshift | Homozygous | Bedouin | M | Full TPN | No |
| 9.2 | c.579delT | p.(I193Mfs*17) | Frameshift | Homozygous | Bedouin | F | Full TPN | No |
| 10 | c.589C>T | p.(Q197*) | Truncation | Homozygous | Iraqi | M | Full TPN | No |
| 11.1 | c.540delT; c.491+1G>A | p.(F180Lfs*30); p.(W143_T164del) [del exon 4] | Frameshift; splicing defect | Compound heterozygous | Unknown | M | Partial TPN | No |
| 11.2 | c.540delT; c.491+1G>A | p.(F180Lfs*30); p.(W143_T164del) [del exon 4] | Frameshift; Splicing defect | Compound heterozygous | Unknown | M | Partial TPN | No |
| 12 | c.757G>A | p.(D253N) | Missense | Homozygous | Turkish | M | Full TPN | No |
| 13 | g.29411_34526del5116 | p.(Q24_Y142del) | Deletion | Homozygous | Hispanic | F | Partial TPN | No |
| 14 | c.227C>G | p.(S76*) | Truncation | Homozygous | Israeli | F | Full TPN | No |
**Siblings are annotated with decimals, e.g. patients 5.1 and 5.2.
Figure 1Quantitative distribution of EPCAM mutations identified in CTE patients. Each patient is represented by a single box. Gray boxes correspond to the presence of the mutation in homozygous patients, whereas white boxes correspond to the presence of a mutation in heterozygous patients
Figure 2Reported mutations in EPCAM identified in CTE patients. (a) Reported mutations are shown based on their effect on the gene (top), mRNA (middle), or protein (bottom). (b) The positions of missense mutations are depicted on the EPCAM dimer structure; theoretical membrane and transmembrane helix structures are shown to indicate protein orientation
Figure 33′ deletions of EPCAM associated with Lynch syndrome. (a) Illustration of the homologies of two Alu–Alu recombination events causing deletions of EPCAM exons 8 and 9. The top and bottom sequences correspond to the two Alu sequences involved in the recombination event, and the middle sequence is the novel junction. The sequences between the colons correspond to the identical microhomology at the deletion junction. (b) Diagram of the EPCAM deletions. Exons 1–9 of EPCAM and the first four exons of MSH2 are depicted at the top. Deletions are shown in the middle as thick black lines. The Alu repeats involved at the homology‐mediated rearrangements for each deletion are shown as labeled black boxes; black boxes above the deletion line are Alu sequences encoded on the + strand, whereas black boxes below the deletion line are Alu sequences encoded on the – strand. Repetitive elements on the + and – strand identified by RepeatMasker version 406 (Smit, 2013–2015) with the Repeat Library Version 20150807 are shown at bottom. Alu‐related sequences are shown as grey and black boxes; other repeats are shown as white boxes. Note that some deletions have previously been reported under different annotations: c.426‐544_*3904del (c.423‐545_*3903del), c.555+402_*1220del (AC079775.6:g.72468_82822del10355), c.555+927_*14226del (c.555+894_*14194del), c.858+1211_*4529del (c.85811211_4529del), c.858+1358_*4793del_insAG (c.858+1364_*4793del_insAG), c.858+2478_*4507del (AC079775.6:g.77436_86109del8674), c.859‐2524_*10762del (AC079775.6:g.77631_92364del14734), c.859‐1605_*5826del (c.859‐1605_*5862del), c.859‐696_*3914del (AC079775.6:g.79459_85516del6058), c.859‐692_*1990del (c.859‐672_*2170del), c.859‐689_*14697del (AC079775.6:g.79465_96299del16834), and c.859‐645_*10911del (AC079775.6:g.79509_92513del13004). In addition, two deletions reported as annotations but not with sequence do not support the reported junction microhomologies: c.492‐509_*13721del and c.858‐353_*618del
Figure 4Detailed views of the positions of reported missense mutations in EPCAM and c.491+1G>A. (a) The F105C and G103R amino acid substitutions are positioned to affect the C66–C99 disulfide bond through disulfide bond isomerization (F105C) or through steric interactions (G103R) as shown by the molecular surface of the arginine mutant that clashes with Y32. (b) The C38Y amino acid substitution disrupts one of the disulfides in the N‐terminal domain. (c) The I146N amino acid substitution places a polar residue within the hydrophobic core of the C‐terminal domain. (d) The N120I and D253N substitutions disrupt the side chain–main chain hydrogen bonds between the thyroglobulin‐like domain and the C‐terminal domain. (e) The T127I amino acid substitution would disrupt side chain–main chain interactions adjacent to the C118‐C135 sulfide bond in the thyroglobulin‐like domain. (f). The c.491+1G>A splice site mutation that causes exon 4 skipping would delete a core region (red cartoon) of the C‐terminal domain
Figure 5Quantitative distribution genotypes in the current literature. Colors denote current treatment status or clinical outcome of patients with the indicated genotype; each patient is represented by a single box. Transplanted patients who were deceased are reported as transplanted
Figure 6Computer simulation of patient treatment/outcomes suggests a correlation between c.499dupC and more severe disease. (a) The number of times a category of mutation was associated with a particular treatment or outcome (vertical arrow) was compared to the expected distribution based on random simulations (grey bars). Reported P‐values are derived from the expected distributions. Statistically significant P‐values are in bold. (b) Analysis of individual mutations using random simulation as in panel A. Of the three mutations that could be analyzed, only the c.499dupC mutation had an observed count that was statistically different than the random distribution