Literature DB >> 35519826

Identification and validation of candidate risk genes in endocytic vesicular trafficking associated with esophageal atresia and tracheoesophageal fistulas.

Guojie Zhong1,2, Priyanka Ahimaz3, Nicole A Edwards4, Jacob J Hagen1,3, Christophe Faure5, Qiao Lu1,3, Paul Kingma6, William Middlesworth7, Julie Khlevner8, Mahmoud El Fiky9, David Schindel10, Elizabeth Fialkowski11, Adhish Kashyap4, Sophia Forlenza6,12, Alan P Kenny6,12, Aaron M Zorn4, Yufeng Shen1,13, Wendy K Chung3,14.   

Abstract

Esophageal atresias/tracheoesophageal fistulas (EA/TEF) are rare congenital anomalies caused by aberrant development of the foregut. Previous studies indicate that rare or de novo genetic variants significantly contribute to EA/TEF risk, and most individuals with EA/TEF do not have pathogenic genetic variants in established risk genes. To identify the genetic contributions to EA/TEF, we performed whole genome sequencing of 185 trios (probands and parents) with EA/TEF, including 59 isolated and 126 complex cases with additional congenital anomalies and/or neurodevelopmental disorders. There was a significant burden of protein-altering de novo coding variants in complex cases (p = 3.3 × 10-4), especially in genes that are intolerant of loss-of-function variants in the population. We performed simulation analysis of pathway enrichment based on background mutation rate and identified a number of pathways related to endocytosis and intracellular trafficking that as a group have a significant burden of protein-altering de novo variants. We assessed 18 variants for disease causality using CRISPR-Cas9 mutagenesis in Xenopus and confirmed 13 with tracheoesophageal phenotypes. Our results implicate disruption of endosome-mediated epithelial remodeling as a potential mechanism of foregut developmental defects. Our results suggest significant genetic heterogeneity of EA/TEF and may have implications for the mechanisms of other rare congenital anomalies.
© 2022 The Author(s).

Entities:  

Keywords:  Xenopus; aerodigestive; congenital anomaly; esophageal atresia; tracheoesophageal fistula

Year:  2022        PMID: 35519826      PMCID: PMC9065433          DOI: 10.1016/j.xhgg.2022.100107

Source DB:  PubMed          Journal:  HGG Adv        ISSN: 2666-2477


Introduction

Esophageal atresia (EA) is a congenital abnormality of the esophagus, co-occurring with tracheoesophageal fistula (TEF) in 70%–90% cases., The overall worldwide incidence of EA/TEF is 2.4 per 100,000 births. Approximately 55% of individuals with EA/TEF are complex with additional congenital anomalies in the cardiovascular, musculoskeletal, urinary, gastrointestinal, or central nervous system. The genetic causes of EA/TEF include chromosome anomalies or variants in genes involved in critical developmental processes that are dosage sensitive. Several EA/TEF risk genes include the transcriptional regulators SOX2, MYCN, CHD7, FANCB, and members of FOX transcription factor family., VACTERAL frequently includes EA/TEF and is frequently of unknown etiology. Mouse models have demonstrated that precise regulation of the transcription factors Nkx2-1, Sox2, and Foxf1 by WNT, bone morphogenetic protein 4 (BMP4), and Hedgehog signaling pathways is required for patterning of the fetal foregut and separation of the esophagus and trachea.,6, 7, 8, 9 Moreover, EFTUD2 haploinsufficiency leads to syndromic EA. EFTUD2 encodes one of the major components of the spliceosome, emphasizing the necessity of mRNA maturation through the spliceosome complex for normal development. Recently we have shown that de novo variants are major contributors to EA/TEF genetic risk, especially in genes that are targets of SOX2 or EFTUD2. However, it remains unclear how developmental signaling pathways, transcription factors, and RNA metabolism control the cellular behavior of tracheoesophageal morphogenesis. Despite previous studies of the genetics in several syndromes that include EA/TEF and mouse models, the etiology in most cases of EA/TEF is still unexplained. To identify the genetic etiologies of EA/TEF, we performed whole genome sequencing (WGS) of 185 individuals with EA/TEF, most without a family history of EA/TEF, and their biological parents. We confirmed our previous results from a smaller EA/TEF cohort, demonstrating an overall enrichment of de novo coding variants in complex cases. Functional enrichment analysis identified a striking convergence of putative risk genes in biological pathways related to endocytosis, membrane dynamics, and intracellular transport. We then used CRISPR-generated Xenopus mutant models to successfully confirm 13 of 18 candidate risk genes for EA/TEF. Together with recent reports that endosome-mediated membrane remodeling is required for tracheoesophageal morphogenesis in animal models, this suggest that disruptions in endosome trafficking may be a feature of many complex EA/TEF cases.

Methods

Participants recruitment

Individuals with EA/TEF were recruited as part of the CLEAR consortium from Columbia University Irving Medical Center in New York, USA, Center Hospitalier Universitaire Sainte-Justine in Montreal, Canada, Cincinnati Children’s Hospital, in Ohio, USA, Cairo University General Hospital in Cairo, Egypt, University of Texas Southwestern Medical Center in Texas, USA, and Oregon Health and Science University in Portland, USA. Participants eligible for the study included those diagnosed with EA/TEF without an identified genetic etiology based upon medical record review. All participants provided informed consent. The overall study was approved by the Columbia University institutional review board and each affiliated site. Blood and/or saliva samples were obtained from the probands and both biological parents. A three-generation family history was taken at the time of enrollment, and clinical data were extracted from the medical records and by participant and parental interview. We performed whole genome sequencing (WGS) on 185 probands without prior sequence based genetic testing diagnosed with EA/TEF and their parents. DNA from 75 probands was isolated from saliva samples, and DNA from the remaining 110 probands was isolated from blood samples. Individuals with only EA/TEF were classified as isolated cases (59 in total), and individuals with other type of congenital abnormalities or neurodevelopmental disorders were classified as complex cases (126 in total; Table S1).

WGS analysis

We identified de novo coding variants using previously published procedures with heuristic filters,, augmented with in silico confirmation by DeepVariant (Table S2). We used ANNOVAR and VEP to annotate variants with population allele frequency, (gnomAD and ExAC), protein-coding consequences, and predicted damaging scores for missense variants. Variants were classified as LGD (likely gene disrupting, including frameshift, stop gained/lost, start lost, splice acceptor/donor and splicing damage variants [spliceAI DS score ≥ 0.8]), missense, or synonymous. In frame deletions/insertions (multiple of three nucleotides) and other splice region variants were excluded in the following analysis. Variants in olfactory receptor genes, HLA genes, or MUC gene family were filtered out of further analysis. We identified de novo copy number variants (CNVs) customized pipeline as described in our previous study. Briefly, we applied CNVnator (v0.3.3) with the bin size set as 100 bp to predict CNV segments by read depth evidence and Lumpy v0.2.13 and SVtyper v0.1.4 to quantify pair-end/split-read (PE/SR) evidence. We only included the CNVs supported by both read depth and PE/SR in downstream analysis. Among the CNVs called in probands with Mendelian errors (that they were not called in any of the parents), we called de novo CNVs by visualization of both normalized read depth and allele fraction of SNP sites. We mapped de novo CNVs on GENCODE v29 protein-coding genes with at least 1 bp in the shared interval. We annotated the genes with variant intolerance metric by ExAC pLI, haploinsufficiency metric by Episcore, haploinsufficiency and triplosensitivity of genes from ClinGen genome dosage map, and CNV syndromes from DECIPHER v11.1.

Burden test

We divided the cohort into two categories based on their phenotypes (isolated and complex) and performed burden tests on both groups and the aggregated group. For each group, we divided de novo coding variants into four types: synonymous, LGD, missense, and protein altering (defined as combination of LGD and missense variants). For each variant type, we calculated the expected number of variants based on a background mutation rate model., We used a single-sided Poisson test to test whether the number of observed de novo variants is significantly higher than expected. We performed the test in all genes, genes intolerant of loss-of-function variants (“constrained genes” based on gnomAD pLI≥0.5), and non-constrained genes. Population attributable risk (PAR) was calculated as follows: , where are the observed number of individuals with heterozygous protein-altering variants, expected number of individuals with heterozygous protein-altering variants, and the number of all cases, respectively.

Pathway enrichment analysis of de novo protein-altering variants in complex cases

To identify the pathways associated with de novo protein-altering variants, we performed pathway enrichment analysis on the gene ontology (GO) pathways and human phenotype ontology (HPO) terms from GSEA, database (version v7.2) in complex cases. We only considered the pathways with at least two protein-altering variants (defined by combination of LGD and missense variants) expected by chance based on background mutation rate model., Based on these criteria, we selected a total of 907 pathways for downstream analysis. We performed a one-sided Poisson test of observed variants versus expectation in each pathway. Since many pathways have shared genes, we performed simulations under the null hypothesis to estimate the family-wise error rate (FWER) for a given p value. In each round, we randomly generated de novo LGD or missense variants based on the background mutation rate and calculated p values for each gene. Based on simulation results, we estimated FWER as follows:where is the total number of pathways that have p values smaller or equal to in all simulations, and is the number of simulations. We used both Jaccard index and correlation to show the overlap of two pathways. For each pair of pathways, the Jaccard index was defined as the aggregated mutation rate of overlapping genes divided by aggregated mutation rate of all genes, and correlations were calculated as the Pearson correlation during simulation. Network layout is generated by “Prefuse Forced Directed OpenCL Layout” algorithm in Cytoscape.

Protein-protein interaction analysis

We tested protein interactions of de novo protein-altering variants in complex cases using S TRING(v11.0) with default settings and default interaction sources. Edges were filtered by S TRING score ≥ 0.4 and visualized by Cytoscape. Proteins that were not connected to any other genes after interaction filtration were removed from the network. Network layout was generated by “Prefuse Forced Directed OpenCL Layout” algorithm in Cytoscape. For each gene, Degree was calculated as the sum of all StringDB scores.

F0 Xenopus tropicalis CRISPR-Cas9 mutagenesis screen

All Xenopus experiments were performed using guidelines approved by the CCHMC Institutional Animal Care and Use Committee (IACUC 2019-0053). Xenopus tropicalis adult frogs were purchased from NASCO (USA) or raised in house and maintained in the CCHMC vivarium under normal housing conditions. Xenopus embryos were obtained by in vitro fertilization or natural mating as previously described., Germ line sox2 embryos (F2 generation) were obtained by mating sox2+/− adults obtained from the National Xenopus Resource (NXR, USA; RRID: SCR_013731). For F0 CRISPR-Cas9 indel mutagenesis, guide RNAs (gRNAs) were designed using CRISPRScan based on the Xenopus tropicalis v9.1 genome assembly on Xenbase. gRNAs were designed to generate either null mutations (early in the coding sequence) or in the coding region similar to the corresponding mutation in our human cohort. In vitro transcribed gRNAs were synthesized using MEGAshortscript T7 Transcription Kit (ThermoFisher, USA) according to manufacturer’s instructions, or purchased as AltR-crRNA (Integrated DNA Technologies, USA). CrRNAs were annealed with AltR-tracrRNA prior to embryo injections according to manufacturer’s guidelines. Guide RNAs (500–700 pg) were complexed with recombinant Cas9 protein (1 ng, PNA Biosciences) and injected into X. tropicalis embryos at the one- or two-cell stage. For negative controls, a gRNA designed targeting tyrosinase (tyr) was injected to calculate a baseline percentage of defective tracheoesophageal development in Xenopus (∼2%). Three-day-old injected tadpoles (stage NF44) were fixed and processed for wholemount immunostaining as previously described using the following primary antibodies: mouse anti-SOX2 (Abcam, ab79351, 1:1,000), goat anti-FOXF1 (R&D systems, AF4798, 1:300), and rabbit anti-NKX2-1 (SCBT, sc-13040X, 1:300). Imaging was performed using a Nikon A1 inverted LUNA confocal microscope with constant laser settings for all embryos. Image analysis was performed using NIS Elements (Nikon, USA). After image analysis, each embryo was genotyped by PCR amplification of the target region followed by Sanger sequencing. Since F0 CRISPR-mutagenesis is mosaic, different cells can have different mutations, so we used the Synthego ICE software tool to deconvolute the proportion and sequence of each indel mutation in each embryo (Figure S1). Genotyping primers and gRNA sequences are in Table S3. We only scored phenotypic data from embryos that had >40% mutation rate. For each gene, CRISPR-mutagenesis experiments were independently repeated at least twice in different batches of embryos, analyzing 5–15 individual mutant tadpoles per experiment. A candidate risk gene was determined to be likely causative if more than 10% of mutant tadpoles had a tracheal or esophageal defect (LTEC, occluded esophagus, failed separation), compared to the baseline rate of <2% in control injected tadpoles.

Results

A total of 185 individuals with EA/TEF were enrolled into the study, including 102 (55%) male and 83 (45%) female probands. Probands were between the ages of 2 days and 54.5 years with an average of 8.2 years old at enrollment (Table 1). The majority (52.4%) were type C EA/TEFs. Fifty-nine probands had isolated EA/TEF, and 126 probands had neurodevelopmental delay and/or at least one additional congenital anomaly and were classified as non-isolated. Of the non-isolated cases, the most common associated anomalies were congenital heart defects (65; 51.5%), skeletal defects (48; 38%), and renal defects (40; 31.7%). Other congenital anomalies included genitourinary defects (non-renal) (16; 12.7%), laryngotracheal defects (13; 10.3%), gastrointestinal defects (9; 7%), limb defects (7; 5.5%), neural tube defects (5; 3.9%), craniofacial defects (5; 3.9%), and other anomalies were seen in 12 probands (9.5%). Twenty-five probands (19.8%) had neurodevelopmental delay. Fifty-five of the cases (30%) previously had a normal clinical karyotype and/or chromosome microarray, and none had exome sequencing. The majority of probands were self-identified White (80.5%), and the remaining were Black/African American (8%), Asian (6%), more than one race (3.2%), or unknown (2.2%). One of the probands reported a family history of EA/TEF, and 14 reported a family history of other congenital anomalies.
Table 1

Clinical table of 185 individuals enrolled into the study

CharacteristicsN = 185
Mean age at enrollment (range)8.22 years (2 days–54.5 years)
Sex
 Male102 (55%)
 Female83 (45%)
Race and ethnicity
 White149 (80.5%)
 Black/African American15 (8%)
 Asian11 (6%)
 American Indian/Alaska Native0
 Native Hawaiian or Pacific Islander0
 More than one race6 (3.2%)
 Unknown4 (2.2%)
 Hispanic13 (7%)
 Non-Hispanic169 (91.3%)
 Unknown3 (1.6%)
Type of EA/TEF
 Type A27 (14.5%)
 Type B5 (2.7%)
 Type C97 (52.4%)
 Type D3 (1.6%)
 Type H8 (4.3%)
 Unknown45 (24.3%)
Clinical presentation
 Isolated59 (32%)
 Non-isolated126 (68%)
 Cardiac defects65 (51.5%)
 Skeletal defects48 (38%)
 Renal defects40 (31.7%)
 Neurodevelopmental delay21 (16.6%)
 Genitourinary defects16 (12.7%)
 Laryngotracheal defects13 (10.3%)
 Gastrointestinal defects9 (7%)
 Limb defects7 (5.5%)
 Neural tube defects5 (3.9%)
 Craniofacial defects5 (3.9%)
 Other12 (9.5%)
Clinical table of 185 individuals enrolled into the study

Complex EA/TEF cases with additional anomalies have significant burden of de novo coding variants

We identified 249 de novo coding variants in 185 probands with EA/TEF (Table S4). The average number of de novo coding variants per proband is 1.35. We classified LGD and missense variants as protein-altering variants. We identified 191 protein-altering variants across all probands, including 47 in 59 isolated cases and 144 in 126 complex cases. We identified 13 de novo CNVs variants in 134 probands, including two individuals with heterozygous deletions of 21q11. None of the CNVs overlapped with any of the genes with de novo sequence variants (Table S5). We performed a burden test for enrichment of de novo coding variants in all cases, isolated cases, and complex cases respectively (Table 2). The number of synonymous variants is close to expectation (fold = 0.94, p = 0.7). Overall, there is a significant burden of de novo protein-altering variants (LGD or missense) (fold = 1.22, p = 4.2 × 10−3). The burden is almost entirely observed in complex cases (fold = 1.35, p = 3.3 × 10−4), as there is no evidence of de novo burden in isolated cases (fold = 0.94, p = 0.68). In complex cases (Table 3), the burden of LGD variants is mostly in genes that are intolerant of loss-of-function variants (defined as gnomAD pLI ≥ 0.5, “constrained genes”; fold = 2.8, p = 2.3 × 10−3), similar to other developmental disorders. The burden of de novo missense variants is also higher in constrained genes compared to non-constrained genes (fold = 1.57 versus 1.22), although it is marginally significant in both constrained and non-constrained gene sets (p = 3.3 × 10−3 and 0.045, respectively). We estimate that about 38 genes carrying these variants in the complex cases are risk genes. Overall, de novo protein-altering variants explain about 30% of PAR of complex EA/TEF.
Table 2

Burden of de novo variants in all cases

Variant typeAll cases (n = 185)
Isolated cases (n = 59)
Complex cases (n = 126)
ObsExpFoldp valueObsExpFoldp valueObsExpFoldp value
Synonymous5861.60.940.7
LGD2319.21.200.236.10.490.92013.11.530.045
Missense168137.31.220.00624443.81.00.5112493.51.330.0015
Protein altering (LGD + missense)191156.51.220.00424749.90.940.68144106.61.353.3 × 10−4

Burdens were calculated in all cases, isolated cases, and complex cases. Protein-altering variants were defined as LGD and missense variants. LGD is likely gene disrupting. Obs is observed. Exp is expected.

Table 3

Burden of protein-altering de novo variants in complex cases stratified by gene variant intolerance

Gene groupType of variantsObsExpFoldp value
Constrained genes (pLI ≥ 0.5; n = 4,365)LGD113.92.820.0023
missense4428.11.570.0033
protein altering (LGD + missense)5532.01.721.4 × 10−4
Non-constrained genes (n = 15,021)LGD99.20.980.57
missense8065.51.220.045
protein altering (LGD + missense)8974.61.190.057
Burden of de novo variants in all cases Burdens were calculated in all cases, isolated cases, and complex cases. Protein-altering variants were defined as LGD and missense variants. LGD is likely gene disrupting. Obs is observed. Exp is expected. Burden of protein-altering de novo variants in complex cases stratified by gene variant intolerance We assessed de novo protein-coding variants for pathogenicity using the ACMG criteria (Table 4 and Table S6). Of the 185 cases, only two clearly had a molecular diagnosis consistent with the phenotype (EFTU2 and MYCN associated with mandibulofacial dysostosis, Guion-Almeida type [OMIM: 610536] and Feingold syndrome [OMIM: 164280], respectively). One individual with a de novo p.G365S SMAD6 with a CADD score of 32 has a phenotype partially overlapping with conditions associated with SMAD6 and may represent an expansion of the phenotypes associated with SMAD6. One individual with a de novo p.T647I variant in GLS with a CADD score of 27.1 has a phenotype that at the age of 2 does not overlap with OMIM: 618339 with infantile cataracts, skin abnormalities, and intellectual disability. Of the 24 cases with de novo LGD variants, 21 were associated with complex phenotypes (Table 4).
Table 4

De novo LGD variants. LGD are likely gene disrupting

GeneVariantProteinVariant typeCADD scoregnomAD pLIOMIMIndividual phenotypeACMG variant class
CAMK2Bc.558delp.R187Afs∗16LGD.0.74autosomal dominant mental retardation (607707)EA + TEF type C, long gap, extra ribs, congenital scoliosis, developmental delaypathogenic
GTF2Ic.761_762delp.Q254Rfs∗5LGD.1noneEA + TEF, atrial septal defect, bilateral inguinal herniaVUS
AMER3c.2236C > Tp.R746∗LGD350.62non-OMIM geneEA + TEF, clubfeet, pyelectasis, atrial septal defect, developmental delayVUS
EFTUD2c.2419delp.Q807Rfs∗21LGD.1mandibulofacial dysostosis, Guion-Almeida type (610536)EA + TEF, clubfeet, pyelectasis, atrial septal defect, developmental delaypathogenic
ARHGAP21c.1711C > T,p.R571∗LGD311noneEA + TEF type C, multicystic dysplastic left kidney, patent ductus arteriosusVUS
ARHGAP17c.499C > Tpra9021LGD370.02noneEA + TEF type CVUS
MYCNc.153_154insCp.K52Qfs∗3LGD.0.89Feingold syndrome (164280)EA + TEF type C, microcephaly, clinodactyly, developmental delaypathogenic
USP9Xc.4775delp.G1592Vfs∗4LGD.1X-linked mental retardation (300968)EA, extra thumbs and dysmorphic features, rectus abdominis diastasis, severe laryngomalacia, seizures, hypotonia, intellectual disabilitypathogenic
ADRM1c.214-28_223del.LGD.1noneEA + TEF type C, atrial septum defect, Ventricular septum defect, developmental delayVUS
ADD1c.1A > Gp.M1?LGD25.10.99noneEA + TEF type C, vertebral anomalies, extra ribs, patent ductus arteriosus, horseshoe kidney, bilateral radial hypoplasia, thumb anomaly, imperforate anuspathogenic
FBXO10c.1419+1G > A.,.LGD260noneEA + TEF type C, vertebral anomaly, coarctation of aortaVUS
CHERPc.1306-1G > A.,.LGD23.51noneEA + TEF type C, renal ectopia, atrial septal defect, scoliosisVUS
IL32c.450_451insCp.G151Rfs∗13LGD.0noneEA + TEF long gap, duodenal atresia, small hole in heartVUS
RASA2c.82delp.D28Tfs∗32LGD.0noneEA + TEF type C, extra ribs, congenital scoliosis, developmental delayVUS
AMACRc.197dupp.R67Afs∗75LGD.0.03alpha-methylacyl-CoA racemase deficiency (AR-614307); bile acid synthesis defect (AR-214950)EA + TEF type C and developmental delaypathogenic
HACE1c.805C > Tp.R269∗LGD400spastic paraplegia and psychomotor retardation with or without seizures (AR-616756)EA + TEF type D, ventricular septal defect, and atrial septal defectpathogenic
MANBALc.72C > Gp.Y24∗LGD340.03noneEA + TEF type C, short gapVUS
CLDN10c.242T > Cp.M81TLGD25.50HELIX syndrome (AR-617671)EA + TEF type C, short gappathogenic
IVDc.456T > Ap.Y152∗LGD360isovaleric acidemia (AR-243500)EA + TEF long gap, hypospadias, minor duplex kidney, developmental delaypathogenic
OR5K4c.765delp.L256Sfs∗22LGD.0noneEA + TEF type C, short gap, ventricular septal defect, horseshoe kidneyVUS
TMPRSS3c.1048+1G > A.,.LGD26.10deafness (AR-601072)EA, duodenal atresia, malrotation, annular pancreas, atrial septal defect, polycystic kidney, imperforate anterior anus, missing ribpathogenic
ITPR1c.6247-5A > G.,.LGD.1spinocerebellar ataxia 15 (606658); spinocerebellar ataxia 29, congenital nonprogressive (117360); Gillespie syndrome (206700)EA + TEF type C, atrial septum defect, aortic irregularity, anomaly of both thumbs, vertebral anomalies, spina bifida occultapathogenic
KIF17c.2092C > Tp.Q698∗LGD350NoneEA, vertebral anomalies, arterial canalVUS
SCP2c.723delp.F241Lfs∗19LGD.0leukoencephalopathy with dystonia and motor neuropathy (AR-613724)EA + TEF type C, left aortic arch, aberrant right subclavian artery, butterfly vertebra, extra ribs, small patent ductus arteriosuspathogenic

ACMG is American College of Medical Genetics. VUS is variant of uncertain significance.

De novo LGD variants. LGD are likely gene disrupting ACMG is American College of Medical Genetics. VUS is variant of uncertain significance.

Protein-altering variants in complex cases are involved in endosome trafficking and developmental pathways

While complex cases have a significant burden of de novo variants, no one gene harbors more than one LGD or missense de novo variant, making it impossible to identify individual risk genes with sufficient statistical support. To investigate the aggregate properties of risk genes, we performed pathway enrichment analysis on protein-altering de novo variants in complex cases (n = 126). We focused on GO pathways and HPO terms. To ensure sufficient statistical power, we only considered the pathways that are expected to have at least two protein-altering variants by chance in 126 subjects. We compared the observed variants in each pathway to the expected number of variants estimated from background mutation rate and tested the enrichment using a Poisson test. We corrected the multi-testing p values to FWER based on simulations. Eight GO pathways and five HPO terms are enriched with protein-altering de novo variants with FWER ≤ 0.05 (Figure 1A, Table S7). The enriched GO pathways are related to autophagy processes, membrane regulation, and intracellular transport and localization, while the HPO terms are related to other developmental disorders (Figure 1B). A total of 86 genes are involved in at least one significant pathway. Fifty-five genes are involved in endocytosis and transcytosis pathways. Forty-five genes are involved in pathways related to other developmental disorders. The enrichment in GO pathways is mostly driven by de novo missense variants, whereas the enrichment in HPO terms is driven by both LGD and missense variants (Figure 1A). These results remain consistent if we exclude the two cases with the 22q11 deletion (Table S8). These findings are consistent with animal model studies in which pleiotropic signaling pathways and endosome-mediated epithelial remodeling are required for tracheoesophageal morphogenesis.
Figure 1

Pathway enrichment analysis

(A) Volcano plot. Each dot represents a pathway. X axis represents the enrichment rate in log scale, and Y axis is the Poisson test p value in log10 scale. The horizontal dashed line marks family-wise error rate (FWER) of 0.05. Significant pathways (FWER < 0.05) are colored by the percentage of LGD variants, and other pathways are colored gray.

(B) Pathway overlaps. Each circle represents a pathway with FWER < 0.05. Circle size is proportional to the number of observed de novo variants in the pathway; circle color represents the FWER; edge width is determined by the Jaccard index between two pathways, and edge color represents the correlation coefficient of the two pathways under the null in simulations.

Pathway enrichment analysis (A) Volcano plot. Each dot represents a pathway. X axis represents the enrichment rate in log scale, and Y axis is the Poisson test p value in log10 scale. The horizontal dashed line marks family-wise error rate (FWER) of 0.05. Significant pathways (FWER < 0.05) are colored by the percentage of LGD variants, and other pathways are colored gray. (B) Pathway overlaps. Each circle represents a pathway with FWER < 0.05. Circle size is proportional to the number of observed de novo variants in the pathway; circle color represents the FWER; edge width is determined by the Jaccard index between two pathways, and edge color represents the correlation coefficient of the two pathways under the null in simulations. We also investigated the functional interactions among the genes (n = 143) with protein-altering de novo variants in complex cases. Based on StringDB (v11.0), the number of protein-protein interactions is significantly larger than expected (PPI enrichment p value = 0.0021; Figure 2).
Figure 2

StringDB of LGD and missense genes in complex cases

Dots are colored to indicate whether it is involved in one of the significant pathways. Constrained genes (pLI ≥ 0.5) with LGD mutations are colored black. Edge width represents the StringDB score. Genes not involved in any of the annotation groups were not shown.

StringDB of LGD and missense genes in complex cases Dots are colored to indicate whether it is involved in one of the significant pathways. Constrained genes (pLI ≥ 0.5) with LGD mutations are colored black. Edge width represents the StringDB score. Genes not involved in any of the annotation groups were not shown.

CRISPR mutation of candidate risk genes in Xenopus disrupts trachea-esophageal morphogenesis

The underlying biology of trachea and esophageal development is conserved between humans and other terrestrial vertebrates, and animal models have proven effective in assessing candidate risk variants from human affected individuals. We therefore turned to the rapid functional genomics possible in the amphibian Xenopus, which is increasingly being used to model human developmental disorders, including tracheoesophageal birth defects. We tested candidate risk variants by CRISPR-Cas9 mutagenesis of the orthologous genes in Xenopus tropicalis, assaying F0 mutant embryos rather than establishing multi-generational lines since this is faster and more closely mimics the de novo mutations in human EA/TEF individuals. F0 mutagenesis results in embryos with a range of mosaic indel mutations. We found that F0 mutagenesis of sox2, a known EA/TEF risk gene in humans, resulted in a trachea-esophageal phenotype indistinguishable from F2 sox2 germline mutants with a failure of the foregut to separate into distinct esophagus and trachea (Figures 3B and 3C). Moreover, unlike human EA individuals with heterozygous SOX2 mutations, heterozygous mouse and Xenopus sox2 mutants do not exhibit tracheoesophageal defects.,
Figure 3

CRISPR-mutation of candidate risk genes in Xenopus disrupts trachea-esophagus morphogenesis

(A–G) Representative confocal microscopy images of NF44 foregut from Xenopus CRISPR mutants. Sox2 F0 CRISPR mutants (C) have the same trachea-esophageal phenotype as sox2−/− F2 germline mutants (B), validating the F0 screen. Compared to control tyr mutations in which the trachea (t) and esophagus (e) have completely separated (A), mutation of 13/18 genes caused a failure of the foregut to separate into distinct trachea and esophagus (D–E and H–L) and/or resulted in a disrupted esophagus with multiple lumens (F and G). Dashed lines indicate the esophagus, trachea, and foregut lumens. Arrows point to a tracheoesophageal cleft. The number of embryos with a TED phenotype/total injected. Scale bars represent 50 μm.

CRISPR-mutation of candidate risk genes in Xenopus disrupts trachea-esophagus morphogenesis (A–G) Representative confocal microscopy images of NF44 foregut from Xenopus CRISPR mutants. Sox2 F0 CRISPR mutants (C) have the same trachea-esophageal phenotype as sox2−/− F2 germline mutants (B), validating the F0 screen. Compared to control tyr mutations in which the trachea (t) and esophagus (e) have completely separated (A), mutation of 13/18 genes caused a failure of the foregut to separate into distinct trachea and esophagus (D–E and H–L) and/or resulted in a disrupted esophagus with multiple lumens (F and G). Dashed lines indicate the esophagus, trachea, and foregut lumens. Arrows point to a tracheoesophageal cleft. The number of embryos with a TED phenotype/total injected. Scale bars represent 50 μm. We prioritized and selected 18 candidate risk genes to test based on (1) the likelihood that the variant in affected individual was damaging, (2) expression in the Xenopus and mouse fetal foregut, and (3) the predicted function focusing on genes implicated in endosome trafficking or signaling pathways that pattern the fetal foregut (Table 5). gRNAs were designed to generate loss-of-function (null) mutations or in a few cases where early embryonic lethality was predicted, an affected individual-like mutation targeting a conserved sequence near the corresponding variant. We genotyped each CRISPR-injected embryo and assessed the trachea-esophageal phenotype in embryos with >40% damaging indel mutations. At 3 days of development (stage NF44), when the trachea and esophagus have normally separated (Figure 3A), tadpoles were fixed and assessed by confocal immunostaining for (Figure 3).
Table 5

EA/TEF candidate genes screened in Xenopus.

GeneFunctionXenopus TED frequency (n)Co-occurring defects% indelsMutation type
sox2transcription factor100% (14)100% (germline)null
sox2transcription factor65% (17)microphthalmia91%null
disp1Hedgehog signaling71% (14)62%null
amer3Wnt signaling62% (21)57%null
eftud2mRNA splicing55% (22)microphthalmia92%affected individual-like
abraRho signaling45% (20)craniofacial92%null
itsn1endocytosis47% (19)microcephaly71%null
itsn1endocytosis42% (43)72%affected individual-like
apc2Wnt signaling37% (19)87%null
smad6BMP signaling32% (31)craniofacial heart looping68%affected individual-like
arhgap21Rho signaling30% (23)craniofacial86%null
itgb4integrin29% (14)heart looping76%null
ap1g2endocytosis24% (17)microphthalmia92%null
rapgef3Ras signaling20% (5)91%null
celsr2Wnt/PCP signaling17% (24)79%null
ptpn14RTK signaling13% (23)94%null
add1cytoskeleton8% (24)craniofacial89%null
map4k3MAPK signaling8% (13)71%null
rab3gap2endocytosis6% (18)81%null
arhgap17Rho signaling0% (11)gut looping88%null
pcdh1cell-cell adhesion0% (7)47%null
tyr (control)pigmentation2% (71)n/dnull

TED, tracheoesophageal defect.

EA/TEF candidate genes screened in Xenopus. TED, tracheoesophageal defect. Thirteen of 18 genes screened exhibited defective trachea-esophageal development in >10% of mutated tadpoles (Table 5, Figure 3). The most common phenotype was an LTEC where the trachea and esophagus failed to separate near the larynx (e.g., sox2, eftud2, itsn1) (Figures 3C–3E), or a disorganized esophageal epithelium, likely leading to EA later in development (e.g., arhgap21 and disp1) (Figures 3F and 3G). This failure to separate the embryonic foregut is a typical manifestation commonly observed in both mouse and Xenopus embryos with mutations in known EA risk genes. Interestingly, several of the gene mutations also resulted in co-occurring defects in other organ systems like the EA/TEF human cases including microphthalmia, microcephaly, and craniofacial malformations. Notably, five genes are implicated in signaling pathways known to regulate foregut patterning (amer3, apc2, celsr2, disp1, smad6), while five other genes are implicated in endocytosis and/or intracellular trafficking (abra, arhgap21, ap1g2, itsn1, rapgef3) (Table 5).

Discussion

In this study, we identified 249 de novo coding variants in 185 EA/TEF individuals, including 23 LGD variants and 168 missense variants. Only two cases were associated with pathogenic variants in genes previously established to cause EA/TEF, suggesting that most of our findings are identifying genetic associations with EA/TEF in genes not previously associated with EA/TEF. Protein-altering de novo variants are enriched in complex cases. Consistent with previous studies of congenital anomalies, those variants showed greater enrichment in constrained genes. Pathway analysis showed that endocytosis, membrane regulation, and intracellular trafficking-related processes are enriched with protein-altering variants. Considering recent findings in mouse and Xenopus that endosome-mediated epithelial remodeling acts downstream of Hedgehog-Gli signaling to regulate tracheoesophageal morphogenesis, it is possible that disruption in endocytic vesicular trafficking may be a common mechanism in many EA/TEF individuals. Endocytic vesicular trafficking is regulated by small GTPases (Rab/Rho) that link endocytosis of membrane-bound vesicles to the actin intracellular transport machinery, which moves vesicles to different subcellular compartments: to lysosomes in the case of autophagy, to different membrane domains in the case of recycling endosomes, from the Golgi and ER to the cell surface for maturation of membrane proteins, and from basal to apical membranes in the case of transcytosis.43, 44, 45 Endocytic trafficking can influence morphogenesis in many ways: by changing cell shape, by dynamic remodeling of cell adhesion and junctional complexes, and by regulation of cell migration or cell signaling.46, 47, 48, 49, 50 Moreover, one of the candidate genes that we tested, ITSN1, encodes a multidomain adaptor protein that coordinates the intracellular transport of endocytic vesicles. ITSN1 is also an autism risk loci and consistent with the neurodevelopmental disorders also present in the EA/TEF individual, Itsn1 is required for neural dendrite formation in rodents, where it physically interacts with core endocytic protein Dnm2 acting as a Cdc42-GEF to promote actin-mediated endosome transport., Thus, the finding that several candidate genes validated in Xenopus are implicated in endocytosis or GTPase activity (abra, arhgap21, ap1g2, itsn1, rapgef3, rab3gap2) suggests that the EA/TEF in the individuals may have been due to disrupted foregut morphogenesis. Our analysis also revealed that LGD and missense variants in complex cases are involved in other developmental disorders, suggesting disruptions to pleiotropic pathways with roles in multiple organ systems. Indeed, Xenopus mutagenesis validated several genes implicated in signaling pathways known to regulate foregut patterning as well as the development of other organ systems including amer3, apc2, and celsr2 in the Wnt pathway, smad6 and sox2 in the BMP pathway, and disp1 required for secretion of Hedgehog ligands. In the future, as more functional data are collected on EA/TEF risk variants, it may be possible to link distinct signaling pathways or cellar mechanisms such as endocytosis to different co-occurring anomalies in specific organ systems. Overall, the genetics of EA/TEF is heterogeneous. With 126 complex cases that are overall significantly enriched with de novo protein-altering variants, we did not find a gene with such variants in multiple cases. This indicates that the number of risk genes contributing through de novo variants is large. A sustained effort to expand the cohort with genome sequencing is critical to improve statistical power to identify risk genes in humans. EA/TEF, like most other congenital anomalies, does not yet have a ClinGen expert panel and has not yet had a formal ClinGen evidence review to establish gene-disease validity for the phenotype of EA/TEF. Some syndromes have been assessed by the syndromic disorders expert panel, but none of the assessed conditions is frequently or consistently associated with EA/TEF. Given the apparent genetic heterogeneity and the small number of genomic studies of EA/TEF, it will likely be some time before there is sufficient evidence to assess any genes beyond perhaps those associated with Fanconi anemia as having more than limited evidence. However, functional data such as that we present in this manuscript add significantly to the evidence review once there are six or more independent de novo predicted loss-of-function individuals with a similar phenotype. One interesting observation is that all CRISPR-generated Xenopus mutants had severe tracheoesophageal clefts rather than atresia or fistulas. We expect that this is because the CRISPR editing strategy results in high mutagenesis rates and often loss-of-function alleles resulting in more severe tracheoesophageal phenotypes, in contrast to the individuals who have heterozygous variants. Indeed, in all most all reported cases where EA/TEF risk alleles have been modeled in mouse or Xenopus, heterozygous variants do not result in an EA/TEF phenotype, whereas null mutations exhibit a cleft with a single undivided foregut. This difference could be due to hypomorphic human variants versus null alleles in animals. In humans, null alleles in pleiotropic developmental genes are likely to be embryonic lethal and may not be viable to term. An additional factor is likely to be the fact that animal models are inbred, whereas the humans have diverse genetic backgrounds, likely associated with modifying alleles. In the future it will be important to test these possibilities with the exact affected individual alleles in animal models to obtain a better assess the genotype-phenotype relationship of these conditions.

Limitations of this study

Our study had limited statistical power to identify individual risk genes of EA/TEF based solely upon the human genetic studies due to the limited sample size. Collaborative human genetic studies of EA/TEF will be necessary to increase those sample sizes and better understand the spectrum of phenotypes associated with each gene. If somatic mutations play a significant role in disease pathogenesis, genetic analyses of blood or saliva may be insufficient to detect these genetic variants. However, even with a modest human sample size with only a single human with a de novo variant in the gene, we demonstrate the ability to effectively select disease-causing variants and functionally confirm the majority of the candidate genes using a moderate throughput F0 mutagenesis system. The combination of human genetics and model organism modeling is powerful for rare human genetic conditions associated with morphological defects. By examining pathways common across genes, we implicate endocytosis, membrane regulation, and intracellular trafficking in tracheoesophageal development, and these same processes are likely related to other congenital anomalies and neurodevelopmental disorders.

Data and code availability

The code generated during this study is available at GitHub: https://github.com/ShenLab/pathways pathways, which contains the code for pathway enrichment analysis of de novo variants with family-wise error rate estimation. The accession number for the raw whole genome sequencing data reported in this paper is dbGaP:phs002161.
  50 in total

Review 1.  Oesophageal atresia.

Authors:  Marinde van Lennep; Maartje M J Singendonk; Luigi Dall'Oglio; Fréderic Gottrand; Usha Krishnan; Suzanne W J Terheggen-Lagro; Taher I Omari; Marc A Benninga; Michiel P van Wijk
Journal:  Nat Rev Dis Primers       Date:  2019-04-18       Impact factor: 52.329

2.  ClinGen--the Clinical Genome Resource.

Authors:  Heidi L Rehm; Jonathan S Berg; Lisa D Brooks; Carlos D Bustamante; James P Evans; Melissa J Landrum; David H Ledbetter; Donna R Maglott; Christa Lese Martin; Robert L Nussbaum; Sharon E Plon; Erin M Ramos; Stephen T Sherry; Michael S Watson
Journal:  N Engl J Med       Date:  2015-05-27       Impact factor: 91.245

Review 3.  Pathways and mechanisms of endocytic recycling.

Authors:  Barth D Grant; Julie G Donaldson
Journal:  Nat Rev Mol Cell Biol       Date:  2009-09       Impact factor: 94.444

4.  DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources.

Authors:  Helen V Firth; Shola M Richards; A Paul Bevan; Stephen Clayton; Manuel Corpas; Diana Rajan; Steven Van Vooren; Yves Moreau; Roger M Pettett; Nigel P Carter
Journal:  Am J Hum Genet       Date:  2009-04-02       Impact factor: 11.025

5.  Multiple dose-dependent roles for Sox2 in the patterning and differentiation of anterior foregut endoderm.

Authors:  Jianwen Que; Tadashi Okubo; James R Goldenring; Ki-Taek Nam; Reiko Kurotani; Edward E Morrisey; Olena Taranova; Larysa H Pevny; Brigid L M Hogan
Journal:  Development       Date:  2007-05-23       Impact factor: 6.868

6.  A universal SNP and small-indel variant caller using deep neural networks.

Authors:  Ryan Poplin; Pi-Chuan Chang; David Alexander; Scott Schwartz; Thomas Colthurst; Alexander Ku; Dan Newburger; Jojo Dijamco; Nam Nguyen; Pegah T Afshar; Sam S Gross; Lizzie Dorfman; Cory Y McLean; Mark A DePristo
Journal:  Nat Biotechnol       Date:  2018-09-24       Impact factor: 54.908

7.  Parallel in vivo analysis of large-effect autism genes implicates cortical neurogenesis and estrogen in risk and resilience.

Authors:  Helen Rankin Willsey; Cameron R T Exner; Yuxiao Xu; Amanda Everitt; Nawei Sun; Belinda Wang; Jeanselle Dea; Galina Schmunk; Yefim Zaltsman; Nia Teerikorpi; Albert Kim; Aoife S Anderson; David Shin; Meghan Seyler; Tomasz J Nowakowski; Richard M Harland; A Jeremy Willsey; Matthew W State
Journal:  Neuron       Date:  2021-01-25       Impact factor: 18.688

8.  Genomic analyses implicate noncoding de novo variants in congenital heart disease.

Authors:  Felix Richter; Sarah U Morton; Seong Won Kim; Alexander Kitaygorodsky; Lauren K Wasson; Kathleen M Chen; Jian Zhou; Hongjian Qi; Nihir Patel; Steven R DePalma; Michael Parfenov; Jason Homsy; Joshua M Gorham; Kathryn B Manheimer; Matthew Velinder; Andrew Farrell; Gabor Marth; Eric E Schadt; Jonathan R Kaltman; Jane W Newburger; Alessandro Giardini; Elizabeth Goldmuntz; Martina Brueckner; Richard Kim; George A Porter; Daniel Bernstein; Wendy K Chung; Deepak Srivastava; Martin Tristani-Firouzi; Olga G Troyanskaya; Diane E Dickel; Yufeng Shen; Jonathan G Seidman; Christine E Seidman; Bruce D Gelb
Journal:  Nat Genet       Date:  2020-06-29       Impact factor: 38.330

9.  Transcytosis via the late endocytic pathway as a cell morphogenetic mechanism.

Authors:  Renjith Mathew; L Daniel Rios-Barrera; Pedro Machado; Yannick Schwab; Maria Leptin
Journal:  EMBO J       Date:  2020-07-13       Impact factor: 11.598

10.  The mutational constraint spectrum quantified from variation in 141,456 humans.

Authors:  Konrad J Karczewski; Laurent C Francioli; Grace Tiao; Beryl B Cummings; Jessica Alföldi; Qingbo Wang; Ryan L Collins; Kristen M Laricchia; Andrea Ganna; Daniel P Birnbaum; Laura D Gauthier; Harrison Brand; Matthew Solomonson; Nicholas A Watts; Daniel Rhodes; Moriel Singer-Berk; Eleina M England; Eleanor G Seaby; Jack A Kosmicki; Raymond K Walters; Katherine Tashman; Yossi Farjoun; Eric Banks; Timothy Poterba; Arcturus Wang; Cotton Seed; Nicola Whiffin; Jessica X Chong; Kaitlin E Samocha; Emma Pierce-Hoffman; Zachary Zappala; Anne H O'Donnell-Luria; Eric Vallabh Minikel; Ben Weisburd; Monkol Lek; James S Ware; Christopher Vittal; Irina M Armean; Louis Bergelson; Kristian Cibulskis; Kristen M Connolly; Miguel Covarrubias; Stacey Donnelly; Steven Ferriera; Stacey Gabriel; Jeff Gentry; Namrata Gupta; Thibault Jeandet; Diane Kaplan; Christopher Llanwarne; Ruchi Munshi; Sam Novod; Nikelle Petrillo; David Roazen; Valentin Ruano-Rubio; Andrea Saltzman; Molly Schleicher; Jose Soto; Kathleen Tibbetts; Charlotte Tolonen; Gordon Wade; Michael E Talkowski; Benjamin M Neale; Mark J Daly; Daniel G MacArthur
Journal:  Nature       Date:  2020-05-27       Impact factor: 69.504

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.