Literature DB >> 29089047

Whole exome sequencing in 342 congenital cardiac left sided lesion cases reveals extensive genetic heterogeneity and complex inheritance patterns.

Alexander H Li1, Neil A Hanchard2, Dieter Furthner3, Susan Fernbach2, Mahshid Azamian2, Annarita Nicosia2, Jill Rosenfeld2, Donna Muzny4, Lisa C A D'Alessandro5, Shaine Morris5, Shalini Jhangiani4, Dhaval R Parekh5, Wayne J Franklin5, Mark Lewin6, Jeffrey A Towbin7, Daniel J Penny5, Charles D Fraser8, James F Martin9, Christine Eng2, James R Lupski2,4,5,10, Richard A Gibbs2,4, Eric Boerwinkle1,4, John W Belmont11,12,13.   

Abstract

BACKGROUND: Left-sided lesions (LSLs) account for an important fraction of severe congenital cardiovascular malformations (CVMs). The genetic contributions to LSLs are complex, and the mutations that cause these malformations span several diverse biological signaling pathways: TGFB, NOTCH, SHH, and more. Here, we use whole exome sequence data generated in 342 LSL cases to identify likely damaging variants in putative candidate CVM genes.
METHODS: Using a series of bioinformatics filters, we focused on genes harboring population-rare, putative loss-of-function (LOF), and predicted damaging variants in 1760 CVM candidate genes constructed a priori from the literature and model organism databases. Gene variants that were not observed in a comparably sequenced control dataset of 5492 samples without severe CVM were then subjected to targeted validation in cases and parents. Whole exome sequencing data from 4593 individuals referred for clinical sequencing were used to bolster evidence for the role of candidate genes in CVMs and LSLs.
RESULTS: Our analyses revealed 28 candidate variants in 27 genes, including 17 genes not previously associated with a human CVM disorder, and revealed diverse patterns of inheritance among LOF carriers, including 9 confirmed de novo variants in both novel and newly described human CVM candidate genes (ACVR1, JARID2, NR2F2, PLRG1, SMURF1) as well as established syndromic CVM genes (KMT2D, NF1, TBX20, ZEB2). We also identified two genes (DNAH5, OFD1) with evidence of recessive and hemizygous inheritance patterns, respectively. Within our clinical cohort, we also observed heterozygous LOF variants in JARID2 and SMAD1 in individuals with cardiac phenotypes, and collectively, carriers of LOF variants in our candidate genes had a four times higher odds of having CVM (odds ratio = 4.0, 95% confidence interval 2.5-6.5).
CONCLUSIONS: Our analytical strategy highlights the utility of bioinformatic resources, including human disease records and model organism phenotyping, in novel gene discovery for rare human disease. The results underscore the extensive genetic heterogeneity underlying non-syndromic LSLs, and posit potential novel candidate genes and complex modes of inheritance in this important group of birth defects.

Entities:  

Keywords:  Cardiac malformation; Congenital heart disease; Developmental disorder; Rare disease; Whole exome sequence

Mesh:

Year:  2017        PMID: 29089047      PMCID: PMC5664429          DOI: 10.1186/s13073-017-0482-5

Source DB:  PubMed          Journal:  Genome Med        ISSN: 1756-994X            Impact factor:   11.117


Background

Congenital cardiovascular malformations (CVMs) occur in 5 to 8 of 1000 live births and have a high mortality rate compared to other birth defects [1, 2]. Left-sided lesion (LSL) disorders comprise 15–20% of severe CVMs [3, 4] and include hypoplastic left heart syndrome (HLHS), aortic valve stenosis (AS), coarctation of the aorta (CoA), interrupted aortic arch type A (IAAA), mitral valve atresia and stenosis (MA, MS), and Shone’s complex (SC). Despite the diversity of the cardiac malformations encompassed by LSLs, this epidemiologically grouped family is also thought to be developmentally unified by altered or obstructed flow through the left side of the heart during embryonic development [5]. Most importantly, LSLs, and HLHS in particular, contribute disproportionately high costs to long-term disability and mortality from CVM, making a better understanding of their underlying etiology an important area of study. Although genetic factors are known to contribute significantly to the development of LSLs, the spectrum and nature of these genetic contributions are complex and heterogeneous, spanning single nucleotide substitutions, chromosome abnormalities including aneuploidies [6], structural variants including copy number variants (CNVs) causing genomic disorders [7], and oligogenic inheritance [8]. More than 30 genes have been implicated in human syndromes that include left-sided heart malformations. These loci include those implicated in HLHS (ZIC3, TBX5, CREBBP, ACVR2B, LEFTY2, DTNA, DHCR7, EVC1-2, FOXF1-FOXC2-FOXL1, and PEX genes), AS (NOTCH1, FOXC1, FGD1), and CoA (JAG1, NOTCH2, NF1, PTPN11, KRAS, SOS1, RAF1, NRAS, BRAF, SHOC2, CBL, ZIC3, CREBBP, MLL2, FGD1, DHCR7, NSDHL, KCNJ2, MKS1) (Additional file 1: Table S1). Many of the associated syndromes have characteristic extra-cardiac features indicating early pleiotropic roles for the underlying molecular pathways in normal organ development. LSLs without overt extra-cardiac abnormalities (apparently isolated or non-syndromic LSL) appear to have a more complex origin. Familial clustering of cases [9] and an increased risk of LSL in first-degree relatives [10] are consistent with single gene and/or oligogenic inheritance; however, the consistent observation of sporadic cases, particularly in the context of more severe LSLs (e.g., HLHS) that may negatively impact reproductive fitness, suggests a potential role for de novo mutations and a potential role for multi-locus variation [11]. De novo mutations have been reported within the broader context of congenital heart defects (CHDs) [12, 13], but their specific role in LSLs remains unknown. This confluence of multiple disease models, potential modes of inheritance, and disparate candidate genes presents a challenge for identifying likely disease-causing alleles in LSLs. The availability of large-scale, public databases that aid in annotating and curating extensive genomic data offers a lens through which to focus on the most robust CVM candidate genes. In order to gain a deeper understanding of the spectrum of genetic variation associated with LSLs, we performed whole exome sequencing (WES) of 342 unrelated LSL cases without known extra-cardiac features, i.e., apparent non-syndromic LSLs. We then focused on putative loss-of-function (LOF) or predicted damaging variants rarely observed in public databases and not observed in a large cohort not enriched for LSLs that was sequenced on a similar platform. We then applied a series of bioinformatic filters to a standing list of potential CHD candidate genes derived from the literature and publicly available databases in order to focus the set of WES-derived genomic candidates. The intersection of case-exclusive, rare, putatively damaging, variation with this a priori LSL candidate gene list formed the foundation of our discovery strategy (Fig. 1). Candidate variants identified were then validated and genotyped in available parents to determine patterns of inheritance. Genes with verified de novo variants were then queried for variants with LOF consequences and algorithmically damaging variants in a large clinical database consisting of individuals with developmental concerns referred for clinical sequencing.
Fig. 1

Discovery strategy for LSL cohort. Imposing a candidate list, constructed independently, of a priori disease gene candidates on rare-variant exome-wide analyses, with integration of pedigree information, facilitates genes discovery. LOF putative loss-of-function variants, DNS damaging non-synonymous variation predicted by > 3 of 6 predictive algorithms, ARIC Atherosclerosis Risk in Communities, ExAC Exome Aggregation Consortium, EVS Exome Variant Server, OP ratio of observed to potential LOF alleles, RVIS residual variation intolerance score; *includes de novo and inherited dominant alleles

Discovery strategy for LSL cohort. Imposing a candidate list, constructed independently, of a priori disease gene candidates on rare-variant exome-wide analyses, with integration of pedigree information, facilitates genes discovery. LOF putative loss-of-function variants, DNS damaging non-synonymous variation predicted by > 3 of 6 predictive algorithms, ARIC Atherosclerosis Risk in Communities, ExAC Exome Aggregation Consortium, EVS Exome Variant Server, OP ratio of observed to potential LOF alleles, RVIS residual variation intolerance score; *includes de novo and inherited dominant alleles

Methods

Subject selection

The discovery sample included 342 unrelated LSL cases without clinically evident extra-cardiac malformations or unexplained developmental concerns at the time of recruitment, which in most cases was during infancy or early childhood. Cases were recruited through the Texas Children’s Hospital (TCH) in Houston, TX and included 42 cases with HLHS originally recruited at Children’s Hospital in Linz, Austria. Parents and affected family members of LSL cases (where available) were also recruited. TCH participants were recruited as an extension of a previously published [9] cohort of LSL cases, and all participants provided informed consent prior to their inclusion. In brief, patients were eligible if they had a characteristic LSL CVM, — AS, CoA, HLHS, MA or MS — or a combination of aortic valve atresia or stenosis with hypoplasia of the left ventricle and aortic arch known as Shone’s complex (SC). We also included individuals if they had bicuspid aortic valve (BAV), ventricular septal defect (VSD), or other associated cardiac defects when they were present in combination with typical LSL malformations (Table 1). Diagnoses were confirmed by echocardiography, cardiac catheterization, or open cardiac surgery. Cases with clear evidence of extra-cardiac involvement at the time of evaluation by the referring physician (including dysmorphic or syndromic cases or those with other birth defects or congenital anomalies) were excluded. In some cases, particularly among medically unstable neonates, a comprehensive physical examination was not possible, and cases were included as long as there was not a strong clinical suspicion of a syndromic diagnosis. All cases were genotyped by chromosome microarray as part of a genome-wide association study of the LSL phenotype [14]. Cases found to have large (> 1 megabase) genomic defects were excluded from further analysis [15]. The same single nucleotide polymorphism (SNP) genotype data were also used to infer ancestry from the first two principal components on multi-dimensional scaling (MDS).
Table 1

Overview of LSL cases

CategoryDescriptionCases (n)
AncestryAfrican American1
Caucasian246
Hispanic95
SexMale233
Female108
CVMAortic valve stenosis (AS)63
Coarctation of the aorta (CoA)130
Hypoplastic left heart syndrome (HLHS)136
Mitral valve stenosis (MS)2
Shone’s complex (SC)6
Other4

This table summarizes demographic and clinical information of 342 LSL probands

CVM cardiovascular malformation

Overview of LSL cases This table summarizes demographic and clinical information of 342 LSL probands CVM cardiovascular malformation

Secondary cohorts

Our analysis utilized two other cohorts as part of our approach. First, we used WES data from 5492 European American (EA) individuals from the population-based Atherosclerosis Risk in Communities (ARIC) study [16] as a comparison group without known severe LSLs or significant congenital CVMs. ARIC WES data [17] were generated on the same sequencing platform using a similar capture reagent and were annotated in the same way as the LSL cohort (see below). The ARIC data were primarily used as a sequencing control dataset to reduce the likelihood of case-only variants arising because of differences in sequencing platforms in public databases. ARIC samples with any of the following criteria were excluded from these analyses: heart failure, major Q-wave, or left ventricular hypertrophy (LVH) by the Cornell definition. In the second stage, genes with de novo LOF variation ascertained in the discovery stage were interrogated for similar damaging/LOF variants in an independent clinical sample of 4750 clinical exomes from the Baylor Genetics Laboratory (http://www.bmgl.com). These individuals had samples submitted to the laboratory for clinical WES for a diverse set of clinical indications, including children with neurodevelopmental concerns or congenital birth defects in keeping with previous reports from this cohort [18, 19]. Only the primary indication and reported clinical data were available for interrogation; cardiovascular lesions, where indicated, were noted, but were not systematically assessed.

Whole exome sequencing

WES was performed on cases and comparison samples with the Illumina HiSeq platform using the Mercury pipeline [20]. ARIC samples were captured using VCRome 2.1 (42 Mb) reagents with an average coverage of 88×, LSL cases were captured using the Human Genome Sequencing Center (HGSC) core (52 Mb), and all analyses were restricted to exonic regions shared between these two reagents. Overall exon coverage and read depth in the two cohorts were highly comparable (> 90% of samples with ≥ 20× coverage at shared sites). In both cohorts, read mapping to Genome Reference Consortium Human Build 37 (GRCh37) was performed with Burrows-Wheeler alignment [21], and allele calling was performed with the Atlas2 suite (Atlas-SNP, Atlas-Indel) [22]. The Variant Call Format (VCF) file contained flagged low-quality variants including SNPs with posterior probability lower than 0.95, total depth of coverage less than 10×, fewer than three variant reads, allelic fraction less than 10%, 99% reads in a single direction, and homozygous reference alleles with < 6× coverage. We increased stringency to remove low-quality indels with a total depth less than 30× and allelic fraction below 30%. WES of LSL cases initially revealed 243,609 variants within the capture regions (239,726 single nucleotide substitutions and 3883 small indels), with indel length ranging from –51 bp (deletion) to +26 bp (insertion). On average, each case presented a total of 14,669 heterozygous and 8321 homozygous non-reference genotypes (Additional file 2: Table S2). Population frequency of variants was determined by comparison to the 1000 Genomes Project [23], Exome Sequencing Project, Exome Aggregation Consortium (ExAC) v0.3 [24], and participants from the ARIC study [16]. Only novel and rare sites were included in these analyses, defining rare as minor allele frequency (MAF) < 0.5% for LOF dominant, recessive, and X-linked segregation analyses.

Functional variant prediction

Variants were annotated to RefSeq gene definitions using ANNOVAR [25]. Conservative LOF annotation was performed by selecting only included premature stopgains in the non-terminal exon, variants disrupting essential splice sites used by all gene isoforms, and frameshift indels similarly mapping to all isoforms. Damaging non-synonymous (DNS) variation was defined as protein-altering substitutions predicted to be damaging by a consensus of at least three out of six prediction scores downloaded via dbNSFP [26] (SIFT, Polyphen2 HDIV, LRT, MutationTaster, MutationAssessor, FATHMM). A Phred-like scaled C-score (Combined Annotation Dependent Depletion (CADD) [27]) was also used to assess pathogenicity of variants (LOF and DNS) but was not used to exclude candidate sites. Residual variation intolerance scores (RVISs) [28] for genes were also interrogated as a means of further prioritizing likely candidate genes.

A priori gene prioritization

In order to facilitate novel gene and variant discovery, we assembled a priori evidence from public resources to identify potential novel LSL genes. We compiled a list of 1712 human genes with a putative role in the development of CVM from a variety of public resources (Additional file 3: Table S3). Genes related to overlapping human disorders including CVM were ascertained from the National Center for Biotechnology Information (NCBI), Online Mendelian Inheritance in Man (OMIM), and literature searches. Relevance to biological pathways and interactions (SHH, NOTCH, TGFB, PITX) was determined using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database [29]. Two model organism databases, the Zebrafish Information Network (ZFIN [30]) and Mouse Genome Informatics (MGI [31]), were also used to ascertain genes related to cardiac malformation in model organisms. ZFIN was queried for genes expressed in the zebrafish heart, and MGI was queried for genes causing abnormal cardiac morphology in mouse models (MP:0000266).

Web resources

Web resources utilized in the study are provided below: 1000 Genomes, http://grch37.ensembl.org/index.html ExAC Browser, http://exac.broadinstitute.org/ dbNSFP, http://varianttools.sourceforge.net/Annotation/DbNSFP OMIM, http://www.omim.org/ University of California Santa Cruz (UCSC) Genome Browser, http://genome.ucsc.edu HMZDelfinder, https://github.com/BCM-Lupskilab/HMZDelFinder ClinVar, https://www.ncbi.nlm.nih.gov/clinvar/ Human Gene Mutation Database (HGMD), http://www.hgmd.cf.ac.uk/ac/index.php

Analytical approach

Candidate mutations in the LSL cohort were prioritized using two main criteria: (1) extremely low allele frequency compared to that observed in the ARIC database (minor allele < 0.05%) and public databases (1000 Genomes Project, ESP, and ExAC; minor allele < 0.5%); (2) prediction of a deleterious functional effect including LOF and DNS variation. Initially, we focused on the most damaging class of variation — extremely rare LOF — and conservatively omitted recurrent LOF sites (seen more than two times) due to concerns about potential ethnic stratification. We further enriched for genes likely to contribute to LSLs by quantifying the observed LOF constraint of genes in ARIC. First, for each gene, we counted the number of LOF alleles in ARIC (gene-wise observed LOF). Next, we simulated all potential nucleotide substitutions in exonic regions to determine the number of total potential LOF sites for each gene, and calculated the ratio of observed to potential LOF alleles (OP ratio [32]). Genes with a very low OP ratio (zero, or in the lowest 30th percentile) were considered stronger candidates for disease. In general, OP ratios correlated well with the functional gene constraint (pLI) metric in the Exome Aggregation Consortium Database (Table 1). Finally, we filtered for variants occurring in our set of a priori compiled cardiac genes (Additional file 3: Table S3) in order to identify those with supporting evidence for a role in LSLs or CHDs. Once we had ascertained a set of candidate genes with LOF variation, we next performed a “working gene” expansion to include DNS sites in the same genes. All prioritized variants were validated using an orthogonal platform (dideoxy-Sanger sequencing), and where parental samples for surveyed probands were available, these samples were used to assess Mendelian segregation of the variant: de novo vs inherited dominant (heterozygous variants), recessive (homozygous or compound-heterozygous variants in trans in a given gene), or X-linked (maternally inherited hemizygous variants on the X chromosome). Finally, having defined a set of candidate genes, we interrogated our clinical database for similarly damaging variants in the same genes and compiled a similar list from publicly available clinical sequencing repositories. The analytical strategy is summarized in Fig. 1.

Results

Variant prioritization

WES of LSL cases identified 132,182 (129,329 single nucleotide variants (SNVs), 2853 indels) novel or extremely rare (MAF < 0.5%) variants within the LSL cases. In silico ‘functional annotation’ determined that 4161 of these rare/novel sites were predicted LOF variants (1469 premature stop, 602 splice, and 2090 frameshift, MAF < 0.5%) representing 1660 genes, and 34,100 predicted DNS variants from 11,822 genes. Novel candidate variants for LSLs were subject to stricter, computationally derived, functional criteria and lower frequency thresholds in population comparison groups (Methods). The mean number of rare variants per LSL case after filtering was 54.4 LOF and 118.8 DNS (range LOF = 35–74; range DNS = 88–158, Additional file 4: Figure S1). The intersection of our a priori candidate gene list with validated WES variants from these LSL cases revealed 27 genes harboring rare or case-exclusive LOF alleles, which we prioritized for further study. These alleles were observed in 26 cases (one case presented with two distinct variants, Table 2), or 7.6% (26/342) of our starting cohort, which we next assessed for mode of inheritance (Fig. 1).
Table 2

Discovery genes ascertained via case-exclusive LOF sites with evidence for a role in LSLs

GeneChr.LSLClinical cohortGene supportLOF OP (%ile)pLI
LSL IDModeCVMNon-CVM
ACVR1 2LO1462De novo00MGI, HE, TGFB0.2010.96
JARID2 6LO0189De novo10MGI, PITX200.99
KMT2D 12LO0785De novo713CVM01
NF1 17LO2000De novo14CVM, MGI, HE0.1661
NR2F2 15LO0260De novo00CVM, MGI00.91
PLRG1 4LO0943De novo00MGI00.99
SMURF1 7LO1765De novo10TGFB00.99
TBX20 7LO0746De novo00CVM, MGI, HE00.34
ZEB2 2LO0747De novo64CVM, ZFIN00.99
ARHGEF11 1LO2218Inherited20ZFIN0.1560.99
CCDC91 12LO0222Inherited13PITX203.2E-05
CDH2 18LO1263Inherited22MGI, HE, PITX20.0771
E2F6 2LO0970Inherited00PITX200.55
FGF19 11LO1957Inherited00MGI00.29
GJC1 17LO0192Inherited00MGI, ZFIN00.94
GLRX3 10LO0350Inherited00MGI, HE00.63
LATS2 13LO0238Inherited24MGI0.1190.87
LTBP1 2LO2218Inherited12MGI, TGFB00.53
MNDA 1LO0369Inherited00CHD candidate0.2657.2E-11
PCDHGA2 5LO0605Inherited04HE08.2E-07
PCSK6 15LO1210Inherited23MGI00.01
RAC1 7LO0938Inherited00MGI, HE00.57
DNAH5 5LO1298Recessive630CVM, MGI, HE, PCD0.2995.8E-37
OFD1 XLO1488X-linked13CVM, MGI, ZFIN00.98
BMP1 8LO0605Unknown01MGI00.87
JMJD6 17LO0295Unknown01MGI00.82
ROCK1 18LO0453Unknown15MGI, TGFB, PITX201

Inheritance was determined by Sanger sequencing in cases and parents. The number of samples in the clinical laboratory cohort with any LOF alleles in these genes is also provided along with current CVM status. Unknown inheritance indicates only one parent was available for validation and did not carry the mutation

Gene support symbols are defined as follows: CVM known role in human cardiovascular malformation, MGI overlapping phenotype in mouse, HE human heart expressed, PITX2 related to PITX2 transcription, ZFIN overlapping phenotype in zebrafish, TGFB transforming growth factor beta pathway

LOF OP ratio percentile calculated from ARIC EA participants. Functional gene constraint values (pLI) from the Exome Aggregation Consortium (ExAC) Browser (cleaned_exac_r03_march16_z_pli_rec_null_data) are provided for reference

Discovery genes ascertained via case-exclusive LOF sites with evidence for a role in LSLs Inheritance was determined by Sanger sequencing in cases and parents. The number of samples in the clinical laboratory cohort with any LOF alleles in these genes is also provided along with current CVM status. Unknown inheritance indicates only one parent was available for validation and did not carry the mutation Gene support symbols are defined as follows: CVM known role in human cardiovascular malformation, MGI overlapping phenotype in mouse, HE human heart expressed, PITX2 related to PITX2 transcription, ZFIN overlapping phenotype in zebrafish, TGFB transforming growth factor beta pathway LOF OP ratio percentile calculated from ARIC EA participants. Functional gene constraint values (pLI) from the Exome Aggregation Consortium (ExAC) Browser (cleaned_exac_r03_march16_z_pli_rec_null_data) are provided for reference

Discovery genes

Sanger sequencing of these 27 alleles in parents revealed nine to have arisen de novo, all in different genes (ACVR1, JARID2, KMT2D, NF1, NR2F2, PLRG1, SMURF1, TBX20, and ZEB2) (Table 2). Mutations in KMT2D (MIM 147920), NF1 (MIM 162200), and ZEB2 (MIM 235730) are known to cause human monogenic Mendelian syndromes (Kabuki syndrome, neurofibromatosis type 1, and Mowat-Wilson syndrome, respectively), and cardiac malformations in these syndromes occur in 3–50% of patients. Mutations in NR2F2 (MIM 615779) and TBX20 (MIM 611363) have previously been associated with non-syndromic congenital heart defects [33, 34]. The remaining de novo alleles were found in genes that are not yet well established as human CVM genes, but are known to play a role in critical cardiac development pathways (e.g., TGFbeta signaling: SMURF1, ACVR1; PITX2 transcription factor target: JARID2 [35]). Model organism mutants recapitulate cardiac developmental phenotypes in seven of these de novo genes (see MGI in Table 2); including PLRG1, for which mutant alleles cause malformation of the left ventricle in mouse models [36]. When we expanded our evaluation of these genes to include rare/novel DNS variation (Fig. 1), we identified three additional de novo DNS variants in three genes (KMT2D, TBX20, ZEB2), providing further confirmation of their role in LSLs (Additional file 5: Table S4). Outside of de novo variants, the majority of LOF variants observed in our isolated LSL cohort were found in the heterozygous state and were transmitted from an apparently unaffected parent (Table 2, Additional file 5: Table S4); however, we did not perform echocardiograms in parents, leaving open the possibility of incomplete penetrance or variable expressivity of the phenotype, such as BAV and other asymptomatic cardiac anomalies. In addition, we did not systematically investigate for potential mosaicism in the unaffected parent. We also found evidence for recessive trait inheritance (i.e., homozygous or compound heterozygous alleles) and X-linked inheritance of LOF variants in DNAH5 and OFD1, respectively. A homozygous LOF variant was observed in DNAH5, which is a reported cause of primary ciliary dyskinesia (MIM 608644), an autosomal recessive condition that may involve complex cardiac malformations in a fraction of patients [37]. We also observed an LOF variant in OFD1 (X-linked) that was maternally transmitted to an affected male case. Pathologic variation in OFD1 is the cause of orofaciodigital syndrome (MIM 311200), which can include congenital heart malformations within its phenotypic spectrum [38]. Upon re-evaluation, the individual with this variant did not manifest other features of this syndrome. These observations support recessive and sex-linked forms of LSL and highlight the complex genetic mechanisms underlying these malformations. Having ascertained 27 LSL candidate genes from our a priori candidate list, we then sought to utilize our clinical diagnostic exome sequencing database (Baylor Genetics; see Methods) to provide additional evidence for likely pathogenicity among these genes. In the first instance, we aimed to characterize the relationship between LOF variation in our identified LSL candidate genes and the presence of CVM among clinical cases. This analysis revealed 79 individuals with LOF variation in 17 of our 27 LSL discovery genes (Fig. 2, Additional file 6: Table S5). Nearly half of these individuals (34/79 = 43%) also presented with CVM (Table 2). Conversely, among the 4778 unique case identifiers available for interrogation, 755 (15.8%) had phenotype accession data consistent with a cardiac phenotype; i.e., carriers of LOF variants in our discovery genes are four times more likely to have a cardiac phenotype among clinical cases (odds ratio (OR) = 4.03, 95% confidence interval (CI) 2.5–6.47). When we expanded this analysis to include algorithmically supported damaging variation in the same genes, however, there was no significant enrichment of these variants among cases with reported cardiac disease in their clinical requisition, which primarily reflects the referring diagnosis for clinical WES and not the full clinical phenotype.
Fig. 2

Rare predicted damaging variation in known and novel human cardiovascular malformation (CVM) genes. The x-axis describes counts of CVM cases carrying predicted loss-of-function (LOF) and damaging non-synonymous (DNS) variation with observed population frequency < 0.0005. CVM cases include LSL discovery (n = 342) and clinical cases referred to Baylor Genetics Lab (BG) presenting with cardiac malformations (Additional file 6: Table S5). Known CVM indicates counts of cases with variants in genes previously implicated with human CVM in OMIM; Phenotypic Expansion indicates genes associated with a human disorder not previously associated with CVM; Novel Human CVM genes have not previously been associated with human CVM but were ascertained by our candidate gene strategy (Additional file 3: Table S3)

Rare predicted damaging variation in known and novel human cardiovascular malformation (CVM) genes. The x-axis describes counts of CVM cases carrying predicted loss-of-function (LOF) and damaging non-synonymous (DNS) variation with observed population frequency < 0.0005. CVM cases include LSL discovery (n = 342) and clinical cases referred to Baylor Genetics Lab (BG) presenting with cardiac malformations (Additional file 6: Table S5). Known CVM indicates counts of cases with variants in genes previously implicated with human CVM in OMIM; Phenotypic Expansion indicates genes associated with a human disorder not previously associated with CVM; Novel Human CVM genes have not previously been associated with human CVM but were ascertained by our candidate gene strategy (Additional file 3: Table S3) In the clinical cohort, two candidate genes with observed de novo mutations from our primary analysis — JARID2 and SMURF1 — both had individual LOF carriers with CVM (Table 2), aortic and pulmonic stenosis with JARID2 and dextrocardia with SMURF1. A missense mutation in JARID2 was recently reported to segregate with BAV and a dilated aorta in a single family with LSLs [39]. In murine models, Jarid2 is expressed throughout the embryonic heart and is necessary for normal cardiac development [40, 41] such that Jarid2 null mice show cardiac defects including double outlet right ventricle and ventricular septal defects [42]. SMURF1 (SMAD Specific E3 Ubiquitin Protein Ligase 1) encodes a downstream protein effector of transforming growth factor beta (TGFB) activity, which, in mouse models, is important to atrioventricular valve formation [43]. A de novo mutation in SMURF1 was recently reported in a single individual with craniosynostosis [44]; no cardiac features were reported in the proband. Taken together, these results support a role for damaging variation in JARID2 and SMURF1 as causes of human CVM.

Variant database annotation

Finally, within our samples we evaluated the prevalence of variants previously described as pathogenic in clinical databases. We first catalogued a priori candidate gene variants identified in LSL cases that were also represented in either the HGMD, 2015.1 or NCBI’s ClinVar. LSL cases presented 120 protein-altering variants in 55 distinct a priori candidate genes previously reported as pathogenic in these two databases (Additional file 7: Table S6). Evidence for pathogenicity was strongest at 29 sites that were extremely rare — 14 were case-exclusive and associated with human disorders related to cardiac development, and 15 were severely depleted across all comparison groups Minor Allele Count (MAC < 5). It is interesting to note that 14 of the latter variants were reported at a frequency higher than expected (MAF > 0.5%) in the ARIC comparison group, which was not ascertained for CVM; as a result, we do not consider these variants to be strong candidates for LSLs. However, we report our observed frequencies to inform future studies of pedigree segregation and functional models aimed at assessing the pathogenicity of the variants identified in our case cohort.

Discussion

Case-only cohort designs

The primary focus of our analysis was to identify likely pathogenic candidate variation in biologically plausible CVM genes occurring in individuals with a known CVM. We utilized a conservative, in silico criterion for definition for pathogenic variation alongside a broad bioinformatics-driven collation of genes with relevant biological and molecular function. We demonstrate that this approach can facilitate gene discovery for isolated, non-syndromic cardiac malformation. Novel candidate mutation in nine of our cases (2.6%) was confirmed to have arisen de novo, with inherited LOF mutations being observed in a further 15 patients (4.4%), inclusive of two patients with recessive or X-linked inheritance. In addition, 23 patients (6.7%) carried previously reported pathogenic variants for similar disease conditions involving CVM, which were also severely depleted in the comparison populations. In aggregate, WES of unrelated LSL cases identified candidate mutation in 49 (14.3%) of our LSL cases. Our rare-case-only cohort design integrates well with current high-throughput sequencing pipelines since it utilizes bioinformatic support for prioritizing novel genes and filtering allele frequencies from large comparison cohorts and public databases. Improving integration with bioinformatic resources may allow for automated generation of a priori candidate genes for any rare condition. Moreover, such integration may be enabled further as a structured ontology, such as the Human Phenotype Ontology (HPO [45]) of clinical terms, is applied clinically and harmonization [11] of phenotype ontology between human, model organism, and biological pathway databases continues to improve. Clinical expertise, however, will remain crucial to further curating and refining candidate gene lists for patient care. Comparison of allele frequencies from large population-based resources remains a key aspect of assessing pathogenicity of known and novel candidate mutations, as low frequency variants can have distinct biological effects from higher frequency variants with similar functional annotation [46]. Combining this approach with disease segregation within pedigrees (particularly de novo variation), using Mendelian genomics and evidence for biological function relevant to the phenotype in question, brings our pipeline in line with recent clinical recommendations for inferring pathogenicity of genomic variants [47]. The rapid expansion of both clinical and bioinformatics resources thus bodes well for the future utility of case-only cohort screens for rare disease and suggests that this approach will continue to grow in power alongside population-based sequencing efforts.

LSL gene discovery

By focusing on the intersection of case-exclusive LOF variation and an a priori candidate gene list, we identified novel candidate pathogenic variants in 7.9% (27/342) cases from our starting LSL cohort. Of the 27 resulting high-confidence genes implicated by intersection with our a priori gene list, interrogation of additional cases and clinical cohorts provided further evidence for LSL pathogenicity in nine genes; of these, SMURF1, PLRG1, and ACVR1 have not been previously established in human LSLs or CHDs and emerged as the strongest novel potential disease gene candidates from our analyses. Integration of our candidate genes with individual disease cohorts and large collaborative projects that are ongoing in the USA and Europe holds the promise of confirming yet more of these candidates in the future. For example, our observation of JARID2 variants in both our discovery and clinical cohorts adds to the previous single-family report in the literature [39] and strengthens arguments for JARID2 as a bona fide LSL gene. Similarly, large-scale model organism knockout projects that include efforts to characterize developmental and cardiac phenotypes, such as the Knockout Mouse Phenotyping Program (KOMP2), have great potential to further facilitate the confirmation of putative human disease genes like those offered here. Lastly, expansion of annotated variation beyond LOF to include DNS variation will also aid disease gene discovery in CHDs.

Complex mechanisms underlying non-syndromic LSLs

Our results also offer insights into the complex etiologies of abnormal cardiac development. First, virtually all our high-confidence pathogenic changes were in different genes and spanned different developmental cardiac gene pathways. This is reflective of the broader LSL literature, which has implicated several molecular pathways in the development of CHDs more generally and LSLs in particular [12, 39]. Further underscoring the complexity of LSLs, we also found evidence of additional modes of inheritance beyond the anticipated dominant inheritance, including both X-linked and recessive models in known syndromic genes. De novo dominant mutations have been reported to account for as many as 10% of CVM cases [13]. A general role for this genetic mechanism in CHDs is supported by our observation of de novo mutation in nine of the 342 (2.6%) of cases reported here. Our neonatal ascertainment of cases meant that we were limited in our ability to definitively exclude neurodevelopmental syndromes from our cohort; however, our results are broadly consistent with previous reports [48], intimating that de novo dominant mutation is less common among non-syndromic CHD cases and that the majority of rare LOF variants are inherited. We, and others, have previously established that up to 20% of parents and siblings of LSL probands will show subtle left-sided cardiac abnormalities including BAV and mitral valve leaflet redundancy, in which lesions can go undetected without specific cardiac imaging [49-51]. Despite this, cardiac imaging is not consistently undertaken in parents of children with LSLs, and we did not undertake systematic cardiovascular evaluation on parents recruited to our study; we are therefore unable to distinguish between incomplete penetrance and variable expressivity among the parent LOF carriers in our study. Nonetheless, our results confirm the notion that transmitted LOF (and DNS) variants observed in putative CHD candidates may yet contribute to cardiac developmental phenotypes across a phenotypic spectrum that includes clinically evident LSLs on the severe end of a spectrum of variably manifest CVMs. This adds to the emerging literature of complex interactions in cardiac and other congenital defects in which both cis-acting and trans-acting pathways can modify the expression of the disease [39, 44, 48, 52, 53]. Future studies in larger cohorts that are powered to systematically explore the complex potential mechanisms underlying this observation, including modifier genes and epistatic interactions, will be key to unraveling the complex genetic architecture of CHDs.

Conclusions

Through a rigorous interrogation of known and suspected human CHD genes using available bioinformatic data resources, we have provided important insights into the genetic landscape of an important class of CHD. We find that the genetics underlying the development of LSLs, though complex and heterogeneous, is tractable in the context of large-scale databases, modern-day sequencing technologies, and carefully phenotyped clinical cohorts. This suggests that the expansion of international consortia and like collaboration could pay significant dividends for future studies addressing the most common class of birth defects. Known LSL genes. Excel worksheet. (XLSX 11 kb) Quantitative comparison of whole exome sequencing data from LSL discovery cohort and ARIC cohort. Excel worksheet. (XLSX 8 kb) A priori gene list. Excel worksheet. (XLSX 110 kb) Distribution of rare sites within LSL cases. Adobe PDF. (PDF 208 kb) Details of candidate variants identified in LSL cohort. Excel worksheet. (XLSX 12 kb) Variants in candidate genes identified in clinical look-up cohort (Baylor Genetics). Excel worksheet. (XLSX 79 kb) LSL cohort variants observed in ClinVar and/or HGMD. Excel worksheet. (XLSX 28 kb)
  53 in total

Review 1.  Genetic basis of congenital cardiovascular malformations.

Authors:  Seema R Lalani; John W Belmont
Journal:  Eur J Med Genet       Date:  2014-04-30       Impact factor: 2.708

2.  Familial Incidence of Cardiovascular Malformations in Hypoplastic Left Heart Syndrome.

Authors:  Angela M Kelle; Muhammad Y Qureshi; Timothy M Olson; Benjamin W Eidem; Patrick W O'Leary
Journal:  Am J Cardiol       Date:  2015-09-10       Impact factor: 2.778

3.  Inheritance analysis of congenital left ventricular outflow tract obstruction malformations: Segregation, multiplex relative risk, and heritability.

Authors:  Kim L McBride; Ricardo Pignatelli; Mark Lewin; Trang Ho; Susan Fernbach; Andres Menesses; Wilbur Lam; Suzanne M Leal; Norman Kaplan; Paul Schliekelman; Jeffrey A Towbin; John W Belmont
Journal:  Am J Med Genet A       Date:  2005-04-15       Impact factor: 2.802

4.  Resolution of Disease Phenotypes Resulting from Multilocus Genomic Variation.

Authors:  Jennifer E Posey; Tamar Harel; Pengfei Liu; Jill A Rosenfeld; Regis A James; Zeynep H Coban Akdemir; Magdalena Walkiewicz; Weimin Bi; Rui Xiao; Yan Ding; Fan Xia; Arthur L Beaudet; Donna M Muzny; Richard A Gibbs; Eric Boerwinkle; Christine M Eng; V Reid Sutton; Chad A Shaw; Sharon E Plon; Yaping Yang; James R Lupski
Journal:  N Engl J Med       Date:  2016-12-07       Impact factor: 91.245

5.  Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome.

Authors:  Manuel A Rivas; Matti Pirinen; Donald F Conrad; Monkol Lek; Emily K Tsang; Konrad J Karczewski; Julian B Maller; Kimberly R Kukurba; David S DeLuca; Menachem Fromer; Pedro G Ferreira; Kevin S Smith; Rui Zhang; Fengmei Zhao; Eric Banks; Ryan Poplin; Douglas M Ruderfer; Shaun M Purcell; Taru Tukiainen; Eric V Minikel; Peter D Stenson; David N Cooper; Katharine H Huang; Timothy J Sullivan; Jared Nedzel; Carlos D Bustamante; Jin Billy Li; Mark J Daly; Roderic Guigo; Peter Donnelly; Kristin Ardlie; Michael Sammeth; Emmanouil T Dermitzakis; Mark I McCarthy; Stephen B Montgomery; Tuuli Lappalainen; Daniel G MacArthur
Journal:  Science       Date:  2015-05-08       Impact factor: 47.728

6.  Genic intolerance to functional variation and the interpretation of personal genomes.

Authors:  Slavé Petrovski; Quanli Wang; Erin L Heinzen; Andrew S Allen; David B Goldstein
Journal:  PLoS Genet       Date:  2013-08-22       Impact factor: 5.917

7.  Two locus inheritance of non-syndromic midline craniosynostosis via rare SMAD6 and common BMP2 alleles.

Authors:  Andrew T Timberlake; Jungmin Choi; Samir Zaidi; Qiongshi Lu; Carol Nelson-Williams; Eric D Brooks; Kaya Bilguvar; Irina Tikhonova; Shrikant Mane; Jenny F Yang; Rajendra Sawh-Martinez; Sarah Persing; Elizabeth G Zellner; Erin Loring; Carolyn Chuang; Amy Galm; Peter W Hashim; Derek M Steinbacher; Michael L DiLuna; Charles C Duncan; Kevin A Pelphrey; Hongyu Zhao; John A Persing; Richard P Lifton
Journal:  Elife       Date:  2016-09-08       Impact factor: 8.140

Review 8.  The Human Phenotype Ontology in 2017.

Authors:  Sebastian Köhler; Nicole A Vasilevsky; Mark Engelstad; Erin Foster; Julie McMurry; Ségolène Aymé; Gareth Baynam; Susan M Bello; Cornelius F Boerkoel; Kym M Boycott; Michael Brudno; Orion J Buske; Patrick F Chinnery; Valentina Cipriani; Laureen E Connell; Hugh J S Dawkins; Laura E DeMare; Andrew D Devereau; Bert B A de Vries; Helen V Firth; Kathleen Freson; Daniel Greene; Ada Hamosh; Ingo Helbig; Courtney Hum; Johanna A Jähn; Roger James; Roland Krause; Stanley J F Laulederkind; Hanns Lochmüller; Gholson J Lyon; Soichi Ogishima; Annie Olry; Willem H Ouwehand; Nikolas Pontikos; Ana Rath; Franz Schaefer; Richard H Scott; Michael Segal; Panagiotis I Sergouniotis; Richard Sever; Cynthia L Smith; Volker Straub; Rachel Thompson; Catherine Turner; Ernest Turro; Marijcke W M Veltman; Tom Vulliamy; Jing Yu; Julie von Ziegenweidt; Andreas Zankl; Stephan Züchner; Tomasz Zemojtel; Julius O B Jacobsen; Tudor Groza; Damian Smedley; Christopher J Mungall; Melissa Haendel; Peter N Robinson
Journal:  Nucleic Acids Res       Date:  2016-11-28       Impact factor: 16.971

9.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

10.  The Mouse Genome Database: integration of and access to knowledge about the laboratory mouse.

Authors:  Judith A Blake; Carol J Bult; Janan T Eppig; James A Kadin; Joel E Richardson
Journal:  Nucleic Acids Res       Date:  2013-11-26       Impact factor: 16.971

View more
  18 in total

Review 1.  Genetics in bicuspid aortic valve disease: Where are we?

Authors:  Katia Bravo-Jaimes; Siddharth K Prakash
Journal:  Prog Cardiovasc Dis       Date:  2020-06-27       Impact factor: 8.194

2.  Genome-Wide De Novo Variants in Congenital Heart Disease Are Not Associated With Maternal Diabetes or Obesity.

Authors:  J G Seidman; Christine E Seidman; Sarah U Morton; Alexandre C Pereira; Daniel Quiat; Felix Richter; Alexander Kitaygorodsky; Jacob Hagen; Daniel Bernstein; Martina Brueckner; Elizabeth Goldmuntz; Richard W Kim; Richard P Lifton; George A Porter; Martin Tristani-Firouzi; Wendy K Chung; Amy Roberts; Bruce D Gelb; Yufeng Shen; Jane W Newburger
Journal:  Circ Genom Precis Med       Date:  2022-02-07

3.  Beyond Gene Panels: Whole Exome Sequencing for Diagnosis of Congenital Heart Disease.

Authors:  Sharon L Paige; Priyanka Saha; James R Priest
Journal:  Circ Genom Precis Med       Date:  2018-03

4.  Developmental and temporal characteristics of clonal sperm mosaicism.

Authors:  Xiaoxu Yang; Martin W Breuss; Xin Xu; Danny Antaki; Kiely N James; Valentina Stanley; Laurel L Ball; Renee D George; Sara A Wirth; Beibei Cao; An Nguyen; Jennifer McEvoy-Venneri; Guoliang Chai; Shareef Nahas; Lucitia Van Der Kraan; Yan Ding; Jonathan Sebat; Joseph G Gleeson
Journal:  Cell       Date:  2021-08-12       Impact factor: 66.850

Review 5.  Decoding the Heart through Next Generation Sequencing Approaches.

Authors:  Michal Pawlak; Katarzyna Niescierowicz; Cecilia Lanny Winata
Journal:  Genes (Basel)       Date:  2018-06-07       Impact factor: 4.096

6.  Evaluating the strength of genetic results: Risks and responsibilities.

Authors:  Gregory S Barsh; Gregory M Cooper; Gregory P Copenhaver; Giorgio Sirugo; Hua Tang; Scott M Williams
Journal:  PLoS Genet       Date:  2019-10-11       Impact factor: 5.917

7.  Congenital heart disease risk loci identified by genome-wide association study in European patients.

Authors:  Harald Lahm; Meiwen Jia; Martina Dreßen; Felix Wirth; Nazan Puluca; Ralf Gilsbach; Bernard D Keavney; Julie Cleuziou; Nicole Beck; Olga Bondareva; Elda Dzilic; Melchior Burri; Karl C König; Johannes A Ziegelmüller; Claudia Abou-Ajram; Irina Neb; Zhong Zhang; Stefanie A Doppler; Elisa Mastantuono; Peter Lichtner; Gertrud Eckstein; Jürgen Hörer; Peter Ewert; James R Priest; Lutz Hein; Rüdiger Lange; Thomas Meitinger; Heather J Cordell; Bertram Müller-Myhsok; Markus Krane
Journal:  J Clin Invest       Date:  2021-01-19       Impact factor: 14.808

8.  Novel variants in CDH2 are associated with a new syndrome including Peters anomaly.

Authors:  Linda M Reis; Nathalie S Houssin; Carlos Zamora; Omar Abdul-Rahman; Jennifer M Kalish; Elaine H Zackai; Timothy F Plageman; Elena V Semina
Journal:  Clin Genet       Date:  2019-11-10       Impact factor: 4.296

9.  Loss of Function of the Nuclear Receptor NR2F2, Encoding COUP-TF2, Causes Testis Development and Cardiac Defects in 46,XX Children.

Authors:  Anu Bashamboo; Caroline Eozenou; Anne Jorgensen; Joelle Bignon-Topalovic; Jean-Pierre Siffroi; Capucine Hyon; Attila Tar; Péter Nagy; Janos Sólyom; Zita Halász; Annnabel Paye-Jaouen; Sophie Lambert; David Rodriguez-Buritica; Rita Bertalan; Laetitia Martinerie; Ewa Rajpert-De Meyts; John C Achermann; Ken McElreavey
Journal:  Am J Hum Genet       Date:  2018-02-22       Impact factor: 11.025

10.  The E3 ubiquitin ligase SMURF1 regulates cell-fate specification and outflow tract septation during mammalian heart development.

Authors:  K Koefoed; J Skat-Rørdam; P Andersen; C B Warzecha; M Pye; T A Andersen; K D Ajbro; E Bendsen; M Narimatsu; F Vilhardt; L B Pedersen; J L Wrana; R H Anderson; K Møllgård; S T Christensen; L A Larsen
Journal:  Sci Rep       Date:  2018-06-22       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.