| Literature DB >> 31190668 |
Allison J Cox1, Fillan Grady2, Gabriel Velez2, Vinit B Mahajan3, Polly J Ferguson1, Andrew Kitchen4, Benjamin W Darbro1, Alexander G Bassuk1.
Abstract
Compound heterozygotes occur when different variants at the same locus on both maternal and paternal chromosomes produce a recessive trait. Here we present the tool VarCount for the quantification of variants at the individual level. We used VarCount to characterize compound heterozygous coding variants in patients with epileptic encephalopathy and in the 1000 Genomes Project participants. The Epi4k data contains variants identified by whole exome sequencing in patients with either Lennox-Gastaut Syndrome (LGS) or infantile spasms (IS), as well as their parents. We queried the Epi4k dataset (264 trios) and the phased 1000 Genomes Project data (2504 participants) for recessive variants. To assess enrichment, transcript counts were compared between the Epi4k and 1000 Genomes Project participants using minor allele frequency (MAF) cutoffs of 0.5 and 1.0%, and including all ancestries or only probands of European ancestry. In the Epi4k participants, we found enrichment for rare, compound heterozygous variants in six genes, including three involved in neuronal growth and development - PRTG (p = 0.00086, 1% MAF, combined ancestries), TNC (p = 0.022, 1% MAF, combined ancestries) and MACF1 (p = 0.0245, 0.5% MAF, EU ancestry). Due to the total number of transcripts considered in these analyses, the enrichment detected was not significant after correction for multiple testing and higher powered or prospective studies are necessary to validate the candidacy of these genes. However, PRTG, TNC and MACF1 are potential novel recessive epilepsy genes and our results highlight that compound heterozygous variants should be considered in sporadic epilepsy.Entities:
Keywords: bioinformatics; compound heterozygous; epilepsy; genetics
Mesh:
Substances:
Year: 2019 PMID: 31190668 PMCID: PMC7045018 DOI: 10.1017/S0016672319000065
Source DB: PubMed Journal: Genet Res (Camb) ISSN: 0016-6723 Impact factor: 1.588
Fig. 1.Flow diagram for the processing and analysis of variant lists. Vcf files are annotated and filtered using SNPSift/SNPEff. Final vcf along with parameter and sample information files are input to VarCount. The input files are processed to recode minor and major alleles when the MAF >0.5 and to count the number of individuals with variants qualifying based on information in the parameter file. The final output lists for every transcript or gene, the number of individuals with qualified variants in that locus (counts.text), which individuals have the variant(s) (countfile.txt), and which variants are harboured by each individual (output.txt).
Rare (<0.5 and 1.0% minor allele frequency) compound heterozygous variants in Epi4k participants.
| Gene | Transcript | Sex | Phena. | Ancb. | Chr:bp* | dbSNP ID | REF/ ALT | Exac | ExAC | Peptide change | Epi4k # (1%) | 1 kg # (1%) | p-value | Epi4k # (0.5%) | 1 kg # (0.5%) | p-value |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EU (y/n)d: All (y/n) | EU (y/n): All (y/n) | EU:All | EU (y/n): All (y/n) | EU (y/n): All (s/n) | EU:All | |||||||||||
| ENST000 | F | IS | EU | 15:56032666 | rs373423650 | T/C | 9.94E-05 | 1.65E-04 | E104G | 3/204: 3/261 | 0/503: 0/2504 | 2/205: 2/262 | 0/503: 0/2504 | 0.0847: | ||
| 15:56032666 | rs185716584 | C/G | 0.003022 | 0.004315 | E104D | |||||||||||
| M | LGS | EU | 15:56032665 | rs372777171 | C/A | 8.28E-06 | 1.50E-05 | E575* | ||||||||
| 15:55965698 | rs185716584 | C/G | 0.003022 | 0.004315 | E104D | |||||||||||
| M | LGS | EU | 15:56032665 | rs35718474 | G/A | 0.003033 | 0.006701 | P13S | ||||||||
| 15:56035093 | rs148011047 | T/G | 0.002611 | 0.003608 | K977T | |||||||||||
| ENST000 | M | IS | EU | 15:55916703 | rs200118898 | C/G | 1.09E-04 | 1.98E-04 | S143W | 2/205: 2/262 | 0/503: 0/2504 | 0.0847: 0.0091 | 2/205: 2/262 | 0/503: 0/2504 | 0.292: | |
| 22:31091324 | rs201298398 | G/A | 3.20E-04 | 5.19E-04 | R262Q | |||||||||||
| M | LGS | E Asia | 22:31137288 | rs576508023 | G/C | 8.42E-06 | 0 | G115R | ||||||||
| 22:31091239 | NA | C/T | 1.66E-05 | 0 | R383W | |||||||||||
| ENST000 | F | IS | EU | 22:31283452 | rs529824818 | C/T | 8.24E-06 | 0 | C543Y | 2/205: 2/262 | 0/503: 2/2502 | 0.0847: 0.0478 | 2/205: 2/262 | 0/503: 1/2502 | 0.0847: 0.0255 | |
| 16:48242388 | rs199839251 | C/T | 8.29E-06 | 1.51E-05 | E273K | |||||||||||
| M | LGS | EU | 16:48250159 | NA | C/A | NA | NA | R889M | ||||||||
| 16:48226471 | NA | C/A | NA | NA | G636V | |||||||||||
| ENST000 | M | IS | EU | 16:48234362 | rs200401362 | C/T | 8.24E-06 | 0 | G861R | 2/205: 3/261 | 0/503: 4/2500 | 0.0847: 0.022 | 1/206: 2/262 | 0/503: 2/2502 | 0.29: 0.0478 | |
| 9:117840315 | rs117058692 | C/G | 7.77E-04 | 0.001248 | G171R | |||||||||||
| 9:117849499 | rs143586851 | C/T | 2.23E-04 | 3.90E-04 | A39T | |||||||||||
| M | IS | C/S Asia | 9:117853183 | rs149986851 | C/A | 2.06E-04 | 3.75E-04 | G203V | ||||||||
| 9:117849402 | rs139280264 | G/A | 8.51E-04 | 0.001247 | R1066C | |||||||||||
| M | IS | EU | 9:117835900 | rs371055558 | C/T | 2.54E-05 | 3.07E-05 | G576S | ||||||||
| 9:117848284 | rs144032672 | C/T | 0.001336 | 0.002026 | G210S | |||||||||||
| ENST000 | M | IS | EU | 9:117849382 | NA | G/A | 1.65E-05 | 3.00E-05 | R4344Q | 3/204: 3/261 | 1/502: 11/2493 | 0.0769: 0.142 | 3/204: 3/261 | 0/503: 6/2498 | ||
| 1:39900231 | rs141949859 | G/T | 6.34E-04 | 1.12E-03 | V3535F | |||||||||||
| M | IS | EU | 1:39853797 | rs145271544 | G/T | NA | NA | A3264S | ||||||||
| 1:39852984 | rs138819868 | T/G | 2.46E-03 | 3.85E-03 | F5885L | |||||||||||
| M | LGS | EU | 1:39951304 | NA | A/G | 8.28E-06 | 0 | I1066V | ||||||||
| 1:39800136 | NA | G/C | 8.264 | 1.50E-05 | W4967C | |||||||||||
| ENST000 | M | IS | ME | 1:39910474 | rs145751447 | G/A | 9.17E-05 | 0 | D1289N | 0/207: 2/262 | 1/502: 6/2498 | 0.99: 0.173 | 0/207: 2/262 | 1/502: 2/2502 | 0.99: 0.0478 | |
| 3:52549439 | rs147953260 | G/A | 0.001512 | 0.002524 | R1872H | |||||||||||
| F | IS | C/S Asia | 3:52554531 | rs189303343 | A/G | 4.05E-05 | 3.66E-05 | I590V | ||||||||
| 3:52540204 | NA | C/T | NA | NA | R2351W |
aPhen = phenotype, bAnc = ancestry, cAF = allele frequency, dy/n corresponds to yes/no counts of individuals with qualifying variants.
*bp (base pair position) in hg19/Build37.
Ancestries: EU = European, E Asia = East Asia, C/S Asia = Central/South Asia, ME = Middle East. Phenotypes: IS = Infantile spasms, LGS = Lennox-Gastaut syndrome.
p-values in bold are the most significant for the specific analysis.
Fig. 2.PRTG compound heterozygous mutations in Epi4k probands. (a) Theoretical model of the human PRTG structure spanning the plasma membrane indicating mutation locations in each child. The three pairs of in trans mutations, indicated in red, were found using a <1% MAF threshold. (b) Schematic representation of PRTG functional domains. Multiple sequence alignment of the PRTG Ig-1 domain. The E104 residue is 100% conserved across seven species. (c) Top: Electrostatic potential surface of PRTG calculated in APBS. Bottom: Close-up of the PRTG electrostatic potential surface at the site of mutation. The p.Glu104Gly mutation leads to a loss of negative charge, which may disrupt interactions with putative PRTG binding partners. The p.Glu104Asp mutation does not lead to a change in charge or electrostatic potential.
De novo variants in Epi4k probands with compound heterozygous variants in PRTG, TNC or MACF1.
| Gene | Missense z-scorea | pLIb | CH variants | Sex | Phenotype | gene w/ | Variant type | Polyphen-2c | CADDd | Missense z-scorea | pLIb |
|---|---|---|---|---|---|---|---|---|---|---|---|
| –0.92 | 0.02 | E104G, E104D | F | IS | Missense | P | 24 | 1.77 | 0.93 | ||
| E104D, E575X | M | LGS | Stop-gained | NA | 35 | 4.42 | 1.00 | ||||
| P13S, K977T | M | LGS | Missense | D | 25.1 | 1.80 | 0.43 | ||||
| 3´ UTR | NA | –0.69 | 1.00 | ||||||||
| –0.14 | 0.00 | G861R, G171R, A39T | M | IS | Missense | B | 23.6 | 5.82 | 1.00 | ||
| Splice donor | NA | 23.9 | 0.22 | 0.00 | |||||||
| G203V, R1066C | M | IS | NA | – | – | – | – | – | |||
| G576S, G210S | M | IS | NA | – | – | – | – | – | |||
| 2.63 | 1.00 | R4344Q, V3535F | M | IS | 5´ UTR | NA | NA | 1.11 | 0.55 | ||
| 3´ UTR | NA | NA | 3.28 | 0.91 | |||||||
| A3264S, F5885L | M | IS | 3´ UTR | NA | NA | 2.70 | 0.87 | ||||
| Missense | B | 17.9 | 0.90 | 0.97 | |||||||
| Synonymous | NA | NA | 1.15 | 0.00 | |||||||
| I1066V, W4967C | M | LGS | Missense | P | 9.4 | NA | NA |
az-score is a measure of tolerance to missense variants, based on ratio of expected to identified; bpLI is the probability that a gene is intolerant to loss-of-function variants; cPolyphen2 – prediction of a missense variant's impact on protein structure and function: B = benign, P = possibly damaging, D = damaging (Adzhubei et al., 2010); dCADD = phred-scaled score of Combined Annotation Dependent Depletion, a measure of the deleteriousness of a SNP or INDEL (Kircher et al., 2014).
Phenotypes: IS = Infantile spasms, LGS = Lennox-Gastaut syndrome.