| Literature DB >> 28944239 |
Emily R Holzinger1,2, Qing Li1, Margaret M Parker3,4, Jacqueline B Hetmanski5, Mary L Marazita6, Elisabeth Mangold7, Kerstin U Ludwig7,8, Margaret A Taub9, Ferdouse Begum5, Jeffrey C Murray10, Hasan Albacha-Hejazi11, Khalid Alqosayer12, Giath Al-Souki13, Abdullatiff Albasha Hejazi14, Alan F Scott15,16, Terri H Beaty5, Joan E Bailey-Wilson1.
Abstract
BACKGROUND: Nonsyndromic oral clefts are craniofacial malformations, which include cleft lip with or without cleft palate. The etiology for oral clefts is complex with both genetic and environmental factors contributing to risk. Previous genome-wide association (GWAS) studies have identified multiple loci with small effects; however, many causal variants remain elusive.Entities:
Keywords: DNA sequence data analysis; genetic risk variants; oral clefts; rare variants; statistical genetics
Year: 2017 PMID: 28944239 PMCID: PMC5606860 DOI: 10.1002/mgg3.320
Source DB: PubMed Journal: Mol Genet Genomic Med ISSN: 2324-9269 Impact factor: 2.183
Number of affected individuals with nonsyndromic oral cleftsa and DNA sequence data
| Population | Individuals (families) with WES data | Individuals (families) with WGS data |
|---|---|---|
| Syrian | 22 (10) | 37 (14) |
| Filipino | 22 (11) | 76 (18) |
| Indian | 26 (12) | 0 |
| German | 38 (19) | 0 |
Multiple affected individuals were sequenced from multiplex families.
Three Syrian individuals from two families (total of six) have both WES and WGS data.
Seventy affected and six unaffected individuals.
Figure 1Flowchart showing single variant analysis steps for WES and WGS data.
Variants passing family‐specific analysis filter in WES data for all cohorts
| Pop. | Fam. ID | Gene | Chr. | BP | A | R | AA | AR | RR | Location | Func. | 1000G Freq. | Predicted damaging |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Syrian | 1 |
| 1 | 15831171 | C | T | 3 | 0 | 0 | Exonic | NS | – | 2 |
| 3 |
| 19 | 603971 | G | A | 2 | 1 | 0 | Intronic | – | 0.006 | – | |
| 6 |
| 2 | 233408449 | T | A | 2 | 0 | 0 | Intronic | – | 0.0012 | – | |
| 7 |
| 14 | 92792313 | G | A | 2 | 0 | 0 | Exonic | NS | 0.0004 | 2 | |
| 7 |
| 14 | 93179134 | T | C | 2 | 0 | 0 | Intronic | – | 0.0002 | – | |
| 7 |
| 14 | 94776036 | A | G | 2 | 0 | 0 | Intronic | – | 0.0002 | – | |
| 7 |
| 14 | 100126748 | A | G | 2 | 0 | 0 | Intronic | – | 0.0006 | – | |
| 9 |
| 14 | 52734696 | A | G | 2 | 0 | 0 | Exonic | NS | 0.0006 | 0 | |
| 10 |
| 19 | 17889669 | A | G | 2 | 0 | 0 | Exonic | NS | – | 3 | |
| 10 |
| 21 | 46228597 | T | C | 2 | 0 | 0 | Intronic | – | – | – | |
| 10 |
| 21 | 47572892 | G | A | 2 | 0 | 0 | Exonic | NS | – | 1 | |
| German | 7 |
| 13 | 109792825 | T | C | 2 | 0 | 0 | Exonic | NS | 0.0058 | 0 |
| 7 |
| 17 | 74534592 | C | A | 2 | 0 | 0 | Upstream | – | 0.0002 | – | |
| 10 |
| 19 | 740436 | A | G | 2 | 0 | 0 | Exonic | NS | 0.0018 | 0 | |
| 12 |
| 15 | 41245692 | G | A | 2 | 0 | 0 | Exonic | NS | 0.0008 | 1 | |
| 20 |
| 19 | 1081558 | A | G | 2 | 0 | 0 | Exonic | NS | – | 7 | |
| Indian | 60 |
| 4 | 967071 | A | G | 2 | 0 | 0 | Exonic | NS | 0.027 | 5 |
| Filipino | 8 |
| 3 | 195595358 | T | A | 2 | 0 | 0 | Exonic | NS | 0.0004 | 4 |
| 10 |
| 6 | 33059894 | G | A | 2 | 0 | 0 | Intergenic | – | 0.013 | – |
We show the family ID for each population (Fam. ID). For each variant we give the gene name, chromosome (Chr.), base pair (BP), alternate allele (A), reference allele (R), number of individuals homozygous for the alternate allele (AA), number of heterozygous individuals (AR), number of individuals homozygous for the reference allele (RR), the gene location (Location), the function of the variant if it is exonic (NS, nonsynonymous; S, synonymous), the frequency of the alternate allele for all populations in 1000 Genomes (1000G Freq.), the frequency from the Greater Middle East Variome Project (GME Freq.), and the number of sources that predict the base pair change to be damaging out of the nine present in wAnnovar. The dashes (–) represent the following: Func. column: variants in non‐exonic regions with no defined function; 1000 Freq column: Not present in 1000 Genomes; Predicted damaging column: No pathogenicity predicted.
Nonsynonymous and potentially damaging variants from Syrian Family 1 in WGS data
| Gene | Chr. | BP | A | R | AA | AR | RR | Loc. | Func. | 1000G Freq. | GME Freq. | QG Freq. | Predicted damaging |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| 1 | 15831171 | C | T | 4 | 3 | 1 | Exonic | NS | – | – | – | 2 |
|
| 4 | 126367606 | T | G | 2 | 5 | 1 | Exonic | NS | 0.003 | 0.006 | 0.002 | 3 |
|
| 4 | 126336105 | G | A | 2 | 5 | 1 | Exonic | NS | 0.002 | 0.007 | 0.002 | 0 |
|
| 4 | 126400922 | T | C | 2 | 5 | 1 | Exonic | NS | 0.004 | 0.006 | – | 0 |
For each variant we give the gene name, chromosome (Chr.), base pair (BP), alternate allele (A), reference allele (R), number of individuals homozygous for the alternate allele (AA), number of heterozygous individuals (AR), number of individuals homozygous for the reference allele (RR), the gene location (Location), the function of the variant (NS, nonsynonymous; S, synonymous), the frequency of the alternate allele for all populations in 1000 Genomes (1000G Freq.), the frequency from the Greater Middle East Variome Project (GME Freq.), the frequency from the Qatar Genome data (QG Freq.), and the number of sources that predict the base pair change to be damaging out of the nine present in wAnnovar. The dashes (–) represent variants that were not present in the specific frequency database.
Variants identified in the compound heterozygous analysis in Syrian Family 1 in the WES data
| Gene | Chr. | BP | A | R | AA | AR | RR | Location | Function | 1000G Freq. | GME. Freq. | QG Freq. | Predicted damaging |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| 3 | 48602623 | A | G | 0 | 3 | 0 | Exonic | NS | 0.001 | 0.005 | 0.003 | 5 |
|
| 3 | 48620046 | A | G | 0 | 3 | 0 | Exonic | NS | 0.001 | 0.005 | 0.002 | 7 |
|
| 3 | 48677114 | G | C | 0 | 3 | 0 | Exonic | NS | 0.019 | 0.011 | 0.013 | 4 |
|
| 3 | 48691197 | T | C | 0 | 3 | 0 | Exonic | NS | 0.005 | 0.005 | 0.005 | 0 |
|
| 3 | 53267183 | T | C | 0 | 3 | 0 | Exonic | NS | 0.002 | 0.011 | 0.010 | 3 |
|
| 3 | 53269028 | T | G | 0 | 3 | 0 | Exonic | NS | 0.002 | 0.011 | 0.010 | 1 |
|
| 11 | 7060948 | T | C | 0 | 3 | 0 | Exonic | NS | 0.014 | 0.040 | 0.046 | 0 |
|
| 11 | 7083610 | A | T | 0 | 3 | 0 | Exonic | NS | 0.015 | 0.039 | 0.048 | 5 |
|
| 11 | 7083620 | C | T | 0 | 3 | 0 | Exonic | NS | 0.022 | 0.043 | 0.051 | 0 |
For each variant we give the gene name, chromosome (Chr.), base pair (BP), alternate allele (A), reference allele (R), number of individuals homozygous for the alternate allele (AA), number of heterozygous individuals (AR), number of individuals homozygous for the reference allele (RR), the gene location (Location), the function of the variant (NS, nonsynonymous; S, synonymous), the frequency of the alternate allele for all populations in 1000 Genomes (1000G Freq.), the frequency from the Greater Middle East Variome Project (GME Freq.), and the number of sources that predict the base pair change to be damaging out of the nine present in wAnnovar.
Genotypes for the eight individuals with WGS data in Syrian Family 1 for the WES and WGS validation and discovery results
| Ind. ID | Phen. | Single variant analyses | Compound Het. analyses | ||||||
|---|---|---|---|---|---|---|---|---|---|
| 1:15831171 (CASP9) | 4:126367606 (FAT4) | 4:126336105 (FAT4) | 4:126400922 (FAT4) | 3:48602623 (COL7A1) | 3:48620046 (COL7A1) | 3:53267183 (TKT) | 3:53269028 (TKT) | ||
| 1 (111)* | L.CL | AA | AR | AR | AR | AR | AR | AR | AR |
| 2 (118)* | B.CL, M.CP | AA | RR | RR | RR | AR | AR | AR | AR |
| 3 (125)* | R.CL | AA | AR | AR | AR | AR | AR | AR | AR |
| 4 (38) | L.CL | AA | AR | AR | AR | AR | AR | AR | AR |
| 5 (114) | B.CL | AR | AR | AR | AR | RR | RR | RR | RR |
| 6 (129) | L.CP‐I | RR | AA | AA | AA | AR | AR | AR | AR |
| 7 (150) | R.CL | AR | AA | AA | AA | RR | RR | RR | RR |
| 8 (157) | L.CL, M.CP | AR | AR | AR | AR | RR | RR | RR | RR |
The first three individuals (111, 118 and 125) have WES and WGS data, as indicated by the asterisk. We identify each variant using Chromosome:Base Pair along with the gene name in parentheses. Homozygous for the alternate allele = AA, heterozygous = AR, homozygous for the reference allele = RR. We also show the specific cleft phenotypes for each individual (Phen. column), where L. = left, R. = right, B. = bilateral, M. = midline, I = incomplete, CL = cleft lip, and CP = cleft palate.