| Literature DB >> 28884020 |
Glenn S Gerhard1, Darrin V Bann2, James Broach2, David Goldenberg2.
Abstract
Next-generation sequencing using exome capture is a common approach used for analysis of familial cancer syndromes. Despite the development of robust computational algorithms, the accrued experience of analyzing exome data sets and published guidelines, the analytical process remains an ad hoc series of important decisions and interpretations that require significant oversight. Processes and tools used for sequence data generation have matured and are standardized to a significant degree. For the remainder of the analytical pipeline, however, the results can be highly dependent on the choices made and careful review of results. We used primary exome sequence data, generously provided by the corresponding author, from a family with highly penetrant familial non-medullary thyroid cancer reported to be caused by HABP2 rs7080536 to review the importance of several key steps in the application of exome sequencing for discovery of new familial cancer genes. Differences in allele frequencies across populations, probabilities of familial segregation, functional impact predictions, corroborating biological support, and inconsistent replication studies can play major roles in influencing interpretation of results. In the case of HABP2 rs7080536 and familial non-medullary thyroid cancer, these factors led to the conclusion of an association that most data and our re-analysis fail to support, although larger studies from diverse populations will be needed to definitively determine its role.Entities:
Year: 2017 PMID: 28884020 PMCID: PMC5584869 DOI: 10.1038/s41525-017-0011-x
Source DB: PubMed Journal: NPJ Genom Med ISSN: 2056-7944 Impact factor: 8.617
Allele frequencies for HABP2 rs7080536 in HapMap, 1000 genomes, Exome Variant Server and ExAC databases
| Population | Allele A | Allele G | Genotype A|A | Genotype A|G | Genotype G|G |
|---|---|---|---|---|---|
|
| |||||
| CSHL-HAPMAP:HapMap-CEU | 0.018 | 0.982 | 0.036 | 0.964 | |
| CSHL-HAPMAP:HapMap-HCB | 0.012 | 0.988 | 0.023 | 0.977 | |
| CSHL-HAPMAP:HAPMAP-MEX | 0.031 | 0.969 | 0.061 | 0.939 | |
| CSHL-HAPMAP:HAPMAP-CHB | 0.000 | 1.000 | 0.000 | 1.000 | |
| CSHL-HAPMAP:HapMap-JPT | 0.012 | 0.988 | 0.024 | 0.976 | |
| CSHL-HAPMAP:HapMap-YRI | 0.000 | 1.000 | 0.000 | 1.000 | |
| CSHL-HAPMAP:HAPMAP-TSI | 0.023 | 0.977 | 0.047 | 0.953 | |
| CSHL-HAPMAP:HAPMAP-GIH | 0.006 | 0.994 | 0.011 | 0.989 | |
|
| |||||
| 1000GENOMES:phase_3_CDX | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_JPT | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_CEU | 0.020 | 0.980 | 0.040 | 0.960 | |
| 1000GENOMES:phase_3_PUR | 0.019 | 0.981 | 0.038 | 0.962 | |
| 1000GENOMES:phase_3_TSI | 0.014 | 0.986 | 0.028 | 0.972 | |
| 1000GENOMES:phase_3_YRI | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_KHV | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_SAS | 0.004 | 0.996 | 0.008 | 0.992 | |
| 1000GENOMES:phase_3_GIH | 0.010 | 0.990 | 0.019 | 0.981 | |
| 1000GENOMES:phase_3_AMR | 0.014 | 0.986 | 0.029 | 0.971 | |
| 1000GENOMES:phase_3_MXL | 0.008 | 0.992 | 0.016 | 0.984 | |
| 1000GENOMES:phase_3_EUR | 0.027 | 0.973 | 0.002 | 0.050 | 0.948 |
| 01000GENOMES:phase_3_ALL | 0.008 | 0.992 | 0.000 | 0.016 | 0.984 |
| 1000GENOMES:phase_3_PEL | 0.006 | 0.994 | 0.012 | 0.988 | |
| 1000GENOMES:phase_3_GBR | 0.055 | 0.945 | 0.011 | 0.088 | 0.901 |
| 1000GENOMES:phase_3_MSL | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_CHS | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_AFR | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_FIN | 0.035 | 0.965 | 0.071 | 0.929 | |
| 1000GENOMES:phase_3_BEB | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_CHB | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_STU | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_IBS | 0.014 | 0.986 | 0.028 | 0.972 | |
| 1000GENOMES:phase_3_ASW | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_ESN | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_ASN | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_ACB | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_LWK | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_GWD | 1.000 | 1.000 | |||
| 1000GENOMES:phase_3_PJL | 0.005 | 0.995 | 0.010 | 0.990 | |
| 1000GENOMES:phase_3_ITU | 0.005 | 0.995 | 0.010 | 0.990 | |
| 1000GENOMES:phase_3_CLM | 0.021 | 0.979 | 0.043 | 0.957 | |
|
| |||||
| EVS EuropeanAmericanAlleleCount | 0.038 | 0.961 | 0.001 | 0.075 | 0.923 |
| EVS AfricanAmericanAlleleCount | 0.007 | 0.993 | 0.000 | 0.013 | 0.987 |
|
| |||||
| European (non-Finnish) | 0.033 | 0.967 | 0.001 | ||
| European (Finnish) | 0.029 | 0.971 | 0.001 | ||
| South Asian | 0.009 | 0.991 | >0.001 | ||
| East Asian | 0.000 | 1.000 | 0.000 | ||
| African | 0.005 | 0.995 | >0.001 | ||
| Latino | 0.007 | 0.993 | >0.001 | ||
| Other | 0.030 | 0.970 | 0.000 | ||
a http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=rs7080536
b http://browser.1000genomes.org/Homo_sapiens/Variation/Population?r=10:115347546-115348546;source=dbSNP;v=rs7080536;vdb=variation;vf=4906750
c http://evs.gs.washington.edu/EVS/ServletManager?variantType=snp&popID=EuropeanAmerican&popID=AfricanAmerican&SNPSummary.x=29&SNPSummary.y=11&SNPSummary=Display+SNP+Summary
d http://exac.broadinstitute.org/variant/10-115348046-G-A
Re-analysis of Gara et al. exome data
| Filtering step | Gara | 1000 Ga | 1000 G CEUb |
|---|---|---|---|
| (1) Variants identified | Not provided | 230,495 | 230,495 |
| (2) SNVs ≤1% in HapMap18c and 1000 Genomes Databases | 53,122 | 44,107 | 39,996 |
| (3) SIFT score < 0.05 or not available | 53,120 | 43,554 | 38,516 |
| (4) In exonic region | 3024 | 6556 | 4486 |
| (5) Present in all three initial affected family members | 20 | 709 | 600 |
| (6) Nonsynonymous | 4 | 388 | 284 |
| (7) SNV/Indel is not present in unaffected/unrelated spouse | 2 | 47 | 35 |
| (8) Present in all seven affected family members based on screening of additional members | 1 | 3 | 2 |
a Global 1000 Genomes AF
b AF in 1000 Genomes CEU (Utah Residents with Northern and Western Ancestry) population
c The HapMap data was only used by Gara et al.