| Literature DB >> 26467025 |
Izabela Karbassi1, Glenn A Maston1, Angela Love1, Christina DiVincenzo1, Corey D Braastad1, Christopher D Elzinga1, Alison R Bright1, Domenic Previte1, Ke Zhang2, Charles M Rowland3, Michele McCarthy1, Jennifer L Lapierre1, Felicita Dubois1, Katelyn A Medeiros1, Sat Dev Batish1, Jeffrey Jones1, Khalida Liaquat1, Carol A Hoffman1, Malgorzata Jaremko1, Zhenyuan Wang1, Weimin Sun2, Arlene Buller-Burckle2, Charles M Strom2, Steven B Keiles2, Joseph J Higgins1.
Abstract
We developed a rules-based scoring system to classify DNA variants into five categories including pathogenic, likely pathogenic, variant of uncertain significance (VUS), likely benign, and benign. Over 16,500 pathogenicity assessments on 11,894 variants from 338 genes were analyzed for pathogenicity based on prediction tools, population frequency, co-occurrence, segregation, and functional studies collected from internal and external sources. Scores were calculated by trained scientists using a quantitative framework that assigned differential weighting to these five types of data. We performed descriptive and comparative statistics on the dataset and tested interobserver concordance among the trained scientists. Private variants defined as variants found within single families (n = 5,182), were either VUS (80.5%; n = 4,169) or likely pathogenic (19.5%; n = 1,013). The remaining variants (n = 6,712) were VUS (38.4%; n = 2,577) or likely benign/benign (34.7%; n = 2,327) or likely pathogenic/pathogenic (26.9%, n = 1,808). Exact agreement between the trained scientists on the final variant score was 98.5% [95% confidence interval (CI) (98.0, 98.9)] with an interobserver consistency of 97% [95% CI (91.5, 99.4)]. Variant scores were stable and showed increasing odds of being in agreement with new data when re-evaluated periodically. This carefully curated, standardized variant pathogenicity scoring system provides reliable pathogenicity scores for DNA variants encountered in a clinical laboratory setting.Entities:
Keywords: clinical laboratory techniques; databases; decision support techniques; mutation; nucleic acid; polymorphism
Mesh:
Year: 2015 PMID: 26467025 PMCID: PMC4737317 DOI: 10.1002/humu.22918
Source DB: PubMed Journal: Hum Mutat ISSN: 1059-7794 Impact factor: 4.878
Figure 1Multiple lines of evidence used in the variant pathogenicity scoring system. Interpretation categories are aligned to the American College of Medical Genetics (ACMG) recommendations [Richards et al., 2015]. A midpoint score of 4 (yellow) does not favor pathogenicity or benignity. Benign scores are shown in green and pathogenic scores in red. Variants of uncertain significance (VUS) have three subclasses; score of 3 is suggestive of the variant being benign, score of 5 is suggestive of the variant being pathogenic, and score of 4 does not favor either side of the pathogenicity scale.
Summary of Variants by Type and Pathogenicity Score
| Variant score | ||||||||
|---|---|---|---|---|---|---|---|---|
| Variant type | 1 (Benign) | 2 (Likely benign) | 3 (VUS suggesting benign) | 4 (VUS) | 5 (VUS suggesting pathogenic) | 6 (Likely pathogenic) | 7 (pathogenic) | Total |
| Missense | 468 | 200 | 643 | 2,248 | 1,307 | 322 | 552 | 5,740 |
| Coding synonymous | 846 | 363 | 1,529 | 77 | 26 | 1 | 4 | 2,846 |
| Intronic | 296 | 108 | 453 | 81 | 124 | 22 | 23 | 1,107 |
| Frameshift | 1 | 0 | 0 | 3 | 1 | 642 | 195 | 842 |
| Nonsense | 0 | 0 | 0 | 3 | 1 | 328 | 286 | 618 |
| Consensus splice site | 0 | 0 | 0 | 1 | 0 | 278 | 127 | 406 |
| In‐frame insertion or deletion | 8 | 2 | 5 | 162 | 30 | 13 | 26 | 246 |
| UTR | 33 | 2 | 8 | 44 | 0 | 1 | 1 | 89 |
| Total | 1,652 | 675 | 2,638 | 2,619 | 1,489 | 1,607 | 1,214 | 11,894 |
Figure 2Likelihood of variant score changes as a function of new data. The percentage of cases changing classification categories is depicted by green (decreases) and red (increases) arrows, cases where variant score is staying the same are depicted in yellow. The last column shows the odds of a variant score increasing to a more pathogenic score . The number of re‐scoring events (n) in each scoring category is shown in the first column. Variants scored as 2 led to a lower score in 38.9% and a higher score in 0.8% on re‐evaluation. Variants scored as 3 were lowered in 33.2% and raised in 2.1%. Variants scored as 2 or 3 had a significant (P < 0.0001) tendency to move down in scoring to classification as benign or benign/likely benign, respectively. Variants with a prior score of 5 or 6 were more likely (odds ratios of 2.88 and 7.56, respectively) to increase to more pathogenic scores (P < 0.0001).
The Distribution of Assigned Variant Scores Compared with the Results of Published Functional Studies (n = 597)
| Assigned variant score | ||||||||
|---|---|---|---|---|---|---|---|---|
| Published effect on protein function | 1 | 2 | 3 | 4 | 5 | 6 | 7 | Total |
| Damaging | 0 | 10 | 8 | 16 | 36 | 143 | 275 | 488 |
| NOT damaging | 21 | 22 | 30 | 3 | 1 | 0 | 2 | 79 |
| Conflicting | 5 | 2 | 6 | 8 | 3 | 0 | 6 | 30 |
Variant scores were lowered in 34 of 488 variants with damaging results (7.0%).
Six out of 79 variants with functional study results of “not damaging” (7.6%) had a score of 4 or higher.
Functional studies with conflicting information were scored in the pathogenic range in nine cases (30%).
SIFT and PolyPhen Predictions for Missense Variants Classified as Benign (n = 353) and Pathogenic (n = 363)
| SIFT tolerated | SIFT NOT tolerated | PolyPhen benign | PolyPhen damaging | Both benign | Both damaging | |
|---|---|---|---|---|---|---|
| Pathogenic | 37 | 326 | 28 | 335 | 12 | 310 |
| Benign | 243 | 110 | 276 | 77 | 220 | 54 |
Polyphen predictions of “probably damaging” and “possibly damaging” are combined into the damaging category.
Both SIFT and PolyPhen predictions agree.
The Performance of SIFT and PolyPhen Predictions Based on the Concordance between the Prediction Tools and Variant Classification
| Parameter | SIFT | PolyPhen | Both agree |
|---|---|---|---|
| Sensitivity | 0.898 | 0.923 | 0.963 |
| Specificity | 0.688 | 0.782 | 0.803 |
| PPV | 0.748 | 0.813 | 0.852 |
| NPV | 0.868 | 0.908 | 0.948 |
| FDR | 0.252 | 0.187 | 0.148 |
| Accuracy | 0.795 | 0.853 | 0.889 |
Calculations were based on the data summarized in Table 3. The number of true positives and negatives were based on the concordance between the pathogenicity scores and the SIFT and PolyPhen predictions.
Abbreviations: PPV, positive predictive value; NPV, negative predictive value; FDR, false discovery rate.