| Literature DB >> 29044180 |
Reuben J Pengelly1, Thahmina Alom2, Zijian Zhang2, David Hunt3, Sarah Ennis2, Andrew Collins2.
Abstract
Next generation sequencing is transforming clinical medicine and genome research, providing a powerful route to establishing molecular diagnoses for genetic conditions; however, challenges remain given the volume and complexity of genetic variation. A number of methods integrate patient phenotype and genotypic data to prioritise variants as potentially causal. Some methods have a clinical focus while others are more research-oriented. With clinical applications in mind we compare results from alternative methods using 21 exomes for which the disease causal variant has been previously established through traditional clinical evaluation. In this case series we find that the PhenIX program is the most effective, ranking the true causal variant at between 1 and 10 in 85% of these cases. This is a significantly higher proportion than the combined results from five alternative methods tested (p = 0.003). The next best method is Exomiser (hiPHIVE), in which the causal variant is ranked 1-10 in 25% of cases. The widely different targets of these methods (more clinical focus, considering known Mendelian genes, in PhenIX, versus gene discovery in Exomiser) is perhaps not fully appreciated but may impact strongly on their utility for molecular diagnosis using clinical exome data.Entities:
Mesh:
Year: 2017 PMID: 29044180 PMCID: PMC5647373 DOI: 10.1038/s41598-017-13841-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Some phenotype-based variant prediction tools.
| Tool | Concept | Authors benchmarks | References and software |
|---|---|---|---|
|
| Integrated phenotypic and interactome analysis using model organisms (mouse, zebrafish) and human clinical data along with protein-protein interaction network data. Focussed on finding new disease genes. | Known disease-gene associations the top hit in 97 % of simulated exomes. |
[ |
|
| Integrates predicted impact of variants with haploinsufficiency and phenotype-specific gene prioritisation. Uses random forest learning trained on the Human Gene Mutation Database (HGMD[ | Outperforms classical deleteriousness scores (PolyPhen, SIFT, MutationTaster). |
[ |
|
| Reduces high dimensional phenotypic and genotypic data using semantic similarity and multidimensional scaling. Interface can be used to convert clinical notes to HPO terms. | Clinical variants given median rank of 2, causal variants in top 1% of candidates (47 cases). Outperformed Phen-Gen, eXtasy, and Exomiser (hiPHIVE) for clinical variants. |
[ |
|
| Integrates human and model organism phenotypes, functional annotations, curated pathways, cellular localizations and anatomical terms using supervised learning. Exploits multiple ontologies and experimental interaction data[ | Outperformed ExomeWalker[ |
[ |
|
| Semantic matching of symptoms against disorder database following Phenomizer[ | Causal coding variants ranked first in 88% of cases (simulation) and in 8 of 11 patient samples. Outperformed VAAST, eXtasy and Phevor by 13–58% and PHIVE by 13–16%. |
[ |
|
| Interrogates only known Mendelian genes and uses semantic similarity matching in Phenomizer[ | Tests on 52 patient samples with known mutations correct gene achieved mean rank of 2.1 |
[ |
|
| Uses ontologies to re-prioritise candidates identified by other variant prioritisation tools such as SIFT, PhastCons and VAAST to identify alleles not previously linked to disease. | Improved performance of tools such as SIFT and VAAST. |
[ |
Rank positions of causal variants by method.
|
|
|
|
| |||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |||
| 1 |
| COFFIN-SIRIS SYNDROME | 2 | 95 | 132 | 1037 | 6013 | 6184 |
| 2 |
| EPILEPTIC ENCEPHALOPATHY | 1 | 85 | 104 | — | 1458 | 8508 |
| 3 |
| MYOCLONIC DYSTONIA | 7 | — | — | — | 239 | 9304 |
| 4 |
| MENTAL RETARDATION, AUTOSOMAL RECESSIVE 15 | 106 | 14 | 10 | 1004 | 2230 | 4511 |
| 5 |
| CONGENITAL FIBER-TYPE DISPROPORTION MYOPATHY | 1 | 68 | 85 | 74 | 422 | 8624 |
| 6 |
| EPILEPTIC ENCEPHALOPATHY, EARLY INFANTILE, 4 | — | — | — | — | — | — |
| 7 |
| SPASTIC ATAXIA, CHARLEVOIX-SAGUENAY TYPE | 3 | 89 | 77 | 308 | 3264 | 5032 |
| 8 |
| ANGELMAN SYNDROME | 12 | 74 | 77 | — | 178 | 8728 |
| 9 |
| PTEN HAMARTOMA TUMOR SYNDROME | 1 | 1 | 1 | — | 126 | 8822 |
| 10 |
| SPINAL MUSCULAR ATROPHY, LOWER EXTREMITY, AUTOSOMAL DOMINANT | 10 | 85 | 86 | 20 | 1759 | 4687 |
| 11 |
| DRAVET SYNDROME | 2 | 27 | 53 | 72 | 250 | 8188 |
| 12 |
| TREACHER COLLINS SYNDROME 3 | 9 | 99 | 92 | 45 | 259 | 8858 |
| 13 |
| MICROPHTHALMIA, ISOLATED 1 | 5 | 60 | 70 | 73 | — | — |
| 14 |
| KLEEFSTRA SYNDROME | 10 | 88 | 95 | — | — | — |
| 15 |
| CRANIOFRONTONASAL SYNDROME | 1 | 1 | 1 | — | 254 | 8997 |
| 16 |
| COSTELLO SYNDROME | 7 | 1 | 1 | 52 | 1 | 9328 |
| 17 |
| NOONAN SYNDROME 6 | 1 | 82 | 83 | — | 1 | 9328 |
| 18 |
| LEUKOENCEPHALOPATHY WITH VANISHING WHITE MATTER; VWM | 11 | — | 144 | — | 30 | 9216 |
| 19 |
| MUENKE SYNDROME | 1 | 1 | 1 | 50 | 7 | 9281 |
| 20 |
| ALPERS SYNDROME | 1 | 89 | 98 | 402 | 14 | 8876 |
| 21 |
| PSEUDOACHONDROPLASIA | 1 | 78 | 90 | 53 | 10 | 9310 |
‘—’ – not ranked.
Figure 1Ranks for causal variants by category. Chart showing the number of cases in different rank classes for each method.
Figure 2Intersection of pathogenic variants being ranked within the top 10 between software.