| Literature DB >> 31821723 |
Ornella Vitale1, Roberto Preste1, Donato Palmisano1, Marcella Attimonelli1.
Abstract
BACKGROUND: Human mitochondrial DNA has an important role in the cellular energy production through oxidative phosphorylation. Therefore, this process may be the cause and have an effect on mitochondrial DNA mutability, functional alteration, and disease onset related to a wide range of different clinical expressions and phenotypes. Although a large part of the observed variations is fixed in a population and hence expected to be benign, the estimation of the degree of the pathogenicity of any possible human mitochondrial DNA variant is clinically pivotal.Entities:
Keywords: annotation; mitochondria; pathogenicity; variant
Year: 2019 PMID: 31821723 PMCID: PMC7005629 DOI: 10.1002/mgg3.1085
Source DB: PubMed Journal: Mol Genet Genomic Med ISSN: 2324-9269 Impact factor: 2.183
Canonical criteria supporting the deleterious role of a novel mutation as reported in DiMauro & Schon, 2002.
| Canonical criteria of pathogenicity from DiMauro and Schon ( |
|---|
| Mutation must not be a known neutral polymorphism |
| The base change must affect an evolutionarily conserved and functionally important site |
| Deleterious mutations are usually heteroplasmic, although a few pathogenic mutations are homoplasmic |
| The degree of heteroplasmy in different family members ought to be in rough agreement with the severity of symptoms |
| The single‐fiber PCR as a method that allows the correlation of mutational load and functional abnormality |
The pathogenicity scoring system. The table reports the update of the pathogenicity scoring system according to Yarham criteria (Yarham et al., 2011) and further improved in HmtVar (Preste et al., 2019)
| The pathogenicity scoring criteria | Score | |
|---|---|---|
| Variant described as pathogenic by more than one report | yes | 2 |
| no | 0 | |
| PhastCons conservation | yes | 1 |
| no | 0 | |
| PhyloP conservation | yes | 1 |
| no | 0 | |
| Heteroplasmy evidences | yes | 2 |
| no | 0 | |
| Segregation of mutation with disease | yes | 2 |
| no | 0 | |
| Histochemical evidence of mitochondrial disease | yes | 2 |
| no | 0 | |
| Biochemical defect in OXPHOS complexes I, III, IV | yes | 2 |
| no | 0 | |
| Pathogenicity evidence in trans‐mitochondrial cybrids or mutant mt‐tRNA steady‐state level studies | yes | 5 |
| no | 0 | |
| Evidence of mutation segregation with biochemical defect from single‐fiber studies | yes | 3 |
| no | 0 |
Each of the criteria is associated with a weighted score allowing classification of human mitochondrial tRNA variant pathogenicity. The improvements applied in Preste et al., (2019) is focused on PhyloP and PhastCons usage to evaluate the inter‐mammalian site conservation.
Figure 1Workflow describing data mining pipeline. For each mtDNA locus, the main steps are: (a) query through the NCBI Entrez system the PubMed database by “gene name or synonym gene name” and “mtDNA variant name” in HGVS format; (b) store the retrieved Pubmed IDs list; (c) download the abstract related to each Pubmed IDs and keep those containing information regarding functional studies and/or variant; (d) for true positive Pubmed IDs, extract the DOI; e) browse NCBI PubMed and download the related PDF articles
Figure 2Workflow describing text mining pipeline. The main steps are: (a) transformation of the PDF article text into a “Corpus”; (b) preprocessing of the Corpus; (c) creation of the token list; (d) definition of the Document‐Term Matrix (DTM); (e) subsetting of the DTM versus the list of variant annotated with different token formats and versus the list of supervised words related to functional evidences; (f) creation of the list of mined variants and related annotations; (g) check the association of variants and specific functional evidences; (h) compilation of the annotation tables
Token Formats. The table lists, in addition to the standard HGVS nomenclature, the most used formats by which mitochondrial variants are reported in the literature
| token_format |
|---|
| m.[POS][REF]>[ALT] |
| m.[POS][REF][ALT] |
| [REF][POS][ALT] |
| [POS][ALT] |
| m.[POS][REF] |
Figure 3Distribution of variants per locus. The barplot shows the number of total potential human mitochondrial single‐nucleotide variants for each mitochondrial locus (mt‐CDS:34297, mt‐DLOOP:3366, mt‐rRNA:7539, mt‐tRNA:4524)
Comparison with Mitomap, Clinvar, and OMIM databases
| Locus_type | Mitomap_variant | Pipeline_variant | Shared_variant | only_Mitomap | only_Pipeline |
|---|---|---|---|---|---|
| Protein‐coding | 337 | 465 | 185 | 152 | 280 |
| D‐loop | 23 | 162 | 11 | 12 | 151 |
| rRNA | 56 | 88 | 28 | 56 | 59 |
| tRNA | 283 | 217 | 144 | 139 | 73 |
| total | 699 |
| 368 | 359 |
|
The table shows the number of variants for each locus annotated in Mitomap, Clinvar, OMIM, and the ones mined by the data and text mining pipeline. In addition, the number of variants common between the databases and the pipeline is reported in the shared_variant column; the only_Mitomap/Clinvar/OMIM column reports the number of variants stored in these database that the pipeline does not able to extract; the column only_Pipeline contains the number of variants for which annotation about functional data and diseases are available that are not present in other databases. The full annotation and information about the human mitochondrial variants are available on the HmtVar database where all these data are reported.
Bold indicates the resultant data extracted from the pipeline.