| Literature DB >> 24903515 |
Maria Taboada1, Hadriana Rodríguez2, Diego Martínez2, María Pardo2, María Jesús Sobrido2.
Abstract
MOTIVATION: As the number of clinical reports in the peer-reviewed medical literature keeps growing, there is an increasing need for online search tools to find and analyze publications on patients with similar clinical characteristics. This problem is especially critical and challenging for rare diseases, where publications of large series are scarce. Through an applied example, we illustrate how to automatically identify new relevant cases and semantically annotate the relevant literature about patient case reports to capture the phenotype of a rare disease named cerebrotendinous xanthomatosis.Entities:
Mesh:
Year: 2014 PMID: 24903515 PMCID: PMC4207225 DOI: 10.1093/database/bau045
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Semantic annotation and indexing of case reports from PubMed.
Figure 2.Example of extraction of a snippet of information from an abstract and its subsequent annotation.
Figure 3.Example of annotation generated by the OBO annotator, using the HPO ontology.
Evaluation results of the performance of the identification of case reports
| Evaluation measure | Manual method | Automated method |
|---|---|---|
| Number of selected papers | 223 | 174 |
| Precision (%) | 97 | 99 |
| Recall (%) | 81 | 65 |
| F-measure (%) | 88 | 78 |
Annotation results for the OBO annotator and the NCBO annotator
| Annotation result | OBO annotator | NCBO annotator |
|---|---|---|
| Number of annotated abstracts | 145 | 126 |
| Percentage of annotated abstracts (%) | 63 | 55 |
| Average number of concepts per abstract | 3.3 | 2.9 |
| Standard deviation | 2.56 | 2.05 |
| Maximum number of concepts per abstract | 11 | 8 |
| Total number of annotations | 456 | 344 |
Evaluation results of the performance of our method, the NCBO annotator and the GoPubMed service
| Measure | OBO annotator | NCBO annotator | GoPubMed |
|---|---|---|---|
| Coverage | 3.86 | 3.14 | 2.54 |
| Precision (%) | 94 | 97 | 97 |
| Recall (%) | 61 | 49 | 41 |
| F-measure (%) | 74 | 65 | 58 |
Figure 4.Venn diagram showing where the two methods overlap.
Evaluation results of the performance of the identification of case reports for the three methods
| Measure | Manual method | Automated method | Combined method |
|---|---|---|---|
| Number of selected papers | 223 | 174 | 230 |
| Precision (%) | 97 | 99 | 99 |
| Recall (%) | 81 | 65 | 87 |
| F-measure (%) | 88 | 78 | 93 |
Figure 5.Percentage of papers selected by each method.
Set of concepts that are more specific in the ontology induced from the literature than in the curated ontology
| HPO concept | Correct annotation? | Recommendation |
|---|---|---|
| Abnormal emotion/affect behavior | Yes | Yes |
| Chronic diarrhea | Yes | Yes |
| Congenital cataract | No | Revise synonyms |
| Gait disturbance | Yes | Yes |
| Global developmental delay | Yes | Yes |
| Juvenile cataract | Yes | Yes |
| Lower limb spasticity | Yes | Yes |
| Paraplegia/paraparesis | Yes | Yes |
| Parkinsonism | Yes | Yes |
| Peripheral demyelination | No | Revise synonyms |
| Polyneuropathy | Yes | Yes |
| Progressive neurologic deterioration | Yes | Yes |
| Spastic gait | Yes | Yes |
Set of concepts included in the curated ontology and not in the one induced from the literature
| HPO concept | Is it in the abstracts? | Reason for the omission |
|---|---|---|
| Abnormality of central somatosensory evoked potentials | Yes | A long series of words |
| Abnormality of the dentate nucleus | Yes | Different name |
| Abnormality of the periventricular white matter | Yes | Different name |
| Angina pectoris | No | |
| Cerebral calcification | Yes | Different name |
| Delusions | No | |
| Developmental regression | No | Different concept |
| Electroencephalography with generalized slow activity | Yes | A long series of words |
| EMG: axonal abnormality | Yes | Different name |
| Hallucinations | No | |
| Limitation of joint mobility | No | |
| Lipomatous tumor | Yes | Different name |
| Malabsorption | No | |
| Myocardial infarction | No | |
| Respiratory insufficiency | No | |
| Xanthelasma | No |