| Literature DB >> 23320851 |
Jorge R Herskovic1, Devika Subramanian, Trevor Cohen, Pamela A Bozzo-Silva, Charles F Bearden, Elmer V Bernstam.
Abstract
BACKGROUND: Electronic Health Records aggregated in Clinical Data Warehouses (CDWs) promise to revolutionize Comparative Effectiveness Research and suggest new avenues of research. However, the effectiveness of CDWs is diminished by the lack of properly labeled data. We present a novel approach that integrates knowledge from the CDW, the biomedical literature, and the Unified Medical Language System (UMLS) to perform high-throughput phenotyping. In this paper, we automatically construct a graphical knowledge model and then use it to phenotype breast cancer patients. We compare the performance of this approach to using MetaMap when labeling records.Entities:
Mesh:
Year: 2012 PMID: 23320851 PMCID: PMC3426800 DOI: 10.1186/1471-2105-13-S13-S2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Simplified RRI workflow Patient records are turned into a concept representation by MetaMap, which is then used to generate patient vectors. Patient vectors are used to generate semantic vectors for UMLS concepts. Note that the random and semantic vectors can contain real numbers, and are normalized in actual use.
Figure 2Graph for Breast Cancer The graph generated by our iterative process for UMLS Concept C0006142, Breast Cancer. The edge labels specify the type of relationship; relationships prefixed with “UMLS:” were found in the UMLS. Relationships without a prefix (i.e. “PROCESS_OF”) were discovered in the literature.
Performance of MetaMap and the graphical method on determining the breast cancer status of the patient as determined by a physician
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| MetaMap | 51% | 85% | 26% | 40% |
| Graphical method | 84% | 46% | 61% | 53% |