| Literature DB >> 31845899 |
Honghan Wu1,2,3, Karen Hodgson4, Sue Dyson4, Katherine I Morley4,5,6, Zina M Ibrahim4,7, Ehtesham Iqbal4, Robert Stewart4,5, Richard Jb Dobson4,7, Cathie Sudlow1,3.
Abstract
BACKGROUND: Much effort has been put into the use of automated approaches, such as natural language processing (NLP), to mine or extract data from free-text medical records in order to construct comprehensive patient profiles for delivering better health care. Reusing NLP models in new settings, however, remains cumbersome, as it requires validation and retraining on new data iteratively to achieve convergent results.Entities:
Keywords: clustering; electronic health records; machine learning; model adaptation; natural language processing; phenotype; phenotype embedding; text mining; word embedding
Year: 2019 PMID: 31845899 PMCID: PMC6938594 DOI: 10.2196/14782
Source DB: PubMed Journal: JMIR Med Inform
The task of recognizing contextualized phenotype mentions is to identify mentions of phenotypes from free-text records and classify the context of each mention into five categories (listed in the second column of Table 1). The last two rows give examples of nonphenotype mentions—the two sentences are not describing incidents of a condition.
| Examples | Types of phenotype mentions |
| 49 year old man with | Positive mentiona |
| With no evidence of | Negated mentiona |
| …Is concerning for local | Hypothetical mentiona |
| PAST MEDICAL HISTORY: (1) | History mentiona |
| Mother was A positive, | Mention of phenotype in another persona |
| She visited the | Not a phenotype mention |
| The patient asked for information about | Not a phenotype mention |
aContextualized mentions.
Figure 1Assess the transferability of a pretrained model in solving a new task: Discriminate between differently inaccurate mentions identified by the model in the new setting.
Figure 2The framework to learn contextualized phenotype embedding from labelled data that an natural language processing model m was trained or validated on. TIA: transient ischemic attack.
Figure 3Architecture of phenotype embedding-based approach for transferring pretrained natural language processing models for identifying new phenotypes or application to new corpora. The word and phenotype embedding model is learned from the training data of the reusable models in its source domain (the task that m was trained for). No labelled data in the target domain (new setting) are required for the adaptation guidance. NLP: natural language processing.
Figure 4Clustered percentage versus separate power on difficult cases. The x-axis is the Epsilon (EPS) parameter of the DBScan clustering algorithm---the longest distance between any two items within a cluster; the y-axis is the percentage. Two types of changing information (as functions of EPS) are plotted on each panel: clustered percentage (solid line) and SP on incorrect cases (false-positive mentions of phenotypes). The latter has two series: (1) SP by chance (dash dotted line) when clustering by randomly selecting mentions and (2) SP by clustering using phenotype embedding (dashed line). N: number of all mentions; N_f: number of false-positive mentions; SP: separate power.
Figure 5Identifying new phenotypes by reusing natural language processing models pretrained for semantically close phenotypes: The four pairs of phenotype-mention identification models are chosen from SemEHR models trained on SLaM data; DBScan Epsilon (EPS) value=3.8, and imbalance waste is calculated on e=3, meaning at least 3 samples are needed for training from each language pattern. The x-axis is the similarity threshold, ranging from 0.0 to 0.8; the y-axes, from top to bottom, are the proportion of duplicate waste saved over total number of mentions, macro-accuracy, and micro-accuracy, respectively.
Comparisons of the performance of reusing models with different semantic similarity levels. Similarity threshold: 0.01; DBScan EPS: 0.38. Reusing models trained for more (semantically) similar phenotypes achieved adaptation results with less effort (more duplicate waste identified) in all cases, and the results were more accurate in three of four cases. Performance metrics of better reusable models are highlighted as bold numbers.
| Model reuse cases | Duplicate waste | Macro-accuracy | Micro-accuracy |
| Diabetes by Type 2 Diabetesa | 0.502b | 0.966b | 0.933b |
| Diabetes by Hypercholesterolemia | 0.477 | 0.965 | 0.930 |
| Stroke by Heart Attacka | 0.711b | 0.948b | 0.955b |
| Stroke by Fatigue | 0.220 | 0.884 | 0.938 |
| Heart attack by Infarcta | 0.569b | 0.989b | 0.966b |
| Heart attack by Bruise | 0.529 | 0.821 | 0.889 |
| Multiple Sclerosis by Myasthenia Gravisa | 0.761b | 0.944 | 0.971 |
| Multiple Sclerosis by Diabetes | 0.522 | 0.993b | 0.979b |
aMore similar model reuse cases.
bPerformance metrics of better reusable models.