| Literature DB >> 34136800 |
Xingqiao Wang1, Xiaowei Xu1, Weida Tong2, Ruth Roberts3,4, Zhichao Liu2.
Abstract
Background: T ransformer-based language models have delivered clear improvements in a wide range of natural language processing (NLP) tasks. However, those models have a significant limitation; specifically, they cannot infer causality, a prerequisite for deployment in pharmacovigilance, and health care. Therefore, these transformer-based language models should be developed to infer causality to address the key question of the cause of a clinical outcome.Entities:
Keywords: artificial intelligence; causal inference; language models; natural language processing; pharmacovigilance
Year: 2021 PMID: 34136800 PMCID: PMC8202286 DOI: 10.3389/frai.2021.659622
Source DB: PubMed Journal: Front Artif Intell ISSN: 2624-8212
FIGURE 1Workflow of the study.
Sentence sets of Analgesics-related acute liver failure and Tramadol-related mortalities.
| Endpoints | Datasets | Number of positives | Number of negatives | Positive versus negative ratio |
|---|---|---|---|---|
| Acute liver failure | Total | 15,224 | 21,437 | 0.71 |
| Training set | 9,798 | 13,663 | 0.71 | |
| Develop set | 2,399 | 3,467 | 0.69 | |
| Test set | 3,027 | 4,307 | 0.70 | |
| Tramadol-related death | Total | 9,846 | 17,399 | 0.57 |
| Training set | 6,250 | 11,185 | 0.56 | |
| Develop set | 1,588 | 2,722 | 0.57 | |
| Test set | 2,008 | 3,442 | 0.58 |
FIGURE 2The distribution of sequence length: (A) Analgesics-induced acute liver failure; and (B) Tramadol-related mortalities.
Top 10 most frequent terms in the two sentence sets based on the tf-idf values.
| Analgesics-related acute liver failure | Tramadol-related mortalities | ||
|---|---|---|---|
| Terms | Tf-idf value | Terms | Tf-idf value |
| Acetylcysteine | 0.0318 | Abacavir | 0.0323 |
| Acinetobacter | 0.0318 | Indomethacin | 0.0323 |
| Alafenamide | 0.0318 | Glossodynia | 0.0315 |
| Altered | 0.0318 | Idiopathic | 0.0315 |
| Appendicectomy | 0.0318 | Amnestic | 0.0312 |
| Appetite | 0.0318 | Assault | 0.0312 |
| Assist | 0.0318 | Axetil | 0.0312 |
| Atherosclerosis | 0.0318 | Bradyarrhythmia | 0.0312 |
| Brucellosis | 0.0318 | Brugada | 0.0312 |
| Cabazitaxel | 0.0318 | Cardiorenal | 0.0312 |
FIGURE 3The relationship between cross-entropy loss and accuracy and training steps in fine-tuned ALBERT models: (A) Analgesics-induced acute liver failure; and (B) Tramadol-related mortalities. The red and gray colors denote the accuracy and cross-entropy loss, respectively.
Enriched causal clinical terms by the proposed InferBERT AI model.
| Clinical categories | Clinical terms | Z-score | Average of do probabilities | Average of not do probabilities | Adjusted |
|---|---|---|---|---|---|
| Analgesics-induced acute liver failure | |||||
| primary suspect drug | APAP | 153.92 | 0.84 | 0.33 | < 1E-16 |
| Age | 18–39 | 36.01 | 0.54 | 0.35 | < 1E-16 |
| Gender | Female | 17.06 | 0.41 | 0.35 | < 1E-16 |
| Dose | Larger than 100 mg | 8.93 | 0.39 | 0.35 | < 1E-16 |
| Outcome | Death | 119.33 | 0.68 | 0.30 | < 1E-16 |
| Tramadol-related mortalities | |||||
| Adversary events | Completed suicide | 252.27 | 1.00 | 0.28 | < 1E-16 |
| Age | 40–64 | 18.33 | 0.44 | 0.32 | < 1E-16 |
| Gender | Male | 3.62 | 0.37 | 0.34 | 0.0001 |
| Dose | Drug abuse | 38.77 | 0.74 | 0.33 | < 1E-16 |
| Primary suspect drug | Hydrocodone bitartrate | 23.67 | 0.91 | 0.36 | < 1E-16 |
FIGURE 4Causal trees for (A) Analgesics-induced acute liver failure; and (B) Tramadol-related mortalities. The number attached to each arrow denotes the z-score.
FIGURE 5Robustness evaluation of the proposed InferBERT model. The yellow and green colors denote Analgesics-induced acute liver failure and Tramadol-related mortalities datasets, respectively. The Venn diagram illustrates the overlapping of the enriched causal terms by three repeated runs. The percentage of overlapping terms (POPs) shown in the dotted-line curve represent the consistency among ranked order terms from the three repeated runs.
FIGURE 6Comparison between the proposed InferBERT model and the three conventional causal inference models including PRR, ROR EBGM: (A) Analgesics-induced acute liver failure; and (B) Tramadol-related mortalities datasets, respectively.