| Literature DB >> 34142969 |
Xuan Qin1, Xinzhi Yao1, Jingbo Xia1.
Abstract
BACKGROUND: Natural language processing has long been applied in various applications for biomedical knowledge inference and discovery. Enrichment analysis based on named entity recognition is a classic application for inferring enriched associations in terms of specific biomedical entities such as gene, chemical, and mutation.Entities:
Keywords: evaluation; metric; pathway enrichment; text mining
Year: 2021 PMID: 34142969 PMCID: PMC8277388 DOI: 10.2196/28247
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Text mining systems for gene extraction and pathway construction. TEES: Turku Event Extraction System.
Figure 2Gene pair extraction rule for the text mining systems.
The complete inverse pathway frequency metrics list.
| Inverse pathway frequency (IPF) metrics |
|
|
|
| IPF_gene | 1 | Equation (12) | Equation (3) |
| IPF_node | 1 | Equations (5) and (7) | 1 |
| IPF_shortpath | Equation (10) | Equation (12) | 1 |
| IPF_shortpath_gene | Equation (10) | Equation (12) | Equation (3) |
| IPF_shortpath_node | Equation (10) | Equations (5) and (7) | 1 |
| IPF_gene_node | 1 | Equations (5) and (7) | Equation (3) |
| IPF_gene_node_shortpath | Equations (5), (7), and (10) | Equations (5) and (7) | Equation (3) |
Figure 3Comparison of the pathway-enrichment metrics based on the rapamycin-related gene set. CumPer: cumulative percentage; IPF: inverse pathway frequency; TEES: Turku Event Extraction System.
Comparison of the areas under the cumulative percentage curve for the pathway-enriched methods based on the known rapamycin-related pathway.
| Inverse pathway frequency metrics | ABSTRACT | SENTENCE | DEPENDENCY | Turku Event Extraction System |
| IPF_gene | 0.634 | 0.638 | 0.628 | 0.529 |
| IPF_node | 0.647 | 0.648 | 0.672 | 0.625 |
| IPF_shortpath | 0.680a | 0.679a | 0.688a | 0.635a |
| IPF_shortpath_gene | 0.675 | 0.675 | 0.682 | 0.626 |
| IPF_shortpath_node | 0.675 | 0.675 | 0.682 | 0.626 |
| IPF_gene_node | 0.675 | 0.675 | 0.682 | 0.626 |
| IPF_gene_node_shortpath | 0.675 | 0.675 | 0.681 | 0.626 |
| .59 | .60 | .64 | .62 |
aIndicates that the area is significantly superior to this text mining method in terms of the pathway enrichment indicator.
Figure 4Visualization of the extracted gene pairs from literature.