| Literature DB >> 21887336 |
Yue Shang1, Yanpeng Li, Hongfei Lin, Zhihao Yang.
Abstract
Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.Entities:
Mesh:
Year: 2011 PMID: 21887336 PMCID: PMC3162578 DOI: 10.1371/journal.pone.0023862
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Framework of the biomedical text summarization system.
Figure 2An example of semantic relation extraction.
Direct relations with “Angina Pectoris”.
| Angina Pectoris--CO-OCCURS_WITH--Diabetes Mellitus |
| Rose extract--CAUSES--Angina Pectoris |
| Blood Platelets--LOCATION_OF--Angina Pectoris |
| Reduction (chemical)--PROCESS_OF--Angina Pectoris |
| Angina Pectoris--ISA--Symptoms <2> |
| Acute hyperglycemia--AFFECTS--Angina Pectoris |
| Angina Pectoris--CO-OCCURS_WITH--Acute myocardial infarction |
| Counterpulsation, External--TREATS--Angina Pectoris |
| Interventions--TREATS--Angina Pectoris |
| Revascularization - action--TREATS--Angina Pectoris |
| Exertion--PROCESS_OF--Angina Pectoris |
| Diabetes Mellitus--CO-OCCURS_WITH--Angina Pectoris |
| Coronary Artery Bypass--TREATS(INFER)--Angina Pectoris |
| Angina Pectoris--OCCURS_IN--Male population group |
| Depressive episode, unspecified--CO-OCCURS_WITH--Angina Pectoris |
| Angina Pectoris--CO-OCCURS_WITH--Stable angina |
Figure 3The core relation set for “Angina Pectoris”.
Figure 4Semantic relation network for “Angina Pectoris” after relation retrieval.
Figure 5An example of relation and sentence retrieval.
Location scores used in our experiment.
| Location Tag | Location Score |
| BACKGROUND | 1 |
| CONCLUSIONS | |
| TITLE | |
| OBJECTIVES | |
| MATERIAL AND METHODS | 0.5 |
| RESULTS |
Diseases use in our experiment.
| Alzheimer's Disease | Cerebrovascular accident | Epilepsy | Myocarditis |
| Asthma | Colon Carcinoma | HIV Infections | Myotonic Dystrophy |
| Atherosclerosis | Crohn's disease | Huntington Disease | Obesity |
| Breast Carcinoma | Cystic Fibrosis | Hypertensive disease | Schizophrenia |
| Carcinoma of lung | Depressive disorder | Malaria | Parkinson Disease |
| Cerebral Amyloid Angiopathy | Mad Cow Disease | Metabolic syndrome | Prostate carcinoma |
Performance of summarization for 24 diseases.
| Disease | ROUGE-1 | ROUGE-2 | ROUGE-L | ||||||
| MEAD | N_SR | SRE | MEAD | N_SR | SRE | MEAD | N_SR | SRE | |
| Alzheimer's Disease | 0.2446 | 0.3613 | 0.3333 | 0.0431 | 0.0509 | 0.0518 | 0.1785 | 0.2968 | 0.2910 |
| Asthma | 0.2708 | 0.2542 | 0.2828 | 0.0310 | 0.0389 | 0.0684 | 0.2154 | 0.2542 | 0.2239 |
| Atherosclerosis | 0.2738 | 0.2825 | 0.3424 | 0.0147 | 0.0426 | 0.0511 | 0.2103 | 0.2429 | 0.2420 |
| Breast Carcinoma | 0.2062 | 0.2584 | 0.3040 | 0.0195 | 0.0259 | 0.0260 | 0.1479 | 0.1912 | 0.1888 |
| Carcinoma of lung | 0.2475 | 0.3731 | 0.3418 | 0.0441 | 0.0570 | 0.0828 | 0.1916 | 0.2994 | 0.2799 |
| Cerebral Amyloid Angiopathy | 0.3067 | 0.2905 | 0.3963 | 0.0604 | 0.0966 | 0.1067 | 0.2733 | 0.2849 | 0.2945 |
| Cerebrovascular accident | 0.2269 | 0.2606 | 0.2974 | 0.0539 | 0.0375 | 0.0601 | 0.2000 | 0.2359 | 0.2279 |
| Colon Carcinoma | 0.3414 | 0.2852 | 0.3208 | 0.0264 | 0.0300 | 0.0318 | 0.2845 | 0.2257 | 0.2692 |
| Crohn's disease | 0.2810 | 0.3393 | 0.3806 | 0.0282 | 0.0526 | 0.0620 | 0.2576 | 0.2828 | 0.3188 |
| Cystic Fibrosis | 0.3206 | 0.4122 | 0.4432 | 0.0769 | 0.1176 | 0.1176 | 0.2632 | 0.3461 | 0.3562 |
| Depressive disorder | 0.2554 | 0.2514 | 0.3527 | 0.0399 | 0.0501 | 0.0445 | 0.2289 | 0.2329 | 0.3328 |
| Mad Cow Disease | 0.3455 | 0.3698 | 0.3852 | 0.0857 | 0.1042 | 0.0838 | 0.3090 | 0.3340 | 0.3737 |
| Epilepsy | 0.2820 | 0.3086 | 0.3718 | 0.0158 | 0.0544 | 0.0517 | 0.2350 | 0.2914 | 0.3243 |
| HIV Infections | 0.3232 | 0.3172 | 0.3458 | 0.0753 | 0.0617 | 0.0462 | 0.2764 | 0.2943 | 0.3223 |
| Huntington Disease | 0.3366 | 0.2910 | 0.3218 | 0.0547 | 0.0390 | 0.0436 | 0.3168 | 0.2601 | 0.2913 |
| Hypertensive disease | 0.2609 | 0.3284 | 0.2751 | 0.0328 | 0.0388 | 0.0677 | 0.2283 | 0.2687 | 0.2424 |
| Malaria | 0.3093 | 0.3529 | 0.3699 | 0.0259 | 0.0836 | 0.0807 | 0.2680 | 0.3476 | 0.3429 |
| Metabolic syndrome | 0.2878 | 0.3249 | 0.3509 | 0.0580 | 0.0811 | 0.0727 | 0.2086 | 0.2888 | 0.2483 |
| Myocarditis | 0.2795 | 0.3171 | 0.2952 | 0.0570 | 0.0590 | 0.0735 | 0.2445 | 0.3024 | 0.2889 |
| Myotonic Dystrophy | 0.2023 | 0.3253 | 0.3146 | 0.0528 | 0.0714 | 0.0606 | 0.1873 | 0.2831 | 0.3121 |
| Obesity | 0.3067 | 0.2333 | 0.2990 | 0.0253 | 0.0383 | 0.0200 | 0.2647 | 0.2283 | 0.2924 |
| Schizophrenia | 0.3644 | 0.2419 | 0.2841 | 0.0436 | 0.0338 | 0.0324 | 0.2473 | 0.2043 | 0.2269 |
| Parkinson Disease | 0.3469 | 0.3757 | 0.4191 | 0.0498 | 0.0703 | 0.0638 | 0.3288 | 0.3333 | 0.3170 |
| Prostate carcinoma | 0.1662 | 0.2938 | 0.3567 | 0.0051 | 0.0396 | 0.0503 | 0.1662 | 0.2625 | 0.2670 |
| Average | 0.2828 | 0.3104 | 0.3410 | 0.0425 | 0.0573 | 0.0604 | 0.2388 | 0.2746 | 0.2864 |
Figure 6Comparison of summarization performance on ROUGE-1.
The impact of relation expansion, noise filtering and redundant removal.
| Method | ROUGE-1 | ROUGE-2 | ROUGE-L |
| Baseline | 0.3196 | 0.0373 | 0.2693 |
| Expansion | 0.3263(+2.0%) | 0.0408(+9.4%) | 0.2723 (+1.1%) |
| Filtering | 0.3208(+0.4%) | 0.0397(+6.4%) | 0.2655 (-1.4%) |
| Expansion + Filtering | 0.3303(+3.3%) | 0.0436(+16.9%) | 0.2801 (+4.0%) |
| Expansion + Filtering +Redundant Removal | 0.3410(+6.7%) | 0.0604(+61.9%) | 0.2864 (+6.3%) |
Figure 7Relationship between ROUGE-1 and concept depth in MeSH based filtering.
Figure 8Relationship between ROUGE-1 and the trade-off parameter
Figure 9Relationship between ROUGE-1 and the trade-off parameter ω.