| Literature DB >> 35002094 |
Arantxa Otegi1, Iñaki San Vicente2, Xabier Saralegi2, Anselmo Peñas3, Borja Lozano3, Eneko Agirre1.
Abstract
Biosanitary experts around the world are directing their efforts towards the study of COVID-19. This effort generates a large volume of scientific publications at a speed that makes the effective acquisition of new knowledge difficult. Therefore, Information Systems are needed to assist biosanitary experts in accessing, consulting and analyzing these publications. In this work we develop a study of the variables involved in the development of a Question Answering system that receives a set of questions asked by experts about the disease COVID-19 and its causal virus SARS-CoV-2, and provides a ranked list of expert-level answers to each question. In particular, we address the interrelation of the Information Retrieval and the Answer Extraction steps. We found that a recall based document retrieval that leaves to a neural answer extraction module the scanning of the whole documents to find the best answer is a better strategy than relying in a precise passage retrieval before extracting the answer span.Entities:
Keywords: 00-01; 99-00; COVID-19; Question answering
Year: 2021 PMID: 35002094 PMCID: PMC8719365 DOI: 10.1016/j.knosys.2021.108072
Source DB: PubMed Journal: Knowl Based Syst ISSN: 0950-7051 Impact factor: 8.038
Sample of question and answers in COVID-19 domain.
| Question | What is the origin of COVID 19? |
|---|---|
| Expected nuggets | ’spillover’, ’positive selection pressure’, ’the species barrier’, ’potential mutations’, ’genetic recombination’, ’animal-to-human transmission’, ’bat reservoirs’, ’Codon usage bias’, ’ancestral haplotypes’, ’bat coronavirus genome’, ’zoonotic origin’, ’seafood wholesale market in Wuhan’, ’evolutionary constraints’, ’pangolins’, ’interspecies transmission’, ’betacoronaviruses’, ’viral fitness’, ’molecular evolution’, ’Chinese province of Hubei’, ’Mammal species’, ’Bats’, ’virus adapation’, ’species of origin’, ’emergence’ |
| Retrieved contexts and text spans | |
| 1 | |
| 2 | Furthermore, if genetic manipulation had been performed, one of the several reverse-genetic systems available for betacoronaviruses would probably have been used. |
| 3 | However, the genetic data irrefutably show that |
| 4 | Instead, we propose |
IR Results on epic-qa-dev regarding the fields used as a query: (i) query; (ii) query+question: query and question concatenated; and (iii) (qry+qs)+()backg: complex query built concatenating query and question fields, and combining linearly the concatenation with the background field.
| (a) IR Results on epic-qa-dev for | |||||||
|---|---|---|---|---|---|---|---|
| Query building | NDCG | R@500 | R@1K | R@2K | R@3K | R@4K | R@5K |
| query | 0.2977 | 0.2864 | 0.3766 | 0.4606 | 0.5397 | 0.5689 | 0.5918 |
| query+question | 0.3832 | 0.3906 | 0.4898 | 0.5848 | 0.6599 | 0.7101 | 0.7346 |
| 0.5 | 0.3901 | 0.3979 | 0.5126 | 0.6099 | 0.6795 | 0.7154 | 0.7604 |
IR Results on epic-qa-dev. Ranking quality and recall oriented systems are evaluated with and without neural reranking, for both passage and document retrieval strategies. RR column (2nd) indicates if reranking is used or not and when used the value of .
| (a) IR Results on epic-qa-dev for | ||||||||
|---|---|---|---|---|---|---|---|---|
| PRF optimized for | RR | NDCG | R@500 | R@1K | R@2K | R@3K | R@4K | R@5K |
| NDCG | no | 0.3993 | 0.3991 | 0.5169 | 0.6252 | 0.6796 | 0.728 | 0.7599 |
| R | no | 0.3855 | 0.4002 | 0.5059 | 0.6118 | 0.6842 | 0.7253 | 0.7665 |
| NDCG | 0.9 | 0.4157 | 0.4612 | 0.5787 | 0.6724 | 0.7238 | 0.7604 | 0.76 |
| R | 0.9 | 0.403 | 0.456 | 0.5665 | 0.6705 | 0.7195 | 0.7555 | 0.76288 |
IR Results on epic-qa-dev for Passage vs. Document retrieval experiments.
| (a) IR Results on epic-qa-dev for | |||||||
|---|---|---|---|---|---|---|---|
| Index | reranking | Test EPIC-QA_docs | |||||
| R@500 | R@1K | R@2K | R@3K | R@4K | R@5K | ||
| passages | yes | 0.6959 | 0.7979 | 0.8597 | 0.8693 | 0.8716 | 0.8716 |
| documents | yes | 0.7041 | 0.7867 | 0.8342 | 0.8538 | 0.8655 | 0.8686 |
Best values of the hyperparameters for the passage- and document-based systems, and their results.
| Context | Passage | Document |
|---|---|---|
| fine-tune | squad+quac | squad+quac |
| number_contexts | 1,000 | 100 |
| number_answers | 15 | 15 |
| normalization | document-level | document-level |
| combination | ir_norm+qa_norm | ir_norm+qa_norm |
| NDNS-Partial | 0.2178 | 0.3044 |
| NDNS-Relaxed | 0.2177 | 0.3051 |
| NDNS-Exact | 0.2482 | 0.3411 |
Fig. 1NDNS-Relaxed results (y-axis) of the exploration for each of the values of the hyperparameters.
Fig. 2NDNS-Relaxed results for different values of in linear combination.
Results of the exploration of which text field to use as a question.
| Passages-based system | |||
|---|---|---|---|
| Question field | NDNS-Partial | NDNS-Relaxed | NDNS-Exact |
| query | 0.1938 | 0.1946 | 0.2225 |
| question | 0.2178 | 0.2177 | 0.2482 |
| background | 0.2147 | 0.2160 | 0.2474 |
| question+background | 0.2173 | 0.2176 | 0.2477 |
Results on Primary dataset for passages-based and document-based full QA systems.
| System | NDNS-Partial | NDNS-Relaxed | NDNS-Exact |
|---|---|---|---|
| Passages | 0.2200 | 0.2196 | 0.2487 |
| Documents | 0.2860 | 0.2860 | 0.3241 |