| Literature DB >> 35655148 |
Shaina Raza1,2, Brian Schwartz3,4, Laura C Rosella4.
Abstract
BACKGROUND: Due to the growing amount of COVID-19 research literature, medical experts, clinical scientists, and researchers frequently struggle to stay up to date on the most recent findings. There is a pressing need to assist researchers and practitioners in mining and responding to COVID-19-related questions on time.Entities:
Keywords: CORD-19; COVID-19; LitCOVID; Long-COVID; Pipeline; Post-COVID-19; Question answering system; Transformer model
Mesh:
Year: 2022 PMID: 35655148 PMCID: PMC9160513 DOI: 10.1186/s12859-022-04751-6
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Fig. 1Construction of reference-standard dataset
General details of the datasets used in this work
| Total articles | Articles used in this work | Timeline of articles | Files | |
|---|---|---|---|---|
| CORD-19 | ~ 1,450,000 articles in all formats (PDF, XML) | 7978 (only PMC articles) | March 2020 till December 2021 | It consists of following files10: (1) document embeddings for each paper; (2) collection of JSON files with full text of CORD-19 papers, (3) metadata for all papers, ‘PMID’, ‘title’, ‘paragraphs’, ‘URL’, ‘publication date’, 'DOI'. |
| LitCOVID | ~ 207,630 | 9877 (PMC articles) | April 2020 till December 2021 | The dataset consists of full articles text provided in JSON and XML format. We get the full texts and metdata. |
*Both these datasets are updated periodically on COVID-19 articles, we use the data December 2021, which were the latest checkpoints available by 31-December-2021
10https://github.com/allenai/cord19
Fig. 2Example of an extractive QA system composed of a question, context and answer
Fig. 3CoQUAD system architecture
Fig. 4CoQUAD-MPNet
Terms used for precision, recall, and accuracy
| Relevant | Non-relevant | Total | |
|---|---|---|---|
| Retrieved | A | B | A + B |
| Not retrieved | C | D | C + D |
| Total | A + C | B + D | A + B + C + D |
Evaluation of QA pipeline
| Evaluation metric | Top@ 1 | Top@ 5 | Top@ 10 | Top@ 20 |
|---|---|---|---|---|
| Recall (single document) | 0.495 | 0.711 | 0.720 | |
| Recall (multiple documents) | 0.494 | 0.716 | 0.720 | |
| Mean reciprocal rank (MRR) | 0.495 | 0.572 | 0.582 | |
| Precision | 0.344 | 0.342 | 0.304 | |
| Mean average precision (MAP) | 0.494 | 0.672 | 0.690 | |
| F1-Score | 0.504 | 0.636 | 0.636 | |
| Exact match (EM) | 0.539 | 0.549 | 0.698 | |
| Semantic answer similarity (SAS) | 0.503 | 0.623 | 0.687 | |
| Accuracy | 0.895 (same for all top @k) | |||
Bold means best result
Fig. 5SAS scores of Reader during different values of top @k
Fig. 6EM score of all models
Fig. 7F1-score of all models