| Literature DB >> 35031074 |
Yuelun Zhang1, Siyu Liang2, Yunying Feng3, Qing Wang4, Feng Sun5, Shi Chen2, Yiying Yang3, Xin He3, Huijuan Zhu2, Hui Pan6.
Abstract
BACKGROUND: Systematic review is an indispensable tool for optimal evidence collection and evaluation in evidence-based medicine. However, the explosive increase of the original literatures makes it difficult to accomplish critical appraisal and regular update. Artificial intelligence (AI) algorithms have been applied to automate the literature screening procedure in medical systematic reviews. In these studies, different algorithms were used and results with great variance were reported. It is therefore imperative to systematically review and analyse the developed automatic methods for literature screening and their effectiveness reported in current studies.Entities:
Keywords: Artificial intelligence; Diagnostic test accuracy; Evidence-based practice; Natural language process; Protocol; Systematic review
Mesh:
Year: 2022 PMID: 35031074 PMCID: PMC8760775 DOI: 10.1186/s13643-021-01881-5
Source DB: PubMed Journal: Syst Rev ISSN: 2046-4053
Review question
| Item | Description |
|---|---|
| “Participants”* | Original publications and literatures identified by electronic literature search |
| Index test | Automatic literature screening models using artificial intelligence algorithms |
| Reference standard | Traditional literature screening by human investigators |
| Outcome | Primary outcome: diagnostic accuracy, measured by sensitivity, specificity, precision, NPV, PPV, NLR, PLR, DOR, F-measure, accuracy, and AUC of automatic literature screening models Secondary outcomes: labour and time saving, mainly evaluated by the percentage of retrieved literatures that the reviewers do not have to read (because they have been screened out by the automatic literature screening models) |
Abbreviations: AUC, area under curve; DOR, diagnostic odds ratio; NLR, negative likelihood ratio; NPV, negative predictive value; PLR, positive likelihood ratio; PPV, positive predictive value
*The “participants” in our review refer to the original publications and literatures identified in a systematic literature search, rather than human participants or patients in traditional systematic reviews
Definitions of variables in data extraction
| Variable | Definitions |
|---|---|
| Study characteristics | |
| Year | Year of publication |
| Authors | Last name of authors |
| Study type | Article, abstract, or systematic review |
| Journal, conference | Name of journal or conference |
| Training set information | |
| Training set | Name of dataset used for training |
| Area | General medicine, detailed disease, or specific intervention |
| Source | Name of electronic databases searched for building training set |
| Time range | Time range of training set |
| Type of publication | Abstract, or full-text |
| Number of all literatures | Number of all literatures in training set |
| Number of included literatures | Number of included literatures identified by the step of screening in training set |
| Training method | Supervised, semi-supervised, or unsupervised |
| Validation set information | |
| Validation set | Name of dataset used for validation |
| Area | General, disease, or intervention |
| Source | Name of electronic database searched for building validation set |
| Time range | Time range of validation set |
| Type of publication | Abstract, or full-text |
| Number of all literatures | Number of all literatures in validation set |
| Number of included literatures | Number of included literatures identified by the step of screening in validation set |
| Golden standard | Process of screening by human investigators |
| AI algorithm information | |
| Model name | Name of model |
| Model type | Classification, regression, ranking, or others |
| Model performance | Including but not limited to sensitivity, specificity, precision, NPV, PPV, NLR, PLR, DOR, F-measure, accuracy, and AUC |
| Cost saving | Decreased number of screened literatures by human investigators |
Abbreviations: AUC, area under curve; DOR, diagnostic odds ratio; NLR, negative likelihood ratio; NPV, negative predictive value; PLR, positive likelihood ratio; PPV, positive predictive value
The revised QUADAS-2 tool for risk of bias assessment
| Domains | Signal questions | Answers |
|---|---|---|
| “Patient” (literature) Selection | ||
| Was a consecutive or random sample of literatures enrolled | Yes/no/unclear | |
| Was a case-control design avoided | Yes/no/unclear | |
| Did the study avoid inappropriate exclusions | Yes/no/unclear | |
| Could the selection of literatures have introduced bias | Low/high/unclear risk | |
| Is there concern that the included literatures do not match the review question | Low/high/unclear risk | |
| Index test (AI algorithms in literature screening) | ||
| Were the index test results interpreted without knowledge of the results of the reference standard | Yes/no/unclear | |
| If a threshold was used, was it pre-specified | Yes/no/unclear | |
| Could the conduct or interpretation of the index test have introduced bias | Low/high/unclear risk | |
| Is there concern that the index test, its conduct, or interpretation differ from the review question | Low/high/unclear risk | |
| Reference standard (results of screening by human investigators) | ||
| Is the reference standard likely to correctly classify the target condition | Yes/no/unclear | |
| Were the reference standard results interpreted without knowledge of the results of the index test | Yes/no/unclear | |
| Could the reference standard, its conduct, or its interpretation have introduced bias | Low/high/unclear risk | |
| Is there concern that the target condition as defined by the reference standard does not match the review question | Low/high/unclear risk | |
| Flow and timing | ||
| Did all literatures receive a reference standard | Yes/no/unclear | |
| Did literatures receive the same reference standard | Yes/no/unclear | |
| Were all literatures included in the analysis | Yes/no/unclear | |
| Could the literature flow have introduced bias | Low/high/unclear risk |