| Literature DB >> 29208034 |
Wichor M Bramer1, Melissa L Rethlefsen2, Jos Kleijnen3,4, Oscar H Franco5.
Abstract
BACKGROUND: Within systematic reviews, when searching for relevant references, it is advisable to use multiple databases. However, searching databases is laborious and time-consuming, as syntax of search strategies are database specific. We aimed to determine the optimal combination of databases needed to conduct efficient searches in systematic reviews and whether the current practice in published reviews is appropriate. While previous studies determined the coverage of databases, we analyzed the actual retrieval from the original searches for systematic reviews.Entities:
Keywords: Databases, bibliographic; Information storage and retrieval; Review literature as topic; Sensitivity and specificity
Mesh:
Year: 2017 PMID: 29208034 PMCID: PMC5718002 DOI: 10.1186/s13643-017-0644-y
Source DB: PubMed Journal: Syst Rev ISSN: 2046-4053
Definitions of general measures of performance in searches
| Recall |
|
| Precision |
|
| Number Needed to Read |
|
Description of topics of included references (only values above 5% are shown)
| Department ( | |
| Surgery | 13 (24%) |
| Epidemiology | 10 (18%) |
| Internal medicine | 3 (5%) |
| Orthopedics | 3 (5%) |
| Patient ( | |
| Neoplasms | 6 (12%) |
| Wounds and injuries | 6 (12%) |
| Musculoskeletal diseases | 5 (10%) |
| Cardiovascular diseases | 5 (10%) |
| Nutritional and metabolic diseases | 5 (10%) |
| Intervention ( | |
| Chemicals and drugs category | 12 (39%) |
| Surgical procedures, operative | 8 (26%) |
| Food and beverages | 2 (6%) |
| Biological factors | 2 (6%) |
| Domain ( | |
| Therapy | 19 (35%) |
| Etiology | 13 (24%) |
| Epidemiology | 6 (11%) |
| Diagnosis | 6 (11%) |
| Management | 5 (9%) |
| Prognosis | 5 (9%) |
| Study types ( | |
| No limits mentioned | 48 (83%) |
| RCTs | 5 (9%) |
| RCTs and cohort studies/case control studies | 5 (9%) |
Number of unique included references by each specific database
| Database | Number of reviews that used the database | Number of reviews with unique references | Number of unique references |
|---|---|---|---|
| Embase | 58 | 29 (50%) | 132 (45%) |
| MEDLINE | 58 | 27 (47%) | 69 (24%) |
| Web of Science | 58 | 19 (33%) | 37 (13%) |
| Google Scholar | 58 | 24 (41%) | 37 (13%) |
| CINAHL | 18 | 1 (6%) | 6 (2%) |
| Scopus | 24 | 3 (13%) | 5 (2%) |
| PsycINFO | 11 | 1 (9%) | 2 (1%) |
| SportDiscus | 2 | 2 (100%) | 3 (1%) |
Performance of several databases and database combinations in terms of sensitivity and precision
| # results | # includes ( | Overall recalla | Median recallb | Minimum recallc | Percentage 100% recalld | Precisione | Number needed to readf | |
|---|---|---|---|---|---|---|---|---|
| Embase (EM) | 85,521 | 1500 | 85.9% | 87.3% | 45.8% | 13.8% | 1.8% | 57 |
| MEDLINE (ML) | 56,340 | 1375 | 78.8% | 82.9% | 50.0% | 8.6% | 2.4% | 41 |
| Web of Science (WoS) | 48,561 | 1189 | 68.1% | 72.5% | 13.2% | 6.9% | 2.4% | 41 |
| Google Scholar (GS) | 10,342 | 601 | 34.4% | 38.0% | 8.3% | 5.2% | 5.8% | 17 |
| EM-ML | 100,444 | 1621 | 92.8% | 94.6% | 66.7% | 24.1% | 1.6% | 62 |
| EM-WoS | 104,444 | 1585 | 90.8% | 93.8% | 57.9% | 27.6% | 1.5% | 66 |
| EM-GS | 91,411 | 1570 | 89.9% | 93.3% | 65.8% | 25.9% | 1.7% | 58 |
| ML-WoS | 75,263 | 1481 | 84.8% | 87.1% | 60.0% | 15.5% | 2.0% | 51 |
| ML-GS | 62,230 | 1459 | 83.6% | 89.8% | 63.2% | 15.5% | 2.3% | 43 |
| WoS-GS | 54,451 | 1320 | 75.6% | 85.7% | 23.7% | 13.8% | 2.4% | 41 |
| EM-ML-GS | 106,334 | 1674 | 95.9% | 97.4% | 78.9% | 41.4% | 1.6% | 64 |
| EM-ML-WoS | 119,367 | 1674 | 95.9% | 97.1% | 71.1% | 37.9% | 1.4% | 70 |
| EM-WoS-GS | 110,334 | 1638 | 93.8% | 98.1% | 65.8% | 44.8% | 1.5% | 67 |
| ML-WoS-GS | 81,153 | 1528 | 87.5% | 92.6% | 70.0% | 29.3% | 1.9% | 53 |
| EM-ML-GS-WoS | 125,257 | 1716 | 98.3% | 100.0% | 78.9% | 74.1% | 1.4% | 73 |
aOverall recall: The total number of included references retrieved by the databases divided by the total number of included references retrieved by all databases
bMedian recall: The median value of recall per review
cMinimum recall: The lowest value of recall per review
dPercentage 100% recall: The percentage of reviews for which the database combination retrieved all included references
ePrecision: The number of included references divided by the total number of references retrieved
fNumber Needed to Read: The total number of references retrieved divided by the number of included references
Fig. 1Percentage of systematic reviews for which a certain database combination reached a certain recall. The X-axis represents the percentage of reviews for which a specific combination of databases, as shown on the y-axis, reached a certain recall (represented with bar colors). Abbreviations: EM Embase, ML MEDLINE, WoS Web of Science, GS Google Scholar. Asterisk indicates that the recall of all databases has been calculated over all included references. The recall of the database combinations was calculated over all included references retrieved by any database
Fig. 2Percentage of systematic reviews of a certain domain for which the combination Embase, MEDLINE and Cochrane CENTRAL reached a certain recall
Fig. 3Legend of Figs. 3 and 4
Fig. 4The ratio between number of results per database combination and the total number of results for all databases
Fig. 5The ratio between precision per database combination and the total precision for all databases
Calculation of probability of acceptable recall of a PubMed sample of systematic reviews
| Frequency | Frequency percentage ( | Probability recall > 95% ( |
| Probability recall 100% ( |
| |
|---|---|---|---|---|---|---|
| EM-ML | 73 | 37 | 47 | 17 | 24 | 9 |
| ML | 41 | 21 | 16 | 3 | 9 | 2 |
| EM-ML-WoS | 40 | 20 | 64 | 13 | 36 | 7 |
| ML-WoS | 21 | 11 | 21 | 2 | 16 | 2 |
| ML-GS | 7 | 4 | 26 | 1 | 16 | 1 |
| ML-WoS-GS | 7 | 4 | 37 | 1 | 29 | 1 |
| EM-ML-GS | 5 | 3 | 76 | 2 | 41 | 1 |
| EM | 2 | 1 | 19 | 0 | 14 | 0 |
| EM-WoS | 1 | 1 | 40 | 0 | 28 | 0 |
| WoS | 1 | 1 | 7 | 0 | 7 | 0 |
| Total | 198a | 100 | 40 | 23 |
aTwo reviews did not use any of the databases used in this evaluation