| Literature DB >> 24817403 |
Abstract
The number of scholarly documents available on the web is estimated using capture/recapture methods by studying the coverage of two major academic search engines: Google Scholar and Microsoft Academic Search. Our estimates show that at least 114 million English-language scholarly documents are accessible on the web, of which Google Scholar has nearly 100 million. Of these, we estimate that at least 27 million (24%) are freely available since they do not require a subscription or payment of any kind. In addition, at a finer scale, we also estimate the number of scholarly documents on the web for fifteen fields: Agricultural Science, Arts and Humanities, Biology, Chemistry, Computer Science, Economics and Business, Engineering, Environmental Sciences, Geosciences, Material Science, Mathematics, Medicine, Physics, Social Sciences, and Multidisciplinary, as defined by Microsoft Academic Search. In addition, we show that among these fields the percentage of documents defined as freely available varies significantly, i.e., from 12 to 50%.Entities:
Mesh:
Year: 2014 PMID: 24817403 PMCID: PMC4015892 DOI: 10.1371/journal.pone.0093949
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1To estimate the number of scientific documents on the web, , let equal the number of citations found in both Scholar and MAS for a collection of papers, and let be the number of citations reported by Scholar.
Then is an estimate of ,the fraction of documents indexed by MAS. The total number of documents N would be where is the size of MAS.
Figure 2Relative number of documents by scholarly search engines and databases.
Total and Google Scholar are estimates.
The estimated number of documents on the web for each field.
| Discipline | Size in MAS | Estimate of Size #1 | Estimate of Size #2 |
| Agriculture Science | 447,134 | 1,088,711 | 1,026,904 |
| Arts & Humanities | 1,373,959 | 5,286,355 | 3,155,485 |
| Biology | 4,135,959 | 8,019,640 | 9,498,798 |
| Chemistry | 4,428,253 | 10,704,454 | 10,170,091 |
| Computer Science | 3,555,837 | 6,912,148 | 8,166,468 |
| Economics & Business | 1,019,038 | 2,733,855 | 2,340,360 |
| Engineering | 3,683,363 | 7,947,425 | 8,459,349 |
| Environmental Sciences | 461,653 | 975,211 | 1,060,249 |
| Geosciences | 1,306,307 | 2,302,957 | 3,000,113 |
| Material Science | 913,853 | 3,062,641 | 2,098,789 |
| Mathematics | 1,207,412 | 2,634,321 | 2,772,987 |
| Medicine | 12,056,840 | 24,652,433 | 27,690,190 |
| Physics | 5,012,733 | 13,033,269 | 11,512,430 |
| Social Science | 1,928,477 | 6,072,285 | 4,429,012 |
| Multidisciplinary | 9,648,534 | 25,798,026 | 22,159,184 |
| Total Sum | 121,223,731 | 117,540,415 |
Figure 3The relative number of documents on the web for each of the 15 fields as defined by MAS.
The percentage of publicly available scholarly documents found in Google Scholar.
| Field | % of Public | 95% CI | Estimate of Size | 95% Lower Bound |
| Agriculture Science | 12 | ±6.3 | 130,645 | 72,446 |
| Arts & Humanities | 24 | ±8.3 | 1,268,725 | 897,331 |
| Biology | 25 | ±8.4 | 2,004,910 | 1,433,666 |
| Chemistry | 22 | ±8.1 | 2,354,979 | 1,625,540 |
| Computer Science | 50 | ±9.8 | 3,456,074 | 2,887,549 |
| Economics & Business | 42 | ±9.6 | 1,148,219 | 926,256 |
| Engineering | 12 | ±6.3 | 953,691 | 528,852 |
| Environmental Sciences | 29 | ±8.8 | 282,811 | 210,017 |
| Geosciences | 35 | ±9.3 | 806,034 | 625,341 |
| Material Science | 12 | ±6.3 | 367,516 | 203,799 |
| Mathematics | 27 | ±8.7 | 711,266 | 518,878 |
| Medicine | 26 | ±8.5 | 6,409,632 | 4,630,828 |
| Physics | 35 | ±9.3 | 4,561,644 | 3,539,034 |
| Social Science | 19 | ±7.6 | 1,153,734 | 761,868 |
| Multidisciplinary | 43 | ±9.7 | 11,093,151 | 8,992,160 |
| Total | 36,703,036 | 27,853,573 |