| Literature DB >> 18974831 |
Duncan Hull1, Steve R Pettifer, Douglas B Kell.
Abstract
Many scientists now manage the bulk of their bibliographic information electronically, thereby organizing their publications and citation material from digital libraries. However, a library has been described as "thought in cold storage," and unfortunately many digital libraries can be cold, impersonal, isolated, and inaccessible places. In this Review, we discuss the current chilly state of digital libraries for the computational biologist, including PubMed, IEEE Xplore, the ACM digital library, ISI Web of Knowledge, Scopus, Citeseer, arXiv, DBLP, and Google Scholar. We illustrate the current process of using these libraries with a typical workflow, and highlight problems with managing data and metadata using URIs. We then examine a range of new applications such as Zotero, Mendeley, Mekentosj Papers, MyNCBI, CiteULike, Connotea, and HubMed that exploit the Web to make these digital libraries more personal, sociable, integrated, and accessible places. We conclude with how these applications may begin to help achieve a digital defrost, and discuss some of the issues that will help or hinder this in terms of making libraries on the Web warmer places in the future, becoming resources that are considerably more useful to both humans and machines.Entities:
Mesh:
Year: 2008 PMID: 18974831 PMCID: PMC2568856 DOI: 10.1371/journal.pcbi.1000204
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
A summary of some of the digital libraries described in this Review.
| Name | Domain | Size | Style of Metadata | Persistent Inbound Links? | Persistent Outbound Links? | Full Text? | Access |
| ACM Digital Library | Computer science | >54,000 articles | BibTeX, EndNote | Yes, see ACM section in text | Not applicable | For subscribers | Metadata and abstract free, full paper for subscribers only |
| IEEE Xplore | Computer science | Unknown | EndNote, Procite, Refman | Yes, see Xplore section in text | Not applicable | For subscribers | Metadata and abstract free, full paper for subscribers only |
| DBLPDBLP | Mostly computer science | >900,000 articles | BibTeX | Yes, see dblp section in text | Various, including DOIs | Links to publisher DOIs | Metadata free |
| Pubmed | Life sciences and biomedicine | >17,000,000 articles | XML, NLM, DTD | Yes, see PubMed section in text | LinkOut and links to publisher sites | Links to publisher DOIs | Metadata and abstract free |
| PubmedCentral | Life sciences and biomedicine | >750,000 | XML, Dublin Core, RDF | Yes, see text | Not applicable | Yes | Free access to data and metadata |
| Web of Knowledge | Broad scientific coverage | >15,000,000 | BibTeX, EndNote, Refman, Procite | No, see WoK section in text | Links to publisher sites | Links to publisher DOIs | Subscription only |
| Scopus | Broad scientific coverage | >33,000,000 | RefWorks, EndNote, Refman, Procite | Yes, see Scopus section in text | Links to publisher sites | Links to publisher DOIs | Subscription only |
| Citeseer | Broad coverage | >760,000 | BibTeX | Yes, see Citeseer section in text | Local cache and links to self-archived papers | Yes | Free access |
| Google Scholar | Broad coverage | Not published | Nothing very exportable, html only | Yes, see Google Scholar section in text | Direct links to publishers and self-archived grey literature | Yes (includes grey literature and self-archived) | Free access |
| arXiv | Mainly physical sciences | >44,000 | BibTeX, | Yes, see section on arXiv in text | Links to self-archived material in some PDFs | Yes | Free access |
Note that this table summary does not cover all the minutiae of licensing issues.
Figure 1A mind map [207] summarizing the contents of this article in a convenient manner.
Figure 2The approximate relative coverage and size of selected digital libraries described in the section Digital Libraries, DOIs, and URIs, and summarised in Table 1.
Of all the libraries described, Google Scholar probably has the widest coverage. However, it is currently not clear exactly how much information Google indexes, what the criteria are for inclusion in the index, and whether it subsumes other digital libraries in the way shown in the figure. Note: the size of sets (circles) in this diagram is NOT proportional to their size, and DBLP, Scopus, and arXiv are shown as a single set for clarity rather than correctness.
Figure 3Google Scholar search results, identified by http://scholar.google.com/scholar?q = mygrid.
Google Scholar links out to external content using a number of methods including OpenURL [89], shown here by the “Find it via JRUL” (JRUL is a local library) links. Unlike, e.g., WoK, it is relatively easy to create inbound links to individual authors and publications in Google Scholar; see text for details.
Figure 4A typical workflow for using a digital library representing a subset of the literature.
Tasks represented by white nodes are normally performed exclusively by humans, while tasks shown in blue nodes can be performed wholly or partly by machines of some kind. The main problematic tasks that make digital libraries difficult to use for both machines and humans are “GET” (publication) and “GET METADATA”. These are shown in bold and discussed further in the Identity Crisis section of this paper.
Figure 5Mekentosj Papers can organize large collections of locally stored PDF files, with their metadata.
It looks and feels much like the popular iTunes application, allowing users to manage their digital libraries by categories shown at the top. It is presently available only under Mac OS/X.