| Literature DB >> 21566119 |
James A Evans1, Andrey Rzhetsky.
Abstract
Life scientists today cannot hope to read everything relevant to their research. Emerging text-mining tools can help by identifying topics and distilling statements from books and articles with increased accuracy. Researchers often organize these statements into ontologies, consistent systems of reality claims. Like scientific thinking and interchange, however, text-mined information (even when accurately captured) is complex, redundant, sometimes incoherent, and often contradictory: it is rooted in a mixture of only partially consistent ontologies. We review work that models scientific reason and suggest how computational reasoning across ontologies and the broader distribution of textual statements can assess the certainty of statements and the process by which statements become certain. With the emergence of digitized data regarding networks of scientific authorship, institutions, and resources, we explore the possibility of accounting for social dependences and cultural biases in reasoning models. Computational reasoning is starting to fill out ontologies and flag internal inconsistencies in several areas of bioscience. In the not too distant future, scientists may be able to use statements and rich models of the processes that produced them to identify underexplored areas, resurrect forgotten findings and ideas, deconvolute the spaghetti of underlying ontologies, and synthesize novel knowledge and hypotheses.Entities:
Mesh:
Year: 2011 PMID: 21566119 PMCID: PMC3129146 DOI: 10.1074/jbc.R110.176370
Source DB: PubMed Journal: J Biol Chem ISSN: 0021-9258 Impact factor: 5.157
FIGURE 1.A, estimated number of distinct pages from the Online Computer Library Center WorldCat Database of books and journals in 71,000 libraries across 121 countries, split by manuscripts and journals, broad subject area, and the most common eight languages from 1450 to present. B, manuscript pages, by language, mapped against major historical events of the 20th century. C, distribution of volumes across libraries: number of volumes plotted against the number of Online Computer Library Center libraries in which each are held, split by manuscript and serials, subject, and language. All distributions feature a spiking tail, suggesting a core collection of books that appears in nearly all libraries. D, growth in the number and publication age of journal pages available via the Internet and freely on the Internet (without institutional subscription) from 1998 to 2006.
FIGURE 2.A, hypothetical temporal sequence of experimental findings (1 and 0 in the beakers) and published articles (1 and 0 in the papers) (15). Early findings are reflected accurately in publications, whereas scientists' interpretation of later findings incorporate the history of publication into account. B, the broader social network in which the scientists in A live. The positive correlation between social ties in B and the propositional agreement in A suggests that communication induces accord.
FIGURE 3.Elements, context, and processes involved in scientific reasoning. Arrows represent causes or influences. The figure emphasizes the scope of first-generation computational reasoning and the emerging second-generation reasoning we describe in this minireview.