| Literature DB >> 21558175 |
Moushumi Sen Sarma1, David Arcoleo, Radhika S Khetani, Brant Chee, Xu Ling, Xin He, Jing Jiang, Qiaozhu Mei, ChengXiang Zhai, Bruce Schatz.
Abstract
With the rapid decrease in cost of genome sequencing, the classification of gene function is becoming a primary problem. Such classification has been performed by human curators who read biological literature to extract evidence. BeeSpace Navigator is a prototype software for exploratory analysis of gene function using biological literature. The software supports an automatic analogue of the curator process to extract functions, with a simple interface intended for all biologists. Since extraction is done on selected collections that are semantically indexed into conceptual spaces, the curation can be task specific. Biological literature containing references to gene lists from expression experiments can be analyzed to extract concepts that are computational equivalents of a classification such as Gene Ontology, yielding discriminating concepts that differentiate gene mentions from other mentions. The functions of individual genes can be summarized from sentences in biological literature, to produce results resembling a model organism database entry that is automatically computed. Statistical frequency analysis based on literature phrase extraction generates offline semantic indexes to support these gene function services. The website with BeeSpace Navigator is free and open to all; there is no login requirement at www.beespace.illinois.edu for version 4. Materials from the 2010 BeeSpace Software Training Workshop are available at www.beespace.illinois.edu/bstwmaterials.php.Entities:
Mesh:
Year: 2011 PMID: 21558175 PMCID: PMC3125736 DOI: 10.1093/nar/gkr285
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Interaction schematic of system usage, services interacting with spaces. A ‘space’ is set of scientific documents for a particular task. (a) Search (or filter) any system or user space and make new smaller spaces. (b) Use terms of interest as seeds to cluster together related terms, in a space; save the documents associated with a given cluster of terms as a smaller space. (c) Summarize functional information about a gene, in the context of a space. (d) Analyze gene lists in the context of a space to reveal significant ‘concepts’ and genes present; and select concepts and/or genes to make new smaller spaces.
Figure 2.System spaces provide standard starting collections for new task spaces within the BioSpace.
Figure 3.Analyze transforms a gene list into a list of discriminating concepts within a space.
Figure 4.Cluster partitions a space into groups of related documents that each can become a space.
Figure 5.Summarize describes the gene functions of specified gene within a task-specific.