| Literature DB >> 17683589 |
Yasunori Yamamoto1, Toshihisa Takagi.
Abstract
BACKGROUND: Many online resources for the life sciences have been developed and introduced in peer-reviewed papers recently, ranging from databases and web applications to data-analysis software. Some have been introduced in special journal issues or websites with a search function, but others remain scattered throughout the Internet and in the published literature. The searchable resources on these sites are collected and maintained manually and are therefore of higher quality than automatically updated sites, but also require more time and effort. DESCRIPTION: We developed an online resource search system called OReFiL to address these issues. We developed a crawler to gather all of the web pages whose URLs appear in MEDLINE abstracts and full-text papers on the BioMed Central open-access journals. The URLs were extracted using regular expressions and rules based on our heuristic knowledge. We then indexed the online resources to facilitate their retrieval and comparison by researchers. Because every online resource has at least one PubMed ID, we can easily acquire its summary with Medical Subject Headings (MeSH) terms and confirm its credibility through reference to the corresponding PubMed entry. In addition, because OReFiL automatically extracts URLs and updates the index, minimal time and effort is needed to maintain the system.Entities:
Mesh:
Year: 2007 PMID: 17683589 PMCID: PMC1976328 DOI: 10.1186/1471-2105-8-287
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1MeSH term distribution. MeSH term distribution at the second level of the hierarchy in those annotated to all the retrievable MEDLINE abstracts. Note that the following categories were excepted: "L" (Information Science). "V" (Publication Components), and "Z" (Geographic Locations).
Figure 2Screen image of OReFiL. This image shows the search result of the query protein protein interaction. MeSH terms annotated to the MEDLINE abstracts in the hit list and their conceptual ancestors in the MeSH hierarchy are displayed in the alphabetical order in the MeSH term box (encircled by a dotted line), and each font size reflects the frequency. MeSH terms also can be used to filter the result by narrowing down to those entries that have a specified MeSH term. Changing a query to narrow down is done by clicking a MeSH term in the box. Clicking a same MeSH term twice removes it from the query.
Figure 3Growth of online resources and MEDLINE. The numbers of URLs appeared in MEDLINE abstracts. The number of the DNS-resolvable URLs is that of URLs whose server name can be resolvable. The number of the page-accessible URLs is that of URLs whose page can be accessed (the server returns the HTTP status code of 200). MEDLINE growth is added for reference.