| Literature DB >> 26673001 |
Kazuki Oshita1, Masaru Tomita1, Kazuharu Arakawa1.
Abstract
With the availability of numerous curated databases, researchers are now able to efficiently use the multitude of biological data by integrating these resources via hyperlinks and cross-references. A large proportion of bioinformatics research tasks, however, may include labor-intensive tasks such as fetching, parsing, and merging datasets and functional annotations from distributed multi-domain databases. This data integration issue is one of the key challenges in bioinformatics. We aim to provide an identifier conversion and data aggregation system as a part of solution to solve this problem with a service named G-Links, 1) by gathering resource URI information from 130 databases and 30 web services in a gene-centric manner so that users can retrieve all available links about a given gene, 2) by providing RESTful API for easy retrieval of links including facet searching based on keywords and/or predicate types, and 3) by producing a variety of outputs as visual HTML page, tab-delimited text, and in Semantic Web formats such as Notation3 and RDF. G-Links as well as other relevant documentation are available at http://link.g-language.org/.Entities:
Keywords: databases, bioinformatics, data integration, molecular biology
Year: 2014 PMID: 26673001 PMCID: PMC4670005 DOI: 10.12688/f1000research.5754.2
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Figure 1. URL Syntax of G-Links.
G-Links is implemented as a RESTful service that can be queried by altering the URL. Full documentation and example queries are available at http://www.g-language.org/wiki/glinks.
Figure 2. HTML output example of BRCA1_HUMAN (UniProt ID of BRCA1 gene in humans).
By default, access to G-Links with web browsers displays the results in interactive HTML, with related image gallery implemented with CoverFlow ( http://imageflow.finnrudolph.de/) on the top, followed by a large table of annotations and cross-references.
Overview of supported databases and web services in G-Links.
Detailed list of Input/Output databases are available at http://link.g-language.org/input_list and http://link.g-language.org/output_list.
| Databases (132) | ||||
|---|---|---|---|---|
| Genome(11) | Phosphorylation(3) | |||
| Gene(6) | Ortholog(7) | Cluster(1) | Expression(4) | |
| SNP(2) | Phylogenesis(2) | |||
| Protein(4) | Structure(5) | Classification(1) | Cluster(4) | |
| Family/Domain/Motif(9) | PPI(4) | Enzyme(3) | ||
| Molecular
| Pathway/Reaction(5) | DISEASE/Pathogen/Drug(6) | ||
| Others(15) | Paper(3) | Organisms specific(31) | ||
| Web Services (33) | ||||
| Alignment Local(1) | Data Retrieval Chemistry
| |||
| Nucleic Composition(5) | Nucleic CpG Islands(1) | Nucleic Translation(1) | Nucleic Repeats(3) | |
| Protein Properties(5) | Protein 2D Structure(3) | Protein Composition(3) | Protein Motif(3) | |
| Protein Localization(4) | Protein Domains(2) | Protein Functional
| ||