Literature DB >> 16185827

Investigation of protein functions through data-mining on integrated human transcriptome database, H-Invitational database (H-InvDB).

Chisato Yamasaki1, Kanako O Koyanagi, Yasuyuki Fujii, Takeshi Itoh, Roberto Barrero, Takuro Tamura, Yumi Yamaguchi-Kabata, Motohiko Tanino, Jun-Ichi Takeda, Satoshi Fukuchi, Satoru Miyazaki, Nobuo Nomura, Sumio Sugano, Tadashi Imanishi, Takashi Gojobori.   

Abstract

H-Invitational Database (H-InvDB; ) is a human transcriptome database, containing integrative annotation of 41,118 full-length cDNA clones originated from 21,037 loci. H-InvDB is a product of the H-Invitational project, an international collaboration to systematically and functionally validate human genes by analysis of a unique set of high quality full-length cDNA clones using automatic annotation and human curation under unified criteria. Here, 19,574 proteins encoded by these cDNAs were classified into 11,709 function-known and 7865 function-unknown hypothetical proteins by similarity with protein databases and motif prediction (InterProScan). The proportion of "hypothetical proteins" in H-InvDB was as high as 40.4%. In this study, we thus conducted data-mining in H-InvDB with the aim of assigning advanced functional annotations to those hypothetical proteins. First, by data-mining in the H-InvDB version of GTOP, we identified 337 SCOP domains within 7865 H-Inv hypothetical proteins. Second, by data-mining of predicted subcellular localization by SOSUI and TMHMM in H-InvDB, we found 1032 transmembrane proteins within H-Inv hypothetical proteins. These results clearly demonstrate that structural prediction is effective for functional annotation of proteins with unknown functions. All the data in H-InvDB are shown in two main views, the cDNA view and the Locus view, and five auxiliary databases with web-based viewers; DiseaseInfo Viewer, H-ANGEL, Clustering Viewer, G-integra and TOPO Viewer; the data also are provided as flat files and XML files. The data consists of descriptions of their gene structures, novel alternative splicing isoforms, functional RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs in relation with orphan diseases, gene expression profiling, and comparisons with mouse full-length cDNAs in the context of molecular evolution. This unique integrative platform for conducting in silico data-mining represents a substantial contribution to resources required for the exploration of human biology and pathology.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16185827     DOI: 10.1016/j.gene.2005.05.036

Source DB:  PubMed          Journal:  Gene        ISSN: 0378-1119            Impact factor:   3.688


  14 in total

Review 1.  The transcript repeat element: the human Alu sequence as a component of gene networks influencing cancer.

Authors:  Paula Moolhuijzen; Jerzy K Kulski; David S Dunn; David Schibeci; Roberto Barrero; Takashi Gojobori; Matthew Bellgard
Journal:  Funct Integr Genomics       Date:  2010-08       Impact factor: 3.410

Review 2.  Microarrays.

Authors:  Robert Plomin; Leonard C Schalkwyk
Journal:  Dev Sci       Date:  2007-01

3.  Generalist Genes: Genetic Links Between Brain, Mind, and Education.

Authors:  Robert Plomin; Yulia Kovas; Claire M A Haworth
Journal:  Mind Brain Educ       Date:  2007-03

4.  Saturation of the human phenome.

Authors:  Mark E Samuels
Journal:  Curr Genomics       Date:  2010-11       Impact factor: 2.236

Review 5.  Bioinformatics tools and novel challenges in long non-coding RNAs (lncRNAs) functional analysis.

Authors:  Letizia Da Sacco; Antonella Baldassarre; Andrea Masotti
Journal:  Int J Mol Sci       Date:  2011-12-23       Impact factor: 5.923

6.  H-DBAS: alternative splicing database of completely sequenced and manually annotated full-length cDNAs based on H-Invitational.

Authors:  Jun-ichi Takeda; Yutaka Suzuki; Mitsuteru Nakao; Tsuyoshi Kuroda; Sumio Sugano; Takashi Gojobori; Tadashi Imanishi
Journal:  Nucleic Acids Res       Date:  2006-11-27       Impact factor: 16.971

7.  H-InvDB in 2009: extended database and data mining resources for human genes and transcripts.

Authors:  Chisato Yamasaki; Katsuhiko Murakami; Jun-ichi Takeda; Yoshiharu Sato; Akiko Noda; Ryuichi Sakate; Takuya Habara; Hajime Nakaoka; Fusano Todokoro; Akihiro Matsuya; Tadashi Imanishi; Takashi Gojobori
Journal:  Nucleic Acids Res       Date:  2009-11-23       Impact factor: 16.971

8.  Comparative genome analysis of three eukaryotic parasites with differing abilities to transform leukocytes reveals key mediators of Theileria-induced leukocyte transformation.

Authors:  Kyoko Hayashida; Yuichiro Hara; Takashi Abe; Chisato Yamasaki; Atsushi Toyoda; Takehide Kosuge; Yutaka Suzuki; Yoshiharu Sato; Shuichi Kawashima; Toshiaki Katayama; Hiroyuki Wakaguri; Noboru Inoue; Keiichi Homma; Masahito Tada-Umezaki; Yukio Yagi; Yasuyuki Fujii; Takuya Habara; Minoru Kanehisa; Hidemi Watanabe; Kimihito Ito; Takashi Gojobori; Hideaki Sugawara; Tadashi Imanishi; William Weir; Malcolm Gardner; Arnab Pain; Brian Shiels; Masahira Hattori; Vishvanath Nene; Chihiro Sugimoto
Journal:  MBio       Date:  2012-09-04       Impact factor: 7.867

9.  Distribution and effects of nonsense polymorphisms in human genes.

Authors:  Yumi Yamaguchi-Kabata; Makoto K Shimada; Yosuke Hayakawa; Shinsei Minoshima; Ranajit Chakraborty; Takashi Gojobori; Tadashi Imanishi
Journal:  PLoS One       Date:  2008-10-14       Impact factor: 3.240

10.  Low conservation and species-specific evolution of alternative splicing in humans and mice: comparative genomics analysis using well-annotated full-length cDNAs.

Authors:  Jun-Ichi Takeda; Yutaka Suzuki; Ryuichi Sakate; Yoshiharu Sato; Masahide Seki; Takuma Irie; Nono Takeuchi; Takuya Ueda; Mitsuteru Nakao; Sumio Sugano; Takashi Gojobori; Tadashi Imanishi
Journal:  Nucleic Acids Res       Date:  2008-10-05       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.