Literature DB >> 16845023

TACT: Transcriptome Auto-annotation Conducting Tool of H-InvDB.

Chisato Yamasaki¹, Hiroaki Kawashima, Fusano Todokoro, Yasuhiro Imamizu, Makoto Ogawa, Motohiko Tanino, Takeshi Itoh, Takashi Gojobori, Tadashi Imanishi.

Abstract

Transcriptome Auto-annotation Conducting Tool (TACT) is a newly developed web-based automated tool for conducting functional annotation of transcripts by the integration of sequence similarity searches and functional motif predictions. We developed the TACT system by integrating two kinds of similarity searches, FASTY and BLASTX, against protein sequence databases, UniProtKB (Swiss-Prot/TrEMBL) and RefSeq, and a unified motif prediction program, InterProScan, into the ORF-prediction pipeline originally designed for the 'H-Invitational' human transcriptome annotation project. This system successively applies these constituent programs to an mRNA sequence in order to predict the most plausible ORF and the function of the protein encoded. In this study, we applied the TACT system to 19 574 non-redundant human transcripts registered in H-InvDB and evaluated its predictive power by the degree of agreement with human-curated functional annotation in H-InvDB. As a result, the TACT system could assign functional description to 12 559 transcripts (64.2%), the remainder being hypothetical proteins. Furthermore, the overall agreement of functional annotation with H-InvDB, including those transcripts annotated as hypothetical proteins, was 83.9% (16 432/19 574). These results show that the TACT system is useful for functional annotation and that the prediction of ORFs and protein functions is highly accurate and close to the results of human curation. TACT is freely available at http://www.jbirc.aist.go.jp/tact/.

Entities: Disease Gene Species

Mesh：

Substances：

Year: 2006 PMID： 16845023 PMCID： PMC1538819 DOI： 10.1093/nar/gkl283

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Automatic prediction of functions of transcripts is extremely important and useful; it has a wide variety of applications in studies based on sequence data of the genome, cDNAs and ESTs in various species, especially in human. Studies on human transcripts have been systematically and extensively carried out to draw the outline of the human transcriptome (1–5). Some other studies have reported the functional annotation of Mus musculus (6) and Arabidopsis (7) full-length cDNAs. However, the functional annotation of those transcripts relied heavily on human curation because currently there is no freely available web server to provide the automated functional annotation. The human transcriptome consists of protein-coding and non-protein-coding functional RNAs. Several sequence analysis techniques are available to provide insights for predicting the function of the transcripts to some extent. For example, sequence similarity search tools, such as BLASTX (8) and FASTY (9) provide the homologs of the transcripts in protein databases and motif prediction programs, such as InterProScan (10) will provide the predicted functional motifs in the protein-coding sequence (CDS) of the transcripts. However, any one of these alone is not enough to comprehensively judge and assign the function of the transcripts as protein-coding genes. Thus, an integration of sequence analysis tools is necessary. We have previously reported the integrative annotation of human genes by the international cooperative project entitled ‘Human Full-length cDNA Annotation Invitational’ (abbreviated as H-Invitational or H-Inv) and the construction of an integrative database of the human transcriptome, named H-Invitational Database (H-InvDB) (11,12). In the H-Invitational project, we collected information about human full-length cDNAs, and conducted extensive bioinformatics analyses by making full use of biological databases and computational tools and rearranging annotation by biologists (13). A standard for human curation was proposed, established and applied to annotate all the collected H-Inv cDNAs. We assigned the standardized functional annotation to 19 574 representative H-Inv proteins by human curation, based on the results of similarity search and InterProScan (11,12). In this study, we describe the newly developed transcriptome auto-annotation conducting tool (TACT), a web-based automated prediction tool for functional annotation that was originally designed for the H-Invitational project. We developed the TACT system by integrating two kinds of similarity searches, BLASTX (8) and FASTY (9), against protein sequence databases and a unified motif prediction program, InterProScan (10), into the ORF-prediction pipeline. This system successively applies these constituent programs to an mRNA or cDNA sequence in order to predict the most plausible ORF and the function as a protein. Furthermore, we applied the TACT system to 19 574 non-redundant human transcripts registered in H-InvDB, and evaluated its predictive power by the degree of agreement with the functional annotation results resulting from human curation.

TACT COMPUTATIONAL PIPELINE

The TACT computational pipeline was developed by integration of two sequence similarity searches, BLASTX and FASTY, and a motif prediction by InterProScan. The computational analyses can be divided into three pipelines; sequence analysis, ORF prediction and auto-functional annotation pipelines, and is carried out as follows (Figure 1).

Figure 1

The TACT annotation pipeline. The flowchart illustrates the TACT computational analysis and web server interfaces. The white arrows indicate the input sequence data to TACT and output annotation data from TACT to users. The thick solid arrows indicate the data flow within the TACT server during analysis.

TACT sequence analysis pipeline

All the repetitive and low-complexity sequences in a query sequence are masked using RepeatMasker () with Repbase. The masked sequence is then subjected to BLASTX and FASTY against UniProtKB (Swiss-Prot/TrEMBL) and RefSeq (human) protein entries. In parallel, GeneMark (14) ORF prediction is carried out.

TACT ORF prediction pipeline

Then, TACT proceeds to predict the ORF of each sequence by using a custom-made Perl script based on the similarity with UniProtKB (Swiss-Prot/TrEMBL) and RefSeq (human) protein entries and prediction by GeneMark (14). The translated region (ORF) of each sequence is predicted by one of five methods in order of priority [illustrated in Supplementary Figure 1A in our previous study (11)]: (i) prediction based on complete sequence match with known (experimentally verified) reviewed human RefSeq or UniProtKB/Swiss-Prot protein entries; (ii) prediction based on similarity with known RefSeq, UniProtKB/Swiss-Prot or UniProtKB/TrEMBL entries; (iii) prediction based on similarity with hypothetical RefSeq, UniProtKB/Swiss-Prot or UniProtKB/TrEMBL entries; (iv) prediction by GeneMark with probability larger than 0.5 and length longer than 80 amino acids; (v) the longest possible translation of initiation to termination codon in six frames of length greater than 80 amino acids.

TACT auto-functional annotation

For each sequence with predicted ORF, TACT conducts InterProScan (10). Then the automated-functional annotation for each sequence is predicted by using a custom-made Perl script applying five standards in order of priority. TACT assigns the most appropriate protein or domain ID, named as ‘data source ID’, to describe the functions of transcripts as proteins and classifies them according to five similarity criteria [illustrated in Supplementary Figure 2B in our previous study (11)]: (i) identical hit by BLASTX or FASTY (identity ≥98% and coverage 100%) to a known human protein in reviewed RefSeq or Swiss-Prot entries (Category I); (ii) similar hit by BLASTX or FASTY (identity ≥50%) to a known protein of any species in RefSeq, UniProtKB/Swiss-Prot or UniProtKB/TrEMBL entries (Category II); (iii) meaningful (with indication of protein functions) InterPro hit by InterProScan (Category III); (iv) similar hit by BLASTX or FASTY (identity ≥50%) to a hypothetical protein in RefSeq or UniProt entries (Category IV); (v) no data source ID assigned (Category V). Category I is solely for human transcripts, but the ‘data source ID’ can be assigned to the input sequence of any species without any restriction. A problem in assigning data source IDs to transcripts is that the data sources are sometimes proteins without experimental verifications. We thus introduced a text-based judgement scheme to determine ‘known proteins’ and ‘meaningful InterPro domain’. In practice, we avoided proteins with the following keywords that suggest proteins without experimental verification in the description: (i) hypothetical, (ii) similar to, (iii) names of cDNA clones (Rik, KIAA, FLJ, DKFZ, HSPC, MGC, CHGC and IMAGE) and (iv) IDs of InterPro domain frequent hitters. Hits to proteins with these keywords are automatically ignored. For any sequence with no predicted ORF, the definition of the sequence is automatically annotated as ‘Non-protein-coding transcript’. Several additional sequence analyses were integrated into the TACT annotation system to provide useful reference data. For example, Gene ontology (GO) terms are assigned through the relations of InterPro IDs to GO terms. All the results of the analysis and annotated data are temporarily stored in a PostgreSQL database and are made accessible through the TACT web-based interfaces (Figure 1).

ACCURACY OF TACT

In the H-Invitational annotation project, we assigned each transcript the most appropriate protein or domain ID, named as ‘data source ID’, to describe the function of cDNA as a protein. The judgement was done one by one by human curation, following a standard scheme [illustrated in Supplementary Figure 4 in our previous study (11)]. For 19 574 representative H-Inv cDNAs, we conducted functional annotation by human curation using a custom-made annotation system. In this report, we evaluated the accuracies of the TACT system by the agreements between TACT annotation and human curation for 19 574 representative H-Inv cDNAs. The overall agreement was 83.9%, i.e. 16 432 TACT annotations, including those annotated as hypothetical proteins, agreed with H-Inv human curation. This result shows that by integrating three sequence analysis programs, the prediction of functional annotation becomes highly accurate and closer to the results of human curation. In our annotation pipeline, we classified proteins into five similarity categories (Table 1): Category I proteins were defined as ‘Identical to a known human protein’, Category II proteins were defined as ‘Similar to a known protein’; Category III proteins were defined as ‘Domain-containing proteins’; Category IV proteins were defined as ‘Conserved hypothetical proteins’; Category V proteins were defined as ‘Hypothetical proteins’. The agreements of TACT annotation with H-InvDB for each similarity category are summarized in Table 1. In the H-Inv annotation project we checked the abstracts in PubMed () corresponding to the candidate protein entries for data source ID in the human curation procedure. The agreement was the lowest in Category II. For example, for HIT000012743 (AK056129), TACT auto-annotation output Q91YE4 as data source ID, so that the definition of the cDNA was ‘Similar to 67 kDa polymerase-associated factor PAF67’ [Category II; Similar to a mouse (Q91YE4) protein]. However, by checking the PubMed abstracts for Q91YE4, we found that the annotation should be altered to ‘TPR repeat containing protein’ [Category III; InterPro domain (IPR001440)-containing protein] because the function of Q91YE4 was not yet examined experimentally. This example illustrates the importance of human curation in functional annotation of transcripts. Also, exclusion of proteins described in specific PubMed entries is a possible way of improving the accuracy of TACT auto-functional annotation pipeline. The human-curated annotation was provided at the appropriate cDNA view in H-InvDB ().

Table 1

The degree of agreement of TACT annotation and human curation

Similarity category	Description	No. of H-Inv proteins examined	No. of correctly predicted proteins by TACT	The agreement of TACT annotation and human curation (%)
I	Identical to a known human protein. (Identity ≥98% and coverage =100%)	5313	4735	89.1
II	Similar to a known protein of any species. (Identity ≥50%)	5859	3469	59.2
III	InterPro domain-containing protein	1387	1320	95.2
IV	Conserved hypothetical protein	1309	1265	98.9
V	Hypothetical protein	5706	5643	96.6
Total		19 574	16 432	83.9

TACT INPUT DATA

Nucleotide sequence data consisting mRNA, cDNA or EST in one of the three formats may be uploaded and submitted to the TACT system. The users may directly copy and paste FASTA-format sequence (s), upload a FASTA or DDBJ format flat file, or enter a DDBJ/EMBL/GenBank accession number (s). The maximum sequence length analyzed by TACT is 30 000 bp and the maximum number of sequences analyzed by TACT is 10. When a DDBJ/EMBL/GenBank accession number is entered as input data, then TACT obtains the DNA databank flat file through the getentry sequence retrieval system in DNA Data Bank of Japan (DDBJ; ) and all the succeeding analysis will be conducted in the appropriate cascades. When an input data file is uploaded in one of the other two formats, then TACT will conduct all the analysis for the uploaded mRNA sequence data. Although it was originally developed to predict functions of protein-coding transcripts of human cDNA, the sequence data of any species can be analysed by TACT.

TACT WEB-BASED INTERFACES

TACT web-based interfaces consist of the TACT top page, the TACT data submission and the TACT annotation view. In the TACT top page, general information about TACT and data input facilities are provided (Figure 2A). In the TACT data submission, the selections of the analysis options are provided (Figure 2B). It is recommended that users select all the options to achieve the highest accuracy. After selecting the options, users can check the input sequence data and the progress of the analysis just after submitting the annotation. By entering an e-mail address, the URL for the annotation results will be reported to users by e-mail when all the analysis has been completed. The TACT annotation view shows all the annotation of the input sequence. It consists of six sections: auto-functional annotation information section; mapping information section; cDNA information section; predicted ORF section; predicted motif information section and predicted subcellular localization section. The data include the functional description as protein-coding transcripts, similarity category, predicted ORF, translation, predicted functional motifs by InterProScan, GO and input nucleotide sequence in DNA databank (Figure 2C). If the DNA accession number of the input data are already recorded in H-InvDB, then the location in the human genome, and corresponding H-Invitational transcripts (HIT) and H-Invitational cluster (HIX) IDs with hyperlinks to H-InvDB will also be provided. As shown in the sample Annotation view in Figure 2, this view also links to many external public databases including DDBJ/EMBL/GenBank, RefSeq, UniProtKB/Swiss-Prot, InterPro and GO.

Figure 2

TACT web-based interfaces. Sample views of TACT top (A), data submission (B) and annotation view (C) for HIT000017619 (AK092752) are shown. The annotation view (C) shows detailed annotation information and has links to external databases as indicated. The blue arrows indicate the flows of views during the TACT analysis and black lines indicate the links to appropriate reference data in H-InvDB or external public databases.

CONCLUSION

In this study, we showed that by integrating three sequence analysis programs, the prediction of ORFs and protein functions becomes highly accurate and closer to the results of human curation. Because of its accuracy and usefulness, TACT will be an indispensable tool to predict the function of transcripts. The TACT system can always provide the latest reference information because protein and motif databases are updated regularly and frequently. This unique system for conducting functional annotation may have a wide range of application in transcriptome studies of human and other species. TACT is freely available at .

14 in total

1. Functional annotation of a full-length Arabidopsis cDNA collection.

Authors: Motoaki Seki; Mari Narusaka; Asako Kamiya; Junko Ishida; Masakazu Satou; Tetsuya Sakurai; Maiko Nakajima; Akiko Enju; Kenji Akiyama; Youko Oono; Masami Muramatsu; Yoshihide Hayashizaki; Jun Kawai; Piero Carninci; Masayoshi Itoh; Yoshiyuki Ishii; Takahiro Arakawa; Kazuhiro Shibata; Akira Shinagawa; Kazuo Shinozaki
Journal: Science Date: 2002-03-21 Impact factor: 47.728

2. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs.

Authors: Y Okazaki; M Furuno; T Kasukawa; J Adachi; H Bono; S Kondo; I Nikaido; N Osato; R Saito; H Suzuki; I Yamanaka; H Kiyosawa; K Yagi; Y Tomaru; Y Hasegawa; A Nogami; C Schönbach; T Gojobori; R Baldarelli; D P Hill; C Bult; D A Hume; J Quackenbush; L M Schriml; A Kanapin; H Matsuda; S Batalov; K W Beisel; J A Blake; D Bradt; V Brusic; C Chothia; L E Corbani; S Cousins; E Dalla; T A Dragani; C F Fletcher; A Forrest; K S Frazer; T Gaasterland; M Gariboldi; C Gissi; A Godzik; J Gough; S Grimmond; S Gustincich; N Hirokawa; I J Jackson; E D Jarvis; A Kanai; H Kawaji; Y Kawasawa; R M Kedzierski; B L King; A Konagaya; I V Kurochkin; Y Lee; B Lenhard; P A Lyons; D R Maglott; L Maltais; L Marchionni; L McKenzie; H Miki; T Nagashima; K Numata; T Okido; W J Pavan; G Pertea; G Pesole; N Petrovsky; R Pillai; J U Pontius; D Qi; S Ramachandran; T Ravasi; J C Reed; D J Reed; J Reid; B Z Ring; M Ringwald; A Sandelin; C Schneider; C A M Semple; M Setou; K Shimada; R Sultana; Y Takenaka; M S Taylor; R D Teasdale; M Tomita; R Verardo; L Wagner; C Wahlestedt; Y Wang; Y Watanabe; C Wells; L G Wilming; A Wynshaw-Boris; M Yanagisawa; I Yang; L Yang; Z Yuan; M Zavolan; Y Zhu; A Zimmer; P Carninci; N Hayatsu; T Hirozane-Kishikawa; H Konno; M Nakamura; N Sakazume; K Sato; T Shiraki; K Waki; J Kawai; K Aizawa; T Arakawa; S Fukuda; A Hara; W Hashizume; K Imotani; Y Ishii; M Itoh; I Kagawa; A Miyazaki; K Sakai; D Sasaki; K Shibata; A Shinagawa; A Yasunishi; M Yoshino; R Waterston; E S Lander; J Rogers; E Birney; Y Hayashizaki
Journal: Nature Date: 2002-12-05 Impact factor: 49.962

3. Gene expression profiling in the human hypothalamus-pituitary-adrenal axis and full-length cDNA cloning.

Authors: R M Hu; Z G Han; H D Song; Y D Peng; Q H Huang; S X Ren; Y J Gu; C H Huang; Y B Li; C L Jiang; G Fu; Q H Zhang; B W Gu; M Dai; Y F Mao; G F Gao; R Rong; M Ye; J Zhou; S H Xu; J Gu; J X Shi; W R Jin; C K Zhang; T M Wu; G Y Huang; Z Chen; M D Chen; J L Chen
Journal: Proc Natl Acad Sci U S A Date: 2000-08-15 Impact factor: 11.205

4. Complete sequencing and characterization of 21,243 full-length human cDNAs.

Authors: Toshio Ota; Yutaka Suzuki; Tetsuo Nishikawa; Tetsuji Otsuki; Tomoyasu Sugiyama; Ryotaro Irie; Ai Wakamatsu; Koji Hayashi; Hiroyuki Sato; Keiichi Nagai; Kouichi Kimura; Hiroshi Makita; Mitsuo Sekine; Masaya Obayashi; Tatsunari Nishi; Toshikazu Shibahara; Toshihiro Tanaka; Shizuko Ishii; Jun-ichi Yamamoto; Kaoru Saito; Yuri Kawai; Yuko Isono; Yoshitaka Nakamura; Kenji Nagahari; Katsuhiko Murakami; Tomohiro Yasuda; Takao Iwayanagi; Masako Wagatsuma; Akiko Shiratori; Hiroaki Sudo; Takehiko Hosoiri; Yoshiko Kaku; Hiroyo Kodaira; Hiroshi Kondo; Masanori Sugawara; Makiko Takahashi; Katsuhiro Kanda; Takahide Yokoi; Takako Furuya; Emiko Kikkawa; Yuhi Omura; Kumi Abe; Kumiko Kamihara; Naoko Katsuta; Kazuomi Sato; Machiko Tanikawa; Makoto Yamazaki; Ken Ninomiya; Tadashi Ishibashi; Hiromichi Yamashita; Katsuji Murakawa; Kiyoshi Fujimori; Hiroyuki Tanai; Manabu Kimata; Motoji Watanabe; Susumu Hiraoka; Yoshiyuki Chiba; Shinichi Ishida; Yukio Ono; Sumiyo Takiguchi; Susumu Watanabe; Makoto Yosida; Tomoko Hotuta; Junko Kusano; Keiichi Kanehori; Asako Takahashi-Fujii; Hiroto Hara; Tomo-o Tanase; Yoshiko Nomura; Sakae Togiya; Fukuyo Komai; Reiko Hara; Kazuha Takeuchi; Miho Arita; Nobuyuki Imose; Kaoru Musashino; Hisatsugu Yuuki; Atsushi Oshima; Naokazu Sasaki; Satoshi Aotsuka; Yoko Yoshikawa; Hiroshi Matsunawa; Tatsuo Ichihara; Namiko Shiohata; Sanae Sano; Shogo Moriya; Hiroko Momiyama; Noriko Satoh; Sachiko Takami; Yuko Terashima; Osamu Suzuki; Satoshi Nakagawa; Akihiro Senoh; Hiroshi Mizoguchi; Yoshihiro Goto; Fumio Shimizu; Hirokazu Wakebe; Haretsugu Hishigaki; Takeshi Watanabe; Akio Sugiyama; Makoto Takemoto; Bunsei Kawakami; Masaaki Yamazaki; Koji Watanabe; Ayako Kumagai; Shoko Itakura; Yasuhito Fukuzumi; Yoshifumi Fujimori; Megumi Komiyama; Hiroyuki Tashiro; Akira Tanigami; Tsutomu Fujiwara; Toshihide Ono; Katsue Yamada; Yuka Fujii; Kouichi Ozaki; Maasa Hirao; Yoshihiro Ohmori; Ayako Kawabata; Takeshi Hikiji; Naoko Kobatake; Hiromi Inagaki; Yasuko Ikema; Sachiko Okamoto; Rie Okitani; Takuma Kawakami; Saori Noguchi; Tomoko Itoh; Keiko Shigeta; Tadashi Senba; Kyoka Matsumura; Yoshie Nakajima; Takae Mizuno; Misato Morinaga; Masahide Sasaki; Takushi Togashi; Masaaki Oyama; Hiroko Hata; Manabu Watanabe; Takami Komatsu; Junko Mizushima-Sugano; Tadashi Satoh; Yuko Shirai; Yukiko Takahashi; Kiyomi Nakagawa; Koji Okumura; Takahiro Nagase; Nobuo Nomura; Hisashi Kikuchi; Yasuhiko Masuho; Riu Yamashita; Kenta Nakai; Tetsushi Yada; Yusuke Nakamura; Osamu Ohara; Takao Isogai; Sumio Sugano
Journal: Nat Genet Date: 2003-12-21 Impact factor: 38.330

5. Geneticists lay foundations for human transcriptome database.

Authors: David Cyranoski
Journal: Nature Date: 2002-09-05 Impact factor: 49.962

6. Basic local alignment search tool.

Authors: S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal: J Mol Biol Date: 1990-10-05 Impact factor: 5.469

7. Investigation of protein functions through data-mining on integrated human transcriptome database, H-Invitational database (H-InvDB).

Authors: Chisato Yamasaki; Kanako O Koyanagi; Yasuyuki Fujii; Takeshi Itoh; Roberto Barrero; Takuro Tamura; Yumi Yamaguchi-Kabata; Motohiko Tanino; Jun-Ichi Takeda; Satoshi Fukuchi; Satoru Miyazaki; Nobuo Nomura; Sumio Sugano; Tadashi Imanishi; Takashi Gojobori
Journal: Gene Date: 2005-09-26 Impact factor: 3.688

8. Toward a catalog of human genes and proteins: sequencing and analysis of 500 novel complete protein coding human cDNAs.

Authors: S Wiemann; B Weil; R Wellenreuther; J Gassenhuber; S Glassl; W Ansorge; M Böcher; H Blöcker; S Bauersachs; H Blum; J Lauber; A Düsterhöft; A Beyer; K Köhrer; N Strack; H W Mewes; B Ottenwälder; B Obermaier; J Tampe; D Heubner; R Wambutt; B Korn; M Klein; A Poustka
Journal: Genome Res Date: 2001-03 Impact factor: 9.043

9. HUNT: launch of a full-length cDNA database from the Helix Research Institute.

Authors: H T Yudate; M Suwa; R Irie; H Matsui; T Nishikawa; Y Nakamura; D Yamaguchi; Z Z Peng; T Yamamoto; K Nagai; K Hayashi; T Otsuki; T Sugiyama; T Ota; Y Suzuki; S Sugano; T Isogai; Y Masuho
Journal: Nucleic Acids Res Date: 2001-01-01 Impact factor: 16.971

10. Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

Authors: Tadashi Imanishi; Takeshi Itoh; Yutaka Suzuki; Claire O'Donovan; Satoshi Fukuchi; Kanako O Koyanagi; Roberto A Barrero; Takuro Tamura; Yumi Yamaguchi-Kabata; Motohiko Tanino; Kei Yura; Satoru Miyazaki; Kazuho Ikeo; Keiichi Homma; Arek Kasprzyk; Tetsuo Nishikawa; Mika Hirakawa; Jean Thierry-Mieg; Danielle Thierry-Mieg; Jennifer Ashurst; Libin Jia; Mitsuteru Nakao; Michael A Thomas; Nicola Mulder; Youla Karavidopoulou; Lihua Jin; Sangsoo Kim; Tomohiro Yasuda; Boris Lenhard; Eric Eveno; Yoshiyuki Suzuki; Chisato Yamasaki; Jun-ichi Takeda; Craig Gough; Phillip Hilton; Yasuyuki Fujii; Hiroaki Sakai; Susumu Tanaka; Clara Amid; Matthew Bellgard; Maria de Fatima Bonaldo; Hidemasa Bono; Susan K Bromberg; Anthony J Brookes; Elspeth Bruford; Piero Carninci; Claude Chelala; Christine Couillault; Sandro J de Souza; Marie-Anne Debily; Marie-Dominique Devignes; Inna Dubchak; Toshinori Endo; Anne Estreicher; Eduardo Eyras; Kaoru Fukami-Kobayashi; Gopal R Gopinath; Esther Graudens; Yoonsoo Hahn; Michael Han; Ze-Guang Han; Kousuke Hanada; Hideki Hanaoka; Erimi Harada; Katsuyuki Hashimoto; Ursula Hinz; Momoki Hirai; Teruyoshi Hishiki; Ian Hopkinson; Sandrine Imbeaud; Hidetoshi Inoko; Alexander Kanapin; Yayoi Kaneko; Takeya Kasukawa; Janet Kelso; Paul Kersey; Reiko Kikuno; Kouichi Kimura; Bernhard Korn; Vladimir Kuryshev; Izabela Makalowska; Takashi Makino; Shuhei Mano; Regine Mariage-Samson; Jun Mashima; Hideo Matsuda; Hans-Werner Mewes; Shinsei Minoshima; Keiichi Nagai; Hideki Nagasaki; Naoki Nagata; Rajni Nigam; Osamu Ogasawara; Osamu Ohara; Masafumi Ohtsubo; Norihiro Okada; Toshihisa Okido; Satoshi Oota; Motonori Ota; Toshio Ota; Tetsuji Otsuki; Dominique Piatier-Tonneau; Annemarie Poustka; Shuang-Xi Ren; Naruya Saitou; Katsunaga Sakai; Shigetaka Sakamoto; Ryuichi Sakate; Ingo Schupp; Florence Servant; Stephen Sherry; Rie Shiba; Nobuyoshi Shimizu; Mary Shimoyama; Andrew J Simpson; Bento Soares; Charles Steward; Makiko Suwa; Mami Suzuki; Aiko Takahashi; Gen Tamiya; Hiroshi Tanaka; Todd Taylor; Joseph D Terwilliger; Per Unneberg; Vamsi Veeramachaneni; Shinya Watanabe; Laurens Wilming; Norikazu Yasuda; Hyang-Sook Yoo; Marvin Stodolsky; Wojciech Makalowski; Mitiko Go; Kenta Nakai; Toshihisa Takagi; Minoru Kanehisa; Yoshiyuki Sakaki; John Quackenbush; Yasushi Okazaki; Yoshihide Hayashizaki; Winston Hide; Ranajit Chakraborty; Ken Nishikawa; Hideaki Sugawara; Yoshio Tateno; Zhu Chen; Michio Oishi; Peter Tonellato; Rolf Apweiler; Kousaku Okubo; Lukas Wagner; Stefan Wiemann; Robert L Strausberg; Takao Isogai; Charles Auffray; Nobuo Nomura; Takashi Gojobori; Sumio Sugano
Journal: PLoS Biol Date: 2004-04-20 Impact factor: 8.029

5 in total

1. CIPRO 2.5: Ciona intestinalis protein database, a unique integrated repository of large-scale omics data, bioinformatic analyses and curated annotation, with user rating and reviewing functionality.

Authors: Toshinori Endo; Keisuke Ueno; Kouki Yonezawa; Katsuhiko Mineta; Kohji Hotta; Yutaka Satou; Lixy Yamada; Michio Ogasawara; Hiroki Takahashi; Ayako Nakajima; Mia Nakachi; Mamoru Nomura; Junko Yaguchi; Yasunori Sasakura; Chisato Yamasaki; Miho Sera; Akiyasu C Yoshizawa; Tadashi Imanishi; Hisaaki Taniguchi; Kazuo Inaba
Journal: Nucleic Acids Res Date: 2010-11-10 Impact factor: 16.971

2. Linking microarray reporters with protein functions.

Authors: Stan Gaj; Arie van Erk; Rachel I M van Haaften; Chris T A Evelo
Journal: BMC Bioinformatics Date: 2007-09-26 Impact factor: 3.169

3. Comparative genome analysis of three eukaryotic parasites with differing abilities to transform leukocytes reveals key mediators of Theileria-induced leukocyte transformation.

Authors: Kyoko Hayashida; Yuichiro Hara; Takashi Abe; Chisato Yamasaki; Atsushi Toyoda; Takehide Kosuge; Yutaka Suzuki; Yoshiharu Sato; Shuichi Kawashima; Toshiaki Katayama; Hiroyuki Wakaguri; Noboru Inoue; Keiichi Homma; Masahito Tada-Umezaki; Yukio Yagi; Yasuyuki Fujii; Takuya Habara; Minoru Kanehisa; Hidemi Watanabe; Kimihito Ito; Takashi Gojobori; Hideaki Sugawara; Tadashi Imanishi; William Weir; Malcolm Gardner; Arnab Pain; Brian Shiels; Masahira Hattori; Vishvanath Nene; Chihiro Sugimoto
Journal: MBio Date: 2012-09-04 Impact factor: 7.867

4. H-InvDB in 2013: an omics study platform for human functional gene and transcript discovery.

Authors: Jun-Ichi Takeda; Chisato Yamasaki; Katsuhiko Murakami; Yoko Nagai; Miho Sera; Yuichiro Hara; Nobuo Obi; Takuya Habara; Takashi Gojobori; Tadashi Imanishi
Journal: Nucleic Acids Res Date: 2012-11-28 Impact factor: 16.971

5. Low conservation and species-specific evolution of alternative splicing in humans and mice: comparative genomics analysis using well-annotated full-length cDNAs.

Authors: Jun-Ichi Takeda; Yutaka Suzuki; Ryuichi Sakate; Yoshiharu Sato; Masahide Seki; Takuma Irie; Nono Takeuchi; Takuya Ueda; Mitsuteru Nakao; Sumio Sugano; Takashi Gojobori; Tadashi Imanishi
Journal: Nucleic Acids Res Date: 2008-10-05 Impact factor: 16.971

5 in total