Literature DB >> 17982176

Evola: Ortholog database of all human genes in H-InvDB with manual curation of phylogenetic trees.

Akihiro Matsuya¹, Ryuichi Sakate, Yoshihiro Kawahara, Kanako O Koyanagi, Yoshiharu Sato, Yasuyuki Fujii, Chisato Yamasaki, Takuya Habara, Hajime Nakaoka, Fusano Todokoro, Kaori Yamaguchi, Toshinori Endo, Satoshi Oota, Wojciech Makalowski, Kazuho Ikeo, Yoshiyuki Suzuki, Kousuke Hanada, Katsuyuki Hashimoto, Momoki Hirai, Hisakazu Iwama, Naruya Saitou, Aiko T Hiraki, Lihua Jin, Yayoi Kaneko, Masako Kanno, Katsuhiko Murakami, Akiko Ogura Noda, Naomi Saichi, Ryoko Sanbonmatsu, Mami Suzuki, Jun-ichi Takeda, Masayuki Tanaka, Takashi Gojobori, Tadashi Imanishi, Takeshi Itoh.

Abstract

Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Currently, with the rapid growth of transcriptome data of various species, more reliable orthology information is prerequisite for further studies. However, detection of orthologs could be erroneous if pairwise distance-based methods, such as reciprocal BLAST searches, are utilized. Thus, as a sub-database of H-InvDB, an integrated database of annotated human genes (http://h-invitational.jp/), we constructed a fully curated database of evolutionary features of human genes, called 'Evola'. In the process of the ortholog detection, computational analysis based on conserved genome synteny and transcript sequence similarity was followed by manual curation by researchers examining phylogenetic trees. In total, 18 968 human genes have orthologs among 11 vertebrates (chimpanzee, mouse, cow, chicken, zebrafish, etc.), either computationally detected or manually curated orthologs. Evola provides amino acid sequence alignments and phylogenetic trees of orthologs and homologs. In 'd(N)/d(S) view', natural selection on genes can be analyzed between human and other species. In 'Locus maps', all transcript variants and their exon/intron structures can be compared among orthologous gene loci. We expect the Evola to serve as a comprehensive and reliable database to be utilized in comparative analyses for obtaining new knowledge about human genes. Evola is available at http://www.h-invitational.jp/evola/.

Entities: Chemical Disease Species

Mesh：

Substances：
RNA, Messenger

Year: 2007 PMID： 17982176 PMCID： PMC2238928 DOI： 10.1093/nar/gkm878

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

A large number of genome and transcript sequences accumulated in the last decade give us an opportunity for large-scale comparative analyses. In particular, detection of orthologs, groups of genes in different species that evolved by speciation, accelerates functional and evolutionary studies. Despite the past efforts to develop bioinformatics methods for analyzing a large number of sequences, it is still a challenge to comprehensively identify orthologs between species. A number of automated pairwise distance-based methods for ortholog detection have been proposed, as represented by the reciprocal best BLAST hits (RBH) method (1) and the reciprocal smallest distance (RSD) method (2). However, as genes might have frequently undergone duplications and losses in evolutionary lineages leading to human (3), pairwise distance-based methods might lead to erroneous inferences of phylogenetic relationships and thus of orthologs. Thus, phylogenetic tree-based detection can be the most plausible solution to provide more reliable orthologs. Here this database ‘Evola’, a sub-database complementary to the H-Invitational database (H-InvDB), was developed to provide orthology information for the originally annotated human genes in H-InvDB. Evola features its ortholog detection in which genome synteny-based computational analysis was followed by manual curation of molecular phylogenetic trees. Evola differs in this way from other ortholog databases such as Inparanoid (4), Ensembl-Compara (5), Homologene (6), HOGENOM (7) and TreeFam (8). These databases are based on BLAST hits (Inparanoid), BLAST hits and synteny (Ensembl-Compara and Homologene) and phylogenetic trees (HOGENOM and TreeFam). The concept of Evola is that genomic region (gene locus) is a unit of genes that are duplicated or lost. In collaboration with H-InvDB, Evola enables users to compare gene structure, transcript variants, upstream/downstream region of the genome among species. H-InvDB is an integrated database of annotated human genes providing annotation of human full-length enriched cDNAs (9,10,11). At the meetings of the Human Full-Length cDNA Annotation Invitational held in Japan (2002 and 2003), Evola started with H-InvDB to annotate evolutionary features of the human genes. With several updates afterwards and a subsequent All Human Genes Evolutionary Annotation (AHG-EV) meeting in 2006, the current strategy of evolutionary annotation (computational analysis and manual curation) in Evola has been established. Orthology information for human and other 11 vertebrates is currently included in the Evola: human, chimpanzee, macaque, mouse, rat, dog, cow, opossum, chicken, zebrafish, Tetraodon and Fugu. Several visualization tools are incorporated into the database, including sequence alignment viewer, natural selection plot and graphical representation of orthologous gene loci among different species. Evola is now one of the databases listed in the Comparison of Orthology Predictions project of the HUGO Gene Nomenclature Committee (HGNC, http://www.genenames.org/).

ORTHOLOG DETECTION

Computational analysis: Ortholog detection based on conserved genomic synteny and pairwise distance

Species for ortholog detection were selected with consideration of completeness of their genome assemblies (chromosome level), abundance of transcript sequences (∼20 000) and importance in biology (intensively studied or a representative of a phylogenetic clade). Whole genome sequence assemblies of human (hg18), chimpanzee (panTro2), macaque (rheMac2), mouse (mm8), rat (rn4), dog (canFam2), cow (rn4), opossum (monDom4), chicken (galGal3), zebrafish (danRer4), Tetraodon (tetNig1) and Fugu (fr1) were downloaded from UCSC (http://genome.ucsc.edu/). Conserved syntenic regions were detected by a modified pairwise genome alignment method (12) using BLASTZ (13) with the options of C = 2, T = 4, Y = 3400 between human and other primates (between more similar genome sequences), and C = 2 between human and non-primate vertebrates (between less similar genome sequences). For human transcripts, H-InvDB representative transcripts (HITs) were used. Other vertebrates’ transcripts (mRNAs) were downloaded from DDBJ (http://www.ddbj.nig.ac.jp/) release 66, Ensembl (http://www.ensembl.org/) release 38 and RefSeq (http://www.ncbi.nlm.nih.gov/RefSeq/) release 17, and their genomic locations (one location per transcript) were detected on cognate genomes by a hybrid method using BLAT (14), BLAST (15) and est2genome (16) as they were used to detect genomic locations of human transcripts in H-InvDB. Representative transcripts (one transcript per gene locus) were determined in consideration of percent identity and coverage to the genome, number of exons, etc. of all transcripts in each locus (9,10,11). Thus, in Evola, representative transcripts were defined as genes. Lengths of overlapping exons of each gene pair between human and other species were calculated in the genome alignment. A gene pair with the maximum length was selected as the best assignment (not a minimum length was defined). Every gene in a species was assigned to a gene in the other species. If two human genes were assigned to one mouse gene, this was defined as a two-to-one ortholog. As a result, Evola contains not only one-to-one orthologs but also many-to-many orthologs. For all the assignment pairs, coding sequences (CDSs) and amino acid (a, a) sequences of other species were predicted by FASTY (17). They were predicted by comparing with the amino acid sequences of the corresponding human genes. Finally, if the length of the alignable region between human and other species ortholog candidates was ≥80 a.a., they were defined as computationally detected orthologs.

Manual curation: Examination of phylogenetic trees by experts

Homologs of human genes (amino acid sequences) were obtained from UniProt (http://www.uniprot.org/) and human RefSeq (NP) by FASTY similarity searches with the option of E-value of <1e−5. For each human gene, a sequence set consisting of both the computationally detected orthologs and the homologs was prepared. For these sequence sets, phylogenetic trees were constructed by the neighbor-joining (NJ) method (18). In detail, multiple amino acid sequence alignments and phylogenetic trees were constructed by ClustalW (19) with the options of bootstrap = 1000, seed = 1, kimura, tossgaps, bootlabels = node. Phylogenetic trees were examined by experts in the field of molecular evolution, who attended the evolutionary annotation meetings described in the introduction. The trees were drawn by NJplot (http://pbil.univ-lyon1.fr/software/njplot.html) and the default rooting was used. Discarding or re-rooting the tree was judged by the experts if necessary. All the ortholog pairs of human and other species detected by the computational analysis were examined (Figure 1). The primary principles of manual curation in Evola to be checked were as follows. [1] Phylogenetic topology between gene tree and species tree is consistent. As a gene tree, the minimum sub-clade including the pair (a part of the tree) was examined. As a species tree of reference, a phylogenetic tree indicating the trifurcation among primates, rodents and Laurasiatherian (dog, cow, etc.) species (20) was used, because the phylogenetic relationship has been controversial among them (21). In fact, we found that ((human–mouse)–dog) clades for some genes and ((human–dog)–mouse) clades for other genes. [2] Outgroup includes either two or more species that are phylogenetically distant from all the species in the sub-clade, or human and other species. In the latter case, human duplicate genes might exist. [3] Available bootstrap values of the corresponding three branches (one between the sub-clade and outgroup, and its two descendants) are all ≥900. The gene pairs consistent with all the principles were defined as ‘manually curated orthologs’, otherwise their annotation status remained to be ‘computationally detected orthologs’.

Figure 1.

An example of manually curated gene pair from H.sapiens (red underline) and Macaca.sp (blue underline). In this case, conditions of phylogenetic topologies, outgroup species (light gray background) and bootstrap values (two circles) are sufficient (refer to the text). Thus, the pair was defined as a manually curated ortholog.

DATABASE CONTENTS

Evola contains two ortholog datasets: (1) more comprehensive set of orthologs (computational analysis); and (2) more reliable orthologs (computational analysis supported by manual curation). In the current Evola (release 4.1), orthology information for 18 968 human genes is available among 11 vertebrates: chimpanzee, macaque, mouse, rat, dog, cow, opossum, chicken, zebrafish, Tetraodon and Fugu (Table 1). Manually curated orthologs occupied 25.4% of all computationally detected ortholog pairs (24 122/94 935) (release 4.1, 2007).

Table 1.

Number of orthologs provided in Evola (release 4.1, June 2007)

Species	Genes	Human genes
Homo sapiens (Human)	18 968	–
Pan troglodytes (Chimpanzee)	16 368	15 615
Macaca sp. (Macaque)^a	12 037	12 352
Mus musculus (Mouse)	15 570	14 574
Rattus norvegicus (Rat)	15 632	14 302
Canis familiaris (Dog)	14 730	13 916
Bos taurus (Cow)	9375	10 181
Monodelphis domestica (Opossum)	13 201	13 588
Gallus gallus (Chicken)	9266	10 738
Danio rerio (Zebrafish)	12 334	10 468
Tetraodon nigroviridis (Tetraodon)	11 505	9820
Takifugu rubripes (Fugu)	9738	9459

Numbers of genes of both human and other species are listed. Owing to lineage-specific duplication or loss, the numbers are usually different (for example, 15 570 mouse genes are orthologous to 14 574 human genes). 18 968 human genes have at least one ortholog among other 11 species.

aMacaca mulatta, Macaca fascicularis, Macaca fuscata, etc. are included.

Number of orthologs provided in Evola (release 4.1, June 2007) Numbers of genes of both human and other species are listed. Owing to lineage-specific duplication or loss, the numbers are usually different (for example, 15 570 mouse genes are orthologous to 14 574 human genes). 18 968 human genes have at least one ortholog among other 11 species. aMacaca mulatta, Macaca fascicularis, Macaca fuscata, etc. are included. Evola is a sub-database of H-InvDB (9,10,11), and orthology information in Evola is, as ‘Evolutionary annotation’, a part of the comprehensive human gene annotations in H-InvDB. Thus, orthology information can be utilized with close reference to other annotation in H-InvDB. For example, 2090 human genes with orthology information belonged to H-Inv protein similarity categories of ‘hypothetical proteins’ (similarity category IV–VI). Molecular functions of these hypothetical proteins can be analyzed using model species. Moreover, cross references between Evola and other annotations in H-InvDB (protein–protein interaction (PPI), expression, polymorphism, disease, etc.) can produce valuable information contributing to the comprehensive understanding of the human genes. We aimed to develop user-friendly interfaces that provide easy access to a variety of orthology information in Evola. Users can search orthologs in the top page of Evola as well as in the search systems of H-InvDB [simple search, advanced search and navigation system (Navi)]. Users can download data for each human gene on the main page as well as all the data of Evola in the download page. On the main page of Evola (Figure 2), the following information for a human gene is available in the left frame: gene name, ortholog list with annotation status, download of sequences, alignments and phylogenetic trees, Gene ontology (22) and InterPro (23). In addition to the set of original ClustalW alignments, another set of alignments, including properly aligned sequences only (24), was also constructed and provided. In the latter sets, sequences with distinctively low identity to other sequences in an alignment were excluded. Based on both alignment sets, phylogenetic trees were constructed by the neighbor-joining method (18) and the NJML+ method (25).

Figure 2.

Evola main page. This page is divided into left and right frames. In the left frame, tables of orthologs, download data, Gene ontology, and InterPro are listed. Three green buttons are links to show ‘Alignment’ (A), ‘d/d view’ (B) and ‘Locus maps’ (C) in the right frame.

Alignment: Multiple alignments of orthologs and homologs (Figure 2A)

Amino acid sequence alignments of orthologs and homologs are displayed. Users can switch from ‘Alignment of Orthologs’ (default) to ‘Alignment of Orthologs and Homologs’, or vice versa. Each amino acid residue is color coded as defined in ClustalX (19). Accession numbers and species names of orthologs (human and other species) are colored in their species colors defined in Evola (human in red, mouse in gray, etc.). Accession numbers of homologs are linked to the original data sources of UniProt or RefSeq. While species are labeled by their scientific names (Homo, Mus, etc.), users can activate a popup window giving a species common name by placing the mouse cursor over homolog accession numbers (for example, ‘Q5R508_Pongo’). InterPro data in the left frame include positional information on a human gene, and they can be utilized to detect conserved domains in the proteins.

d/d view: Window analysis detecting regions under positive or negative selection (Figure 2B)

Users can select one or more species for which to show the plots in the graph. In the lower frame under the graph, the pairwise nucleotide sequence alignment of CDSs is shown. The sequence positions (a.a. or codon) appearing in the graph and alignment are those of human genes. The nonsynonymous to synonymous substitution rate ratio (d/d) is a commonly used measure of natural selection. In order to visualize positively and negatively selected regions, sliding window analysis was conducted (a 20 codon window with 1 codon stepping; result for the first window appears as a plot at 11th codon of the human gene). The statistical significance (P-value) of the difference between the number of nonsynonymous substitution (n) per synonymous substitutions (s): n/s, and the number of nonsynonymous sites (N) per synonymous sites (S): N/S was calculated by Fisher's exact test. d and n values were estimated by the modified Nei–Gojobori method (26,27). If d/d > 1, the score (= 1 − P-value) was plotted above the zero line (neutral), and if d/d < 1, the score [= −(1 − P-value)] was plotted below the zero line. The regions plotted above the red line indicate that the sites might be under positive selection (d/d > 1 and P < 0.01). Conversely, the regions plotted below the blue line indicate that the sites might be under negative (purifying) selection (d/d < 1 and P < 0.01).

Locus maps: Comparative maps of orthologous gene loci (Figure 2C)

Orthologs were detected for representative transcripts (one transcript per gene locus) in Evola. However, there could be transcript variants in gene loci that have different exon–intron structures leading to produce different protein isoforms. Thus, information on other transcripts besides the representative transcript among orthologous gene loci are shown in Locus maps. In the figures, exon/intron structure, coding sequence (CDS) and untranslated regions (UTR) for each transcript are visualized. H-Inv cluster ID (HIX, an identifier of gene locus), Gene symbol, genomic location and a link to ‘G-integra’, an integrated genome browser of H-InvDB, are available. The flag icon denotes the representative transcript. The blue diamond icon denotes the Representative Alternative Splicing Variant (RASV) that is another representative per transcript group consisting of the same alternative splicing pattern (28). Representative transcripts are also RASVs, and blue diamonds do not appear if there is only one splicing isoform. In the tables, the H-Inv transcript ID (HIX) and original accession numbers (DDBJ/EMBL/GenBank, Ensembl and RefSeq) of the representative transcript and other transcripts are listed.

FUTURE DIRECTIONS

As our update policy, orthology information in Evola is updated when H-InvDB annotation is updated. One major update and three minor updates per year are scheduled. At the next major update on December 2007, a new duplicate gene family view is planned to be integrated within Evola. Human duplicate gene family data was originally constructed based on both amino acid sequence similarity (29) and orthology information. In the current Evola (release 4.1), parts of human duplicate gene annotation have been already implemented. The human duplicate genes are included in the alignments and phylogenetic trees of orthologs and homologs. Finally, we expect Evola to serve as a new database for evolutionary annotation of human genes. We sincerely welcome any requests and feedback from users.

28 in total

1. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors: M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal: Nat Genet Date: 2000-05 Impact factor: 38.330

2. Flexible sequence similarity searching with the FASTA3 program package.

Authors: W R Pearson
Journal: Methods Mol Biol Date: 2000

3. Investigation of protein functions through data-mining on integrated human transcriptome database, H-Invitational database (H-InvDB).

Authors: Chisato Yamasaki; Kanako O Koyanagi; Yasuyuki Fujii; Takeshi Itoh; Roberto Barrero; Takuro Tamura; Yumi Yamaguchi-Kabata; Motohiko Tanino; Jun-Ichi Takeda; Satoshi Fukuchi; Satoru Miyazaki; Nobuo Nomura; Sumio Sugano; Tadashi Imanishi; Takashi Gojobori
Journal: Gene Date: 2005-09-26 Impact factor: 3.688

4. Rates of genome evolution and branching order from whole genome analysis.

Authors: Gavin A Huttley; Matthew J Wakefield; Simon Easteal
Journal: Mol Biol Evol Date: 2007-05-09 Impact factor: 16.240

5. Ensembl 2006.

Authors: E Birney; D Andrews; M Caccamo; Y Chen; L Clarke; G Coates; T Cox; F Cunningham; V Curwen; T Cutts; T Down; R Durbin; X M Fernandez-Suarez; P Flicek; S Gräf; M Hammond; J Herrero; K Howe; V Iyer; K Jekosch; A Kähäri; A Kasprzyk; D Keefe; F Kokocinski; E Kulesha; D London; I Longden; C Melsopp; P Meidl; B Overduin; A Parker; G Proctor; A Prlic; M Rae; D Rios; S Redmond; M Schuster; I Sealy; S Searle; J Severin; G Slater; D Smedley; J Smith; A Stabenau; J Stalker; S Trevanion; A Ureta-Vidal; J Vogel; S White; C Woodwark; T J P Hubbard
Journal: Nucleic Acids Res Date: 2006-01-01 Impact factor: 16.971

6. TreeFam: a curated database of phylogenetic trees of animal gene families.

Authors: Heng Li; Avril Coghlan; Jue Ruan; Lachlan James Coin; Jean-Karim Hériché; Lara Osmotherly; Ruiqiang Li; Tao Liu; Zhang Zhang; Lars Bolund; Gane Ka-Shu Wong; Weimou Zheng; Paramvir Dehal; Jun Wang; Richard Durbin
Journal: Nucleic Acids Res Date: 2006-01-01 Impact factor: 16.971

7. New developments in the InterPro database.

Authors: Nicola J Mulder; Rolf Apweiler; Teresa K Attwood; Amos Bairoch; Alex Bateman; David Binns; Peer Bork; Virginie Buillard; Lorenzo Cerutti; Richard Copley; Emmanuel Courcelle; Ujjwal Das; Louise Daugherty; Mark Dibley; Robert Finn; Wolfgang Fleischmann; Julian Gough; Daniel Haft; Nicolas Hulo; Sarah Hunter; Daniel Kahn; Alexander Kanapin; Anish Kejariwal; Alberto Labarga; Petra S Langendijk-Genevaux; David Lonsdale; Rodrigo Lopez; Ivica Letunic; Martin Madera; John Maslen; Craig McAnulla; Jennifer McDowall; Jaina Mistry; Alex Mitchell; Anastasia N Nikolskaya; Sandra Orchard; Christine Orengo; Robert Petryszak; Jeremy D Selengut; Christian J A Sigrist; Paul D Thomas; Franck Valentin; Derek Wilson; Cathy H Wu; Corin Yeats
Journal: Nucleic Acids Res Date: 2007-01 Impact factor: 16.971

8. Database resources of the National Center for Biotechnology Information.

Authors: David L Wheeler; Tanya Barrett; Dennis A Benson; Stephen H Bryant; Kathi Canese; Vyacheslav Chetvernin; Deanna M Church; Michael DiCuccio; Ron Edgar; Scott Federhen; Lewis Y Geer; Yuri Kapustin; Oleg Khovayko; David Landsman; David J Lipman; Thomas L Madden; Donna R Maglott; James Ostell; Vadim Miller; Kim D Pruitt; Gregory D Schuler; Edwin Sequeira; Steven T Sherry; Karl Sirotkin; Alexandre Souvorov; Grigory Starchenko; Roman L Tatusov; Tatiana A Tatusova; Lukas Wagner; Eugene Yaschenko
Journal: Nucleic Acids Res Date: 2006-12-14 Impact factor: 16.971

9. Large-scale identification and characterization of alternative splicing variants of human gene transcripts using 56,419 completely sequenced and manually annotated full-length cDNAs.

Authors: Jun-ichi Takeda; Yutaka Suzuki; Mitsuteru Nakao; Roberto A Barrero; Kanako O Koyanagi; Lihua Jin; Chie Motono; Hiroko Hata; Takao Isogai; Keiichi Nagai; Tetsuji Otsuki; Vladimir Kuryshev; Masafumi Shionyu; Kei Yura; Mitiko Go; Jean Thierry-Mieg; Danielle Thierry-Mieg; Stefan Wiemann; Nobuo Nomura; Sumio Sugano; Takashi Gojobori; Tadashi Imanishi
Journal: Nucleic Acids Res Date: 2006-08-12 Impact factor: 16.971

10. The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts.

Authors: Chisato Yamasaki; Katsuhiko Murakami; Yasuyuki Fujii; Yoshiharu Sato; Erimi Harada; Jun-ichi Takeda; Takayuki Taniya; Ryuichi Sakate; Shingo Kikugawa; Makoto Shimada; Motohiko Tanino; Kanako O Koyanagi; Roberto A Barrero; Craig Gough; Hong-Woo Chun; Takuya Habara; Hideki Hanaoka; Yosuke Hayakawa; Phillip B Hilton; Yayoi Kaneko; Masako Kanno; Yoshihiro Kawahara; Toshiyuki Kawamura; Akihiro Matsuya; Naoki Nagata; Kensaku Nishikata; Akiko Ogura Noda; Shin Nurimoto; Naomi Saichi; Hiroaki Sakai; Ryoko Sanbonmatsu; Rie Shiba; Mami Suzuki; Kazuhiko Takabayashi; Aiko Takahashi; Takuro Tamura; Masayuki Tanaka; Susumu Tanaka; Fusano Todokoro; Kaori Yamaguchi; Naoyuki Yamamoto; Toshihisa Okido; Jun Mashima; Aki Hashizume; Lihua Jin; Kyung-Bum Lee; Yi-Chueh Lin; Asami Nozaki; Katsunaga Sakai; Masahito Tada; Satoru Miyazaki; Takashi Makino; Hajime Ohyanagi; Naoki Osato; Nobuhiko Tanaka; Yoshiyuki Suzuki; Kazuho Ikeo; Naruya Saitou; Hideaki Sugawara; Claire O'Donovan; Tamara Kulikova; Eleanor Whitfield; Brian Halligan; Mary Shimoyama; Simon Twigger; Kei Yura; Kouichi Kimura; Tomohiro Yasuda; Tetsuo Nishikawa; Yutaka Akiyama; Chie Motono; Yuri Mukai; Hideki Nagasaki; Makiko Suwa; Paul Horton; Reiko Kikuno; Osamu Ohara; Doron Lancet; Eric Eveno; Esther Graudens; Sandrine Imbeaud; Marie Anne Debily; Yoshihide Hayashizaki; Clara Amid; Michael Han; Andreas Osanger; Toshinori Endo; Michael A Thomas; Mika Hirakawa; Wojciech Makalowski; Mitsuteru Nakao; Nam-Soon Kim; Hyang-Sook Yoo; Sandro J De Souza; Maria de Fatima Bonaldo; Yoshihito Niimura; Vladimir Kuryshev; Ingo Schupp; Stefan Wiemann; Matthew Bellgard; Masafumi Shionyu; Libin Jia; Danielle Thierry-Mieg; Jean Thierry-Mieg; Lukas Wagner; Qinghua Zhang; Mitiko Go; Shinsei Minoshima; Masafumi Ohtsubo; Kousuke Hanada; Peter Tonellato; Takao Isogai; Ji Zhang; Boris Lenhard; Sangsoo Kim; Zhu Chen; Ursula Hinz; Anne Estreicher; Kenta Nakai; Izabela Makalowska; Winston Hide; Nicola Tiffin; Laurens Wilming; Ranajit Chakraborty; Marcelo Bento Soares; Maria Luisa Chiusano; Yutaka Suzuki; Charles Auffray; Yumi Yamaguchi-Kabata; Takeshi Itoh; Teruyoshi Hishiki; Satoshi Fukuchi; Ken Nishikawa; Sumio Sugano; Nobuo Nomura; Yoshio Tateno; Tadashi Imanishi; Takashi Gojobori
Journal: Nucleic Acids Res Date: 2007-12-18 Impact factor: 16.971

19 in total

1. ProPhylER: a curated online resource for protein function and structure based on evolutionary constraint analyses.

Authors: Jonathan Binkley; Kalpana Karra; Andrew Kirby; Midori Hosobuchi; Eric A Stone; Arend Sidow
Journal: Genome Res Date: 2009-10-21 Impact factor: 9.043

2. Massive gene losses in Asian cultivated rice unveiled by comparative genome analysis.

Authors: Hiroaki Sakai; Takeshi Itoh
Journal: BMC Genomics Date: 2010-02-19 Impact factor: 3.969

3. Distinct evolutionary patterns of Oryza glaberrima deciphered by genome sequencing and comparative analysis.

Authors: Hiroaki Sakai; Hiroshi Ikawa; Tsuyoshi Tanaka; Hisataka Numa; Hiroshi Minami; Masaki Fujisawa; Michie Shibata; Kanako Kurita; Ari Kikuta; Masao Hamada; Hiroyuki Kanamori; Nobukazu Namiki; Jianzhong Wu; Takeshi Itoh; Takashi Matsumoto; Takuji Sasaki
Journal: Plant J Date: 2011-03-21 Impact factor: 6.417

4. genenames.org: the HGNC resources in 2011.

Authors: Ruth L Seal; Susan M Gordon; Michael J Lush; Mathew W Wright; Elspeth A Bruford
Journal: Nucleic Acids Res Date: 2010-10-06 Impact factor: 16.971

5. Synorth: exploring the evolution of synteny and long-range regulatory interactions in vertebrate genomes.

Authors: Xianjun Dong; David Fredman; Boris Lenhard
Journal: Genome Biol Date: 2009-08-21 Impact factor: 13.583

6. G-compass: a web-based comparative genome browser between human and other vertebrate genomes.

Authors: Yoshihiro Kawahara; Ryuichi Sakate; Akihiro Matsuya; Katsuhiko Murakami; Yoshiharu Sato; Hao Zhang; Takashi Gojobori; Takeshi Itoh; Tadashi Imanishi
Journal: Bioinformatics Date: 2009-10-21 Impact factor: 6.937

7. Sexy gene conversions: locating gene conversions on the X-chromosome.

Authors: Mark J Lawson; Liqing Zhang
Journal: Nucleic Acids Res Date: 2009-05-31 Impact factor: 16.971

8. H-InvDB in 2009: extended database and data mining resources for human genes and transcripts.

Authors: Chisato Yamasaki; Katsuhiko Murakami; Jun-ichi Takeda; Yoshiharu Sato; Akiko Noda; Ryuichi Sakate; Takuya Habara; Hajime Nakaoka; Fusano Todokoro; Akihiro Matsuya; Tadashi Imanishi; Takashi Gojobori
Journal: Nucleic Acids Res Date: 2009-11-23 Impact factor: 16.971

9. The Chicken Gene Nomenclature Committee report.

Authors: David W Burt; Wilfrid Carrë; Mark Fell; Andy S Law; Parker B Antin; Donna R Maglott; Janet A Weber; Carl J Schmidt; Shane C Burgess; Fiona M McCarthy
Journal: BMC Genomics Date: 2009-07-14 Impact factor: 3.969

10. H-InvDB in 2013: an omics study platform for human functional gene and transcript discovery.

Authors: Jun-Ichi Takeda; Chisato Yamasaki; Katsuhiko Murakami; Yoko Nagai; Miho Sera; Yuichiro Hara; Nobuo Obi; Takuya Habara; Takashi Gojobori; Tadashi Imanishi
Journal: Nucleic Acids Res Date: 2012-11-28 Impact factor: 16.971