Literature DB >> 17947331

CFGP: a web-based, comparative fungal genomics platform.

Jongsun Park¹, Bongsoo Park, Kyongyong Jung, Suwang Jang, Kwangyul Yu, Jaeyoung Choi, Sunghyung Kong, Jaejin Park, Seryun Kim, Hyojeong Kim, Soonok Kim, Jihyun F Kim, Jaime E Blair, Kwangwon Lee, Seogchan Kang, Yong-Hwan Lee.

Abstract

Since the completion of the Saccharomyces cerevisiae genome sequencing project in 1996, the genomes of over 80 fungal species have been sequenced or are currently being sequenced. Resulting data provide opportunities for studying and comparing fungal biology and evolution at the genome level. To support such studies, the Comparative Fungal Genomics Platform (CFGP; http://cfgp.snu.ac.kr), a web-based multifunctional informatics workbench, was developed. The CFGP comprises three layers, including the basal layer, middleware and the user interface. The data warehouse in the basal layer contains standardized genome sequences of 65 fungal species. The middleware processes queries via six analysis tools, including BLAST, ClustalW, InterProScan, SignalP 3.0, PSORT II and a newly developed tool named BLASTMatrix. The BLASTMatrix permits the identification and visualization of genes homologous to a query across multiple species. The Data-driven User Interface (DUI) of the CFGP was built on a new concept of pre-collecting data and post-executing analysis instead of the 'fill-in-the-form-and-press-SUBMIT' user interfaces utilized by most bioinformatics sites. A tool termed Favorite, which supports the management of encapsulated sequence data and provides a personalized data repository to users, is another novel feature in the DUI.

Entities: Chemical Disease Species

Mesh：

Substances：

Year: 2007 PMID： 17947331 PMCID： PMC2238957 DOI： 10.1093/nar/gkm758

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Fungi exert a far-reaching influence on the earth's biosphere (1). As recyclers of organic matter and as symbionts of most terrestrial plants, fungi are essential components of healthy ecosystems (2). For thousands of years, humans have exploited fungi for the production of many useful compounds and foods (3). In contrast to these benefits, fungi are also a major cause of plant diseases, significantly reducing crop yield (4). Fungi also represent a direct threat to human health as the most common cause of death in immunocompromised patients such as bone marrow transplant recipients and individuals suffering from advanced HIV infection due to systemic mycoses (5,6). Studies on fungal biology have been greatly aided by rapidly accumulating genome sequence data (7). Since the completion of the genome sequencing of Saccharomyces cerevisiae (8), genomes of more than 80 fungal species have been completely sequenced or are currently being sequenced (7,9). As new high-throughput and low cost sequencing technologies (10) become widely available, the rate of fungal genome sequencing will continue to accelerate. Currently available fungal genome sequences cover species in four out of the seven fungal phyla, including Ascomycota, Basidiomycota, Chytridiomycota and Microsprodia (11,12) (Table 1). These genome sequences provide novel opportunities for elucidating the evolutionary and genetic basis of many different fungal lifestyle features, such as pathogenesis, symbiosis and the ability to grow on diverse substrates (9,13,14), via the use of various functional genomic and informatic tools. A better understanding of fungal biology will not only facilitate the judicious use of beneficial fungi, but also advance our efforts to control pathogenic species (15,16).

Table 1.

List of genome sequences stored in the data warehouse of the CFGP

Species	Size (Mb)	No. of ORFs	Source^a	Reference
Eubacteria (Domain)^b
Actinobacteria (Phylum)
Bifidobacterium longum	2.3	1727	NCBI	(34)
Streptomyces coelicolor A3(2)	8.7	7769	CBS	(35)
Streptomyces avermitilis MA-4680	9.0	7575	CBS
Proteobacteria (Phylum)
Escherichia coli K12	4.6	4311	NCBI	(36)
Pseudomonas fluorescens Pf-5	7.1	6137	NCBI	(37)
Eukaryota (Domain)
Cryptophyceae (Kingdom)^c
Guillardia theta	0.7	627	CBS	(38)
Euglenozoa (Kingdom)^c
Leishmania infantum	34.7	3241	SGTC	(39)
Fungi (Kingdom)^d
Ascomycota (Phylum)
Pezizomycotina (Subphylum)
Botrytis cinerea	42.7	16 448	BI
Sclerotinia sclerotiorum	38.3	14 522	BI
Aspergillus clavatus	27.9	9119	BI
Aspergillus fischerianus	32.6	10 403	BI
Apsergillus flavus	36.8	12 587	BI
Aspergillus fumigatus	28.8	9926	TIGR	(40)
Aspergillus nidulans	30.1	10 701	BI	(17)
Aspergillus oryzae	37.1	12 062	DOGAN	(41)
Aspergillus terreus	29.3	10 406	BI
Aspergillus niger	37.2	11 200	JGI
Coccidioides immitis RS	28.9	10 457	BI
Coccidioides immitis H538.4	55.6		BI
Coccidioides immitis RMSCC 2394.1	28.9		BI
Coccidioides posadasii Silveria	27.4		BI
Coccidioides posadasii RMSCC 3488	28.1		BI
Histoplasma capsulatum	33.0	9349	BI
Uncinocarpus reesii	22.3	7798	BI
Chaetomium globosum	34.9	11 124	BI
Fusarium graminearum PH-1	36.6	13 321	BI	(42)
Fusarium graminearum GZ3639^e	15.1		BI	(42)
Fusarium oxysporum	61.4	17 608	BI
Fusarium verticillioides	41.9	14 155	BI
Fusarium solani	51.3	15 707	JGI
Magnaporthe oryzae	41.6	12 841	BI	(43)
Neurospora crassa	39.2	10 620	BI	(44)
Podospora anserina	35.7	9872	IGM
Trichoderma reesei	34.5	9997	JGI
Alternaria brassicicola	32.0		WGSC
Pyrenophora tritici-repentis	38.0		BI
Mycosphaerella graminicola	41.9	11 395	JGI
Mycosphaerella fijiensis	73.4	10 313	JGI
Stagonospora nodorum	37.2	16 597	BI
Saccharomycotina (Subphylum)
Candida albicans SC5314	27.8	14 216	SGTC	(45)
Candida albicans WO-1	14.5	6157	BI
Candida dubliniensis	14.5	6027	SI
Candida glabrata	12.3	5174	CBS	(19)
Candida guilliermondii	10.6	5920	BI
Candida lusitaniae	12.1	5941	BI
Candida parapsilosis	13.1		SI
Candida tropicalis	14.7	6258	BI
Debaryomyces hansenii	12.2	6354	CBS	(19)
Eremothecium gossypii	8.7	4718	NCBI	(46)
Kluyveromyces lactis	10.7	5327	Genoscope
Kluyveromyces waltii	10.6	5214	BI	(19)
Lodderomyces elongisporus	15.5	5796	BI
Saccharomyces cerevisiae 288C	12.2	5898	SGD	(47)
Saccharomyces cerevisiae RM11-1a	11.7	5383	BI
Saccharomyces cerevisiae YJM789	11.9	5471	SI
Saccharomyces bayanus	11.5	9385	BI	(47)
Saccharomyces castellii	11.4	4677	VBI	(48)
Saccharomyces kudriavzevii	11.2	3768	VBI
Saccharomyces kluyveri	11.0	2968	WUGSC	(48)
Saccharomyces mikatae	11.5	9016	BI	(47)
Saccharomyces paradoxus	11.9	8939	BI	(47)
Pichia stipitis	15.4	5839	JGI	(49)
Yarrowia lipolytica	20.5	6524	CBS	(19)
Taphrinomycotina (Subphylum)
Pneumocystis carinii^e	6.3	4020	SI
Schizosaccharomyces pombe	12.6	5005	GeneDB	(50)
Schizosaccharomyces japonicus	11.3	5172	BI
Basidiomycota (Phylum)
Agricomycotina (Subphylum)
Postia placenta	90.9	17 173	JGI
Phanerochaete chrysosporium	30.0	10 048	JGI	(51)
Coprinus cinereus	36.3	13 544	BI
Laccaria bicolor	64.9	20 614	JGI
Cryptococcus neoformans Serotype A	19.5	7302	BI
Cryptococcus neoformans Serotype B	19.0	6870	NCBI
Cryptococcus neoformans Serotype D B3501-A	19.3	6578	SGTC	(52)
Cryptococcus neoformans Serotype D JEC21	19.1	6475	SGTC	(52)
Pucciniomycotina (Subphylum)
Sporobolomyces roseus	21.2	5536	JGI
Puccinia graminis	88.7	20 567	BI
Ustilaginomycotina (Subphylum)
Ustilago maydis 521	19.7	6689	BI	(15)
Ustilago maydis FB1	19.7		BI	(15)
Chytridiomycota (Phylum)
Batrachochytrium dendrobatidis	23.9	8818	BI
Mucoromycotina (Subphylum incertae sedis)
Rhizopus oryzae	45.3	17 467	BI
Phycomyces blakesleeanus	55.9	14 792	JGI
Microsporidia (Phylum)
Encephalitozoon cuniculi	2.5	1996	Genoscope	(53)
Antonospora locustae	6.1	2606	JBPC
Stramenopila (Kingdom)^c
Peronosporomycota (Phylum)
Phytophthora infestans	228.5	22 658	BI
Phytophthora sojae	86.0	19 276	JGI	(54)
Phytophthora ramorum	66.7	16 066	JGI	(54)
Hyaloperonospora parasitica	83.8		VBI
Chloroplastida (Kingdom)^c
Charophyta (Phylum)
Arabidopsis thaliana	119.2	28 581	TAIR	(55)
Oryza sativa var. Japonica	370.8	37 555	IRGSP	(56)
Oryza sativa var. indica	426.3	49 710	BGI	(57)
Populus trichocarpa	485.5	58 036	JGI	(58)
Medicago truncatula	251.7	40 567	MTGSP
Metazoa (Kingdom)
Arthropoda (Phylum)
Anopheles gambiae	287.8	15 802	Ensembl	(59)
Drosophila melanogaster	118.4	19 389	BDGP	(60)
Cnidaria (Phylum)
Nematostella vectensis	356.6	27 273	JGI	(61)
Nematoda (Phylum)
Caenohabditis elegans	100.3	21 124	NCBI	(62)
Urochordata (Phylum)
Ciona intestinalis	173.5	19 744	Ensembl	(63)
Ciona savignyi	177.0	20 150	Ensembl
Vertebrata (Phylum)
Danio rerio	1636.5	14 966	Ensembl
Tetraodon nigroviridis	402.2	28 005	Ensembl
Xenopus tropicalis	1510.9	28 305	Ensembl
Bos taurus	3144.2	32 991	Ensembl
Canis familiaris	2519.8	30 308	Ensembl
Gallus gallus	1105.2	24 166	Ensembl
Pan troglodytes	4295.0	39 648	Ensembl
Mus musculus	2724.2	36 471	Ensembl	(64)
Rattus norvegicus	2718.9	32 543	Ensembl
Homo sapiens	3418.7	33 869	Ensembl	(65)
Total	28 984.2	1 353 360

aSGTC, Stanford Genome Technology Center; SI, Sanger Institute; CBS, Center For Biological Sequences; BI, Broad Institute; WGSC, Washington Univ. Genome Sequencing Center; JGI, DOE Joint Genomic Institute; DOGAN, Database Of the Genomes Analyzed at Nite; IGM, Instituté de Génétique et Microbiologie; TAIR, The Arabidopsis Information Resource; IRGSP, International Rice Genome Sequencing Project; BDGP, Berkeley Drosophila Genome Project; BGI, Beijing Genome Institute; VGI, Virginia Bioinformatics Institute; JBPC, Josephine Bay Paul Center for Comparative Molecular Biology and Evolution; MTGSP, Medicago Truncatula Genome Sequencing Project.

bTaxonomy based on (66).

cTaxonomy based on (67).

dTaxonomy based on (12).

eIncomplete coverage of genome information.

List of genome sequences stored in the data warehouse of the CFGP aSGTC, Stanford Genome Technology Center; SI, Sanger Institute; CBS, Center For Biological Sequences; BI, Broad Institute; WGSC, Washington Univ. Genome Sequencing Center; JGI, DOE Joint Genomic Institute; DOGAN, Database Of the Genomes Analyzed at Nite; IGM, Instituté de Génétique et Microbiologie; TAIR, The Arabidopsis Information Resource; IRGSP, International Rice Genome Sequencing Project; BDGP, Berkeley Drosophila Genome Project; BGI, Beijing Genome Institute; VGI, Virginia Bioinformatics Institute; JBPC, Josephine Bay Paul Center for Comparative Molecular Biology and Evolution; MTGSP, Medicago Truncatula Genome Sequencing Project. bTaxonomy based on (66). cTaxonomy based on (67). dTaxonomy based on (12). eIncomplete coverage of genome information. The abundance of sequenced species has facilitated in-depth comparative evolutionary genomic analyses across multiple fungal taxa (17–20). Because of the large amount of data involved, a cohesive, user-friendly informatics platform that links data and analysis tools is needed to efficiently support such analyses. Despite this need, the lack of data standardization has hampered the development of such platforms. The Genome Information Management System (GIMS) provided an integrated environment for archiving and visualization of genome sequences and data on transcriptome, protein–protein interaction, Gene Ontology (GO) and metabolic pathway (21). The ‘eFungi’, an improvement from the GIMS, stores genome sequences of 34 fungal and 2 Oomycete species (http://www.e-fungi.org.uk/). Although these systems systematically archive genomic data from multiple species, they do not support analysis of archived data with bioinformatic tools. Heterogeneity of user interface (UI) and input/output data format in different bioinformatics tools has also complicated the integration of tools in a single platform to support multifaceted analyses of multiple genome sequences. Several systems provide multiple tools via a single platform. One example is the SNAP workbench, which supports sophisticated phylogenetic analyses through a menu-driven design (22). The iNquiry (BioTeam Inc., Wayland, MA, USA; http://web.bioteam.net/metadot/index.pl?id=2187) and European Molecular Biology Open Software Suite (EMBOSS) (23) are other examples of integrated, web-based platforms with multiple bioinformatic tools. The PLATCOM integrates a number of tools for comparative analysis of multiple genomes (24,25). These platforms, integrating data and tools, significantly shorten data analysis time by eliminating the need for visiting multiple, independent web sites to collect and analyse data. The ISYS platform utilizes middleware to link many different databases to data analysis tools using JAVA and allow these tools to communicate without any modification (26). Although these examples illustrate major improvements in supporting integrative analyses of genome sequence data via a single platform, the efficiency and expandability of such platforms require continuous enhancement, in order to adequately support utilization of rapidly increasing genome sequence data. Another area that requires improvement is the UI. Many currently available web-based bioinformatic platforms employ classical UI systems that simply display a list of functions or databases and provide a ‘paste-sequence-and-press-submit’ form (http://ausweb.scu.edu.au/aw02/papers/refereed/fitch/paper.html). Such UIs are easy to construct, but are not suitable for successively analysing sequence data with multiple tools. To provide an effective means for analysing fungal genome sequence data through a suite of tools across multiple species, we developed the Comparative Fungal Genomics Platform (CFGP), which consists of a large-scale genomic data warehouse, bioinformatics tools useful for comparative genome analyses and a novel UI. The UI of the CFGP provides an easy access to sequence data stored in the data warehouse and seamlessly supports integrative data analyses using multiple tools. The data warehouse currently houses 101 genome databases in a standardized format for rapid data exchange. Bioinformatic tools incorporated into the CFGP were wrapped by a middleware program to efficiently manage tasks and facilitate data exchange between tools.

SYSTEM ARCHITECTURE AND DESIGN

The CFGP consists of three layers—the basal layer, middleware and the UI (Figure 1). The basal layer contains a data warehouse, which is managed using MySQL. Meta information for different types of biological data, including genome sequences, species and phenotype screening data, is placed as individual objects in this layer. The middleware connects the basal layer with the UI and supports the use of data analysis tools, including BLAST (27), ClustalW for multiple sequence alignment (28), InterProScan for predicting functional domains (29), SignalP 3.0 for predicting the presence of signal peptide (30), PSORT II for predicting subcellular localization (31) and a newly developed program named BLASTMatrix for identifying and summarizing the distribution pattern of homologous genes across the genome sequences stored in the CFGP. As a result of the standardization of data exchange, the functionality of the CFGP can be easily expanded by adding any new tools that function in the UNIX environment. The UI of the CFGP developed with PHP (http://www.php.net) is based on a new concept, termed the Data-driven User Interface (DUI). By collecting sequences to be analysed first and executing analyses later, the DUI significantly reduces the time required for analysing the same sequence data via multiple tools.

Figure 1.

Overall system architecture and data flow in the CFGP. The basal layer contains a data warehouse, Favorite (a personal data repository and management tool), and external databases, such as InterPro and GO, stored in the CFGP. The wrapper in the middle layer relays requests from the UI to both the internal and external programs. The task manager at the right side of the wrapper manages tasks by assigning them to servers. At the upper layer, the DUI, a template engine developed with PHP, operates. A ‘command’ from the user goes to the middle layer. The basal layer passes the data to the middle layer as ‘input’. At the middle layer, chosen programs generate results and pass them to the upper layer for ‘representation’ and to the basal layer for ‘storage’. The three layers of the CFGP can be manipulated and developed independently, which provides an optimal environment for maintenance and expansion of the CFGP. This was made possible by employing a standardized scheme in building each layer. In the basal layer, functions and schema of databases were standardized in both naming rules and basic structure of programming style, which enhances the efficiency of database development. In the middle layer, communications between the CFGP and external programs were standardized via PERL modules. This facilitates the future expansion of functionality, because new programs can be easily incorporated into the CFGP by constructing additional PERL modules. In the DUI, most of the interface components were standardized as a function so that a developer can easily make a new UI with selected components.

FEATURES OF THE CFGP

Data warehouse

Fungal genome sequence data in the public domain are stored in heterogeneous formats, posing a hurdle in integrating the data for comparative analysis. We retrieved these data and stored all Open Reading Frame (ORF) and contig (or chromosome) sequences of individual genomes in the data warehouse of the CFGP in a single format using MySQL. Subsequently, all sequence data were encapsulated as individual objects so that they can be easily analysed through multiple data analysis tools. The data warehouse currently houses the genome sequences of 65 fungal species, 4 Oomycete species and 27 non-fungal organisms (Table 1). The fungal genome databases cover 52 species belonging to the Ascomycota, eight species in the Basidomycota, two species each in the Mucoromycotina and the Microsprodia and one in the Chytridiomycota (12).

Data-driven user interface (DUI)

Most of the bioinformatics tools currently available through the web typically provide a box in the UI for pasting a query sequence. However, as the complexity of scientific inquires increases, often requiring multiple analyses with a single query, a single analysis with multiple sequences, or a combination of both, this type of UI becomes inefficient, and a new UI design is required (32). The only current solution for analysing a large number of sequences is a batch processing of data, which likely requires some level of programming knowledge by the user. We developed the DUI to seamlessly support data management and integrative analyses using a suite of data analysis tools. It consists of two compartments: the Data Frame, supports browsing and collection of data, and the Manipulation Frame, which supports data management (Figure 2A). Four browsing tools under the ‘SEQUENCE’ menu include Contig Browser for browsing data in the data warehouse, SequenceSet Browser for browsing data in databases such as Uniprot, MyGene Browser for browsing data in the user's own computer and NR Browser for NR and NT sequences of NCBI. The Manipulation Frame provides a mechanism for storing and organizing data collected in a personalized space in the CFGP. The collection arrow transfers selected sequence data from the Data Frame to the Manipulation Frame, where they can be analysed by any bioinfomatic tools in the CFGP. This data management scheme significantly enhances the efficiency of data analysis, especially when large amounts of data are involved.

Figure 2.

Structure of DUI. (A) A screenshot shows the process of data acquisition from Contig Browser. On the left side, ‘Data Frame’ displays the list of Magnaporthe oryzae proteins and ‘Manipulation Frame’ on the right side shows a list of Favorite. The ‘Collection arrow’ in the middle transfers chosen sequences from the Data Frame to the Manipulation Frame. (B) Collected sequences can be analysed by data analysis tools in Favorite. Users can choose sequences by clicking the checkbox in front of each sequence. (C) A BLAST search output is shown with Favorite in the Manipulation Frame. From the BLAST result, users can transfer sequences to Favorite via the use of the ‘Collection Arrow’.

Favorite as a bioinformatic workbench

A new UI tool named Favorite was developed to provide a personalized hub for storing and managing sequences retrieved from the data warehouse (Figure 2B). By storing only the primary keys of chosen sequences, not the sequences themselves, Favorite significantly reduces the space needed for storing data. Data stored in Favorite can be analysed with one tool or a series of tools by simply clicking the appropriate analyses in the option window (Figure 2C). Five external programs, including BLAST, ClustalW, InterProScan, SignalP 3.0 and PSORT II, are available in Favorite. A BLAST search result can be presented in six different formats. One of them is ‘interpro view’, which displays the BLAST result annotated by InterPro to provide the functional prediction of the proteins in the BLAST output. The ClustalW provides three different output formats: the multiple sequence alignment, distance matrix and the bootstrapped phylogenetic tree. The MSA viewer and Phyloviewer aid the user in manipulating the results of multiple sequence alignments and phylogenetic trees, respectively (http://phyloviewer.riceblast.snu.ac.kr; J. Park et al., unpublished data). Results from InterProScan, SignalP and PSORT II are stored in the annotation database so that all results can be displayed in the annotation page of each query sequence. All analysis outputs provide an option of storing any sequences in the output into Favorite, offering an easy way to collect selected sequences for subsequent analyses. To empower the personalized use of Favorite, user authentication is required. Besides supporting the management of individual users’ data, Favorite can also be used to exchange data with other researchers. In addition, Favorite retains the user's original reference data, which overcomes any discrepancies between analyses conducted at different time points due to the frequent updating of external databases, such as the NR database in NCBI.

BLASTMatrix, a novel tool for searching and visualizing potential homologs across multiple species

With the availability of a large number of completely sequenced fungal genomes, it is possible to analyse the distribution of homologous genes across fungal taxa (7,9). Repeated BLAST searches against individual genome datasets are currently required for this task, which is iterative and cumbersome (33). To solve this problem, a new tool named the BLASTMatrix was developed and linked to the CFGP. With a query sequence, the BLASTMatrix generates a table containing the best hit in each of the species, which is then organized according to their taxonomical positions (Figure 3A), and also calculates the distribution pattern of homologous genes in different taxonomic groups (Figure 3B). The output can include InterPro or GO terms, helping the prediction of putative functions of hypothetical proteins. Further analyses can then determine the orthologous relationships between the query and its homologs in individual species.

Figure 3.

Format of BLASTMatrix output. An example of BLASTMatirx output generated using the aflatoxin gene cluster in Aspergillus nidulans as queries. The results are presented in a matrix format (A) and a distribution based on e-value (B). Additionally, BLASTMatrix analyses the pattern of conservation in the BLASTMatrix dataset (such as novel gene, ‘highly conserved gene’ or ‘taxon-specific gene’) based on the distribution pattern of matched genes in all screened taxa.

FUTURE PROSPECTS

Genome sequences, along with associated functional genomics data, will continue to accumulate at an exponential rate. To efficiently utilize this inflow of data, standardization of data and efficient communication among data analysis tools are required. Enhancing the standard of communication between programs will also help future expansion by integrating more bioinformatics tools and will provide a development environment for open source projects. Additional genomic information, such as alternative splicing and expression data derived from EST, SAGE and microarray experiments, can be added to the CFGP.

60 in total

1. Next-generation biologists must straddle computation and biology.

Authors: P Wickware
Journal: Nature Date: 2000-04-06 Impact factor: 49.962

2. EMBOSS: the European Molecular Biology Open Software Suite.

Authors: P Rice; I Longden; A Bleasby
Journal: Trends Genet Date: 2000-06 Impact factor: 11.639

3. Initial sequencing and analysis of the human genome.

Authors: E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal: Nature Date: 2001-02-15 Impact factor: 49.962

4. ISYS: a decentralized, component-based approach to the integration of heterogeneous bioinformatics resources.

Authors: A Siepel; A Farmer; A Tolopko; M Zhuang; P Mendes; W Beavis; B Sobral
Journal: Bioinformatics Date: 2001-01 Impact factor: 6.937

5. Microbial biotechnology.

Authors: A L Demain
Journal: Trends Biotechnol Date: 2000-01 Impact factor: 19.536

6. Genome sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi.

Authors: M D Katinka; S Duprat; E Cornillot; G Méténier; F Thomarat; G Prensier; V Barbe; E Peyretaillade; P Brottier; P Wincker; F Delbac; H El Alaoui; P Peyret; W Saurin; M Gouy; J Weissenbach; C P Vivarès
Journal: Nature Date: 2001-11-22 Impact factor: 49.962

7. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana.

Authors:
Journal: Nature Date: 2000-12-14 Impact factor: 49.962

8. The genome sequence of the rice blast fungus Magnaporthe grisea.

Authors: Ralph A Dean; Nicholas J Talbot; Daniel J Ebbole; Mark L Farman; Thomas K Mitchell; Marc J Orbach; Michael Thon; Resham Kulkarni; Jin-Rong Xu; Huaqin Pan; Nick D Read; Yong-Hwan Lee; Ignazio Carbone; Doug Brown; Yeon Yee Oh; Nicole Donofrio; Jun Seop Jeong; Darren M Soanes; Slavica Djonovic; Elena Kolomiets; Cathryn Rehmeyer; Weixi Li; Michael Harding; Soonok Kim; Marc-Henri Lebrun; Heidi Bohnert; Sean Coughlan; Jonathan Butler; Sarah Calvo; Li-Jun Ma; Robert Nicol; Seth Purcell; Chad Nusbaum; James E Galagan; Bruce W Birren
Journal: Nature Date: 2005-04-21 Impact factor: 49.962

9. The genome sequence of Schizosaccharomyces pombe.

Authors: V Wood; R Gwilliam; M-A Rajandream; M Lyne; R Lyne; A Stewart; J Sgouros; N Peat; J Hayles; S Baker; D Basham; S Bowman; K Brooks; D Brown; S Brown; T Chillingworth; C Churcher; M Collins; R Connor; A Cronin; P Davis; T Feltwell; A Fraser; S Gentles; A Goble; N Hamlin; D Harris; J Hidalgo; G Hodgson; S Holroyd; T Hornsby; S Howarth; E J Huckle; S Hunt; K Jagels; K James; L Jones; M Jones; S Leather; S McDonald; J McLean; P Mooney; S Moule; K Mungall; L Murphy; D Niblett; C Odell; K Oliver; S O'Neil; D Pearson; M A Quail; E Rabbinowitsch; K Rutherford; S Rutter; D Saunders; K Seeger; S Sharp; J Skelton; M Simmonds; R Squares; S Squares; K Stevens; K Taylor; R G Taylor; A Tivey; S Walsh; T Warren; S Whitehead; J Woodward; G Volckaert; R Aert; J Robben; B Grymonprez; I Weltjens; E Vanstreels; M Rieger; M Schäfer; S Müller-Auer; C Gabel; M Fuchs; A Düsterhöft; C Fritzc; E Holzer; D Moestl; H Hilbert; K Borzym; I Langer; A Beck; H Lehrach; R Reinhardt; T M Pohl; P Eger; W Zimmermann; H Wedler; R Wambutt; B Purnelle; A Goffeau; E Cadieu; S Dréano; S Gloux; V Lelaure; S Mottier; F Galibert; S J Aves; Z Xiang; C Hunt; K Moore; S M Hurst; M Lucas; M Rochet; C Gaillardin; V A Tallada; A Garzon; G Thode; R R Daga; L Cruzado; J Jimenez; M Sánchez; F del Rey; J Benito; A Domínguez; J L Revuelta; S Moreno; J Armstrong; S L Forsburg; L Cerutti; T Lowe; W R McCombie; I Paulsen; J Potashkin; G V Shpakovski; D Ussery; B G Barrell; P Nurse; L Cerrutti
Journal: Nature Date: 2002-02-21 Impact factor: 49.962

10. The Leishmania genome project: new insights into gene organization and function.

Authors: P J Myler; S M Beverley; A K Cruz; D E Dobson; A C Ivens; P D McDonagh; R Madhubala; S Martinez-Calvillo; J C Ruiz; A Saxena; E Sisk; S M Sunkin; E Worthey; S Yan; K D Stuart
Journal: Med Microbiol Immunol Date: 2001-11 Impact factor: 3.402

37 in total

1. Complete sequencing and comparative analyses of the pepper (Capsicum annuum L.) plastome revealed high frequency of tandem repeats and large insertion/deletions on pepper plastome.

Authors: Yeong Deuk Jo; Jongsun Park; Jungeun Kim; Wonho Song; Cheol-Goo Hur; Yong-Hwan Lee; Byoung-Cheorl Kang
Journal: Plant Cell Rep Date: 2010-10-27 Impact factor: 4.570

2. Bringing Web 2.0 to bioinformatics.

Authors: Zhang Zhang; Kei-Hoi Cheung; Jeffrey P Townsend
Journal: Brief Bioinform Date: 2008-10-08 Impact factor: 11.622

3. YeastWeb: a workset-centric web resource for gene family analysis in yeast.

Authors: Yanhui Chu; Xiaohuan Yuan; Yanqin Guo; Yufei Zhang; Yan Wu; Haifeng Liu; Dan Wu; Haihua Bao; Lixin Guan; Xiudong Jin
Journal: BMC Genomics Date: 2010-07-13 Impact factor: 3.969

4. Approaches to Fungal Genome Annotation.

Authors: Brian J Haas; Qiandong Zeng; Matthew D Pearson; Christina A Cuomo; Jennifer R Wortman
Journal: Mycology Date: 2011-10-03

5. Identification and analysis of in planta expressed genes of Magnaporthe oryzae.

Authors: Soonok Kim; Jongsun Park; Sook-Young Park; Thomas K Mitchell; Yong-Hwan Lee
Journal: BMC Genomics Date: 2010-02-10 Impact factor: 3.969

6. Combining ChIP-chip and expression profiling to model the MoCRZ1 mediated circuit for Ca/calcineurin signaling in the rice blast fungus.

Authors: Soonok Kim; Jinnan Hu; Yeonyee Oh; Jongsun Park; Jinhee Choi; Yong-Hwan Lee; Ralph A Dean; Thomas K Mitchell
Journal: PLoS Pathog Date: 2010-05-20 Impact factor: 6.823

7. Fungal secretome database: integrated platform for annotation of fungal secretomes.

Authors: Jaeyoung Choi; Jongsun Park; Donghan Kim; Kyongyong Jung; Seogchan Kang; Yong-Hwan Lee
Journal: BMC Genomics Date: 2010-02-11 Impact factor: 3.969

8. Homeobox transcription factors are required for conidiation and appressorium development in the rice blast fungus Magnaporthe oryzae.

Authors: Seryun Kim; Sook-Young Park; Kyoung Su Kim; Hee-Sool Rho; Myoung-Hwan Chi; Jaehyuk Choi; Jongsun Park; Sunghyung Kong; Jaejin Park; Jaeduk Goh; Yong-Hwan Lee
Journal: PLoS Genet Date: 2009-12-04 Impact factor: 5.917

9. Genomic resources of Magnaporthe oryzae (GROMO): a comprehensive and integrated database on rice blast fungus.

Authors: Shalabh Thakur; Sanjay Jha; Subhankar Roy-Barman; Bharat Chattoo
Journal: BMC Genomics Date: 2009-07-15 Impact factor: 3.969

10. SNUGB: a versatile genome browser supporting comparative and functional fungal genomics.

Authors: Kyongyong Jung; Jongsun Park; Jaeyoung Choi; Bongsoo Park; Seungill Kim; Kyohun Ahn; Jaehyuk Choi; Doil Choi; Seogchan Kang; Yong-Hwan Lee
Journal: BMC Genomics Date: 2008-12-04 Impact factor: 3.969