Literature DB >> 23175615

EuPathDB: the eukaryotic pathogen database.

Cristina Aurrecoechea¹, Ana Barreto, John Brestelli, Brian P Brunk, Shon Cade, Ryan Doherty, Steve Fischer, Bindu Gajria, Xin Gao, Alan Gingle, Greg Grant, Omar S Harb, Mark Heiges, Sufen Hu, John Iodice, Jessica C Kissinger, Eileen T Kraemer, Wei Li, Deborah F Pinney, Brian Pitts, David S Roos, Ganesh Srinivasamoorthy, Christian J Stoeckert, Haiming Wang, Susanne Warrenfeltz.

Abstract

EuPathDB (http://eupathdb.org) resources include 11 databases supporting eukaryotic pathogen genomic and functional genomic data, isolate data and phylogenomics. EuPathDB resources are built using the same infrastructure and provide a sophisticated search strategy system enabling complex interrogations of underlying data. Recent advances in EuPathDB resources include the design and implementation of a new data loading workflow, a new database supporting Piroplasmida (i.e. Babesia and Theileria), the addition of large amounts of new data and data types and the incorporation of new analysis tools. New data include genome sequences and annotation, strand-specific RNA-seq data, splice junction predictions (based on RNA-seq), phosphoproteomic data, high-throughput phenotyping data, single nucleotide polymorphism data based on high-throughput sequencing (HTS) and expression quantitative trait loci data. New analysis tools enable users to search for DNA motifs and define genes based on their genomic colocation, view results from searches graphically (i.e. genes mapped to chromosomes or isolates displayed on a map) and analyze data from columns in result tables (word cloud and histogram summaries of column content). The manuscript herein describes updates to EuPathDB since the previous report published in NAR in 2010.

Entities: Disease Species

Mesh：

Substances：
RNA Splice Sites

Year: 2012 PMID： 23175615 PMCID： PMC3531183 DOI： 10.1093/nar/gks1113

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

The Eukaryotic Pathogen Database (EuPathDB: http://eupathdb.org) is one of the five NIAID/NIH-funded Bioinformatics Resource Centers (BRCs) supporting infectious disease pathogens and invertebrate vectors of human disease (1–5). BRC resources provide free online access to functional genomic data with tools that enable integrated data interrogation (6). Additional information regarding the BRC program is available on NIAID websites (http://www.niaid.nih.gov/labsandresources/resources/dmid/brc/Pages/default.aspx) and the BRC portal site (http://pathogenportal.org). EuPathDB is specifically tasked with providing support to research communities investigating eukaryotic pathogens, in particular (but not limited to) categories A–C priority and (re)-emerging pathogens. In addition, collaborative efforts between EuPathDB and GeneDB (7) with funding from The Bill and Melinda Gates Foundation and The Wellcome Trust made it possible to develop a kinetoplastid resource (8). Currently, EuPathDB includes 11 component sites listed with their web addresses in Table 1 (1,8–13). All databases incorporate the Strategies WDK, a unique graphical search tool that enables users to perform complex combinatorial queries (14). This system has been used by other genomic resources including, FungiDB (http://fungidb.org) (15), SchistoDB (http://schistodb.net) (16), TBDB (http://www.tbdb.org/wdk/) (17) and BetaCell (http://www.betacell.org) (18).

Table 1.

This table lists EuPathDB resources, their web addresses and the included organisms

Database	Web address	Supported organisms
EuPathDB	http://eupathdb.org	All EuPathDB organisms listed below
AmoebaDB	http://amoebadb.org	Entamoeba histolytica, E. dispar, E. invadens, E. moshkovskii
CryptoDB	http://cryptodb.org	Cryptosporidium parvum, C. hominis, C. muris
GiardiaDB	http://giardiadb.org	Giardia lamblia assemblages A, B and E
MicrosporidiaDB	http://microsporidiadb.org	Edhazardia aedis, Encephalitozoon cuniculi, E. hellem, E. intestinalis, Enterocytozoon bieneusi, Hamiltosporidium tvaerminnensis, Nematocida parisii, Nosema ceranae, Vavraia culicis
PiroplasmaDB	http://piroplasmadb.org	Babesia bovis, Theileria annulata, T. parva
PlasmoDB	http://plasmodb.org	Plasmodium berghei, P. chabaudi, P. falciparum, P. gallinaceum, P. knowlesi, P. reichenowi, P. vivax, P. yoelii
ToxoDB	http://toxodb.org	Toxoplasma gondii, Eimeria tenella, Gregarina niphandrodes, Neospora caninum
TrichDB	http://trichdb.org	Trichomonas vaginalis
TriTrypDB	http://tritrypdb.org	Trypanosoma brucei, T. congolense, T. cruzi, T. vivax, Leishmania major, L. infantum, L. braziliensis, L. Mexicana, L. panamensis, L. tarentolae, Endotrypanum monterogeii
OrthoMCL	http://orthomcl.org	Includes proteins from over 150 organisms across bacteria, archaea and eukarya.

This table lists EuPathDB resources, their web addresses and the included organisms

WHAT IS NEW IN EuPathDB

Over the past 2 years, EuPathDB has made advances in its repertoire of databases, data content, analysis and visualization tools and its infrastructure.

New databases

The latest addition to the EuPathDB family of databases is PiroplasmaDB (http://piroplasmadb.org), which supports Babesia and Theileria parasites. The look and feel of PiroplasmaDB is identical to other EuPathDB resources. Searches in this database are conducted using the search strategy system (14), which involves the sequential addition of searches using set operations to produce a refined list of results (11). Figure 1A depicts a search strategy in PiroplasmaDB that defines a list of genes predicted to contain signal peptides, transmembrane domains or both, and are differentially regulated between a virulent and an attenuated strain of Babesia bovis (19). To facilitate collaborative efforts, search strategies may be shared using a uniquely generated URL (Figure 1B). For example, the search strategy displayed in Figure 1A may be accessed using the following address: http://piroplasmadb.org/piro/im.do?s=de44813e1905d647.

Figure 1.

Screen shots of a search strategy in PiroplasmaDB and GBrowse representing HTS (C–E from ToxoDB and F and G from AmoebaDB) (A) A three-step search strategy combining genes with predicted signal peptides, transmembrane domains and microarray expression data. (B) Search strategies may be saved and shared with others using a uniquely generated URL. (C) Peptides from mass spec experiments are mapped to genes and displayed graphically. Mousing over the graphics provides additional information, such as the peptide sequence and any posttranslational modifications. In this image, peptides are from a phophoproteomic experiment. (D) A track representing strand-specific RNA-seq data. Blue indicates reads mapping to the forward strand, whereas red represents those mapping to the reverse strand. (E) Unified splice junction track representing intron-spanning RNA-seq reads from all experiments in the database. (F) A 2 kb region with alignment of DNA sequencing reads to the genome. (G) Zooming in to 100 bp displays the actual sequence allowing data inspection. Highlighted nucleotides represent SNPs.

ReFlow workflow system

The EuPathDB data builds are complex because the project includes 11 different websites, each with its own underlying database. In each bi-monthly release cycle, some of these databases are completely rebuilt (when there are major changes to multiple genomes). The rest may receive incremental updates to add high-value data sets, such as newly sequenced and annotated genomes or new functional experiments or to revise existing ones. In both cases, the build is controlled entirely by workflows using the ReFlow workflow system developed in-house. The workflows are dependency graphs specifying every step of creating the integrated database, from data acquisition, through analysis on a compute cluster, to cross-referencing and finally loading. As an example, PlasmoDB’s workflow has approximately 5000 distinct steps, which analyze and load data from approximately 250 data sets. ReFlow is uniquely suited to building genomic databases as it supports running ‘in reverse’ to remove outdated data. ReFlow is used during each build cycle to revise outdated data sets, to recompute cross-genome analyses when we add new genomes and to redo data that our QA process has identified as having a bug.

New data content

The data content in EuPathDB has increased both in quantity and type. An updated data content table is available at the following URL: http://eupathdb.org/eupathdb/showXmlDataContent.do?name=XmlQuestions.GenomeDataType

Genome sequence and annotation

The number of available sequenced and annotated genomes has increased dramatically owing in large part to the presence of a number of sequencing ‘white papers’ specifically tasked with sequencing eukaryotic pathogens (i.e. The Broad Institute—Plasmodium and Microsporidia; the J. Craig Venter Institute—Toxoplasma and Entamoeba; and the Genome Institute at Washington University—Kinetoplastida). Additional whole-genome sequencing data are provided by the parasite genomics section of the Sanger Institute and individual research laboratories. EuPathDB incorporates both annotated and unannotated genomes providing searches based on the provided data (i.e. annotation, BLAST analysis, sequence retrieval and download, etc.) and based on various analyses performed in-house [i.e. InterPro scan (20), open reading frame prediction, BLAT against the NCBI, Genome Ontology searches, searches against available functional data, etc.].

New data types include

Phosphoproteomic data

Mass spectrometry-based data representing peptides with phosphorylated amino acids have been incorporated allowing users to search for genes with modified peptides and graphically visualize modified peptides. Figure 1C shows a Genome Browser (GBrowse) (21) view from ToxoDB showing phospho-peptides mapped against genes (22). Mousing over the peptide glyphs reveals information regarding the peptide amino acid sequence, modified amino acid and genomic location.

Strand-specific RNA sequence (RNA-seq) data

Data from such experiments are represented in GBrowse as histograms of depth of read coverage. Reads aligning to the forward strand are in blue and those aligning to the reverse strand are in red (Figure 1D). Currently, strand-specific RNA-seq data are available in PlasmoDB (Newbold and Berriman groups, unpublished data) and ToxoDB (Boothroyd and Gregory groups, unpublished data).

Splice junction predictions (based on RNA-seq)

Intron-spanning RNA-seq reads are aligned to the genome using the RNA-seq unified mapper (23) (Figure 1E). Intron-spanning reads from individual experiments or from all available experiments combined may be visualized. Mousing over intron spans reveals experimental information and the number of reads that support the span enabling users to evaluate the confidence of the intron and identify genes that show evidence for alternative RNA processing.

Single nucleotide polymorphism data based on high-throughput sequencing

Single nucleotide polymorphisms (SNPs) based on high-throughput sequencing (HTS) data are determined by aligning reads to the reference genome using Bowtie (24), post-processing with SamTools (25) and GATK (26) and ultimately called using VarScan (27). Genes can be identified based on their SNP characteristics and parameters, such as allele frequency (based on percent allele-matched reads), P-value and depth of coverage supporting a SNP may be tweaked. Read pileup data are available in GBrowse, including the ability to view actual aligned sequence reads (Figure 1F and G) to further assess the quality of individual SNP calls.

Expression quantitative trait loci data

Genes may be identified based on their association to genome-wide expression-level polymorphisms from a genetic cross between phenotypically distinct parasite clones of Plasmodium falciparum (HB3 and Dd2) (28). This data may be searched and visualized in multiple ways. Genes may be identified based on their association to genomic segments, expression profile similarity or similarity of genetic association. Genomic segments can be identified based on their association to genes. Regions/spans that are associated by expression quantitative trait loci data (eQTL) are displayed in a table on gene pages and both microsatellites and haplotype blocks are available as tracks in GBrowse.

High-throughput phenotyping data

Essential Trypanosoma brucei genes can be identified based on the decreased sequence read coverage generated from sequencing the population of expression library cassettes in a genome-wide RNAi-based screen (29). The high-throughput phenotyping search is located in the ‘Putative Function’ section under the heading ‘Identify Genes by’ on the TriTrypDB home page (8). A sample strategy that searches this data for genes that are likely essential in all stages or time points examined can be accessed here: http://tritrypdb.org/tritrypdb/im.do?s=0e54e90e623cbbc2 Graphs and tables representing the expression and percentile values for individual genes are available in the ‘Phenotype’ section of gene pages, and GBrowse tracks of coverage plots for each sample from this experiment are available.

New Tools

Genomic segment tool

DNA segments may be defined based on their genomic location or their nucleotide sequence (DNA motif pattern) (Figure 2A). This search dynamically generates segment records allowing the incorporation of results into a search strategy (see genomic colocation, below). This new search is available under ‘Identify Other Data Types’; click on ‘Genomic Segments (DNA motif)’ then select either ‘DNA motif pattern’ or ‘Genomic location’ (Figure 2A). Figure 2B shows the DNA motif pattern search page, which allows selection of target organisms to search (example shown from GiardiaDB) and an input window for the DNA motif pattern (simple text or a regular expression may be used). Results of a DNA motif pattern search are returned as a step in a strategy and the motif records are displayed including the identified motif (Figure 2C).

Figure 2.

Screen shot from GiardiaDB depicting a genomic segment search. (A) Genomic segment searches (i.e. DNA motif pattern) are available on the home page. (B) DNA motifs may be entered as a standard string of characters or using a regular expression as depicted. (C) DNA segment records are generated dynamically and results are displayed in a search strategy with results represented in a dynamic table below the strategy.

Genomic colocation tool

This tool enables searches based on a user-defined relationship between entities with defined genomic coordinates (i.e. genes, SNPs, DNA motifs, etc.). For example, one may be interested in identifying all genes that have a SNP or a DNA motif located within 500-nt upstream of the 5′-end. Figure 4 illustrates the steps taken to find all genes that have a DNA motif defined in Figure 3C located within 500-nt upstream of the 5′-end. After running a DNA pattern search, a step is added to define all genes in the organism of interest (Figure 3A). Since the steps in this strategy include different result types (DNA motifs and genes), the only option available for combining the results is the genomic colocation option (Figure 3A). The next step is to define which results to retrieve based on the user-defined colocation relationship (Figure 3B). The customizable colocation popup provides a dynamic logic statement that is updated based on the chosen parameters (Figure 3B). Once the parameters are set, the logic statement in this example is ‘Return each gene from step 2 whose upstream region contains the exact region of a Genomic Segment from step 1 and is on the same strand’. Clicking on ‘Get Answer’ returns all genes that meet the colocation criteria (results include in addition to gene IDs, the number and location of matches (Figure 3C).

Figure 4.

Screen shots from PlasmoDB showing in (A) a typical result list from a search strategy, (B) an alternative graphical representation of genes on chromosomes, (C) a word cloud generated by clicking on the column analysis icon for the product description column and (D) a histogram generated by clicking on the column analysis icon for the ortholog count column.

Figure 3.

Screen shots depicting the genomic colocation query in EuPathDB resources. In this example from GiardiaDB, genes that have a DNA motif located within 500-nt upstream are identified. (A) To identify genes in relation to DNA motifs, a step searching for genes based on the organism of interest is added to the strategy. The genomic colocation option is selected by default when combing different record types, such as DNA motifs and genes. (B) The customizable colocation popup provides a dynamic logic statement that is updated based on the chosen parameters. (C) Results of colocation query. Top of the panel shows the search strategy and the bottom portion includes the results with columns for gene IDs, number of matched motifs in the defined region and match genomic coordinates. Screen shots from PlasmoDB showing in (A) a typical result list from a search strategy, (B) an alternative graphical representation of genes on chromosomes, (C) a word cloud generated by clicking on the column analysis icon for the product description column and (D) a histogram generated by clicking on the column analysis icon for the ortholog count column.

Alternative views of search results

Search results are typically visualized as a list of results in a table with customizable columns (Figure 4A) (1). A new feature provides tabs that enable users to choose alternative data views. For example, in gene results pages, users can choose a graphical visualization of their genes mapped on the genome (Figure 4B) to determine, if there is bias in the genomic distribution. A user may zoom in on individual chromosomes and click on the gene graphic to visit the gene page or a GBrowse view. For isolate results, users can select a Google map view to visualize the geographic distribution of the isolates. Clicking on the pins pops up, specific information with the option to retrieve isolate results from that country.

Column analysis

This tool enables users to analyze data within columns of the results table after running a search. To access this feature, run any search that returns a list of results, then click on the icon next to the column name (Figure 4A). Currently, this tool offers two analyses: word clouds for columns containing text (Figure 4C) and histograms for columns containing numbers (Figure 4D). Further analyses, including enrichment analysis for GO terms, EC numbers and pathways, will be implemented in the near future.

Updated Genome Browser

The GMOD Genome Browser has been updated to version 2.48. The update provides several new GBrowse features to EuPathDB users, including the ability to upload BAM files in the custom tracks section allowing private display of HTS data in the context of other available data tracks. Additional features available in GBrowse may be accessed at the following URL: http://gmod.org/wiki/GBrowse_2.0_HOWTO

Future directions

EuPathDB resources will continue to expand both in data content and type, and in functionality. Development projects that are currently underway include: integration of OrthoMCL into the strategiesWDK: this would facilitate better integration of data from OrthoMCL with the rest of EuPathDB and would promote integrated evolution-based queries using search strategies; incorporation of mass spectrometric metabolomic data allowing queries for changes in the metabolome of parasites in response to developmental or environmental changes; incorporation of parasite host response data enabling users to ask questions regarding changes in host cells (i.e. RNA-seq, microarray, proteomics, etc.) in response to infection by eukaryotic parasites; enabling direct data export from EuPathDB, a Galaxy server (30). This would allow users to perform custom analysis with data obtained from EuPathDB and their own uploaded data. Examples of this include analysis of RNA-seq results, SNP analysis and phylogenetic tree reconstruction; and enabling GBrowse login to allow users to store their custom tracks and GBrowse preferences in their EuPathDB user profile.

FUNDING

National Institute of Allergy and Infectious Diseases (EuPathDB); National Institutes of Health, Department of Health and Human Services [Contract No. HHSN272200900038C to D.S.R., C.J.S. and J.C.K.]; Bill and Melinda Gates Foundation, The Wellcome Trust [WT085822MA, The TriTrypDB component of EuPathDB]. Funding for open access charge: NIH Contract No. [HHSN272200900038C]. Conflict of interest statement. None declared.

30 in total

1. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.

Authors: Aaron McKenna; Matthew Hanna; Eric Banks; Andrey Sivachenko; Kristian Cibulskis; Andrew Kernytsky; Kiran Garimella; David Altshuler; Stacey Gabriel; Mark Daly; Mark A DePristo
Journal: Genome Res Date: 2010-07-19 Impact factor: 9.043

2. VarScan: variant detection in massively parallel sequencing of individual and pooled samples.

Authors: Daniel C Koboldt; Ken Chen; Todd Wylie; David E Larson; Michael D McLellan; Elaine R Mardis; George M Weinstock; Richard K Wilson; Li Ding
Journal: Bioinformatics Date: 2009-06-19 Impact factor: 6.937

Review 3. TB database 2010: overview and update.

Authors: James E Galagan; Peter Sisk; Christian Stolte; Brian Weiner; Michael Koehrsen; Farrell Wymore; T B K Reddy; Jeremy D Zucker; Reinhard Engels; Marcel Gellesch; Jeremy Hubble; Heng Jin; Lisa Larson; Maria Mao; Michael Nitzberg; Jared White; Zachariah K Zachariah; Gavin Sherlock; Catherine A Ball; Gary K Schoolnik
Journal: Tuberculosis (Edinb) Date: 2010-05-20 Impact factor: 3.131

4. Aligning short sequencing reads with Bowtie.

Authors: Ben Langmead
Journal: Curr Protoc Bioinformatics Date: 2010-12

5. The Sequence Alignment/Map format and SAMtools.

Authors: Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal: Bioinformatics Date: 2009-06-08 Impact factor: 6.937

6. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.

Authors: Jeremy Goecks; Anton Nekrutenko; James Taylor
Journal: Genome Biol Date: 2010-08-25 Impact factor: 13.583

7. AmoebaDB and MicrosporidiaDB: functional genomic resources for Amoebozoa and Microsporidia species.

Authors: Cristina Aurrecoechea; Ana Barreto; John Brestelli; Brian P Brunk; Elisabet V Caler; Steve Fischer; Bindu Gajria; Xin Gao; Alan Gingle; Greg Grant; Omar S Harb; Mark Heiges; John Iodice; Jessica C Kissinger; Eileen T Kraemer; Wei Li; Vishal Nayak; Cary Pennington; Deborah F Pinney; Brian Pitts; David S Roos; Ganesh Srinivasamoorthy; Christian J Stoeckert; Charles Treatman; Haiming Wang
Journal: Nucleic Acids Res Date: 2010-10-24 Impact factor: 16.971

8. TriTrypDB: a functional genomic resource for the Trypanosomatidae.

Authors: Martin Aslett; Cristina Aurrecoechea; Matthew Berriman; John Brestelli; Brian P Brunk; Mark Carrington; Daniel P Depledge; Steve Fischer; Bindu Gajria; Xin Gao; Malcolm J Gardner; Alan Gingle; Greg Grant; Omar S Harb; Mark Heiges; Christiane Hertz-Fowler; Robin Houston; Frank Innamorato; John Iodice; Jessica C Kissinger; Eileen Kraemer; Wei Li; Flora J Logan; John A Miller; Siddhartha Mitra; Peter J Myler; Vishal Nayak; Cary Pennington; Isabelle Phan; Deborah F Pinney; Gowthaman Ramasamy; Matthew B Rogers; David S Roos; Chris Ross; Dhileep Sivam; Deborah F Smith; Ganesh Srinivasamoorthy; Christian J Stoeckert; Sandhya Subramanian; Ryan Thibodeau; Adrian Tivey; Charles Treatman; Giles Velarde; Haiming Wang
Journal: Nucleic Acids Res Date: 2009-10-20 Impact factor: 16.971

9. EuPathDB: a portal to eukaryotic pathogen databases.

Authors: Cristina Aurrecoechea; John Brestelli; Brian P Brunk; Steve Fischer; Bindu Gajria; Xin Gao; Alan Gingle; Greg Grant; Omar S Harb; Mark Heiges; Frank Innamorato; John Iodice; Jessica C Kissinger; Eileen T Kraemer; Wei Li; John A Miller; Vishal Nayak; Cary Pennington; Deborah F Pinney; David S Roos; Chris Ross; Ganesh Srinivasamoorthy; Christian J Stoeckert; Ryan Thibodeau; Charles Treatman; Haiming Wang
Journal: Nucleic Acids Res Date: 2009-11-13 Impact factor: 16.971

10. SchistoDB: a Schistosoma mansoni genome resource.

Authors: Adhemar Zerlotini; Mark Heiges; Haiming Wang; Romulo L V Moraes; Anderson J Dominitini; Jerônimo C Ruiz; Jessica C Kissinger; Guilherme Oliveira
Journal: Nucleic Acids Res Date: 2008-10-08 Impact factor: 16.971

49 in total

1. Integration of RNA-seq and proteomics data with genomics for improved genome annotation in Apicomplexan parasites.

Authors: Natalie C Silmon de Monerri; Louis M Weiss
Journal: Proteomics Date: 2015-08 Impact factor: 3.984

2. Chromerid genomes reveal the evolutionary path from photosynthetic algae to obligate intracellular parasites.

Authors: Yong H Woo; Hifzur Ansari; Thomas D Otto; Christen M Klinger; Martin Kolisko; Jan Michálek; Alka Saxena; Dhanasekaran Shanmugam; Annageldi Tayyrov; Alaguraj Veluchamy; Shahjahan Ali; Axel Bernal; Javier del Campo; Jaromír Cihlář; Pavel Flegontov; Sebastian G Gornik; Eva Hajdušková; Aleš Horák; Jan Janouškovec; Nicholas J Katris; Fred D Mast; Diego Miranda-Saavedra; Tobias Mourier; Raeece Naeem; Mridul Nair; Aswini K Panigrahi; Neil D Rawlings; Eriko Padron-Regalado; Abhinay Ramaprasad; Nadira Samad; Aleš Tomčala; Jon Wilkes; Daniel E Neafsey; Christian Doerig; Chris Bowler; Patrick J Keeling; David S Roos; Joel B Dacks; Thomas J Templeton; Ross F Waller; Julius Lukeš; Miroslav Oborník; Arnab Pain
Journal: Elife Date: 2015-07-15 Impact factor: 8.140

3. The role of palmitoylation for protein recruitment to the inner membrane complex of the malaria parasite.

Authors: Johanna Wetzel; Susann Herrmann; Lakshmipuram Seshadri Swapna; Dhaneswar Prusty; Arun T John Peter; Maya Kono; Sidharth Saini; Srinivas Nellimarla; Tatianna Wai Ying Wong; Louisa Wilcke; Olivia Ramsay; Ana Cabrera; Laura Biller; Dorothee Heincke; Karen Mossman; Tobias Spielmann; Christian Ungermann; John Parkinson; Tim W Gilberger
Journal: J Biol Chem Date: 2014-11-25 Impact factor: 5.157

4. PopNet: A Markov Clustering Approach to Study Population Genetic Structure.

Authors: Javi Zhang; Asis Khan; Andrea Kennard; Michael E Grigg; John Parkinson
Journal: Mol Biol Evol Date: 2017-07-01 Impact factor: 16.240

Review 5. Genomics of apicomplexan parasites.

Authors: Lakshmipuram Seshadri Swapna; John Parkinson
Journal: Crit Rev Biochem Mol Biol Date: 2017-02-22 Impact factor: 8.250

6. The apical annuli of Toxoplasma gondii are composed of coiled-coil and signalling proteins embedded in the inner membrane complex sutures.

Authors: Klemens Engelberg; Chun-Ti Chen; Tyler Bechtel; Victoria Sánchez Guzmán; Allison A Drozda; Suyog Chavan; Eranthie Weerapana; Marc-Jan Gubbels
Journal: Cell Microbiol Date: 2019-09-10 Impact factor: 3.715

Review 7. A review on host-pathogen interactions: classification and prediction.

Authors: R Sen; L Nayak; R K De
Journal: Eur J Clin Microbiol Infect Dis Date: 2016-07-29 Impact factor: 3.267

Review 8. A review of the global burden, novel diagnostics, therapeutics, and vaccine targets for cryptosporidium.

Authors: William Checkley; A Clinton White; Devan Jaganath; Michael J Arrowood; Rachel M Chalmers; Xian-Ming Chen; Ronald Fayer; Jeffrey K Griffiths; Richard L Guerrant; Lizbeth Hedstrom; Christopher D Huston; Karen L Kotloff; Gagandeep Kang; Jan R Mead; Mark Miller; William A Petri; Jeffrey W Priest; David S Roos; Boris Striepen; R C Andrew Thompson; Honorine D Ward; Wesley A Van Voorhis; Lihua Xiao; Guan Zhu; Eric R Houpt
Journal: Lancet Infect Dis Date: 2014-09-29 Impact factor: 25.071

9. Literature-based gene curation and proposed genetic nomenclature for cryptococcus.

Authors: Diane O Inglis; Marek S Skrzypek; Edward Liaw; Venkatesh Moktali; Gavin Sherlock; Jason E Stajich
Journal: Eukaryot Cell Date: 2014-05-09

10. EuPathDB: The Eukaryotic Pathogen Genomics Database Resource.

Authors: Susanne Warrenfeltz; Evelina Y Basenko; Kathryn Crouch; Omar S Harb; Jessica C Kissinger; David S Roos; Achchuthan Shanmugasundram; Fatima Silva-Franco
Journal: Methods Mol Biol Date: 2018