Literature DB >> 17984080

VFDB 2008 release: an enhanced web-based resource for comparative pathogenomics.

Jian Yang1, Lihong Chen, Lilian Sun, Jun Yu, Qi Jin.   

Abstract

Virulence factor database (VFDB) was set up in 2004 dedicated for providing current knowledge of virulence factors (VFs) from various medical significant bacterial pathogens to facilitate pathogenomic research. Nowadays, complete genome sequences of almost all the major pathogenic microbes have been determined, which makes comparative genomics a powerful approach for uncovering novel virulence determinants and hidden aspects of pathogenesis. VFDB was therefore upgraded to present the enormous diversity of bacterial genomes in terms of virulence genes and their organization. The VFDB 2008 release includes the following new features; (i) detailed tabular comparison of virulence composition of a given genome with other genomes of the same genus, (ii) multiple alignments and statistical analysis of homologous VFs and (iii) graphical comparison of genomic organizations of virulence genes. Comparative analysis of the numerous VFs will improve our understanding of the nature and evolution of virulence, as well as the development of new therapeutic and preventive strategies. VFDB 2008 release offers more user-friendly tools for comparative pathogenomics and it is publicly accessible at http://www.mgc.ac.cn/VFs/.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17984080      PMCID: PMC2238871          DOI: 10.1093/nar/gkm951

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Infectious diseases remain to be one of the biggest threats to public health despite the advance of modern medicine in post-genome era (1). Virulence factors (VFs) refer to the traits encoded by ‘virulence genes’ that pathogenic microbes are equipped to cause infection. To combat infectious diseases, a better understanding of VFs is absolutely necessary to decipher the mechanisms pathogenic microbes employ. VFDB was built to meet the challenge of providing up-to-date knowledge about VFs from various medically important bacterial pathogens (2). The term pathogenomics is given to describe genomic approaches in studying microbial pathogens as to how they interact with their hosts, and in other words, pathogenomics is the study of pathogenic microbes and the entities they infect on the genomic level. The availability of complete genome sequences of different microbial species enables comparative studies to identify the common as well as species- or strain-specific VFs. Pathogenic bacteria have acquired various VFs that allow them to colonize diverse niches, cause infection and to survive in the hosts. Commonly shared VFs indicate universal requirement to cause infection by related pathogens, whereas narrowly distributed VFs often determine species- and/or strain-specific characteristics. As a consequence, comparative genomic approaches were introduced into VFDB to explore VFs within completely deciphered bacterial genomes. The VFDB 2008 release has not only collected up-to-date knowledge about VFs from over 200 complete genomes of pathogenic bacteria, but also has incorporated a set of analytical tools to meet the desire of comparative pathogenomic studies.

DATABASE UPDATES

Data source and construction for comparative analysis

Information of publicly available bacterial genomes was retrieved from the summary page of ‘Complete Microbial Genomes’ at NCBI (http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi). RefSeq is a curated non-redundant collection of sequences with uniform format (3). For convenience of later data processing only genomes that are available from RefSeq database were included for further comparative analysis (both pathogenic and non-pathogenic isolates). The complete genome sequences and annotations were batch downloaded from the FTP server of RefSeq (ftp://ftp.ncbi.nih.gov/genomes/Bacteria/). The VF loci in each genome were obtained from the original literatures and subsequent reviews. Each of the VF genes was verified by sequence-similarity search against the genomes of related bacterial pathogens. Data collected were manually inspected and each homolog group was further validated by multi-alignments (see below). The NCBI BLAST software was used for local sequence-similarity search (4). A series of BioPerl scripts were designed to extract features for all desired loci from the downloaded genome files in a semi-automated fashion. An enhanced multiple genome map viewer (5) was employed for graphical comparison of the pathogenomic organizations (see below). ClustalW (6) was run in batch by an in-house Perl script to generate multi-alignments for each homolog group.

Full tabular comparison of the pathogenomic composition

Tabular style was commonly used in scientific literatures for comparative analysis. For each genus a full comparison of pathogenomic composition is given as a spreadsheet to integrate information about VFs and genomes (see Figure 1A for example). The far left column organizes all known VFs in functional groups (toxins, lipase, etc.) and the next column lists all VF genes, and each row gives gene IDs (i.e. ‘locus_tag’ in annotation files) of the respective genomes, and pseudogenes are highlighted by star marks. Each gene ID in the table is a hyperlink that connects its individual page for full DNA and protein sequences. All tables can be downloaded as Excel files by terminal users.
Figure 1.

Comparative pathogenomic results of four sequenced Listeria genomes. (A) Full tabular comparison of pathogenomic composition. (B) Graphical overview for the comparison of pathogenomic organization. VF genes are color-coded by their functional classifications. Homologues between each adjacent pair of genomes are indicated by connecting lines for convenience of further interpretation.

Comparative pathogenomic results of four sequenced Listeria genomes. (A) Full tabular comparison of pathogenomic composition. (B) Graphical overview for the comparison of pathogenomic organization. VF genes are color-coded by their functional classifications. Homologues between each adjacent pair of genomes are indicated by connecting lines for convenience of further interpretation. For a quick glance of overall information, the above full-detailed tables can be converted to simplified tables, where each VF occupies a single row and gene details are replaced by symbols; ‘+’ for presence, ‘−’ for absence, and ‘±’ for partial or non-functional genes. The simplified table can be viewed in text mode, symbolic mode or schematic mode. In some genus there are many complete genome sequences, and an extreme case is Streptococcus with 25 complete genomes up to date. Taking the consideration that a selective subset of genomes might be more interesting to certain studies, a special filter is designed to allow generating tables that contain only selected genomes. For example, a table can be generated by expressing data of only 12 pyogenes genomes within the genus of Streptococcus.

Multiple alignments of homologous virulence genes

Analysis of homologous genes is a powerful approach for elucidating gene structure, function and evolution. The diversity of nucleotide sequences of bacterial genes often reflects particular niches a microbe colonizes in vivo and in the environment (7). From the full-comparison table described above, a single click on gene name will return a page with multi-alignment of both nucleotide and amino acid sequences if homologous gene(s) existing. A summary table on top of the alignment gives statistics about unmatched overhang, length of the alignment and percentage of polymorphic sites. A configurable phylogenetic tree constructed by ClustalW is displayed beneath the alignment when more than two sequences are involved. A filter is also designed to perform multi-alignment on selected sequences only. In this case however there is no pre-computed result by default, and an online ClustalW must be run which constructs an alignment in a few seconds.

Concise graphical comparison of pathogenomic organization

Bacterial genome evolution has been driven by nucleotide substitutions and indels, as well as the changes of the genome architecture by genetic rearrangements including translocations and inversions (8). Recent comparative genomic studies have revealed that the dynamic changes of genome structures contribute greatly to the adaptive evolution of certain bacterial pathogens, such as Shigella (9). To unambiguously display the dynamic features of the genomes and to compare VFs’ genomic organization among related pathogens, an enhanced multiple genome map viewer was implemented, which depicts all VF genes as clickable arrows (or bars) and color-coded by functional classifications. Since details about genes unrelated to virulence are hidden, the map becomes concise, although not to scale, and suitable for quick examinations of the genome organization of VFs among related genomes. The viewer page provides three different representing styles: (i) complete mode which exhibits full scale pathogenomic map that is informative but usually large in size; (ii) compact mode that provides details of all virulence loci but omits flanking genes/regions; (iii) overview mode that scales the map to fit full screen without giving details. To facilitate interpretation of pathogenomic synteny under the overview mode, there are lines to connect homologous VF genes of the adjacent genomes when only one replicon is available (or selected by the users) for each genome. Terminal users can also run the viewer with select genomes of their interest. The usefulness of displaying synteny is highlighted in the case of Listeria species (Figure 1B); their genomes exhibit a high synteny in virulence gene organization. It is in agreement with the recent listerial pangenome studies, which revealed the lack of inversions or shifting of large genome segments in the sequenced Listeria genomes. The possible reason may be the low occurrence of transposons and insertion sequence elements in those genomes (10).

DISCUSSION

Virulence involves a wide spectrum of biological activities, which is reflected by the diverse VFs employed by pathogenic microbes to colonize the particular niches in the hosts. A fuller investigation of VFs is highly desirable for pathogenomic research. VFDB 2008 release attempts to meet such a challenge by providing all creditable information up to date and by providing more analytical tools to the terminal users. The comparative pathogenomic results indicate that most pathogens have a flexible gene pool encoding VFs. Different combinations of VFs or organizations on microbial genomes or different expression patterns of VFs may in consequence be responsible for the diverse clinical signs of pathogen infections. VFDB 2008 release has expanded with additional eight pathogens, which are Brucella, Bartonella, Campylobacter, Clostridium, Corynebacterium and Enterococcus, as well as Chlamydia and Mycoplasma. VFDB will continue to expand by including more medical significant pathogens, and provide up-to-date information by regular updates. For the convenience of local use, full dataset of VFDB is available for batch download in several forms, including FASTA sequences and tabular (Excel) files. Furthermore, new features and analytical tools are under development which we anticipate to make VFDB a useful pathogenomic resource to the scientific community.
  10 in total

Review 1.  Microbial genome evolution: sources of variability.

Authors:  Alex Mira; Lisa Klasson; Siv G E Andersson
Journal:  Curr Opin Microbiol       Date:  2002-10       Impact factor: 7.934

Review 2.  Haemophilus influenzae: genetic variability and natural selection to identify virulence factors.

Authors:  Janet R Gilsdorf; Carl F Marrs; Betsy Foxman
Journal:  Infect Immun       Date:  2004-05       Impact factor: 3.441

Review 3.  Pathogenomics of Listeria spp.

Authors:  Torsten Hain; Som S Chatterjee; Rohit Ghai; Carsten Tobias Kuenne; André Billion; Christiane Steinweg; Eugen Domann; Uwe Kärst; Lothar Jänsch; Jürgen Wehland; Wolfgang Eisenreich; Adelbert Bacher; Biju Joseph; Jennifer Schär; Jürgen Kreft; Jochen Klumpp; Martin J Loessner; Julia Dorscht; Klaus Neuhaus; Thilo M Fuchs; Siegfried Scherer; Michel Doumith; Christine Jacquet; Paul Martin; Pascale Cossart; Christophe Rusniock; Philippe Glaser; Carmen Buchrieser; Werner Goebel; Trinad Chakraborty
Journal:  Int J Med Microbiol       Date:  2007-05-07       Impact factor: 3.473

Review 4.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

5.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.

Authors:  J D Thompson; D G Higgins; T J Gibson
Journal:  Nucleic Acids Res       Date:  1994-11-11       Impact factor: 16.971

6.  Genome dynamics and diversity of Shigella species, the etiologic agents of bacillary dysentery.

Authors:  Fan Yang; Jian Yang; Xiaobing Zhang; Lihong Chen; Yan Jiang; Yongliang Yan; Xudong Tang; Jing Wang; Zhaohui Xiong; Jie Dong; Ying Xue; Yafang Zhu; Xingye Xu; Lilian Sun; Shuxia Chen; Huan Nie; Junping Peng; Jianguo Xu; Yu Wang; Zhenghong Yuan; Yumei Wen; Zhijian Yao; Yan Shen; Boqin Qiang; Yunde Hou; Jun Yu; Qi Jin
Journal:  Nucleic Acids Res       Date:  2005-11-07       Impact factor: 16.971

7.  ShiBASE: an integrated database for comparative genomics of Shigella.

Authors:  Jian Yang; Lihong Chen; Jun Yu; Lilian Sun; Qi Jin
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

8.  NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.

Authors:  Kim D Pruitt; Tatiana Tatusova; Donna R Maglott
Journal:  Nucleic Acids Res       Date:  2006-11-27       Impact factor: 16.971

9.  VFDB: a reference database for bacterial virulence factors.

Authors:  Lihong Chen; Jian Yang; Jun Yu; Zhijian Yao; Lilian Sun; Yan Shen; Qi Jin
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

Review 10.  Infectious diseases - a global challenge.

Authors:  Katja Becker; Ying Hu; Nikola Biller-Andorno
Journal:  Int J Med Microbiol       Date:  2006-01-30       Impact factor: 3.473

  10 in total
  87 in total

1.  Evidence of a robust resident bacteriophage population revealed through analysis of the human salivary virome.

Authors:  David T Pride; Julia Salzman; Matthew Haynes; Forest Rohwer; Clara Davis-Long; Richard A White; Peter Loomer; Gary C Armitage; David A Relman
Journal:  ISME J       Date:  2011-12-08       Impact factor: 10.302

Review 2.  Detecting genomic islands using bioinformatics approaches.

Authors:  Morgan G I Langille; William W L Hsiao; Fiona S L Brinkman
Journal:  Nat Rev Microbiol       Date:  2010-05       Impact factor: 60.633

3.  Draft genome sequence of Turicibacter sanguinis PC909, isolated from human feces.

Authors:  Páraic Ó Cuív; Eline S Klaassens; A Scott Durkin; Derek M Harkins; Les Foster; Jamison McCorrison; Manolito Torralba; Karen E Nelson; Mark Morrison
Journal:  J Bacteriol       Date:  2010-12-23       Impact factor: 3.490

4.  A comparative genomics, network-based approach to understanding virulence in Vibrio cholerae.

Authors:  Jianying Gu; Yufeng Wang; Timothy Lilburn
Journal:  J Bacteriol       Date:  2009-08-07       Impact factor: 3.490

5.  Crystal structure of a putative quorum sensing-regulated protein (PA3611) from the Pseudomonas-specific DUF4146 family.

Authors:  Debanu Das; Hsiu-Ju Chiu; Carol L Farr; Joanna C Grant; Lukasz Jaroszewski; Mark W Knuth; Mitchell D Miller; Henry J Tien; Marc-André Elsliger; Ashley M Deacon; Adam Godzik; Scott A Lesley; Ian A Wilson
Journal:  Proteins       Date:  2013-11-22

6.  Screening for antimicrobial resistance genes and virulence factors via genome sequencing.

Authors:  Mads Bennedsen; Birgitte Stuer-Lauridsen; Morten Danielsen; Eric Johansen
Journal:  Appl Environ Microbiol       Date:  2011-02-18       Impact factor: 4.792

7.  Integration and visualization of host-pathogen data related to infectious diseases.

Authors:  Timothy Driscoll; Joseph L Gabbard; Chunhong Mao; Oral Dalay; Maulik Shukla; Clark C Freifeld; Anne Gatewood Hoen; John S Brownstein; Bruno W Sobral
Journal:  Bioinformatics       Date:  2011-06-27       Impact factor: 6.937

8.  The human gut virome: inter-individual variation and dynamic response to diet.

Authors:  Samuel Minot; Rohini Sinha; Jun Chen; Hongzhe Li; Sue A Keilbaugh; Gary D Wu; James D Lewis; Frederic D Bushman
Journal:  Genome Res       Date:  2011-08-31       Impact factor: 9.043

9.  Complete genomic sequence of the O-desmethylangolensin-producing bacterium Clostridium rRNA cluster XIVa strain SY8519, isolated from adult human intestine.

Authors:  Shin-ichiro Yokoyama; Kenshiro Oshima; Izumi Nomura; Masahira Hattori; Tohru Suzuki
Journal:  J Bacteriol       Date:  2011-10       Impact factor: 3.490

Review 10.  PATRIC: the comprehensive bacterial bioinformatics resource with a focus on human pathogenic species.

Authors:  Joseph J Gillespie; Alice R Wattam; Stephen A Cammer; Joseph L Gabbard; Maulik P Shukla; Oral Dalay; Timothy Driscoll; Deborah Hix; Shrinivasrao P Mane; Chunhong Mao; Eric K Nordberg; Mark Scott; Julie R Schulman; Eric E Snyder; Daniel E Sullivan; Chunxia Wang; Andrew Warren; Kelly P Williams; Tian Xue; Hyun Seung Yoo; Chengdong Zhang; Yan Zhang; Rebecca Will; Ronald W Kenyon; Bruno W Sobral
Journal:  Infect Immun       Date:  2011-09-06       Impact factor: 3.441

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.