| Literature DB >> 25243218 |
Siew Woh Choo1, Hamed Heydari2, Tze King Tan1, Cheuk Chuen Siow3, Ching Yew Beh4, Wei Yee Wee1, Naresh V R Mutha3, Guat Jah Wong1, Mia Yang Ang1, Amir Hessam Yazdi5.
Abstract
To facilitate the ongoing research of Vibrio spp., a dedicated platform for the Vibrio research community is needed to host the fast-growing amount of genomic data and facilitate the analysis of these data. We present VibrioBase, a useful resource platform, providing all basic features of a sequence database with the addition of unique analysis tools which could be valuable for the Vibrio research community. VibrioBase currently houses a total of 252 Vibrio genomes developed in a user-friendly manner and useful to enable the analysis of these genomic data, particularly in the field of comparative genomics. Besides general data browsing features, VibrioBase offers analysis tools such as BLAST interfaces and JBrowse genome browser. Other important features of this platform include our newly developed in-house tools, the pairwise genome comparison (PGC) tool, and pathogenomics profiling tool (PathoProT). The PGC tool is useful in the identification and comparative analysis of two genomes, whereas PathoProT is designed for comparative pathogenomics analysis of Vibrio strains. Both of these tools will enable researchers with little experience in bioinformatics to get meaningful information from Vibrio genomes with ease. We have tested the validity and suitability of these tools and features for use in the next-generation database development.Entities:
Mesh:
Year: 2014 PMID: 25243218 PMCID: PMC4138799 DOI: 10.1155/2014/569324
Source DB: PubMed Journal: ScientificWorldJournal ISSN: 1537-744X
Figure 1A workflow of the PGC tool. Through an input web interface on VibrioBase, users can choose two genomes of interest (reference versus query genomes) in VibrioBase and parameters for comparison. Three available useful parameters/thresholds which are the minimum percent identity, merge threshold, and link threshold. The minimum percent identity parameter will display aligned genomic regions (represented by colored links) once the sequence identity is higher than the user-defined cut-off. Similar to the merge threshold, this threshold will merge two links if the distance between the two regions is lower than the user-defined cut-off. The link threshold ignores links in the diagram if their widths are lower than the user-defined cut-off. Once the job is submitted to our server, PGC pipeline parses this information and starts the genome alignment with NUCmer. A series of in-house Perl and Python scripts are used to parse the NUCmer output and generate different text files: (1) Karyotype.txt stores information about the contigs and their colors used in Circos; (2) Links.txt stores information about the aligned genomic regions; (3) Band_labels.txt keeps the names of each contigs/chromosome. The Circos.conf configuration file will be created using the information in the above four files, which is needed for displaying the two aligned genomes with Circos.
Figure 2Example of a pairwise genome alignment between V. vulnificus YJ016 and V. vulnificus CMCP6. (a) Circos plot reveals differences between the two genomes. Red circle indicates the translocations possibly occur. Black arrowheads indicate phage insertions. (b) The structure of the two putative prophages. (Scale of the Circos 1 : 1 Mbps).
Figure 3The profiling of virulence genes of 134 V. cholerae strains using PathoProT. The heat map was generated by using the thresholds of 50% completeness and 50% identity. It gave an overview of the clustered strains having closely related sets of virulence genes, sorted according to the level of similarities across the strains and genes. The flagella formation gene clusters (flgB-flgC-flgD-flgF-flgG-flgH-flgI-flgK-flgL) are the largest conserved virulence gene group among the V. cholera strains. The flg gene cluster works with other virulence gene clusters such as fla gene cluster (flaA-flaB-flaC-flaD-flaE) [17] and fli gene cluster (fliH-fliJ-fliL-fliM-fliN-flnO-fliP-fliQ-fliR-fliS) to form the flagella organelles on the bacteria [18], which is one of the common characteristics among all cholerae strains.
Figure 4A 16S-based phylogenetic tree. V. cholera HENC 01, V. cholera HENC 02, and V. cholera HENC 03 are clustered into a group (red box), rather than with other V. cholera strains (blue box). The three strains are closely related to V. parahaemolyticus species. These results suggest that the three strains might be inappropriately classified into V. cholera.