| Literature DB >> 31396192 |
Yen-Yi Liu1, Ji-Wei Lin2, Chih-Chieh Chen2,3.
Abstract
With the decreasing cost of next-generation sequencing, whole-genome sequence-based bacterial genome comparisons are expected to become a mainstream process in the public health domain. Extended multilocus sequence typing (MLST) methods are becoming increasingly popular for use in comparing bacterial genetic relatedness in epidemiological investigations. Several extended MLST schemes based on biological signatures have been reported. Among them, whole-genome MLST (wgMLST) has gradually become one of the most widely used approaches for bacterial strain typing. In addition to using bacterial typing, many researchers aim to identify differences in the genes of compared strains. Because these differences might provide insights into critical bacterial functions, such as virulence and pathogenicity, researchers usually study these genes that differ between strains. Hence, we designed a web service tool based on wgMLST-constructed tree topology coupled with the feature selection method to create the "canonical wgMLST (cano-wgMLST) tree." The genes that differ between strains are shown at each split of the tree, thereby directly providing information for performing a comparative genomic analysis for each strain pair. We demonstrated that this web service tool could be operated efficiently on two datasets consisting of 22 Campylobacter jejuni isolates and 59 S. Heidelberg isolates, respectively. We imposed this tool on a designated web server, cano-wgMLST_BacCompare, to enable users to create a wgMLST tree and canonical wgMLST tree automatically from their uploaded bacterial genomes for not only epidemiological investigation but also comparative genomic analysis. Additionally, detailed information on how to use this service is provided. The cano-wgMLST_BacCompare is available at http://baccompare.imst.nsysu.edu.tw.Entities:
Keywords: comparative genomic analysis; epidemiological investigation; feature selection; molecular typing; next generation sequencing; whole-genome multilocus sequence typing
Year: 2019 PMID: 31396192 PMCID: PMC6668299 DOI: 10.3389/fmicb.2019.01687
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
FIGURE 1Schematic work flow of cano-wgMLST–BacCompare.
FIGURE 2Schematic work flow for identifying highly discriminatory loci.
FIGURE 3Features of the cano-wgMLST_BacCompare server. (A) Input page of the upload genome contig files; (B) output page of the whole-genome scheme extraction; (C) output page of the highly discriminatory loci refinement and wgMLST tree.
FIGURE 4Dendrogram and heatmap constructed using wgMLST profiles for 22 Campylobacter jejuni isolates. (A) The wgMLST tree generated on the basis of the Occ100 scheme (1602 loci). (B) The cano-wgMLST tree generated on the basis of the Occ100_top5 scheme (42 loci). The Occ100 scheme refers to the set of loci that are present in all isolates, and the Occ100_top5 scheme is a subset of the Occ100 scheme that unites the five most discriminatory loci for each split. Outbreak or event-related taxa are colored red. (C) The heatmap of 42 highly discriminatory loci. Different alleles in the same column are indicated by different colors.
List of the 42 highly discriminatory loci selected by 22 Campylobacter jejuni isolates.
| # | Locus | Splitsa | Gene name | Annotation |
|---|---|---|---|---|
| 1 | SAL0000001 | 8,9,10 | Putative glycosyltransferase EpsH | |
| 2 | SAL0000002 | 2,7 | Potassium-transporting ATPase B chain | |
| 3 | SAL0000007 | 10 | Hypothetical protein | |
| 4 | SAL0000258 | 1 | Bifunctional purine biosynthesis protein PurH | |
| 5 | SAL0000726 | 3 | Hypothetical protein | |
| 6 | SAL0000810 | 4 | Metalloprotease MmpA | |
| 7 | SAL0000920 | 3 | Cytolethal distending toxin subunit A precursor | |
| 8 | SAL0001454 | 1 | Outer membrane protein class 4 precursor | |
| 9 | SAL0001455 | 4 | Riboflavin synthase | |
| 10 | SAL0001471 | 3 | NADP-dependent 3-hydroxy acid dehydrogenase YdfG | |
| 11 | SAL0001477 | 1 | Putative peptidyl-prolyl | |
| 12 | SAL0001479 | 1 | Hypothetical protein | |
| 13 | SAL0001481 | 1 | Sensor protein BasS | |
| 14 | SAL0001487 | 4 | Hypothetical protein | |
| 15 | SAL0001491 | 2 | Hypothetical protein | |
| 16 | SAL0001494 | 2 | Hypothetical protein | |
| 17 | SAL0001508 | 8 | GDP/UDP-N,N’-diacetylbacillosamine 2-epimerase (hydrolyzing) | |
| 18 | SAL0001515 | 3 | Uracil-DNA glycosylase | |
| 19 | SAL0001524 | 6,8 | Heme/hemopexin transporter protein HuxB precursor | |
| 20 | SAL0001525 | 4 | Hypothetical protein | |
| 21 | SAL0001527 | 11 | Putative type I restriction enzymeP M protein | |
| 22 | SAL0001529 | 7 | DNA polymerase III subunit tau | |
| 23 | SAL0001530 | 2,9 | Hypothetical protein | |
| 24 | SAL0001531 | 5 | tRNA uridine 5-carboxymethylaminomethyl modification enzyme MnmG | |
| 25 | SAL0001534 | 11 | Hypothetical protein | |
| 26 | SAL0001538 | 11 | Flagellar hook protein FlgE | |
| 27 | SAL0001539 | 9 | Ribonuclease J 1 | |
| 28 | SAL0001540 | 6 | Putative tRNA-dihydrouridine synthase | |
| 29 | SAL0001542 | 10,11 | Hypothetical protein | |
| 30 | SAL0001543 | 6,7 | N,N’-diacetyllegionaminic acid synthase | |
| 31 | SAL0001544 | 9 | Potassium-transporting ATPase A chain | |
| 32 | SAL0001548 | 6 | Heat-inducible transcription repressor HrcA | |
| 33 | SAL0001557 | 11 | UvrABC system protein B | |
| 34 | SAL0001560 | 5 | DNA ligase | |
| 35 | SAL0001565 | 4 | Flagellar protein FlaG | |
| 36 | SAL0001572 | 5 | DNA topoisomerase 1 | |
| 37 | SAL0001573 | 5 | Ferrous iron permease EfeU precursor | |
| 38 | SAL0001578 | 8 | Adenosylmethionine-8-amino-7-oxononanoate aminotransferase | |
| 39 | SAL0001595 | 5 | Hypothetical protein | |
| 40 | SAL0001599 | 7,10 | Aminoglycoside 3-N-acetyltransferase | |
| 41 | SAL0001600 | 2,6,7,8,9,10 | Hypothetical protein | |
| 42 | SAL0001601 | 3 | Alpha-ketoglutarate permease |