| Literature DB >> 15980573 |
David M Aanensen1, Brian G Spratt.
Abstract
The unambiguous characterization of strains of a pathogen is crucial for addressing questions relating to its epidemiology, population and evolutionary biology. Multilocus sequence typing (MLST), which defines strains from the sequences at seven house-keeping loci, has become the method of choice for molecular typing of many bacterial and fungal pathogens (and non-pathogens), and MLST schemes and strain databases are available for a growing number of prokaryotic and eukaryotic organisms. Sequence data are ideal for strain characterization as they are unambiguous, meaning strains can readily be compared between laboratories via the Internet. Laboratories undertaking MLST can quickly progress from sequencing the seven gene fragments to characterizing their strains and relating them to those submitted by others and to the population as a whole. We provide the gateway to a number of MLST schemes, each of which contain a set of tools for the initial characterization of strains, and methods for relating query strains to other strains of the species, including clustering based on differences in allelic profiles, phylogenetic trees based on concatenated sequences, and a recently developed method (eBURST) for identifying clonal complexes within a species and displaying the overall structure of the population. This network of MLST websites is available at http://www.mlst.net.Entities:
Mesh:
Year: 2005 PMID: 15980573 PMCID: PMC1160176 DOI: 10.1093/nar/gki415
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Schematic representation of a typical MLST workflow. (A) Sequences can be entered locus by locus (single locus query), or all seven loci from a single strain (multiple locus query), or (D) by uploading a XML file with a set of strains and their sequences (batch strain query). (B) For the single and multiple locus queries, sequences that are not in the database are identified, and can be compared with the sequences of all known alleles using Jalview (7), or (C) the nucleotide differences compared with the most similar alleles can be displayed. The batch strain query (D) returns a strain table (E), which shows the allele number for each locus if known and the allelic profile if all the seven alleles and the ST are known. Strains that have the most similar allelic profiles to query strains are displayed as a table or by cluster analysis (F), and further information about them can be obtained. For the pneumococcal example used here, the query strains can be compared with the reference set of pneumococcal strains and closely related streptococcal strains, to establish whether or not they are pneumococci, using the concatenated sequences to construct a neighbor-joining tree (H). The relationship of unknown strains to the whole population can also be investigated using eBURST (G).
Figure 2The XML format for batch querying multiple strains using mlst.net.
Figure 3A population snapshot of the entire S.pneumoniae MLST database showing all major and minor clonal complexes viewed using eBURST.