Literature DB >> 21483493

Core gene set as the basis of multilocus sequence analysis of the subclass Actinobacteridae.

Toïdi Adékambi1, Ray W Butler, Finnian Hanrahan, Arthur L Delcher, Michel Drancourt, Thomas M Shinnick.   

Abstract

Comparative genomic sequencing is shedding new light on bacterial identification, taxonomy and phylogeny. An in silico assessment of a core gene set necessary for cellular functioning was made to determine a consensus set of genes that would be useful for the identification, taxonomy and phylogeny of the species belonging to the subclass Actinobacteridae which contained two orders Actinomycetales and Bifidobacteriales. The subclass Actinobacteridae comprised about 85% of the actinobacteria families. The following recommended criteria were used to establish a comprehensive gene set; the gene should (i) be long enough to contain phylogenetically useful information, (ii) not be subject to horizontal gene transfer, (iii) be a single copy (iv) have at least two regions sufficiently conserved that allow the design of amplification and sequencing primers and (v) predict whole-genome relationships. We applied these constraints to 50 different Actinobacteridae genomes and made 1,224 pairwise comparisons of the genome conserved regions and gene fragments obtained by using Sequence VARiability Analysis Program (SVARAP), which allow designing the primers. Following a comparative statistical modeling phase, 3 gene fragments were selected, ychF, rpoB, and secY with R2>0.85. Selected sets of broad range primers were tested from the 3 gene fragments and were demonstrated to be useful for amplification and sequencing of 25 species belonging to 9 genera of Actinobacteridae. The intraspecies similarities were 96.3-100% for ychF, 97.8-100% for rpoB and 96.9-100% for secY among 73 strains belonging to 15 species of the subclass Actinobacteridae compare to 99.4-100% for 16S rRNA. The phylogenetic topology obtained from the combined datasets ychF+rpoB+secY was globally similar to that inferred from the 16S rRNA but with higher confidence. It was concluded that multi-locus sequence analysis using core gene set might represent the first consensus and valid approach for investigating the bacterial identification, phylogeny and taxonomy.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21483493      PMCID: PMC3069002          DOI: 10.1371/journal.pone.0014792

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Beyond the standard DNA-DNA hybridization value, the ad-hoc committee concerned with the revaluation of taxonomy of the species definition in bacteriology proposed using a small set of five housekeeping genes for quantitative evaluation of taxonomic relatedness to achieve an adequately informative level of phylogeny data and issued a call for the detection of such genes [1]. Previous studies have confirmed that sequences of housekeeping genes accurately predict genome relatedness and can be used for species-level identification [2], [3]. Several housekeeping genes have been used for bacterial phylogeny and identification [4]–[6] but no consensus has been made regarding an optimal selection of genes. Also, some previously recommended genes were absent in some species or isolates of the same genus or could not be amplified [5], [7]. For example, 695 protein coding sequences were present in Mycobacterium marinum but not in Mycobacterium ulcerans or Mycobacterium tuberculosis [8]. Admittedly, these studies were empirical and did not take advantage of currently available genome sequences. Comparative genome sequence analysis may also aid in the reasonable selection of candidate genes with defined characteristics. It has been reported that suitable genes should fulfill the following conditions; (i) the genes must be ubiquitous with orthologous sequences in all cellular life as is the case of 16S rRNA gene; (ii) the genes must be present in single copy among genomes, without close paralogues that could confuse analysis; (iii) the individual genes must be long enough (>900 bases) to contain sufficient information; (iv) the genes should not be prone to horizontal gene transfer (HGT) or recombination; (v) closely linked genes should also be avoided; (vi) the genes should contain at least two highly conserved regions to allow the design of appropriate amplification and sequencing primers and (vii) the sequence must predict whole-genome relationships with acceptable precision [3], [9]–[11]. In this study, we applied these conditions to the analysis of the subclass Actinobacteridae using genome computational methods to deduce a small set of useful genes [12]. The development of a universal approach for housekeeping genes for classification and identification (as is the case of 16S rRNA gene) may present difficulties because of the saturation of the third codon position over a long evolutionary timescale [13]. So, for the subclass Actinobacteridae, we applied these conditions to the 63 core genes originally suggested by Koonin [14] who compared 100 sequenced genomes belonging to the all cellular life (Bacteria, Archaea and Eukaryotes). A common set of genes was previously defined as the smallest possible group of genes that would be sufficient to sustain a functioning cellular life form under the most favorable conditions imaginable, that is, in the presence of a full complement of essential nutrients and in the absence of environmental stress [15], [16]. For many important pathogens, the genes common to all strains within a species [known as the core genome] are a minority component of the entire gene pool for that species (the pan-genome) [17]. The subclass Actinobacteridae contained two orders Actinomycetales and Bifidobacteriales and comprised about 85% of the actinobacteria families [18]. They are Gram-positive bacteria with a high G+C content in their DNA ranging from 51% in some Corynebacterium species to more than 70% in Streptomyces and Frankia species; 43 families and about 200 genera are recognized. This lineage comprises a wide range of morphologically diverse organisms with different phenotypic characteristics; from coccoid (e. g. Micrococcus) or rod-coccoid (e.g. Arthrobacter) to fragmenting hyphal forms (e.g. Nocardia spp.) or permanent and highly differentiated branched mycelium (e.g. Streptomyces) [18]–[20]. The actinobacteria have a common ancestry [18], [21], [22] and share conserved indels in protein sequences [23] and 23S rRNA [24] that are characteristics of this phylum. A recent comprehensive analysis of four actinobacterial genomes identified 233 proteins with unknown functions that were unique for this cluster of genomes and do not have homologues in any other currently available bacterial genome [25]. Some genera of the subclass Actinobacteridae are under taxonomic reevaluation [26]–[33]. Unique biochemical characteristics shared by all subclass members have not been demonstrated. Many members of the subclass stain acid-alcohol-fast but may not represent known families [34]. Some Mycobacteriology reference laboratories also identify Nocardia and Rhodococcus species. Therefore, it is noteworthy to find consensus genes, particularly those with robust, broad range primers useful for the taxonomic, phylogenetic, and identification analysis of this subclass. Therefore, we systemically analyzed with computational tools, genes belonging to the so-called core gene set defined based on genome examination of Actinobacteridae in order to (i) determine a standard set of genes for use in all phylogenetic levels and (ii) complement and extend the utility of the 16S rRNA gene in the identification and classification of bacteria. Among the core set genes, we attempted to detect genes useful for the first line identification and taxonomic relationship analysis of the subclass Actinobacteridae. In this study, we proposed a core gene set as consensus genes for multi-locus sequence analysis (MLSA) for the subclass Actinobacteridae.

Results

Genes selection

For the 63 ubiquitous genes identified by Koonin [14], we used the algorithm described in . More than half were eliminated because they had sequence length less than 900 bp (312–840 bp). Most of these genes belong to the family of ribosomal proteins (48%; 30/63 genes). Also, the aminoacyl-transfer-RNA synthetase genes (24%; 15/63 genes) presented evidence of HGT that limited their usefulness for bacterial taxonomy and identification [35]–[37]. Furthermore, it has been shown that ribosomal proteins and tRNA synthetases are not appropriate for use in phylogenetic analyses [38] due to their small size and HGT. Therefore, these genes were removed from the study. BLAST searches were used to eliminate candidates that had multiple copies or close paralogues in the genome ( ). After application of these criteria, only 9 candidate genes (15%) remained for further analysis of the nucleotides with sequence alignments ( ). These candidate genes were categorized in three classes; transcription (rpoB, rpoC), translation (infB, ychF, ksgA, qr17) and secretion (secY, ffh, ftsY) pathways. Among the candidate genes, 2 pairs were linked (rpoB-rpoC and ftsY-ffh).
Figure 1

Strategy to establish a comprehensive gene set useful for MLSA following the literature recommended criteria.

Primer selection

After applying Sequence VARiability Analysis Program (SVARAP) to the 9 candidate genes [39], [40], two genes, ksgA and qr17, contained only a single well-conserved region to design a low-degeneracy primer [data not shown]. Primers were designed for 6 functionally diverse genes in the conserved regions flanking the hypervariable regions. The mean variability of the highly conserved regions where the primers were selected was <15% ( ). For several of the selected genes (rpoB, rpoC, infB), there were numerous potential primer binding sites conserved across the studied genera. But our constraint was to choose a hypervariable region flanked by conserved regions which potentially had enough information to distinguish different Actinobacteridae species and strains. Also, an accommodating fragment size from 600–900 bp was needed in other to be sequenced directly in both directions with the PCR primers.
Figure 2

Mean variability for successive windows of 50 nucleotide positions using SVARAP.

(A) ychF gene sequence,(B) rpoB gene sequence (C), secY gene sequence (D), infB gene sequence and (E) rpoC gene sequence The arrows correspond to the forward and reverse amplification and sequencing primers.

Mean variability for successive windows of 50 nucleotide positions using SVARAP.

(A) ychF gene sequence,(B) rpoB gene sequence (C), secY gene sequence (D), infB gene sequence and (E) rpoC gene sequence The arrows correspond to the forward and reverse amplification and sequencing primers.

Primers tested

The primers were tested with 26 species belonging to 10 Actinobacteria genera. Four primer combinations of the ffh and two primer combinations of ftsY gene produced specific, non-specific and negative amplification among all tested species. Therefore, these genes were not considered further. One set of primers for each of the 5 genes (rpoB, rpoC infB, ychF and secY) was identified with the criteria enumerated in materials and methods ( ). There were 5 primer pairs which bind across all these bacterial taxa. For 4 of the 5 genes (exception rpoC gene), their fragments could be sequenced directly in both directions with the PCR primers by using the ABI prism 3130xl instrument. Among the 50 genomes understudy, the sizes ranged from 700–718 bp for ychF (∼65% coverage), 745–769 bp for rpoB (∼22% coverage), 766–793 bp for secY (∼60% coverage), and 733–745 bp for infB (∼27% coverage). Ultimately, we identified (see prediction of whole genome conserved region analysis) one combination of primers for each of the 3 genes (ychF, rpoB and secY) that produced a single amplicon for each of the phylogenetically diverse Actinobacteridae species screened ( ). Their size (according to M. tuberculosis numbering Genbank accession number CP000611) and annealing temperature were summarized in . The secY primers cannot amplify Atopobium rimae, an actinobacteria that is not belonging to the subclass Actinobacteridae ( ). In addition, in-silico analysis on the 12 species that belonged to the phylum Actinobacteria but not to the subclass Actinobacteridae showed in some cases restricted binding sites for either the secY or ychF primers but not for the rpoB primers. Therefore, the analysis in the present work is restricted on the subclass Actinobacteridae. The intraspecies similarities were 96.3–100% for ychF, 97.8–100% for rpoB and 96.9–100% for secY among 73 strains belonging to 15 species of Actinobacteridae compare to 99.4–100% for 16S rRNA ( ). The list of the primers specific for the common set of genes described above can be used for the amplification and sequencing of these same genes in other genera in the subclass Actinobacteridae.
Table 1

Testing amplification with broad range primers on some representative genera of the subclass Actinobacteridae. (+) PCR positif.

16S rRNA ychF rpoB secY
Nocardia asteriodes ++++
Nocardia brasiliense ++++
Nocardia otitidiscaviarum ++++
Actinomadura madurae ++++
Actinomadura pelletieri ++++
Tsukamurella aurantiacutis ++++
Tsukamurella paurometabolum ++++
Corynebacterium jeikeium ++++
Corynebacterium xerosis ++++
Corynebacterium pseudotuberculosis ++++
Gordonia bronchialis ++++
Gordonia sputi ++++
Gordonia terrae ++++
Rhodococcus equi ++++
Rhodococcus erythropolis ++++
Streptomyces somaliensis ++++
Mycobacterium avium ++++
Mycobacterium kansasii ++++
Mycobacterium fortuitum ++++
Mycobacterium lentiflavum ++++
Mycobacterium goodii ++++
Mycobacterium salmoniphilum ++++
Mycobacterium simiae ++++
Mycobacterium tuberculosis ++++
Bifidobacterium longum ++++
Atopobium rimae +++
Table 2

Broad range primer sequences for amplification and sequencing of the species of the subclass Actinobacteridae.

Genes nameSequence of the primersExcepted sequence size (M. tuberculosis numbering)Hybridization Temperature
rpoB ActRpoB2473F: GGHAAGGTSACSCCNAAGGG ActRpoB3303R: GAANCGCTGDCCRCCGAACTG 754 bp60°C
secY ActSECY238F: GGBRTBATGCCSTACATYAC ActSECY1109R: AANCCRCCRWACTKCTTCAT787 bp52°C
ychF ActYCHF208F: TTYGTBGAYATCGCVGG ActYCHF983R: ACGAYYTCVGCYTTGATGAA703 bp52°C

R = A or G; Y = C or T; K = G or T; S = G or C; W = A or T; B = C, G, or T; D = A, G, or T; H = A, C, or T; V = A, C, or G; N = A, C, G, or T.

Table 3

Core gene set intraspecies similarities among species of the subclass Actinobacteridae.

SpeciesNumber of isolates16S rRNA similarity (%) ychF similarity (%) rpoB similarity (%) secY similarity (%)
Mycobacterium abscessus sensu stricto(a) 20 100 99.6–100 99.6–100 99.7–100
Mycobacterium avium 310099.1–99.999.799.9–100
Mycobacterium leprae 2100100100100
Mycobacterium bovis 3100100100100
Mycobacterium tuberculosis 20100100100100
Mycobacterium monacense 310099.0–10010099.6–100
Rhodococcus erythropolis 210096.999.396.3
Corynebacterium glucuronolyticum 210010099.7100
Corynebacterium matruchotii 299.499.998.899.2
Streptomyces roseosporus 210010099.9100
Propionibacterium acnes 299.9–10010099.9100
Bifidobacterium adolescentis 298.7–99.997.699.299.2
Bifidobacterium animalis 499.8–100100100100
Bifidobacterium longum 499.9–10098.3–10097.8–99.998.1–100
Tropheryma whipplei 210099.899.7100

(a) The intraspecies similarities were performed experimentally on 20 clinical isolates of Mycobacterium abscessus.

R = A or G; Y = C or T; K = G or T; S = G or C; W = A or T; B = C, G, or T; D = A, G, or T; H = A, C, or T; V = A, C, or G; N = A, C, G, or T. (a) The intraspecies similarities were performed experimentally on 20 clinical isolates of Mycobacterium abscessus.

Candidate gene fragment sequences and predicting whole genome conserved region relationships

Because MLSA is based on the sequences of housekeeping gene fragments, it remained to be determined which of the candidate gene fragment sequences best satisfied the final criterion by predicting whole genome conserved region relationships. The scatter plot of the gene fragment similarities and the whole genome conserved region similarities (n = 1,224 pairwise comparisons) are shown in . A least-squares quadratic regression was computed for each gene, giving the following formulae for the whole genome conserved region relatedness (GCR) for two actinobacterial strains as a function of gene-fragment sequence similarity (SS):
Figure 3

Scatter plot of the relationship between genome conserved similarity and gene sequence fragment similarity.

(A) ychF fragment, (B) rpoB fragment, (C) secY fragment, (D), infB fragment and (E) ychF+rpoB+secY fragment.

Scatter plot of the relationship between genome conserved similarity and gene sequence fragment similarity.

(A) ychF fragment, (B) rpoB fragment, (C) secY fragment, (D), infB fragment and (E) ychF+rpoB+secY fragment. The best three gene fragments were ychF, rpoB and secY, all with R2 above 0.85 while infB had a somewhat lower value. Interestingly, combining of the three top gene fragments produced a prediction model with a high R2 value of 0.919 and formula: GCR = 0.0212SS2 ychFrpoBsecY−2.873SSychFrpoBsecY +172.14 ( ). Among the 50 genomes studied, the sizes ranged from 2,220–2,274 bp for ychF+rpoB+secY.

Phylogenetic analysis

In addition to the 50 species understudy, 83 species of the subclass Actinobacteridae that the genome sequences are in the pipeline are included in the phylogenetic analysis. Species (n = 12) that belonged to the phylum Actinobacteria but not to the subclass Actinobacteridae are used as outgroup. The phylogenetic trees constructed with different fragment of genes were shown to have moderate heterogeneity in term of topological histories (data not shown). However, the concatenation of these 3 gene fragments delivered high confidence in phylogenetic tree as shown in the by using neighour joining method. The percentage of bootstrap values greater than 65% at each node was higher for ychF+rpoB+secY (81%) compare to 16S rRNA (68%) (p = 0.01 by chi-square test). A bootstrap value >67% in the concatenated tree supported the fork separating Mycobacterium Corynebacterium, Streptomyces, and Bifidobacterium genera from the other recognized genera ( ). The phylogenetic organizations obtained from combined datasets were globally similar to that inferred from the 16S rRNA ( ) but with higher confidence indicating that ychF+rpoB+secY appeared to be useful tool in addition to the 16S rRNA gene for the investigation of evolutionary relationships among the species of the subclass Actinobacteridae.
Figure 4

Phylogenetic tree of actinobacterial species using the neighbour-joining method with Kimura's two parameter distance correction.

(A) ychF+rpoB+secY fragment, (B) 16S rRNA gene. The support of each branch, as determined from 1000 bootstrap samples, is indicated by the value at each node (as a percentage). Bar represent difference in nucleotide sequences.

Phylogenetic tree of actinobacterial species using the neighbour-joining method with Kimura's two parameter distance correction.

(A) ychF+rpoB+secY fragment, (B) 16S rRNA gene. The support of each branch, as determined from 1000 bootstrap samples, is indicated by the value at each node (as a percentage). Bar represent difference in nucleotide sequences.

Discussion

Analysis of the 16S rRNA gene sequences has served as the standard to assess Actinobacteria diversity in nature and to classify Actinobacteria species [21]. Recently, new specific 16S rRNA gene primers were designed and an Actinobacteria Amplification Resource (http://microbe2.ncl.ac.uk/MMB/AAR.htm) site was constructed to provide a visual guide to aid in the amplification of actinobacterial 16S rRNA gene in marine and terrestrial environment [41]. The appeal of these molecules lies in their ubiquitous distribution and relatively slow rate of evolution, which enables comparison among divergent Actinobacteria species. Several authors have noted shortcomings in using 16S rRNA gene sequences for assessing Actinobacteria diversity and for phylogenetic analysis. The lack of informative characters and a slow evolution rate complicates both the differentiation of closely related strains of bacteria as well as the resolution of an evolutionary tree [42]. 16S rRNA is a multiple copy gene and may be present in 1–6 copies in the subclass Actinobacteridae (www.genome.jp) with 99.4–100% similarity (Table 3). Some of these multiple copies of 16S rRNA gene exhibit different sequences [4], [43], [44]. In this situation, direct sequencing is not suitable for isolate identification because of discrepant results are produced by the different sequences. In the vast majority of bacterial genomes, the divergence between 16S rRNA gene sequence copies is <1% [45]. Also, the influence of intragenomic heterogeneity displayed by the 16S rRNA gene on bacterial phylogeny was assessed. Furthermore, despite the perceived reliability of the 16S rRNA gene sequence as a phylogenetic marker, it is known that any single measure of sequence similarity is subject both to simple stochastic variation and to the influence of recombination or HGT [35], [46]. Also, examples of HGT of the 16S rRNA in nature have been reported based on patterns of the 16S rRNA gene sequence heterogeneity, but these are limited to relatively closely related organisms, including certain Actinomycetes [44], [47], [48]. MLSA has been proposed as an alternative to 16S rRNA for some genera of the subclass Actinobacteridae such as Mycobacterium [4]–[6], Bifidobacterium [49], Microbacterium [50], [51], and Streptomyces [52]. However, there was no consensus regarding the choice of genes to be used for MLSA amongst these genera and choices have remained empirical. The increasing genomic database was examined to define a rational post-genomic study of a common set of genes that may be useful for the MLSA classification, phylogeny and identification of the species belonging to the subclass Actinobacteridae. The present study represented a first attempt in the development of a systematic measured approach for proposal of a core set of genes for MLSA useful for the subclass Actinobacteridae. The main objective was the selection of housekeeping genes as candidates that belong to the common set of genes and fulfilled the criteria noted before. The availability of 50 complete genomes of the subclass Actinobacteridae provided the stimulus for selecting the candidate loci. The hypothesis that gene fragment sequence can predict genome conserved region accurately is supported strongly in this study. The 3 gene fragments (ychF, rpoB and secY) selected appeared to be stable and evolved slowly. The phylogenetic tree derived from the concatenation of these 3 fragments is more robust than that derived from the 16S rRNA ( ). The 3 loci selected were found to be suitable for MLSA as they amplified and could be sequenced in the species of the subclass Actinobacteridae studied. Also, certain loci were linked as in the case of rpoB and rpoC (rpoB always preceding rpoC). According to the large rpoB database [53] and because the rpoC amplicon (>1300 bp) was sequenced totally only with additional sequence primers ( ), we proposed to incorporate rpoB rather than rpoC in the MLSA. To our knowledge, these 3 loci have not been incorporated in the same MLSA studies and this represents the first time that ychF gene has been proposed for bacterial taxonomy, phylogeny and identification. Furthermore, the 3 fragments distinguish more the strains of a single species than the 16S rRNA (Table 3). Although there are no validated cut-off values to delineate species of the subclass Actinobacteridae, we observed that similarity values of <96.3% for ychF, <97.8% for rpoB and <96.9% for secY effectively delineated the currently recognized species in the subclass Actinobacteridae. Similar threshold (97.7%) has been suggested for the rpoB by analysing the complete sequence [53]. However, M. marinum and M. ulcerans; Rhodococcus jostii and Rhodococcus opacus; Corynebacterium pseudogenitalium and Corynebacterium tuberculostearium; Streptomyces roseosporus and Streptomyces griseus; Streptomyces coelicolor and Streptomyces lividans; Bifidobacterium catenulatum and Bifidobacterium pseudocatenulatum seem to belong to the same species. Contrary to the slow evolution rate of the 16S rRNA gene, these 3 genes belonging to the essential gene set tend to be highly evolutionarily conserved, in terms of both the rate of sequence evolution [54] and particularly, in terms of wide phyletic spread [54]–[56]. It was suggested by Zeigler [3] that less than five genes might be sufficient to equal or surpass the power of DNA-DNA hybridizations and could predict overall phylogenetic relatedness with high precision. Recently, it was described by Edwards [57] that it was not so much the multiplicity of genes that was deemed responsible for the success of combining information via concatenation, but rather the multiplicity of characters or sites. We also demonstrated that rpoB sequence similarity was significantly correlated with DNA-DNA hybridization among two bacterial species [58] and average nucleotide identity [59]. Despite the concerns with MLSA due to the difficulties in choosing genes to be compared, the information derived from the common set of genes presented here can complement and extend the utility of the 16S rRNA sequences for resolving issues pertaining to the genetics and evolution of bacterial genomes. Based on the consensus view introduced here, this common set of genes described may serve as a convenient starting point in the logical development of MLSA for other bacterial species and may be useful in construction of a supertree [60]. In this study, we systematically selected gene fragments of ychF, rpoB and secY as suitable representative candidates to achieve the goal of creating and generating a robust and highly discriminatory supertree which infers phylogeny among members of Actinobacteria species. Moreover, these 3 fragments of genes could potentially reflect the evolution of the whole genome because they are spaced well apart on the genome and their tree heterogeneity is moderate. Using a common set of genes for MLSA would represent an easier way to standardize the identification and phylogenetical relationships of known and unknown species across the subclass Actinobacteridae. Admittedly, further studies will be necessary needed to assess the intraspecies and the interspecies variability of isolates and reference strains in the different genera to improve some guidelines for the use of the common set of genes [40]. MLSA would also favour the creation of sequence databases for comparative purposes and would allow taxonomists to compare new taxa at a remote location via the internet. The exchange of reference strains between laboratories could be reduced and this approach could aid the reorganization of the species of the subclass Actinobacteridae, which would be important for misclassified species and unnamed taxa. Finally, the approach described above may have universal application but should be tested with other bacterial subclass. The primer sets will likely have to be adapted for each subclass or bacterial group. In summary a set of broad range primers were developed that targeted housekeeping genes distributed in the subclass Actinobacteridae. From the data presented, we concluded that MLSA using the common set of genes ychF, rpoB, and secY represented a valid approach for investigating the identification, phylogeny and taxonomy of Actinobacteria genera and may represent an alternative approach to DNA-DNA hybridization.

Materials and Methods

Strains and an in silico core gene set databases

Genomes of 50 different species of the subclass Actinobacteridae including multiple strains of the same species have been sequenced to completion () [61]. An in silico core gene set database was constructed from these genomes based on the 63 ubiquitous genes identified by Koonin [14]. The bacteria families, genera and species analyzed in this study are summarized in . Only one gene sequence per taxon was retained in cases where the isolates shared 100% gene sequence similarity to avoid bias due to primer choice for a taxon. Sequencing of genomes from representatives of about 200 other high G+C content bacteria are currently in progress (http://www.genomesonline.org). The intraspecies similarities were performed in silico on 53 strains belonging to 14 species of the subclass Actinobacteridae and experimentally on 20 clinical isolates of Mycobacterium abscessus ( ).

Common set of gene database for primers designation

A common set of gene related sequences were retrieved from the available whole-genome sequences. This was facilitated with BLAST searches (http://www.ncbi.nlm.nih.gov/BLAST/genome) performed against the 50 Actinobacteria genome sequences available at the NCBI website (http://www.ncbi.nlm.nih.gov/sutils/genom_table.cgi/) using the orthologues of a common set of genes derived from Mycobacterium tuberculosis (GenBank accession number CP000611) and Propionibacterium acnes (GenBank accession number AE017283) as queries ( ). The common set of genes thus obtained were aligned using clustal X program, version 1.83, in the PHYLIP software package [62]. New broad range primers were designed for amplification and sequencing using SVARAP software which analyzes and graphically represents the variability in stretches of 50-bp along the nucleotide sequence [40], [41]. A BLAST search was performed to check potential primers for unspecific amplification. To ascertain primer efficacy and amplification efficiency ( ), a panel of genera of Actinobacteria was tested ( ). 16S rRNA amplification with primer pair fD1-rP2 [63] was used to ensure DNA extraction.

Individual gene fragment and whole-genome conserved region alignments

After alignment with clustal X, the sequence similarities of the individual gene fragment were determined using BioEdit v7.0.9 (Ibis Biosciences, Carlsbad, CA). To determine the genome conserved sequence similarity, pairs of whole genomes were aligned by using the MUMmer application [64] with the following parameters: breaken = 500, minCluster = 40, diagFactor = 0.15, maxGap = 250 and minMatch = 12. To ensure that all possible alignments were found, the reference and query files were swapped and each pair was reanalyzed. When two neighbouring sequence regions shared overlapping endpoints, the common segment was divided equally between them. Two similarity estimates were calculated from each genomic sequence comparison. DNA sequence similarity for conserved regions was calculated as the mean sequence identity of the homologous regions, weighted by each region's length in nucleotides.

Amplification and sequencing methods

PCR was conducted in 50 µl volumes with 25 µl of hotstart master mix and 20 µl water (Qiagen, Germantown, MD) with 15 min at 95°C followed by 35 cycles of 95°C for 1 min, 50°C, 52°C and 60°C depending on the primers ( ) for 1.30 min and 72°C for 1 min with a final extension step at 72°C for 10 min. Sequence reactions utilized the ABI Prism Big Dye v3.1 terminator cycle sequencing ready reaction kit (Perkin Elmer Applied Biosystems, Foster City, Calif) using the following program: an initial denaturation step of 1 min at 96°C followed by 25 cycles of denaturation at 96°C for 10 s, annealing at 50°C for 5 s and elongation at 60°C for 4 min. Products of sequencing reactions were recorded with ABI Prism 3130xl sequencer following the protocol of the supplier (Perkin Elmer Applied Biosystems). After the genomic statistical analysis, the sequences in GenBank were re-examined to increase the number of strains incorporated in the phylogenetic analysis. Phylogenetic trees were constructed from the common set of gene sequences using the neighour-joining method with Kimura's two parameter distance correction model with 1,000 bootstrap replications in MEGA version 4.0 software package [65]. The primer binding sites were eliminated from the sequences prior to computer analysis.

Statistical analysis

The chi-square test is used to compare the bootstrap values at the nodes of the phylogenetic trees. A p-value of <0.05 was considered significant. Family and species analyzed within the phylum Actinobacteria. (0.12 MB DOC) Click here for additional data file.
  62 in total

1.  Fast algorithms for large-scale genome alignment and comparison.

Authors:  Arthur L Delcher; Adam Phillippy; Jane Carlton; Steven L Salzberg
Journal:  Nucleic Acids Res       Date:  2002-06-01       Impact factor: 16.971

2.  Report of the ad hoc committee for the re-evaluation of the species definition in bacteriology.

Authors:  Erko Stackebrandt; Wilhelm Frederiksen; George M Garrity; Patrick A D Grimont; Peter Kämpfer; Martin C J Maiden; Xavier Nesme; Ramon Rosselló-Mora; Jean Swings; Hans G Trüper; Luc Vauterin; Alan C Ward; William B Whitman
Journal:  Int J Syst Evol Microbiol       Date:  2002-05       Impact factor: 2.747

3.  Gram-positive bacteria with a high DNA G+C content are characterized by a common insertion within their 23S rRNA genes.

Authors:  C Roller; W Ludwig; K H Schleifer
Journal:  J Gen Microbiol       Date:  1992-06

Review 4.  Towards a prokaryotic genomic taxonomy.

Authors:  Tom Coenye; Dirk Gevers; Yves Van de Peer; Peter Vandamme; Jean Swings
Journal:  FEMS Microbiol Rev       Date:  2005-04       Impact factor: 16.408

5.  The evolution of supertrees.

Authors:  Olaf R P Bininda-Emonds
Journal:  Trends Ecol Evol       Date:  2004-06       Impact factor: 17.712

Review 6.  Bacterial pathogenomics.

Authors:  Mark J Pallen; Brendan W Wren
Journal:  Nature       Date:  2007-10-18       Impact factor: 49.962

7.  Complete rpoB gene sequencing as a suitable supplement to DNA-DNA hybridization for bacterial species and genus delineation.

Authors:  Toïdi Adékambi; Thomas M Shinnick; Didier Raoult; Michel Drancourt
Journal:  Int J Syst Evol Microbiol       Date:  2008-08       Impact factor: 2.747

8.  A seven-gene, multilocus, genus-wide approach to the phylogeny of mycobacteria using supertrees.

Authors:  Sophie Mignard; Jean-Pierre Flandrois
Journal:  Int J Syst Evol Microbiol       Date:  2008-06       Impact factor: 2.747

9.  Mycobacterium barrassiae sp. nov., a Mycobacterium moriokaense group species associated with chronic pneumonia.

Authors:  Toïdi Adékambi; Didier Raoult; Michel Drancourt
Journal:  J Clin Microbiol       Date:  2006-10       Impact factor: 5.948

10.  Analysis of bifidobacterial evolution using a multilocus approach.

Authors:  Marco Ventura; Carlos Canchaya; Antonio Del Casale; Franco Dellaglio; Erasmo Neviani; Gerald F Fitzgerald; Douwe van Sinderen
Journal:  Int J Syst Evol Microbiol       Date:  2006-12       Impact factor: 2.747

View more
  14 in total

Review 1.  Phylogenetic framework and molecular signatures for the main clades of the phylum Actinobacteria.

Authors:  Beile Gao; Radhey S Gupta
Journal:  Microbiol Mol Biol Rev       Date:  2012-03       Impact factor: 11.056

2.  Glutamine synthetase type I (glnAI) represents a rewarding molecular marker in the classification of bifidobacteria and related genera.

Authors:  Jiří Killer; Chahrazed Mekadim; Věra Bunešová; Jakub Mrázek; Zuzana Hroncová; Eva Vlková
Journal:  Folia Microbiol (Praha)       Date:  2019-05-08       Impact factor: 2.099

3.  The 70S ribosome modulates the ATPase activity of Escherichia coli YchF.

Authors:  Marion Becker; Katherine E Gzyl; Alvin M Altamirano; Anthony Vuong; Kirstin Urban; Hans-Joachim Wieden
Journal:  RNA Biol       Date:  2012-09-20       Impact factor: 4.652

4.  A walk into the LuxR regulators of Actinobacteria: phylogenomic distribution and functional diversity.

Authors:  Catarina Lopes Santos; Margarida Correia-Neves; Pedro Moradas-Ferreira; Marta Vaz Mendes
Journal:  PLoS One       Date:  2012-10-08       Impact factor: 3.240

5.  Genomic variation in Salmonella enterica core genes for epidemiological typing.

Authors:  Pimlapas Leekitcharoenphon; Oksana Lukjancenko; Carsten Friis; Frank M Aarestrup; David W Ussery
Journal:  BMC Genomics       Date:  2012-03-12       Impact factor: 3.969

6.  Genome and Phenotype Microarray Analyses of Rhodococcus sp. BCP1 and Rhodococcus opacus R7: Genetic Determinants and Metabolic Abilities with Environmental Relevance.

Authors:  Alessandro Orro; Martina Cappelletti; Pasqualina D'Ursi; Luciano Milanesi; Alessandra Di Canito; Jessica Zampolli; Elena Collina; Francesca Decorosi; Carlo Viti; Stefano Fedi; Alessandro Presentato; Davide Zannoni; Patrizia Di Gennaro
Journal:  PLoS One       Date:  2015-10-01       Impact factor: 3.240

7.  Use of whole genome sequences to develop a molecular phylogenetic framework for Rhodococcus fascians and the Rhodococcus genus.

Authors:  Allison L Creason; Edward W Davis; Melodie L Putnam; Olivier M Vandeputte; Jeff H Chang
Journal:  Front Plant Sci       Date:  2014-08-19       Impact factor: 5.753

8.  Evolutionary transitions between beneficial and phytopathogenic Rhodococcus challenge disease management.

Authors:  Elizabeth A Savory; Skylar L Fuller; Alexandra J Weisberg; William J Thomas; Michael I Gordon; Danielle M Stevens; Allison L Creason; Michael S Belcher; Maryna Serdani; Michele S Wiseman; Niklaus J Grünwald; Melodie L Putnam; Jeff H Chang
Journal:  Elife       Date:  2017-12-12       Impact factor: 8.140

9.  Emergence of a New Population of Rathayibacter toxicus: An Ecologically Complex, Geographically Isolated Bacterium.

Authors:  Mohammad Arif; Grethel Y Busot; Rachel Mann; Brendan Rodoni; Sanzhen Liu; James P Stack
Journal:  PLoS One       Date:  2016-05-24       Impact factor: 3.240

10.  Gall-ID: tools for genotyping gall-causing phytopathogenic bacteria.

Authors:  Edward W Davis Ii; Alexandra J Weisberg; Javier F Tabima; Niklaus J Grunwald; Jeff H Chang
Journal:  PeerJ       Date:  2016-07-19       Impact factor: 2.984

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.