Literature DB >> 29062941

A systematic study of the whole genome sequence of Amycolatopsis methanolica strain 239T provides an insight into its physiological and taxonomic properties which correlate with its position in the genus.

Biao Tang1,2, Feng Xie3, Wei Zhao2, Jian Wang3, Shengwang Dai3, Huajun Zheng4, Xiaoming Ding1, Xufeng Cen2, Haican Liu5, Yucong Yu1, Haokui Zhou6, Yan Zhou1,4, Lixin Zhang3, Michael Goodfellow7, Guo-Ping Zhao1,2,4,6.   

Abstract

The complete genome of methanol-utilizing Amycolatopsis methanolica strain 239T was generated, revealing a single 7,237,391 nucleotide circular chromosome with 7074 annotated protein-coding sequences (CDSs). Comparative analyses against the complete genome sequences of Amycolatopsis japonica strain MG417-CF17T, Amycolatopsis mediterranei strain U32 and Amycolatopsis orientalis strain HCCB10007 revealed a broad spectrum of genomic structures, including various genome sizes, core/quasi-core/non-core configurations and different kinds of episomes. Although polyketide synthase gene clusters were absent from the A. methanolica genome, 12 gene clusters related to the biosynthesis of other specialized (secondary) metabolites were identified. Complete pathways attributable to the facultative methylotrophic physiology of A. methanolica strain 239T, including both the mdo/mscR encoded methanol oxidation and the hps/hpi encoded formaldehyde assimilation via the ribulose monophosphate cycle, were identified together with evidence that the latter might be the result of horizontal gene transfer. Phylogenetic analyses based on 16S rDNA or orthologues of AMETH_3452, a novel actinobacterial class-specific conserved gene against 62 or 18 Amycolatopsis type strains, respectively, revealed three major phyletic lineages, namely the mesophilic or moderately thermophilic A. orientalis subclade (AOS), the mesophilic Amycolatopsis taiwanensis subclade (ATS) and the thermophilic A. methanolica subclade (AMS). The distinct growth temperatures of members of the subclades correlated with corresponding genetic variations in their encoded compatible solutes. This study shows the value of integrating conventional taxonomic with whole genome sequence data.

Entities:  

Keywords:  AMS; AOS; ATS; Amycolatopsis methanolica; Complete genome sequence; One carbon metabolism; Sub-generic phyletic clades

Year:  2016        PMID: 29062941      PMCID: PMC5640789          DOI: 10.1016/j.synbio.2016.05.001

Source DB:  PubMed          Journal:  Synth Syst Biotechnol        ISSN: 2405-805X


Introduction

The genus Amycolatopsis [1] belongs to the family Pseudonocardiaceae [2], [3] which is a member of the order Pseudonocardiales [4] in the class Actinobacteria [5]. This genus currently encompasses 65 validly published species (http://www.bacterio.net/amycolatopsis.html) including the type and sole representative of Amycolatopsis methanolica, a facultative methylotrophic actinobacterium with a tortuous taxonomic pedigree. The organism was initially classified as Streptomyces sp. strain 239 [6], [7], but was moved first to the genus Nocardia [8] and then to the family Pseudonocardiaceae [9] prior to its assignment to the genus Amycolatopsis based on a combination of genotypic and phenotypic criteria [10]. Like most Amycolatopsis strains, Amycolatopsis methanolica strain 239T is Gram-positive, non-acid-fast, and forms aerial and substrate hyphae that fragment into squarish elements. It contains meso-diaminopimelic acid (meso-A2pm), arabinose and galactose in the wall peptidoglycan, major amounts of di- and tetrahydrogenated menaquinones, phosphatidylethanolamine as the diagnostic phospholipid, iso-C16:0 as the predominant fatty acid, but lacks mycolic acids [11]. The genus Amycolatopsis can be distinguished from other genera classified in the family Pseudonocardiaceae using genus-specific oligonucleotide primers [12] and a broad range of genotypic and phenotypic markers [2], [3]. The genus Amycolatopsis contains alkaliphilic, endophytic, halophilic, mesophilic, pathogenic and thermophilic strains [11], [13], [14], [15]. Over the years, a gradually increasing number of Amycolatopsis species were assigned to several multi- and single-membered subclades based on gyrB, recN and 16S rRNA gene sequence analyses [11], [16], [17] though most species belong to two major subclades represented by the earliest described species, namely A. methanolica (AMS) and Amycolatopsis orientalis (AOS). Several AOS species have been shown to synthesize antibiotics, notably A. orientalis, the type species of the genus, which produces vancomycin [1], [18] and Amycolatopsis mediterranei which produces rifamycin [1], [19]. In contrast, A. methanolica seems to have a markedly different metabolism as it is a facultative methylotroph that synthesizes few, if any, antibiotics. It is difficult to distinguish between AMS and AOS species using phenotypic properties though strains in the former group grow well at 45 °C [10], [11], [20], [21], [22], [23], [24], [25] hence species classified in the AMS can be considered as thermophilic actinobacteria [26], [27]. AMS strains are of potential value in biotechnology as vehicles for fermentative overproduction of aromatic amino acids [10], [28] and as agents of bioremediation [25], [29]. Data derived from whole-genome sequences are being used increasingly to clarify relationships between prokaryotes, including filamentous actinobacteria which are difficult to resolve using established procedures [30], [31]. Complete genome sequences are available for Amycolatopsis japonica strain MG417-CF17T which produces (S,S)-N,N'-ethylenediaminedisuccinic acid, a potential phospholipase inhibitor [32]; for the rifamycin SV-producer, A. mediterranei strain U32 [33] and for the vancomycin-producing A. orientalis strain HCCB10007 [34]. Here, we present the complete genome sequence of A. methanolica strain 239T. We show that it contains genes encoding one-carbon metabolic pathways and 12 potential specialized (secondary) metabolic biosynthetic gene clusters thereby providing a further insight into the taxonomy of the genus Amycolatopsis. Comparative genomic analyses undertaken between the A. methanolica genome and those of the AOS strains, namely A. japonica, A. mediterranei and A. orientalis, revealed a broad spectrum of genomic structures. In addition, when the phylogenetic structure of the genus Amycolatopsis was revisited based on in silico 16S rDNA and orthologues of the actinobacterial class-specific conserved gene (AMETH_3452), a third phyletic branch, the Amycolatopsis taiwanensis subclade (ATS) and some yet-to-be clearly delineated minor groups were highlighted.

Results and discussion

The complete genome of A. methanolica strain 239T provides a high quality alternative genetic blueprint for the genus Amycolatopsis

The A. methanolica chromosome is circular (Fig. 1A) and can thereby be distinguished from the linear chromosomes of Streptomyces species, but not from those of representatives of the family Pseudonocardiaceae (Actinosynnema mirum, A. japonica, A. mediterranei, A. orientalis, Pseudonocardia dioxanivorans Saccharomonospora viridis and Saccharopolyspora erythraea) as shown in Table 1. The genome of A. methanolica strain 239T (CP009110) consists of 7,237,391 bp and is much smaller than those of the A. japonica (8.9 Mbp), A. mediterranei (10.2 Mb) and A. orientalis (8.9 Mbp) strains; its G + C content is 71.53 mol% which is comparable to experimentally determined values (Fig. 1A) and is in the same range as those of other Amycolatopsis strains [32], [33], [34] (Table 1).
Fig. 1

Genome atlas of with two characteristic episomes enlarged (A) and the gene clusters encoding biosynthetic functions for specialized metabolites (B). (A) The large circle represents the chromosome: the outer scale is numbered in megabases and indicates the core (blue) and non-core (yellow) regions. The circles are nominated to start from the outside in. The genes in circles 1 and 2 (forward and reverse strands, respectively) are color-coded according to COG functional categories. Circle 3 shows selected essential genes (cell division, replication, transcription, translation, and amino-acid metabolism; the paralogs of essential genes in the non-core regions are not included), circle 4 the phage (red), specialized metabolite clusters (orange) and integrated plasmid pMEA300 (pink), circle 5 the mobile genetic elements (transposases), circle 6 the RNAs, circle 7 the GC content with a calculated ratio of 71.53 mol % and circle 8 the GC bias (blue, values > 0; red, values < 0). The right panel of (A) demonstrates the two characteristic episomes of the , encoding biosynthetic functions for specialized metabolites. The corresponding biosynthesis related genes are dark colored.

Table 1

General features of Pseudonocardiaceae genomes.

Species/Strains(accession number)A. orientalis HCCB10007 (CP003410)A. mediterranei U32 (CP002000)A. japonica MG417-CF17T (CP008953)A. methanolica 239T (CP009110)S. viridis DSM 43017T (CP001683)S. erythraea NRRL 2338T (NC_009142)A. mirum DSM 43827T (CP001630)P. dioxanivorans CB1190T (CP002593)
Length (bp)8,948,59110,236,7158,961,3187,237,3914,308,3498,212,8058,248,1447,096,571
GC content69.01%71.30%68.89%71.53%67.32%71.15%73.71%73.31%
Total proteins81689228829870743828719769166495
Proteins with function55126441422748302719497944234642
Proteins with function %67.50%69.85%50.94%68.28%71.02%69.18%63.95%71.47%
Average ORF size (bp)992.3990.1981.8897.3978.1969.21039.7967.6
Average intergenic region size (bp)103.3119.298.38115.1148.4172.9153.8126
Coding density90.57%89.30%90.89%88.7186.984.987.288.6
rRNA operons44433453
tRNA genes5052555149506050

T: type strain.

Genome atlas of with two characteristic episomes enlarged (A) and the gene clusters encoding biosynthetic functions for specialized metabolites (B). (A) The large circle represents the chromosome: the outer scale is numbered in megabases and indicates the core (blue) and non-core (yellow) regions. The circles are nominated to start from the outside in. The genes in circles 1 and 2 (forward and reverse strands, respectively) are color-coded according to COG functional categories. Circle 3 shows selected essential genes (cell division, replication, transcription, translation, and amino-acid metabolism; the paralogs of essential genes in the non-core regions are not included), circle 4 the phage (red), specialized metabolite clusters (orange) and integrated plasmid pMEA300 (pink), circle 5 the mobile genetic elements (transposases), circle 6 the RNAs, circle 7 the GC content with a calculated ratio of 71.53 mol % and circle 8 the GC bias (blue, values > 0; red, values < 0). The right panel of (A) demonstrates the two characteristic episomes of the , encoding biosynthetic functions for specialized metabolites. The corresponding biosynthesis related genes are dark colored. General features of Pseudonocardiaceae genomes. T: type strain. The A. methanolica genome contains three rRNA operons as opposed to four found in the A. japonica, A. mediterranei and A. orientalis genomes. The positions of all of the copies of the 16S rRNA genes in the genomes of these strains together with those in the genomes of S. viridis DSM 43017T and S. erythraea NRRL 2338T, representative species of the two most closely related genera, are listed in Table S1 (Supplementary Table 1). All three rRNA operons found in the A. methanolica genome share common genomic positions with those of the S. viridis strain with respect to the consistency of their neighbouring genes, though only two of them are in the same positions as those in the A. japonica, A. mediterranei, A. orientalis and S. erythraea genomes. The genome of A. methanolica strain 239T has 51 tRNA genes including a tRNASec encoding selenocysteine. This tRNASec gene is located immediately downstream of selBA and upstream of selD. It is transcribed in the same DNA strand of selBA while in the opposite strand of selD. We also found a gene in the A. methanolica genome, namely AMETH_3396, that encodes a formate dehydrogenase α subunit equipped with a Sec-encoding UGA codon, a Sec insertion sequence (SECIS) element and a stem-loop structure required for the incorporation of Sec into proteins [33]. This gene is similar to that found in the genome of A. mediterranei (Table S2). Similar to the two pMEA100-like integrated plasmids found in the A. mediterranei genome, one integrated pMEA300-like plasmid with 13,285 bp was found at the 10 Kb position of the A. methanolica genome (Fig. 1A). As previously reported [35], [36], this pMEA300-like plasmid might be a conjugative element present mostly in an integrated state at a single site in the chromosome, though it can also replicate autonomously. The attB site for the integration of the plasmid was located in the Ile-tRNA region. This type of integrated plasmid is not found in the A. orientalis genome though the latter does have a free plasmid, pXL100, which has not been detected in any of the other sequenced Amycolatopsis strains. Interestingly, a predicted prophage with 37,822 bp encoding 52 ORFs (AMETH_6251 to AMETH_6302), most of which were annotated as hypothetical proteins, is present in the A. methanolica 239T genome as a tandem repeat positioned at the genome coordinate around 6.5 Mbp (Fig. 1A). Prophages have not been found in the genomes of A. mediterranei, A. orientalis or A. japonica. By employing the three criteria defined in the study of the A. mediterranei genome [33], genome-wide comparisons were made against the complete genome sequences of several Pseudonocardiaceae strains, namely A. mirum DSM 43827T, A. orientalis HCCB10007, A. mediterranei U32, P. dioxanivorans CB1190T, S. viridis DSM 43017T and S. erythraea NRRL 2338T [37]. These results revealed a highly conserved core region ranging from 0 to 2.1 Mb (left arm) and 4.8–7.2 Mb (right arm) in the A. methanolica chromosome (total of ∼4.5 Mb, Fig. 1, Fig. 2, Fig. 3). Although ortholog ordering of quite a few consecutive genomic segments in the non-core region of A. methanolica are apparently co-linear in the quasi-core or core regions of the A. mediterranei and A. orientalis genomes (Fig. 3A and B), there is no obvious “quasi-core” region identified within the non-core region ranging from 2.1 to 4.8 Mb (total of ∼2.7 Mb) of its genome. This situation is quite different from that found in the genomes of A. mediterranei, A. orientalis and S. viridis (Fig. 2). On the other hand, this co-linearity analysis shows that A. methanolica strain 239T and S. viridis DSM 43017T share the highest consistency throughout the whole genome, even more so than against the A. mediterranei and A. orientalis genomes. This phenomenon may simply reflect the smaller genome size of S. viridis (∼4.2 Mb) compared to the genomes of the two other Amycolatopsis strains, especially in the non-core region (∼1 Mb in S. viridis) (Fig. 3C). In contrast, extremely low co-linearity was found between the genomes of the A. methanolica and S. erythraea strains (Fig. 3D). Consequently, by increasing the number of genomes of various sizes from different phyletic lineages within the genus Amycolatopsis, the genome configuration aspect of “core” may be considered as the large consecutive genomic segment extending from oriC in both directions, a segment that is relatively stable and hence has fewer inserted sequences compared with the “non-core” region. The concept of “quasi-core” may be seen as part of the core near the replication terminus in ancestral species, but more accessible to multiple horizontal gene transfer (HGT) and chromosomal rearrangement events during evolution. Further, the boundaries of core and quasi-core regions against the non-core region are “relative”, i.e., they are more or less artificial and apparently variable depending on the pair of genomes being compared.
Fig. 2

Comparative analyses of 7 . In panel A, parallel straight lines represent the genomes of A. orientalis HCCB10007 (CP003410.1), A. mediterranei U32 (CP002000.1), A. methanolica 239T (CP009110), S. viridis DSM 43017T (CP001683.1), S. erythraea NRRL 2338T (AM420293.1), A. mirum DSM 43827T (CP001630.1) and P. dioxanivorans CB1190T (CP002593.1), and are drawn to scale with oriC located at the very end of the lines. Vertical short bars representing different conserved genes are marked with distinct colors to demonstrate their genomic loci; the latter bars are connected by corresponding colored thin lines. The segregation of core, non-core and quasi-core regions are shown as clusters of conserved orthologous genes. Black arrows highlight the genomic loci of the highly conserved AMETH_3452 analogs located in the middle of the 7 genomes and are connected by thick black lines. In panel B, the distribution of chromosomal loci of the highly conserved genes orthologous to AMETH_3452 among species of the class Actinobacteria with circular genomes is plotted against their corresponding genome sizes. The horizontal coordinate represents the relative chromosomal positions of the genomes normalized to 0.0–1.0 with oriC located at both ends; the genomic loci of the conserved genes orthologous to AMETH_3452 are shown as scattered points. This conserved gene is located in the middle of the circular chromosomes (i.e., close to the 0.5 locus) in species that mainly belong to four taxa, namely the families Pseudonocardiaceae (12 genomes, dot, red), Corynebacteriaceae (50 genomes, dot, green), Micrococcaceae (9 genomes, asterisk, blue) and Mycobacteriaceae (59 genomes, dot, cyan). In contrast, the gene is located close to oriC in the species belonging to the families Acidimicrobiaceae (rhombus, gray). Catenulisporaceae (square, pink), Geodermatophilaceae (triangle, gray), Micromonosporaceae (triangle, purple), Nocardiopsaceae (triangle, pink) and Streptomycetaceae (only Streptomyces violaceusniger Tu 4113) (square, purple). This gene is located between these two positions in species belonging to the families Frankiaceae (square, gray), Microbacteriaceae (rhombus, gold), Nocardiaceae (rhombus, brown), Propionibacteriaceae (rhombus, plum), Actinomycetaceae, Beutenbergiaceae, Cellulomonadaceae, Dermabacteraceae, Dermacoccaceae, Glycomycetaceae, Intrasporangiaceae, Jonesiaceae, Nakamurellaceae, Nocardioidaceae, Promicromonosporaceae, Sanguibacteraceae, Segniliparaceae, Thermomonosporaceae, Tsukamurellaceae and in Thermobispora bispora, a member of the genus Thermobispora that was once classified in the family Pseudonocardiaceae but is now known to form a deep lineage in the 16S rRNA actinobacterial gene tree hence its reclassification as an order incertae sedis in the current edition of Bergey's Manual of Systematic Bacteriology [109] (small dot, black). Detailed information on all of these genomes is given in Table S4.

Fig. 3

Collinearity analyses of the genome of against the genomes of (C), and (D), respectively. The ortholog ordering in the defined core region of the A. methanolica genome is highlighted in light red while the corresponding consecutive genomic segments in the non-core region, which are matched with the quasi-core regions of the corresponding genomes, are shown in light green.

Comparative analyses of 7 . In panel A, parallel straight lines represent the genomes of A. orientalis HCCB10007 (CP003410.1), A. mediterranei U32 (CP002000.1), A. methanolica 239T (CP009110), S. viridis DSM 43017T (CP001683.1), S. erythraea NRRL 2338T (AM420293.1), A. mirum DSM 43827T (CP001630.1) and P. dioxanivorans CB1190T (CP002593.1), and are drawn to scale with oriC located at the very end of the lines. Vertical short bars representing different conserved genes are marked with distinct colors to demonstrate their genomic loci; the latter bars are connected by corresponding colored thin lines. The segregation of core, non-core and quasi-core regions are shown as clusters of conserved orthologous genes. Black arrows highlight the genomic loci of the highly conserved AMETH_3452 analogs located in the middle of the 7 genomes and are connected by thick black lines. In panel B, the distribution of chromosomal loci of the highly conserved genes orthologous to AMETH_3452 among species of the class Actinobacteria with circular genomes is plotted against their corresponding genome sizes. The horizontal coordinate represents the relative chromosomal positions of the genomes normalized to 0.0–1.0 with oriC located at both ends; the genomic loci of the conserved genes orthologous to AMETH_3452 are shown as scattered points. This conserved gene is located in the middle of the circular chromosomes (i.e., close to the 0.5 locus) in species that mainly belong to four taxa, namely the families Pseudonocardiaceae (12 genomes, dot, red), Corynebacteriaceae (50 genomes, dot, green), Micrococcaceae (9 genomes, asterisk, blue) and Mycobacteriaceae (59 genomes, dot, cyan). In contrast, the gene is located close to oriC in the species belonging to the families Acidimicrobiaceae (rhombus, gray). Catenulisporaceae (square, pink), Geodermatophilaceae (triangle, gray), Micromonosporaceae (triangle, purple), Nocardiopsaceae (triangle, pink) and Streptomycetaceae (only Streptomyces violaceusniger Tu 4113) (square, purple). This gene is located between these two positions in species belonging to the families Frankiaceae (square, gray), Microbacteriaceae (rhombus, gold), Nocardiaceae (rhombus, brown), Propionibacteriaceae (rhombus, plum), Actinomycetaceae, Beutenbergiaceae, Cellulomonadaceae, Dermabacteraceae, Dermacoccaceae, Glycomycetaceae, Intrasporangiaceae, Jonesiaceae, Nakamurellaceae, Nocardioidaceae, Promicromonosporaceae, Sanguibacteraceae, Segniliparaceae, Thermomonosporaceae, Tsukamurellaceae and in Thermobispora bispora, a member of the genus Thermobispora that was once classified in the family Pseudonocardiaceae but is now known to form a deep lineage in the 16S rRNA actinobacterial gene tree hence its reclassification as an order incertae sedis in the current edition of Bergey's Manual of Systematic Bacteriology [109] (small dot, black). Detailed information on all of these genomes is given in Table S4. Collinearity analyses of the genome of against the genomes of (C), and (D), respectively. The ortholog ordering in the defined core region of the A. methanolica genome is highlighted in light red while the corresponding consecutive genomic segments in the non-core region, which are matched with the quasi-core regions of the corresponding genomes, are shown in light green. The replication origin (oriC) of the A. methanolica chromosome was established based on GC skew [38], and its adjacent dnaA gene was chosen as the starting point for numbering the 7074 predicted protein-coding sequences (CDSs) (Fig. 1A). Among all of the CDSs, 68.28% of them were assigned to a functional category in the Cluster of Orthologous Groups (COGs, Table S3). The orthologues between the A. methanolica genome and those of A. japonica, A. mediterranei and A. orientalis are 4107 (49.5% of all proteins of A. japonica), 4158 (51.4% of all proteins of A. mediterranei) and 4168 (45.1% of all proteins of A. orientalis), respectively. The ratios of genes encoded for amino acid transport and metabolism, as well as for energy production and conversion in the A. methanolica genome are higher than those found in the A. mediterranei and A. orientalis genomes. In addition to a comparison of chromosomal configurations of the closely related genomes, a conserved gene (AMETH_3452) spanning nucleotides 3,578,958 to 3,579767 in the A. methanolica genome was found to encode a hypothetical protein of 269 amino acid residues (Fig. 2A, Fig. S1). It is significant that this gene is orthologous among all of the genomes belonging to the family Pseudonocardiaceae (Fig. S1, e.g., AMED_4666, AORI_3898, AJAP_19695, SVIR_19800, SACE_3783, AMIR_3419, and PSED_3357). When all of the sequenced genomes of bacteria were analyzed, genes orthologous to AMETH_3452 were identified in almost all of the sequences from species classified in the class Actinobacteria but not in those of other classes of prokaryotes (ref to Fig. S1). Furthermore, it is particularly interesting to note that most of these sequences are located in the middle of the circular chromosomes opposite to oriC (Fig. 2), especially in the families Corynebacteriaceae, Micrococcaceae, Mycobacteriaceae and Pseudonocardiaceae, regardless of significant variations in the genome sizes of their constituent species (Fig. 2B, Table S4). This hypothetical protein is predicted to be soluble and cytoplasmic based on comparison of sequence similarity and secondary structural content (Fig. S1). Although the function of this CDS is unknown, it turned out to be a precise molecular marker for intra-generic species delineation of members of the family Pseudonocardiaceae as shown in the next section. A draft A. methanolica genome (AQUL00000000.1) consisting of two contigs, 6,804,661 bp and 392,199 bp each, was released in 2013. We have determined the differences between the sequences of the two versions (AQUL00000000.1 and CP009110) (Fig. S2) and found that there are 47 single nucleotide polymorphisms (SNPs), 30 indels and a large insertion of 37,822 bp located in the genomic coordinate around 6.5 Mb in genome CP009110, which is actually the prophage mentioned above but shown as a single copy rather than a tandem repeat. In addition, contig 1 in AQUL00000000.1 shows inversions of two fragments compared to that of genome CP009110.

Amycolatopsis species may be classified into three major subclades and some minor groups based on phylogenetic analyses of 16S rDNA, AMETH_3452 orthologous sequences, growth physiology and genomic characteristics

Amycolatopsis species can be assigned to several multimembered and single-membered subclades/groups based on 16S rDNA sequences, as well as on gyrB and recN sequences, two highly conserved eubacterial orthologous genes [11], [16], [17]. Most of the species fall into the A. methanolica (AMS) and the A. orientalis (AOS) subclades, the members of which share similar taxonomic properties though they have been reported to have different temperature profiles [10], [23], [24]. We generated an Amycolatopsis phylogenetic tree based on 16S rRNA genes (data from EzTaxon [39]) of the type strains of 62 Amycolatopsis species using S. viridis DSM 43017T as the outgroup strain (Fig. 4A). The initial AOS subclade has grown since its inception and now comprises 36 mesophilic species and 13 moderately thermophilic species which can grow at 45 °C. This subclade is generally stable and apparently accommodates groups A-G [16], [17] and groups H, I, and J from this study (Fig. 4A). The remaining taxon, the F group, can be divided readily into two well delineated taxa, phyletic group F1 composed of Amycolatopsis helveola TT-99-32T [40], Amycolatopsis pigmentata TT00-43T [40], and A. taiwanensis 0345M-7T [41] and phyletic group F2, composed of A. methanolica and related species (Fig. 4A). It can be seen from Fig. 4B that in contrast to group F1, only members of the group F2 grow at 45 °C and in the presence of 5%, w/v NaCl, these properties still need to be acquired for Amycolatopsis thermophila GY088T (see Fig. 4B). Consequently, another independent sub-generic lineage, the A. taiwanensis subclade (ATS/F1), is proposed to cover these three species thereby emphasizing their phylogenetic and physiological differences from members of the AMS/F2 subclade, which is represented by A. methanolica [10] and comprises 10 thermophilic species capable of growing at temperatures above 50 °C except for Amycolatopsis endophytica KLBMP 1221T [42], which grows at 45 °C.
Fig. 4

Neighbour-joining phylogenetic tree based on 16S rDNA of 62 as the outgroup (A) together with corresponding growth and temperature tolerance properties (B) In panel A, numbers at the nodes are percentage bootstrap values based on 1000 resampled datasets. The scale bar indicates 5 nucleotide substitutions per 1000 nucleotides. The subclades are designated AOS, ATS and AMS, respectively. Subclade groups (A–G) defined by Everest and Meyers [16] and Everest et al. [17] are shown, the F group is divided into F1 and F2. The subclades H-J were defined in this study; the J group lies at the periphery of AOS subclade hence the corresponding vertical bar is dashed. In panel B, the phenotypic characteristics of the 62 Amycolatopsis type strains are listed according to their phylogenetic locations. All of the growth condition data were taken from the initial published descriptions of the species. Abbreviations: +, positive growth, w, weak growth, -, no growth and ND, no data available.

Neighbour-joining phylogenetic tree based on 16S rDNA of 62 as the outgroup (A) together with corresponding growth and temperature tolerance properties (B) In panel A, numbers at the nodes are percentage bootstrap values based on 1000 resampled datasets. The scale bar indicates 5 nucleotide substitutions per 1000 nucleotides. The subclades are designated AOS, ATS and AMS, respectively. Subclade groups (A–G) defined by Everest and Meyers [16] and Everest et al. [17] are shown, the F group is divided into F1 and F2. The subclades H-J were defined in this study; the J group lies at the periphery of AOS subclade hence the corresponding vertical bar is dashed. In panel B, the phenotypic characteristics of the 62 Amycolatopsis type strains are listed according to their phylogenetic locations. All of the growth condition data were taken from the initial published descriptions of the species. Abbreviations: +, positive growth, w, weak growth, -, no growth and ND, no data available. It is well known that compatible solutes can stabilize biological membranes and protect cells and cell components from freezing, desiccation, high temperature and oxygen radicals [43], [44], [45], [46], [47]. Consequently, genes related to the synthesis of several compatible solutes, such as ectoine/hydroxyectoine, glycine, betaine and trehalose were sought in the AMS genomes of A. methanolica, Amycolatopsis sp. ATCC 39116 (AFWY00000000) and Amycolatopsis thermoflava N1165T (NZ_AXBH00000000) versus the ATS genome of A. taiwanensis DSM 45107T NZ_JAFB00000000 (Table S5). It is significant that the OtsA-OtsB [48], TreY-TreZ [49], TreS [48] and TreP [50], [51], [52] pathways for trehalose biosynthesis were found in the three AMS genomes while the ATS genome has only the OtsA-OtsB and TreP pathways suggesting that the absence of TreY-TreZ and TreS pathways may contribute to the reduction of trehalose accumulation in ATS strains. In addition, several transporters encoded by ssuB (AMETH_2747), proP (AMETH_5526, AMETH_6721) and betP (AMETH_3300, AMETH_4479) and genes encoding five betaine aldehyde dehydrogenases (AMETH_0085, AMETH_2645, AMETH_5372, AMETH_5920, AMETH_5579) were missing in the ATS strains thereby accounting for their salt and heat intolerance. The whole genome sequences of A. japonica strain MG417-CF17T, A. mediterranei strain U32, A. methanolica strain 239T and A. orientalis strain HCCB10007 have multiple copies of 16S rRNA genes with various levels of identity between selected pairs of rDNAs. Specifically, the sequences of the three copies of 16S rDNAs in the A. methanolica genome are identical but distinct from those in the genomes of the other strains. The sequences of the 16S rRNA genes of the A. japonica, A. mediterranei and A. orientalis strains share identities ranging from 98.61 to 100% compared to other 16S rDNAs in the same genome (Fig. 5A). In particular, the sequence identity for some inter-species pairs of 16S rDNA between the A. orientalis and A. japonica strains are higher than those of the corresponding intra-species pairs. Specifically, HCCB10007_R022 is much closer to MG417-CF17_AJAP r14255 (99.52%) and MG417-CF17_AJAP r30385 (99.52%) than to HCCB10007_R016 (98.91%); while MG417-CF17_AJAP r33495 is much closer to HCCB10007_R045 (99.66%) and HCCB10007_R041 (99.66%) than to MG417-CF17_AJAP r10595 (99.2%) (Fig. 5A). This variation needs to be addressed as it may influence the structure of Amycolatopsis 16S rRNA gene trees. We conducted a phylogenetic analysis using all 15 copies of the 16S rDNAs encoded in the genomes of the A. japonica, A. methanolica, A. mediterranei and A. orientalis strains (Fig. 5B) and found that certain copies of the 16S rDNAs from the genomes of the A. japonica and A. orientalis strains were cross-clustered while those of the A. mediterranei and A. methanolica strains clearly clustered independently. This result is not surprising and can be attributed to the different levels of sequence similarities between pairs of the 16S rDNAs from the different species versus sequence diversities between pairs of 16S rDNAs within a species (Fig. 5A).
Fig. 5

Pairwise comparison of 16S rDNA sequences from (A) and their overall phylogenetic relationships (B). In panel A, the percentage of sequence identity is shown for each pair of the different copies of 16S rDNAs. The brackets on the diagonal line of the table represent comparisons between the same 16S rDNAs. The pairs circled by triangles represent intra-species comparisons while those circled by squares represent inter-species comparisons comparable to those of the corresponding intra-species comparisons shown as square circled data inside the triangle. In panel B, the small differences between the intra- and inter-species 16S rDNA comparisons accounted for the abnormal clustering of the corresponding species in the phylogenetic tree are shown.

Pairwise comparison of 16S rDNA sequences from (A) and their overall phylogenetic relationships (B). In panel A, the percentage of sequence identity is shown for each pair of the different copies of 16S rDNAs. The brackets on the diagonal line of the table represent comparisons between the same 16S rDNAs. The pairs circled by triangles represent intra-species comparisons while those circled by squares represent inter-species comparisons comparable to those of the corresponding intra-species comparisons shown as square circled data inside the triangle. In panel B, the small differences between the intra- and inter-species 16S rDNA comparisons accounted for the abnormal clustering of the corresponding species in the phylogenetic tree are shown. We also analyzed the phylogeny of the genus Amycolatopsis using the fifteen 16S rDNA sequences of the four Amycolatopsis species and 16S rRNA gene sequences from another 55 Amycolatopsis species as shown in Fig. S3. It can be seen that while different copies of the 16S rDNAs selected from the A. japonica and A. orientalis genomes affect their positions in the phylogenetic tree they had no effect on the assignment of strains to the three Amycolatopsis subclades. This result provides further evidence that phylogenetic relationships between strains at the sub-generic level can be influenced by the presence of multiple copies of diverse 16S rRNA genes present in the constituent species [53], [54], [55]. The latest September 2015 version of the 16S rDNA based All-Species Living Tree accommodated the 62 Amycolatopsis species mentioned above and divided them into two lineages separated by some species of Prauserella and Saccharomonospora (http://www.arb-silva.de/fileadmin/silva_databases/living_tree/LTP_release_123/LTPs123_SSU_tree.pdf). The smaller lineage encompassed 13 species, including A. methanolica and A. taiwanensis i.e., the representatives of the AMS and ATS subclades, respectively. The larger phyletic line contained 52 species, including A. orientalis, the representative of the AOS subclade. This tree is consistent with our division of the genus into three major subclades (Fig. 4A) though the intrusion of the Prauserella and Saccharomonospora species needs an explanation. The All-Species Living Tree [56] based on the Silva database [57] was designed to include all of the sequenced type strains of hitherto published species of Archaea and Bacteria based on their 16S rRNA gene sequences. To encompass such a broad range of bacterial species into a single tree to capture the higher taxonomic groups and clusters, different sequence filters were used to exclude some highly variable regions of the aligned 16S rRNA gene sequences [56]. However, exclusion of too many variable regions may reduce the phylogenetic resolution at lower taxonomic ranks, such as at generic and species levels. In order to construct a more reliable phylogenetic tree of the targeted bacterial group(s) instead of considering all of the taxa in the All-Species Living Tree, we performed a phylogenetic analysis focused on the relevant groups (the three genera in this case) using complete 16S rRNA gene sequences without using the sequence filters from the Silva database. This approach enabled us to retain all of the informative aligned sequences needed to construct a better phylogenetic tree of the targeted bacterial taxa (Amycolatopsis, Prauserella and Saccharomonospora in this case) as shown in Fig. S4. Currently, complete and draft genomes of 18 Amycolatopsis strains have been deposited in the NCBI database. We generated phylogenetic trees based on these strains using either gyrB or recN genes alone or in combination with 16S rDNA genes and used corresponding data from species representing four other genera classified in the family Pseudonocardiaceae as the outgroup, an approach that has rarely been applied [16], [17]. The results were completely unexpected, Amycolatopsis halophila YIM 93223T did not cluster with the other Amycolatopsis species but with Saccharomonospora and Saccharopolyspora species (Fig. S5). The same problem was encountered when the phylogenetic tree was constructed based on the head-tail linked 95 single-copy orthologous genes encoding conserved functional proteins annotated from a total of 123 actinobacterial genomes (Table S6, Fig. S6). However, this approach worked perfectly well when restricted to the three representative Amycolatopsis species and representative species of five genera belonging to the family Pseudonocardiaceae (Fig. S7). We speculate that this phenomenon is due to the gyrB and recN sequences not being sufficiently variable to delineate between species of closely related genera, notably genera belonging to the same family. Consequently, phylogenetic trees based on the newly identified, conserved class-specific AMETH_3452 orthologues (Fig. 2, Fig. S1, Table S4), with or without the corresponding 16S rDNA sequences from the complete or draft genomes of the 18 Amycolatopsis species were constructed and compared to the corresponding 16S rDNA gene tree (Fig. 6). In all three trees, the species representing the five genera classified in the family Pseudonocardiaceae clustered as expected. It was particularly interesting that all but one of the 18 Amycolatopsis species were segregated into AOS (13 species, covering groups A, B, C, D and E), AMS (3 species of group F2) and ATS (1 species of group F1) lineages, the exception was A. halophila (Fig. 4A).
Fig. 6

Neighbour-joining trees based on . The AOS/ATS/AMS classification and the A-F and J groupings are shown. Half square brackets in red denote the AMS branches, the green color the ATS branch and the blue color the AOS branches. Numbers at the nodes are bootstrap values based on 1000 replicates. The scale bar indicates 0.02, 0.05 or 0.005 nucleotide substitutions per site.

Neighbour-joining trees based on . The AOS/ATS/AMS classification and the A-F and J groupings are shown. Half square brackets in red denote the AMS branches, the green color the ATS branch and the blue color the AOS branches. Numbers at the nodes are bootstrap values based on 1000 replicates. The scale bar indicates 0.02, 0.05 or 0.005 nucleotide substitutions per site. It is particularly interesting that the type strain of A. halophila formed a distinct phyletic line that was more closely related to the A. methanolica and A. taiwanensis strains than to representatives of the A. orientalis subclade (Fig. 6). However, this result is not so surprising as the A. halophila and Amycolatopsis salitolerans strains formed a distinct group (designated as J), supported by a 100% bootstrap value, towards the periphery of the A. orientalis 16S rRNA subclade and hence is quite close to the group F strains (Fig. 4). The A. halophila and A. salitolerans strains also share distinctive physiological features as they are thermophilic and grow at high salt concentrations [58], [59]. The A. halophila strain has a particularly small genome (5.55 Mb) with very few gene clusters annotated that are likely to encode for specialized metabolites. These genomic characteristics show a greater similarity to the genomes of F group species than to AOS species, the former have smaller genomes (8.78–7.24 Mbps against 10.86–8.53 Mbps) and fewer gene clusters (16–12 against 55–17) than the latter (Table S7 and Fig. 6). These data clearly show that the A. halophila and A. salitolerans strains belong to a subclade that lies closer to the F group than to the AOS subclade (Fig. S5 and Fig. S6). The systematic analyses illustrated above underpin the long-standing view that 16S rDNA-based phylogenetic analysis is reliable at and above the generic level(s) but this is not always so at the species level where relationships may not be in sync with corresponding phenotypic data. By the same token, gyrB and recN sequence-based phylogenies [16], [17] failed to distinguish between the species from cluster related taxa among the genera of the family Pseudonocardiaceae considered here (Fig. S5). On the other hand, the AMETH_3452-based tree in this study provides an alternative perspective by employing a class-specific orthologues gene that encodes for a hypothetical CDS, in this instance good correlation was found between the phylogenetic and corresponding physiological data (Fig. 6).

Whole genome sequence of A. methanolica strain 239Treveals the genetic basis for its special primary and specialized metabolic characteristics

Although A. methanolica strain 239T has been reported to produce few, if any, specialized metabolites [60], a total of 12 gene clusters putatively encoding for such metabolites were identified in its genome (Fig. 1B) using antiSMASH3 [61] and BAGEL3 [62] software. It is interesting that all 12 gene clusters are highly conserved among the three AMS species with known complete or draft genome sequences (A. methanolica strain 239T, Amycolatopsis sp. ATCC 39116 and Amycolatopsis thermoflava strain N1165T); while, except for oth and ectABC (see later), neither the gene clusters per se nor their chromosomal loci are conserved in the three completely sequenced and well annotated representative genomes of AOS species [32], [33], [34]. These findings can be attributed to the close phyletic relationships found between the AMS species, while further analysis shows that species in the AOS generally have a greater potential for the synthesis of specialized metabolites than corresponding data drawn from the ATS and AMS strains (Table S7). In particular, few polyketide synthases (PKSs) and/or non-ribosomal peptide synthetases (NRPSs) were identified in the three AMS genomes (1 NRPS in A. methanolica strain 239T, 1 PKS and 1 NRPS in Amycolatopsis sp. ATCC 39116, 3 PKSs and 1 NRPS in A. thermoflava strain N1165T), which is significantly different from corresponding data in the AOS genomes. Among the 12 potential specialized metabolites, two classes of siderophore biosynthetic gene clusters were identified. The NIS (RPS ndependent iderophore) gene cluster encodes the key enzyme NIS synthase, which is derived from esterified or amidated derivatives of carboxylic acid and belongs to group type C [63] based on phylogenetic analysis. The only NRPS cluster in the genome of A. methanolica strain 239T (Amys, AMETH_0591 - AMETH_0609) with 39,563 bp shows 96.5% and 97.0% identity with a similar cluster identified in Amycolatopsis sp. ATCC 39116 and A. thermoflava strain N1165T, respectively. The genes encoded in this cluster are complete and the NRPS products have recently been identified and characterized as amychelins [60]. Three ribosomally synthesized and post-translational modified peptides (RiPPs) containing Lasso, Lan1 and Lan2 have low homology to known sequences indicating that their products are likely to be novel. Terp2 and Terp3 were recognized as gene clusters responsible for isorenieratene and hopene biosynthesis, respectively based on antiSMASH data; Terp2 probably accounts for the yellow colonies of A. methanolica strain 239T, while, Terp1 and Terp4 may be new types of terpenes. The bacteriocin biosynthetic gene cluster in A. methanolica strain 239T seems to be conserved in Amycolatopsis sp. ATCC 39116 [64], Amycolatopsis sp. AA4 (NZ_ACEV00000000), Amycolatopsis sp. MJM2582 (JPLW00000000) [65] and in the type strains of Amycolatopsis azurea DSM 43854T (NZ_ANMG00000000), Amycolatopsis decaplanina DSM 44594T (NZ_AOHO00000000) [66], Amycolatopsis lurida NRRL 2430T (CP007219) [67], Amycolatopsis japonica MG417-CF17T32 and Amycolatopsis nigrescens CSC17Ta-90T (NZ_ARVW00000000). The ectABC gene cluster involved in the synthesis of ectoine has been isolated and characterized for a number of strains belonging to the Actinobacteria, Firmicutes and Proteobacteria. The ectoine hydroxylase gene (ectD) is responsible for the conversion of ectoine to hydroxyectoine; both ectoine and hydroxyectoine are involved in thermoprotection [68], [69]. The ectoine gene cluster has been detected in the genomes of many Amycolatopsis species, including A. taiwanensis DSM 45107T (ATS), A. japonica (AJAP_02100_409 - AJAP_02085_406), A. mediterranei (AMED_8594 - AMED_8597) and A. orientalis (AORI_7380 - AORI_7383) (AOS). For the AMS species, the protein EctABCD of A. methanolica strain 239T has 99% similarity to that of the other two AMS species, i.e., A. thermoflava strain N1165T and Amycolatopsis sp. ATCC 39116; associated CDSs are also highly conserved in these strains (Fig. S8). One carbon metabolism was recognized as a key characteristic of A. methanolica strain 239T10. However, KEGG mapping of the whole genome sequence of this strain showed that it lacked the gene encoding for the well-known methanol dehydrogenase (e.g., cytochrome c-dependent or NAD-dependent) used to oxidize methanol [70], [71]. Instead, an mdo gene (AMETH_5577) was identified in A. methanolica strain 239T as previously described [72], [73]_ENREF_72. This gene encodes methanol:N,N-dimethyl-4-nitrosoaniline oxidoreductase (EC: 1.1.99.37) which may act on the CH—OH group of donors. The subsequent oxidization of formaldehyde [74], [75], [76] and formate is likely to be catalyzed by NAD/mycothiol-dependent formaldehyde dehydrogenase (FD-FA1DH/MscR, EC: 1.1.1.306) encoded by AMETH_3767 and formate dehydrogenase (EC: 1.2.1.2) encoded by AMETH_3397 (fdh) and AMETH_3398 (fdoH), respectively (Fig. 7A). Though a glutathione-independent formaldehyde dehydrogenase (GD-FA1DH, EC: 1.2.1.46) is encoded by AMETH_1319 in the genome of A. methanolica strain 239T, its activity in this strain is doubtful [74], [76] as glutathione has not been detected in actinobacteria [77], [78]. In this study, we confirmed the proposition that glutathione is unlikely to be synthesized in A. methanolica strain 239T due to the lack of the gene gshB which encodes one of the two key enzymes, i.e., glutathione synthetase [79].
Fig. 7

Schematic representation of the pathways for methanol, glucose and gluconate metabolism and for the biosynthesis of aromatic amino acids in (A) and comparison of the corresponding genes against those of other . AMETH_5577 represents methanol:N,N-dimethyl-4-nitrosoaniline oxidoreductase, K17067, EC:1.2.99.4/EC:1.1.99.37; AMETH_3767 NAD/mycothiol-dependent formaldehyde dehydrogenase, K00153, EC:1.1.1.306; AMETH_3397 formate dehydrogenase, K00123, EC:1.2.1.2, and AMETH_3398 formate dehydrogenase iron-sulfur subunit, K00124, EC:1.2.1.2. In panel B, Y denotes the identified genes, N denotes the unidentified genes in complete genomes, and ND denotes the genes that are not detected in the incomplete genomes.

Schematic representation of the pathways for methanol, glucose and gluconate metabolism and for the biosynthesis of aromatic amino acids in (A) and comparison of the corresponding genes against those of other . AMETH_5577 represents methanol:N,N-dimethyl-4-nitrosoaniline oxidoreductase, K17067, EC:1.2.99.4/EC:1.1.99.37; AMETH_3767 NAD/mycothiol-dependent formaldehyde dehydrogenase, K00153, EC:1.1.1.306; AMETH_3397 formate dehydrogenase, K00123, EC:1.2.1.2, and AMETH_3398 formate dehydrogenase iron-sulfur subunit, K00124, EC:1.2.1.2. In panel B, Y denotes the identified genes, N denotes the unidentified genes in complete genomes, and ND denotes the genes that are not detected in the incomplete genomes. Here we propose that A. methanolica strain 239T assimilates formaldehyde via the ribulose monophosphate (RuMP) cycle rather than by the serine pathway [76], as indicated by KEGG mapping of its whole genome sequence. In this pathway, formaldehyde is fixed with ribulose 5-phosphate (Ru5P) to form D-arabino-3-hexulose-6-phosphate (Hu6P) catalyzed by 3-hexulose-6-phosphate synthase (HPS), and is then isomerized to fructose 6-phosphate (F6P) catalyzed by 3-hexulose-6-phosphate isomerase (HPI) (Fig. 7A). The A. methanolica genome has one gene (AMETH_4517) encoding HPI and two identical genes (AMETH_4519, AMETH_4538) encoding HPS (Fig. 7B), indicating an efficient RuMP pathway for F6P supply. The genome of A. mediterranei strain U32 does not encode any of the genes for the above mentioned key enzymes for C1 utilization (Mdo, FD-FA1DH/MscR, HPS or HPI), while the genomes of A. japonica strain MG417-CF17T and A. orientalis strain HCCB10007 only encode genes for FD-FA1DH/MscR. The genomes of other AMS strains, namely A. thermoflava N1165T and Amycolatopsis sp. ATCC 39116 [64] contain two key genes (mdo and mscR) for methanol oxidation, but neither hps nor hpi genes have been found in their partial genome sequences (Fig. 7B). Therefore, it seems unlikely that other members of the thermophilic AMS will be methylotrophic, mainly due to the absence of the RuMP cycle for assimilation of C1 intermediates. In contrast with ATS where none of the four genes have been detected in the genome of its representative species, namely A. taiwanensis, the mscR gene has been detected in the genomes of a few species of AOS, but not for A. mediterranei as mentioned above (Fig. 7B). It is particularly interesting that all four genes are found in the partially sequenced genome of Amycolatopsis benzoatilytica AK 16/65T (ARPK00000000.1), a member of the AOS subclade (Fig. 7B), a result which suggests it may be a facultative methylotroph. As a facultative methylotroph, genomic annotation confirmed that A. methanolica strain 239T is able to generate F6P from carbon sources other than C1 substrates via the RuMP cycle, such as glucose via glycolysis and the KDPGA (Entner-Doudoroff pathway) and TA (pentose phosphate) pathways (Fig. 7A). For the downstream assimilation reactions that occur in the TCA cycle, it is significant that the A. methanolica strain lacks 2-ketoglutarate dehydrogenase but, as in other Amycolatopsis strains has 2-oxoglutarate synthase encoded by korA (AMETH_0502) and korB (AMETH_0501) (Fig. 7A) instead. The hps and hpi genes of A. methanolica strain 239T are located in a cluster that extends from AMETH_4515 to AMETH_4540 and which encodes a few genes for the TA and glycolysis pathways. An analysis of the phylogeny of the protein sequences of HPS and HPI orthologs when present in the same strain shows that while these genes have been evolving more or less concomitantly (Fig. 8A and B), similar clusters are absent in most of the genomes, as exemplified by actinomycetes such as Arthrobacter aurescens strain TC1 (CP000474.1) [80] and Rhodococcus jostii strain RHA1 (CP000431.1) [81], [82] and by the firmicute Bacillus subtilis strain BSn5 (CP002468.1) [83] (Fig. 8C). However, in the incompletely sequenced genomes of two representatives of the family Pseudonocardiaceae, namely A. benzoatilytica strain AK 16/56T (ARPK00000000.1) and Saccharomonospora marina strain XMU15T (CM001439.1), we identified the conserved clusters albeit with evidence of considerable rearrangements (Fig. 8C). In addition, although neither the genes of hps or hpi nor this hps-hpi related cluster are found in the three complete genomes of the AOS strains, orthologous genes, gene operons and gene clusters encoding RuMP-related carbon metabolic pathways for F6P generation are well conserved (Table 2). However, as previously noted [84], sequence divergence (35.4% identity) and conditional expression differences between the A. methanolica ATP-dependent 6-phosphofructokinase (ATP-PFK, AMETH_4515) versus its isogenic PPi-dependent 6-phosphofructokinase (PPi-PFK, AMETH_4897) infers an HGT event for the acquisition of the ATP-PFK. Meanwhile, genes of this RuMP cluster encoding fructose 1,6-bisphosphatase II (glpX, AMETH_4516 and AMETH_4530) and the fructose-bisphosphate aldolase (ALDO, AMETH_4531) are also specific for A. methanolica strain 239T, (i.e., <40% identity compared to the corresponding orthologs of all of the four completely sequenced Amycolatopsis strains, data not shown), thereby supporting the case for HGT.
Fig. 8

Phylogenetic tree of HPI (A) and HPS (B) protein sequences and comparison of their encoding cluster in against those in the two . Maximum-likelihood tree (A and B) showing 6-phospho-3-hexuloisomerase (HPI) and 3-hexulose-6-phosphate synthase (HPS) in relation to the orthologous proteins selected from organisms which have available whole genome sequences and the CDSs for both HPS and HPI. The protein sequences of HPS and HPI from A. methanolica strain 239T are marked by solid triangles. Numbers at the nodes are percentage bootstrap values based on a maximum-likelihood analysis of 1000 resampled datasets. Bar indicates 0.1 substitutions per site. (C) Comparison of chromosomal loci of genes involved in RuMP and related carbon metabolism from A. methanolica strain 239T, A. benzoatilytica AK 16/56T, S. marina XMU15, R. jostii RHA1, Arthrobacter aurescens TC1 and B. subtilis BSn5. The clusters from A. methanolica, A. benzoatilytica and S. marina strains are conserved. Genes marked in red are specific for the corresponding strains, those in green have corresponding orthologs in most of the actinobacteria, the hps and hpi genes are marked in black.

Table 2

Comparison of genes involved in the pentose phosphate pathway and glycolysis/gluconeogenesis in Amycolatopsis genomes.

A. methanolica(in RuMP cluster)A. methanolica(not in RuMP cluster)A. mediterraneiA. orientalisA. japonica
Pentose phosphate pathway (complete)pfkAMETH_4897 (PPi-pfk)AMED_2460AORI_2423AJAP_26710
AMETH_4515 (ATP-pfk)
tktAMETH_2245, AMETH_2246
AMED_0651, AMED_0652
AMETH_5084, AMETH_5085AMED_2262, AMED_2263AORI_5659, AORI_5660AJAP_11180, AJAP_11185
AMETH_4518AMETH_4092AMED_2809AORI_2796AJAP_24810
rpeAMETH_4520AMETH_4130AMED_2771AORI_2761AJAP_25030
rpiAMETH_4529AMETH_1699AMED_6811AORI_1928AJAP_29895
AORI_6417AJAP_06835
AMETH_2243, AMETH_6524AORI_7455AJAP_38990
gndAMETH_0006AMED_0003AORI_0003AJAP_00020
talAAMETH_4093AMED_2808AORI_2795AJAP_24815
pgiAMETH_4094AMED_2807AORI_2794AJAP_24820
zwfAMETH_4095AMED_2806AORI_2793AJAP_24825
pglAMETH_4097AMED_2804AORI_2791AJAP_24835
Glycolysis/gluconeogenesisglpXAMETH_5852AMED_8059AORI_6881AJAP_04695
AMETH_4516
AMETH_4530
fbaAAMETH_6838AMED_9110AORI_7850AJAP_40945
AMETH_4655
AMETH_6121AMED_3923AORI_5186AJAP_13540
AMETH_4531

Each line is the corresponding ortholog.

Phylogenetic tree of HPI (A) and HPS (B) protein sequences and comparison of their encoding cluster in against those in the two . Maximum-likelihood tree (A and B) showing 6-phospho-3-hexuloisomerase (HPI) and 3-hexulose-6-phosphate synthase (HPS) in relation to the orthologous proteins selected from organisms which have available whole genome sequences and the CDSs for both HPS and HPI. The protein sequences of HPS and HPI from A. methanolica strain 239T are marked by solid triangles. Numbers at the nodes are percentage bootstrap values based on a maximum-likelihood analysis of 1000 resampled datasets. Bar indicates 0.1 substitutions per site. (C) Comparison of chromosomal loci of genes involved in RuMP and related carbon metabolism from A. methanolica strain 239T, A. benzoatilytica AK 16/56T, S. marina XMU15, R. jostii RHA1, Arthrobacter aurescens TC1 and B. subtilis BSn5. The clusters from A. methanolica, A. benzoatilytica and S. marina strains are conserved. Genes marked in red are specific for the corresponding strains, those in green have corresponding orthologs in most of the actinobacteria, the hps and hpi genes are marked in black. Comparison of genes involved in the pentose phosphate pathway and glycolysis/gluconeogenesis in Amycolatopsis genomes. Each line is the corresponding ortholog. On the other hand, in the same RuMP cluster, certain genes, such as those encoding transketolase (tkt AMETH_4518), ribulose-phosphate 3-epimerase (rpe AMETH_4520) and ribose 5-phosphate isomerase (rpi AMETH_4529), have corresponding isoenzymes with similar functions in other parts of the genomes of both A. methanolica strain 239T and the three AOS strains with high similarities (>60% identity, Table S9). These genes are highly similar to corresponding orthologs in other strains belonging to the class Actinobacteria (Fig. 8C) and hence are likely to be specific to this taxon and integrated into the hps-hpi RuMP cluster in ancestral strain(s) of the family Pseudonocardiaceae. Utilization of methanol, formaldehyde and formic acid as sole one-carbon substrates by A. methanolica strain 239T was retested after whole genome annotation and KEGG mapping (see Materials and Methods). After incubation for 14 days, the bacterium grew well in liquid synthetic media (HMM) supplemented with 0.1%, 0.2% and 0.4% methanol as sole carbon sources. In contrast, the strain failed to grow in the same basal media supplemented with the same concentrations of either formaldehyde or formic acid as sole carbon sources (Fig. S9, S10). Consequently, we believe that A. methanolica strain 239T might not be able to tolerate the toxicity caused by environmental formaldehyde or formic acid even though they are intermediates of methanol metabolism. To test this hypothesis, the non-growing cultures were re-spread on GT plates with starch as the carbon source followed by incubation at 28 °C for 3 days. Two non-growing cultures, i.e., A. methanolica in HMM media without carbon source additives or with formic acid as the sole carbon source, grew well on the GT plate; while the formaldehyde culture failed to grow (Table S8). These findings indicate that A. methanolica stain 239T is unlikely to tolerate the toxic effect of formaldehyde while its ability to metabolize formic acid may be too weak to support healthy growth.

Conclusions and perspectives

A. methanolica strain 239T (CP009110) is the first member of the AMS species to be completely sequenced thereby providing a high quality alternative genetic blueprint for the taxonomically diverse genus Amycolatopsis. Comparative and evolutionary genomic analyses based on this study allowed several major conclusions to be drawn: The genomes of Amycolatopsis species are circular ranging in size from 7 Mbp to 10 Mbp (ref to Table S7). In addition to a highly conserved and consistent core region with few inserted sequences extending from oriC in both directions of the A. methanolica genome, certain consecutive segments of the non-core region are generally co-linear with those from other species of the genus. The AMETH_3452 orthologous genes identified in all of the sequenced genomes of actinobacteria were mainly located in the middle of the circular chromosomes opposite to the oriC (Fig. 2); while the so-called “quasi-core” regions are proposed to be part of the core in ancestral actinobacterial species near the replication termini (refer to Fig. 2, Fig. 3). These regions are likely to be more accessible to multiple HGT and chromosomal rearrangement events and to adaptation under different environmental conditions thereby ensuring that a few segments of the core genome are relatively stable. In light of the systematic search for molecular markers capable of delineating between the 62 Amycolatopsis species and representatives of 5 other genera classified in the family Pseudonocardiaceae, three major Amycolatopsis subclades (AMS, AOS and ATS) and some minor taxa, such as the J group, were proposed based on phylogenetic analyses of 16S rDNA and AMETH_3452 orthologues (against the 18 Amycolatopsis species with available genomic sequences) as well as on their genome characteristics and growth physiology (Fig. 4, Fig. 6). The constituent species of the subclades share distinctive temperature profiles and some genomic features. Thus, the 10 AMS species are thermophilic with smaller genome sizes, as are the 2 species of group J; the 3 ATS species are mesophilic, while the AOS species encompass 36 mesophilic and 11 moderately thermophilic strains, many of which have a large number of gene clusters that may produce various kinds of specialized metabolites (Fig. 6, Table S7). The separation of the F group into ATS/F1 and AMS/F2 subclades is supported by differences in either the numbers or types of their transporters and biosynthetic genes involved in compatible solutes that are likely to be an expression of their adaption to hot saline environments. This genomic study revealed the complete pathways for the utilization of methanol as a sole carbon source in A. methanolica strain 239T including both mdo and mscR encoded methanol oxidation and hps and hpi encoded formaldehyde assimilation via the RuMP cycle (Fig. 7). Although phenotypic verification is still required, it seems likely that A. benzoatilytica AK 16/65T, a member of the AOS subclade, may prove to be a facultative methylotroph as all four of the essential genes required for methanol utilization are encoded in its partially sequenced genome (ARPK00000000.1) (Fig. 7B). However, methylotrophy is unlikely to be a common trait within the genus Amycolatopsis as none of the other available Amycolatopsis genomes have the complete set of the four genes required for methanol utilization. Indeed, the RuMP cluster including hps and hpi defined in the genomes of the A. methanolica, A. benzoatilytica and S. marina strains seem to have been acquired by HGT, an event that probably occurred in ancestral strain(s) of the family Pseudonocardiaceae (Fig. 7, Fig. 8). 4. It was also shown that A. methanolica strain 239T is potentially able to produce 12 different kinds of specialized metabolites though not polyketides (Fig. 1B). These gene clusters are highly conserved among the three AMS genomes; while, except for oth and ectABC, none of them are found in the three completely sequenced and well annotated representative genomes of the AOS species. In summary, the complete sequencing, de novo assembly and comparative annotation of the A. methanolica genome greatly improves our systematic understanding of the genus Amycolatopsis with respect to the spectrum of genomic structures and configurations, genus-focused phylogeny and sub-generic classification, as well as to its primary and specialized metabolism. It is, once again, evident that systems biology guided genomic analyses based on the complete genetic information of strains representing diverse prokaryotic species is the key to revitalizing and even to revolutionizing the taxonomy of microorganisms in the genomic era [85].

Materials and methods

Genome sequencing and assembly: Amycolatopsis methanolica strain 239T was sequenced using a whole genome shotgun strategy with the Roche 454 GS FLX Titanium System [86]. A total of 64 contigs (length≥500 bp) with a total size of 7.23 Mb were assembled from 561,423 reads (average length 430 bp) by the Newbler V2.3 Program of the 454 suite package, providing a 28-fold coverage of the whole genome. The relationships between the contigs were determined by using ContigScape plugin [87] and by reference to the A. mediterranei U32 genome. The gaps were filled by sequencing PCR products. The final sequence assembly was performed using the phred/phrap/consed package [88], [89], [90]. Sanger-based sequencing was employed to facilitate gap closure and to amend low-quality regions (score < 40). Finally, a consensus sequence containing 7,237,391 bp with an estimated error rate of 0.8 per 100,000 bases provided 28-fold coverage. The complete sequence of the chromosome was deposited in the GenBank database under accession number CP009110. Genome annotation and analysis: Putative protein-coding sequences were predicted based on results from Glimmer 3.02 [91], Genemark [92] and Z-Curve [93]. The CDS annotation was based on the BLASTP results obtained with the CDD and the databases of KEGG [94] and NR; manual corrections were also implemented. The tRNA genes were predicted directly with tRNAscan-SE v1.23 [95]. The genome-wide colinearity analysis was performed using the MUMmer 3.22 Project [96]. The gene clusters for specialized metabolism were predicted using antiSMASH 3 [61]. The critical genes that determined the existence, conformation, and chain lengths of compounds were further elucidated through literature consultation, sequence alignment, domain comparison, and/or phylogenetic analysis. All of the BLASTP analyses used a threshold E value of 1e-5, and protein length diversity was not less than 0.5. Potential specialized metabolite gene clusters of all 18 Amycolatopsis genomes that have been predicted using antiSMASH 3. RiPPs analysis were predicted by BAGEL3 [62]. Circular map, comparative analysis and assignment of common orthologues: Genome circular maps were generated using GenomeViz [97] and DNAplotter [98]. Comparative analyses of genomes from strains classified in the family Pseudonocardiaceae were achieved using Mauve 2.3.1 genome alignment software [99]; the genoPlotR package [100] was used in post-processing. Orthologous analysis data were submitted to the MBGD platform, a database for comparative analysis of complete microbial genomes [101] with default parameters (http://mbgd.genome.ad.jp). Further, all of the annotated proteins of the 123 genomes belonging to the phylum Actinobacteria including the genera Acidimicrobium, Conexibacter and Rubrobacter which were assigned to independent classes in the phylum Actinobacteria in the most recent edition of Bergey's Manual of Systematic Bacteriology (Fig. S7), were downloaded from ftp.ncbi.nlm.nih.gov/genomes/Bacteria and clustered using the BLASTCLUST [102] program under the conditions of a minimum of 30% identity and 70% length coverage. After a further strict screen, 95 conserved and single-copy orthologous genes were selected, but neither gyrB nor recN genes once used in phylogenetic analysis of Amycolatopsis species [16], [17] were included in the list as shown in Table S6. These genes were not considered because there are two gyrB genes in Gardnerella vaginalis ATCC 14019 (NC_014644, HMPREF0421_20130, HMPREF0421_20607) while the recN gene of Mycobacterium smegmatis MC2 155 (MSMEG_3749) contains a frame shift mutation within its coding sequence which is not the result of sequencing error (CP000480).

Phylogenetic analyses

Phylogenetic analysis based on 16S rDNA of 62 Amycolatopsis type strains

The 16S rRNA gene sequences of the 62 Amycolatopsis type stains were obtained from the EzTaxon server. The MUSCLE algorithm [103] from MEGA5 [104] was used for sequence alignments. The neighbour-joining (NJ) [105] method in the MEGA5 software was used to construct the phylogenetic trees with S. viridis DSM 43017T as an outgroup. The topology of the phylogenetic trees were evaluated using the bootstrap resampling method with 1000 repeats from the MEGA5 package.

Phylogenetic tree of 13 AOS, 1 ATS and 3 AMS and group J based on gyrB, 16S rDNA-gyrB, recN, 16S rDNA-recN gene sequences

Concatenated 16S rDNA-gyrB and 16S rDNA-recN gene sequences were generated by joining individual gyrB or recN gene sequences to the end of corresponding 16S rDNA sequences. The gyrB and recN gene sequences were full-length and were extracted from the genomes. The phylogenetic tree was constructed as described above.

Phylogenetic trees of 13 AOS, 1 ATS and 3 AMS and group J based on head-tail linked 95 core protein sequences

Phylogenetic trees were based on the 95 orthologous protein sequences (Table S6). Each orthologous protein was aligned using MUSCLE3.5 [103]. All of the alignment files were joined in series and the results generated by Gblocks 0.9b [106]. The alignment format was transformed by Clustal X2.1 [107]. The phylogenetic trees were constructed using the maximum-likelihood (ML) method from the RAxML package [108], and the reliability of each branch was tested by 1000 bootstrap replications. Finally, the tree file was processed by using the MEGA5 package [104]. The accession numbers of the genomes used in this tree are JNYY00000000, JMQI00000000, ARBH00000000, CP002000, ARPK00000000, JNYZ00000000, ARAF00000000, ANMG00000000, CP007219, AOHO00000000, CP008953, CP003410, JAFB00000000, AFWY00000000, CP009110, AXBH00000000, CP001683, AM420293, CP001630 and CP002593. Experimental analysis for one-carbon source utilization by : The A. methanolica strain was grown on an GT plate (20 g/L starch, 0.5 g/L l-asparagine, 1 g/L KNO3, 0.5 g/L K2HPO4·3H2O, 0.5 g/L NaCl, 0.5 g/L MgSO4·7H2O, 1 g/L CaCO3, and 2 g/L agar, pH 7.5) at 28 °C for 3 days. A single colony from the plate was inoculated into 40 mL of ISP2 broth (0.4% yeast extract, 1% malt extract, and 0.4% glucose) and incubated at 28 °C for 4 days. To test for the survival of the strain, we put 2 mL of the seed culture in eppendorf tubes, washed twice (centrifuged before resuspension) in HMM medium (prepared by mixing (1) 100 mL phosphate salts solution (25.3 g/L K2HPO4 and 22.5 g/L Na2HPO4), (2) 100 mL sulfate salts solution (5 g/L (NH4)2SO4 and 2 g/L MgSO4·7H2O), (3) 799 mL of deionized water, and (4) 1 mL of trace metal solution (0.177 g/L ZnSO4·7H2O, 1.466 g/L CaCl2·2H2O, 0.107 g/L MnCl2·4H2O, 2.496 g/L FeSO4·7H2O, 0.177 g/L (NH4)6Mo7O24·4H2O, 0.374 g/L CuSO4·5H2O, 0.238 g/L CoCl2·6H2O and 0.1 g/L Na2WO4·2H2O) followed by resuspension with 1 mL HMM. Next, 40 μL of the resultant suspension was inoculated into 15 mL of HMM broth supplemented with methanol, formaldehyde or formic acid (pH adjusted to 7.5 before use) as the sole carbon sources, these experiments were carried out in triplicate concentrations of the carbon sources, namely 0.1%, 0.2% or 0.4%; and HMM without any additives was used as the control. For the volatile methanol, continuous feeding was performed every two days. Optical density at 600 nm (OD600) was measured every second day. After 14 days of incubation, 10 μL of each broth was spread over the GT plates to evaluate the survival of A. methanolica strain 239T in HMM broth supplemented with methanol, formaldehyde or formic acid as sole carbon sources, and in HMM broth without any additives serving as the controls.

Statistical analysis

The boxplot chart and scatter diagram were prepared using R-i386–3.2.1.

Accession numbers

The genome sequence has been deposited at the NCBI under the GenBank accession number [CP009110].

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

MG proposed the research project and GPZ conceived and designed the research strategy. BT and JW were responsible for sequencing, finishing and the annotations. BT performed and contributed to the annotation data processing and analysis. BT, FX, WZ, SWD, HCL, XMD, YCY, XFC, YZ and HKZ carried out the experiments and data analyses. BT, FX, MG and WZ were involved in drafting the manuscript. GPZ and LXZ supervised the experimental work and participated in preparing the manuscript. MG revised the article. All authors read and approved the final manuscript.
  96 in total

1.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.

Authors:  Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2006-08-23       Impact factor: 6.937

2.  The use of gyrB sequence analysis in the phylogeny of the genus Amycolatopsis.

Authors:  Gareth J Everest; Paul R Meyers
Journal:  Antonie Van Leeuwenhoek       Date:  2008-09-21       Impact factor: 2.271

3.  Mycothiol-dependent formaldehyde dehydrogenase, a prokaryotic medium-chain dehydrogenase/reductase, phylogenetically links different eukaroytic alcohol dehydrogenases--primary structure, conformational modelling and functional correlations.

Authors:  A Norin; P W Van Ophem; S R Piersma; B Persson; J A Duine; H Jörnvall
Journal:  Eur J Biochem       Date:  1997-09-01

4.  Amycolatopsis methanolica sp. nov., a facultatively methylotrophic actinomycete.

Authors:  L de Boer; L Dijkhuizen; G Grobben; M Goodfellow; E Stackebrandt; J H Parlett; D Whitehead; D Witt
Journal:  Int J Syst Bacteriol       Date:  1990-04

5.  Rhodococcus jostii: a home for Rhodococcus strain RHA1.

Authors:  Amanda L Jones; Julian Davies; Masao Fukuda; Roselyn Brown; Jesmine Lim; Michael Goodfellow
Journal:  Antonie Van Leeuwenhoek       Date:  2013-07-13       Impact factor: 2.271

Review 6.  Ecological significance of compatible solute accumulation by micro-organisms: from single cells to global climate.

Authors:  D T Welsh
Journal:  FEMS Microbiol Rev       Date:  2000-07       Impact factor: 16.408

7.  Copper bioaccumulation by the actinobacterium Amycolatopsis sp. AB0.

Authors:  Virginia Helena Albarracín; Beatriz Winik; Erika Kothe; María Julia Amoroso; Carlos Mauricio Abate
Journal:  J Basic Microbiol       Date:  2008-10       Impact factor: 2.281

8.  Kinetic mechanism of glutathione synthetase from Arabidopsis thaliana.

Authors:  Joseph M Jez; Rebecca E Cahoon
Journal:  J Biol Chem       Date:  2004-08-09       Impact factor: 5.157

9.  From genomics to chemical genomics: new developments in KEGG.

Authors:  Minoru Kanehisa; Susumu Goto; Masahiro Hattori; Kiyoko F Aoki-Kinoshita; Masumi Itoh; Shuichi Kawashima; Toshiaki Katayama; Michihiro Araki; Mika Hirakawa
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

10.  Draft Genome Sequence of Amycolatopsis decaplanina Strain DSM 44594T.

Authors:  Navjot Kaur; Shailesh Kumar; Monu Bala; Gajendra Pal Singh Raghava; Shanmugam Mayilraj
Journal:  Genome Announc       Date:  2013-04-04
View more
  8 in total

1.  Genome-based classification of micromonosporae with a focus on their biotechnological and ecological potential.

Authors:  Lorena Carro; Imen Nouioui; Vartul Sangal; Jan P Meier-Kolthoff; Martha E Trujillo; Maria Del Carmen Montero-Calasanz; Nevzat Sahin; Darren Lee Smith; Kristi E Kim; Paul Peluso; Shweta Deshpande; Tanja Woyke; Nicole Shapiro; Nikos C Kyrpides; Hans-Peter Klenk; Markus Göker; Michael Goodfellow
Journal:  Sci Rep       Date:  2018-01-11       Impact factor: 4.379

2.  Comparative Genomics and Biosynthetic Potential Analysis of Two Lichen-Isolated Amycolatopsis Strains.

Authors:  Marina Sánchez-Hidalgo; Ignacio González; Cristian Díaz-Muñoz; Germán Martínez; Olga Genilloud
Journal:  Front Microbiol       Date:  2018-03-13       Impact factor: 5.640

3.  Comparative genomics reveals phylogenetic distribution patterns of secondary metabolites in Amycolatopsis species.

Authors:  Martina Adamek; Mohammad Alanjary; Helena Sales-Ortells; Michael Goodfellow; Alan T Bull; Anika Winkler; Daniel Wibberg; Jörn Kalinowski; Nadine Ziemert
Journal:  BMC Genomics       Date:  2018-06-01       Impact factor: 3.969

4.  Revisiting the Taxonomic Status of the Biomedically and Industrially Important Genus Amycolatopsis, Using a Phylogenomic Approach.

Authors:  Vartul Sangal; Michael Goodfellow; Jochen Blom; Geok Yuan Annie Tan; Hans-Peter Klenk; Iain C Sutcliffe
Journal:  Front Microbiol       Date:  2018-09-27       Impact factor: 5.640

5.  Uncovering the potential of novel micromonosporae isolated from an extreme hyper-arid Atacama Desert soil.

Authors:  Lorena Carro; Jean Franco Castro; Valeria Razmilic; Imen Nouioui; Che Pan; José M Igual; Marcel Jaspars; Michael Goodfellow; Alan T Bull; Juan A Asenjo; Hans-Peter Klenk
Journal:  Sci Rep       Date:  2019-03-18       Impact factor: 4.379

6.  The Degradative Capabilities of New Amycolatopsis Isolates on Polylactic Acid.

Authors:  Francesca Decorosi; Maria Luna Exana; Francesco Pini; Alessandra Adessi; Anna Messini; Luciana Giovannetti; Carlo Viti
Journal:  Microorganisms       Date:  2019-11-20

Review 7.  Actinobacteria Derived from Algerian Ecosystems as a Prominent Source of Antimicrobial Molecules.

Authors:  Ibtissem Djinni; Andrea Defant; Mouloud Kecha; Ines Mancini
Journal:  Antibiotics (Basel)       Date:  2019-10-01

Review 8.  Actinobacteria From Desert: Diversity and Biotechnological Applications.

Authors:  Feiyang Xie; Wasu Pathom-Aree
Journal:  Front Microbiol       Date:  2021-12-09       Impact factor: 5.640

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.