| Literature DB >> 29340007 |
José L Steffani-Vallejo1, Marion E Brunck2, Erika Y Acosta-Cruz1,3,4, Rafael Montiel3, Francisco Barona-Gómez1.
Abstract
Mycobacterium simiae (Karassova V, Weissfeiler J, Kraszanay E, Acta Microbiol Acad Sci Hung 12:275-82, 1965) is a slow-growing nontuberculous Mycobacterium species found in environmental niches, and recently evidenced as an opportunistic Human pathogen. We report here the genome of a clinical isolate of M. simiae (MsiGto) obtained from a patient in Guanajuato, Mexico. With a size of 6,684,413 bp, the genomic sequence of strain MsiGto is the largest of the three M. simiae genomes reported to date. Gene prediction revealed 6409 CDSs in total, including 6354 protein-coding genes and 52 RNA genes. Comparative genomic analysis identified shared features between strain MsiGto and the other two reported M. simiae genomes, as well as unique genes. Our data reveals that M. simiae MsiGto harbors virulence-related genes, such as arcD, ESAT-6, and those belonging to the antigen 85 complex and mce clusters, which may explain its successful transition to the human host. We expect the genome information of strain MsiGto will provide a better understanding of infective mechanisms and virulence of this emergent pathogen.Entities:
Keywords: Mycobacterium simiae; Nontuberculous mycobacteria; Opportunistic pathogen
Year: 2018 PMID: 29340007 PMCID: PMC5759803 DOI: 10.1186/s40793-017-0291-x
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Fig. 1Phylogenetic tree showing the relationship of MsiGto with selected species members of the Complex, including 2 relevant strains. Phylogenetic reconstruction was obtained using Bayesian inference. Numbers at the nodes are the values of posterior probabilities. The tree was obtained after 1 million generations with mixed model. Sequence data from H37Rv was used as an outgroup
Average Nucleotide Identity (A) and Average Amino acid Identity (B) between M. simiae MsiGto and other M simiae strains sequenced to date
| Organism | M. simiae MsiGto A/B | M. simiae DSM 44165 | M. simiae MO323 |
|---|---|---|---|
| M. simiae MsiGto | – | 97.25/ 97.51 | 98.99/98.95 |
| M. simiae DSM 44165 | 97.25/ 97.51 | – | 97.26/97.59 |
| M. simiae MO323 | 98.99/98.95 | 97.26/97.59 | – |
Classification and general features of Mycobacterium simiae MsiGto [51]
| MIGS ID | Property | Term | Evidence codea |
|---|---|---|---|
| Classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | [ | ||
| Genus | TAS [ | ||
| Species | IDA | ||
| Gram stain | Weakly Postive | IDA | |
| Cell shape | Irregular rods | IDA | |
| Motility | Non Motile | IDA | |
| Sporulation | Nonsporulating | NAS | |
| Temperature range | Mesophile | NAS | |
| Optimum temperature | 37 °C | NAS | |
| pH range; Optimum | 5.5–8; 7 | IDA | |
| Carbon source | Starch | IDA | |
| MIGS-6 | Habitat | Human Associated | NAS |
| MIGS-6.3 | Salinity | Normal | NAS |
| MIGS-22 | Oxygen requirement | Aerobic | NAS |
| MIGS-15 | Biotic relationship | Parasitic | IDA |
| MIGS-14 | Pathogenicity | Pathogenic | NAS |
| MIGS-4 | Geographic location | Mexico/Guanajuato | NAS |
| MIGS-5 | Sample collection | 2014 | NAS |
| MIGS-4.1 | Latitude | Not Reported | NAS |
| MIGS-4.2 | Longitude | Not Reported | NAS |
| MIGS-4.4 | Altitude | Not Reported | NAS |
aEvidence codes - IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [60]
Project information
| MIGS ID | Property | Term |
|---|---|---|
| MIGS 31 | Finishing quality | Draft |
| MIGS-28 | Libraries used | Paired End Illumina |
| MIGS 29 | Sequencing platforms | Illumina HiSeq 2000 |
| MIGS 31.2 | Fold coverage | 216 |
| MIGS 30 | Assemblers | SOAPdenovo |
| MIGS 32 | Gene calling method | RAST |
| Locus Tag | B5M45 | |
| Genbank ID |
| |
| GenBank Date of Release | April 17, 2017 | |
| GOLD ID | Ga0183212 | |
| BIOPROJECT |
| |
| MIGS 13 | Source Material Identifier | |
| Project relevance | Medical, Evolutionary |
Summary of genome: one chromosome, no plasmids
| Label | Size (Mb) | Topology | INSDC identifier | RefSeq ID |
|---|---|---|---|---|
| Chromosome | 6,684,413 | Circular | GenBank |
Genome statistics
| Attribute | Value | % of Total |
|---|---|---|
| Genome size (bp) | 6,684,413 | 100 |
| DNA coding (bp) | 5,978,008 | 89.43 |
| DNA G + C (bp) | 4,416,391 | 66.07 |
| DNA scaffolds | 15 | 100 |
| Total genes | 6369 | 100 |
| Protein coding genes | 6299 | 99.90 |
| RNA genes | 70 | 1.10 |
| Pseudo genes | 160 | 2.51 |
| Genes in internal clusters | 579 | 9.09 |
| Genes with function prediction | 4713 | 74.00 |
| Genes assigned to COGs | 5272 | 82.29 |
| Genes with Pfams domains | 5009 | 78.19 |
| Genes with signal peptides | 260 | 4.08 |
| Genes with transmembrane helices | 1292 | 20.29 |
| CRISPR repeats | 15 |
Number of genes associated with general COG functional categories
| Code | Value | %agea | Description |
|---|---|---|---|
| J | 161 | 2.53 | Translation, ribosomal structure and biogenesis |
| A | 21 | 0.33 | RNA processing and modification |
| K | 448 | 7.05 | Transcription |
| L | 192 | 3.02 | Replication, recombination and repair |
| B | 1 | 0.01 | Chromatin structure and dynamics |
| D | 55 | 0.86 | Cell cycle control, Cell division, chromosome partitioning |
| V | 47 | 0.73 | Defense mechanisms |
| T | 218 | 3.43 | Signal transduction mechanisms |
| M | 159 | 2.50 | Cell wall/membrane biogenesis |
| N | 57 | 0.89 | Cell motility |
| U | 36 | 0.56 | Intracellular trafficking and secretion |
| O | 153 | 2.41 | Posttranslational modification, protein turnover, chaperones |
| C | 447 | 7.03 | Energy production and conversion |
| G | 243 | 3.82 | Carbohydrate transport and metabolism |
| E | 298 | 4.69 | Amino acid transport and metabolism |
| F | 82 | 1.29 | Nucleotide transport and metabolism |
| H | 210 | 3.30 | Coenzyme transport and metabolism |
| I | 558 | 8.78 | Lipid transport and metabolism |
| P | 239 | 3.76 | Inorganic ion transport and metabolism |
| Q | 494 | 7.77 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 795 | 12.51 | General function prediction only |
| S | 358 | 5.63 | Function unknown |
| – | 1082 | 17.03 | Not in COGs |
aThe total is based on the total number of protein coding genes in the genome
Fig. 2A graphical circular map of the MsiGto genome keyed to the COGS functional categories. The circular map was generated using BASys web server [61]
Fig. 3Venn diagram analysis showing the number of unique and shared family proteins as evidenced using PATRIC, between the M. simiae strains MO323, DSM 44165 and MsiGto