| Literature DB >> 26275904 |
Li Xiao1, Travis Ptacek2,3, John D Osborne4,5, Donna M Crabb6, Warren L Simmons7, Elliot J Lefkowitz8,9, Ken B Waites10, T Prescott Atkinson11, Kevin Dybvig12,13.
Abstract
BACKGROUND: Mycoplasma pneumoniae is a common pathogen that causes upper and lower respiratory tract infections in people of all ages, responsible for up to 40% of community-acquired pneumonias. It also causes a wide array of extrapulmonary infections and autoimmune phenomena. Phylogenetic studies of the organism have been generally restricted to specific genes or regions of the genome, because whole genome sequencing has been completed for only 4 strains. To better understand the physiology and pathogenicity of this important human pathogen, we performed comparative genomic analysis of 15 strains of M. pneumoniae that were isolated between the 1940s to 2009 from respiratory specimens and cerebrospinal fluid originating from the USA, China and England.Entities:
Mesh:
Year: 2015 PMID: 26275904 PMCID: PMC4537597 DOI: 10.1186/s12864-015-1801-0
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of the sequenced strains
| Strain | Description | Isolation source | Date collected | Site originated | |
|---|---|---|---|---|---|
| Type 1 | M129 | ATCC 29342 (Reference Strain) | Patient with pneumonia | 1968 | USA/NC |
| 142.8 | ATCC 29085 | Throat | 1960 | USA/MD | |
| 51494 | Clinical isolate | Cerebral spinal fluid | 2006 | USA/CO | |
| 54089 | Macrolide resistant isolate | Throat | 2009 | USA/AL | |
| 54524 | Low passage isolate | Throat | 2009 | USA/AL | |
| 85084 | Respiratory specimen | Prior to 1985 | China | ||
| 85138 | Respiratory specimen | Prior to 1985 | China | ||
| Type 2 | FH | ATCC 15531 (Reference Strain) | Sputum | 1954 | USA/MA |
| 19294 | Low passaged isolate | Throat | 1994 | USA/OH | |
| 39443 | Low passaged isolate | Throat | 1999 | USA/AL | |
| M1139 | Respiratory specimen | 1981 | England | ||
| M2192 | Respiratory specimen | 1982 | England | ||
| M2592 | Respiratory specimen | 1982 | England | ||
| MAC | ATCC 15492 | Human lung tissue | 1944 | USA/CA | |
| UAB PO1 | Low passage isolate | Throat | 1980 | USA/AL | |
Characteristics of the completed genome assemblies
| Strain | Accession | Length | %GC | Genes | ||||
|---|---|---|---|---|---|---|---|---|
| CDS | rRNA | tRNA | Total | |||||
| Type 1 | M129 | CP003913 | 816451 | 0.40038 | 790 | 6 | 74 | 870 |
| 142.8 | CP010538 | 816496 | 0.40011 | 790 | 6 | 74 | 870 | |
| 51494 | CP010541 | 816404 | 0.40005 | 781 | 6 | 74 | 861 | |
| 54089 | CP010542 | 816565 | 0.40010 | 784 | 6 | 74 | 864 | |
| 54524 | CP010543 | 816583 | 0.40009 | 780 | 6 | 74 | 860 | |
| 85084 | CP010544 | 816404 | 0.40011 | 788 | 6 | 74 | 868 | |
| 85138 | CP010545 | 816402 | 0.40011 | 788 | 6 | 74 | 868 | |
| Type 2 | FH | CP010546 | 817207 | 0.39981 | 786 | 6 | 72 | 864 |
| 19294 | CP010539 | 818633 | 0.40001 | 780 | 6 | 72 | 858 | |
| 39443 | CP010540 | 817184 | 0.39979 | 782 | 6 | 72 | 860 | |
| M1139 | CP010547 | 817045 | 0.39977 | 789 | 6 | 72 | 867 | |
| M2192 | CP010548 | 817169 | 0.39978 | 783 | 6 | 72 | 861 | |
| M2592 | CP010549 | 817198 | 0.39981 | 783 | 6 | 72 | 861 | |
| MAC | CP010550 | 817156 | 0.39983 | 788 | 6 | 72 | 866 | |
| PO1 | CP010551 | 817216 | 0.39982 | 788 | 6 | 72 | 866 | |
Fig. 1Overall sequence identity of the 15 sequenced strains with the reference M129 genome. BLAST-based similarity of a given strain versus the M129 reference is represented as a colored ring. Colors by strain are indicated to the right. Solid coloration indicates >99 % identity and transparent grey indicates approximately 95 % identity. Location in the reference genome is indicated by numeration on the inside of the ring. GC content in the reference genome is indicated by the black bar graphs between the genomic coordinates and the colored rings (bars pointing toward the outside of the circle indicate high GC content). Note that genomic structural alterations are not visible using this method
Fig. 2Whole genome alignment of the 15 sequenced strains using MAUVE. Regions colored in mauve are conserved across all strains. Differently colored blocks are conserved in some strains. Blocks that are lower are inverted relative to the other strains. Open boxes indicate the location of genes. tRNA genes are shaded in green and rRNA genes are shaded in red. Genes affected by the indicated variants are labeled. Numbers above intervals indicate locations relative to the M129 strain. a Alignment showing all 15 strains. b Close up of the type 2-specific insertion. M129 and FH are shown and are typical of the other type 1 and 2 strains, respectively. Lines indicate relative point of insertion. c Close up of the type 1-specific insertion. M129 and FH are shown and are typical of the other type 1 and 2 strains, respectively. Lines indicate relative point of insertion
Fig. 3Phylogenetic tree based on whole genome alignment of the 15 sequenced strains. The 15 sequenced M. pneumoniae strains and M. hominis (included as an outgroup) were aligned, and a tree was generated using the alignment. Confidence values, represented as percent of supporting bootstrapping iterations are shown for each node. Scale, in differences per site, is indicated at the bottom. The branches between M. hominis and the M. pneumoniae strains have been truncated (indicated by double slashes), and the branch length (in differences per site) is indicated above the branch. The inset shows the same tree (rescaled, note the new scale bar) without any branches truncated
Variants relative to the M129 reference strain
| Strain | Total variants | Non-synonymous genic SNPs | Synonymous genic SNPs | Genic indels | IntragenicSNPs and indels | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Coding | Start lost | Stop gain | Stop lost | In frame | Frameshift | |||||
| Type 1 | M129 | 235 | 113 | 0 | 3 | 0 | 41 | 10 | 36 | 32 |
| 142.8 | 334 | 174 | 1 | 2 | 1 | 67 | 4 | 22 | 63 | |
| 51494 | 382 | 202 | 1 | 3 | 1 | 77 | 7 | 24 | 67 | |
| 54089 | 431 | 223 | 1 | 3 | 1 | 90 | 8 | 36 | 69 | |
| 54524 | 385 | 208 | 1 | 2 | 1 | 76 | 5 | 22 | 70 | |
| 85084 | 404 | 199 | 1 | 2 | 1 | 88 | 8 | 38 | 67 | |
| 85138 | 365 | 183 | 1 | 1 | 1 | 72 | 2 | 25 | 80 | |
| Type 2 | FH | 1581 | 705 | 0 | 8 | 6 | 474 | 35 | 109 | 244 |
| 19294 | 1606 | 705 | 0 | 11 | 6 | 490 | 33 | 110 | 251 | |
| 39443 | 1615 | 713 | 0 | 10 | 6 | 489 | 36 | 110 | 251 | |
| M1139 | 1581 | 687 | 0 | 8 | 6 | 485 | 33 | 114 | 248 | |
| M2192 | 1604 | 705 | 0 | 11 | 6 | 490 | 35 | 109 | 248 | |
| M2592 | 1588 | 697 | 0 | 8 | 6 | 484 | 33 | 111 | 249 | |
| MAC | 1569 | 677 | 0 | 8 | 7 | 477 | 34 | 114 | 252 | |
| PO1 | 1588 | 700 | 0 | 8 | 6 | 482 | 34 | 116 | 242 | |
Type 2 genes with the most variants as compared to M129 (type 1)
| Gene | Function | FH | 19294 | 39443 | M1139 | M2192 | M2592 | MAC | PO1 |
|---|---|---|---|---|---|---|---|---|---|
| MPN457 | Unknown | 165 | 164 | 165 | 165 | 164 | 165 | 165 | 165 |
| MPN141 | P1 | 52 | 67 | 68 | 65 | 66 | 67 | 67 | 66 |
| MPN142 | ORF6 | 17 | 16 | 16 | 16 | 16 | 16 | 16 | 16 |
| MPN286 | Unknown | 13 | 14 | 13 | 13 | 14 | 13 | 13 | 13 |
| MPN205 | Unknown | 12 | 12 | 12 | 12 | 12 | 12 | 13 | 12 |
| MPN503 | Unknown | 10 | 10 | 10 | 11 | 10 | 10 | 12 | 10 |
| MPN439 | Unknown | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
| MPN489 | Unknown | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
| MPN370 | Unknown | 9 | 11 | 10 | 9 | 10 | 9 | 9 | 9 |
| MPN048 | Unknown | 10 | 9 | 9 | 10 | 9 | 9 | 10 | 10 |
Functional annotation of genes with no variants
| KEGG BRITE hierarchy | Count | Genes | |
|---|---|---|---|
| mpn03100 | Non-coding RNA | 37 | tRNAs, 5S rRNA, 4.5S rRNA, RNaseP RNA |
| mpn01000 | Enzymes | 28 | def, deoC, dhfr, gatB, gcp, gidB, grs1, lacA, lgt, lip2, lspA, nox, pheS, polA, ppnK, pstB, rimK, rnc, udk, upp, yaaC, yacA, ygiH, yjeQ, yjfU, yjfW, MPN047, MPN479 |
| mpn03011 | Ribosome | 25 | 50S and 30S ribosomal proteins, 5S rRNA |
| mpn0200 | Transporters | 9 | amiD, permease, glnQ, dnaK, pstA, pstB, oppB, yjfU |
| mpn03029 | Mitochondrial biogenesis | 9 | grs1, gatB, dnaJ, dnaK, groEL, groES, YidC, rpsP, ssb |
| mpn03009 | Ribosome Biogenesis | 5 | gidB, rnc, rimK, yjeQ, rbfA, spg |
| mpn03016 | Transfer RNA biogenesis | 5 | gidA, grs1, gcp, RNaseP RNA, pheS, yacA |
| mpn03110 | Chaperones and folding catalysts | 4 | dnaJ, dnaK, groEL, groES, trx |
| mpn03036 | Chromosome | 4 | gidA, gidB, rnc, scpB, soj |
| mpn02044 | Secretion system | 4 | yidC, secE, secG, MPN680 |
| mpn03400 | DNA repair and recombination | 4 | polA, recA, rpoE, ssb |
| mpn00194 | Photosynthesis proteins | 3 | atpG, atpF, atpE |
| mpn03012 | Translation factors | 2 | infA, efp |
| mpn01007 | Amino acid related enzymes | 2 | grs1, pheS |
| mpn03032 | DNA replication | 2 | polA, ssb |
| mpn04147 | Exosome | 2 | groEL, dnaK |
| mpn01002 | Peptidases | 2 | lspA, gcp |
| mpn03021 | Transcription machinery | 1 | rpoE |
| mpn01004 | Lipid biosynthesis proteins | 1 | ygiH |
| mpn04090 | Cellular antigens | 1 | cdd |
| mpn04812 | Cytoskeleton proteins | 1 | soj |
Fig. 4Multiple protein sequence alignments showing the differences in P1 and ORF6 between type 1 and type 2 strains. a The large region of variation in P1. Type1 is representative sequence for all type 1 strains and type2 is representative of all type 2 strains. b The large region of variation in ORF6. Type1 is representative of all type 1 strains, except M129, which is also shown (differences in M129 highlighted in red). Type2 is representative of all type 2 strains except for MAC, which is also shown (differences in MAC highlighted in red)
Fig. 5Multiple protein sequence alignments showing strain-specific differences in P1 and ORF6. Type1 and 2 strain names are highlighted in blue and green, respectively. a A poly-serine repeat in P1 with varying lengths in various strains. 19294 has a uniquely long allele of the poly-serine repeat, and the repeat-length in the other strains does not strictly correspond to strain type. b A stop-loss mutation in MAC results in an additional 7 amino acids added to the protein sequence. c A pair of frameshifts results in the truncation of the beginning of ORF6 in MAC. Type1 and Type2 are representative sequences for type 1 and other type 2 strains, respectively. The new protein is predicted to use an alternate start codon by RAST: the starting methionine in MAC is the same codon as that which codes for the leucine in other type 1 and type 2 strains
Fig. 6Phylogenetic tree of ArcA. The protein sequences of the two ArcA protein sequences from M. pneumoniae (MPN304 and MPN560), ArcA sequences from other Mycoplasma species, and the protein sequence of arginine deiminase from Streptococcus pneumoniae strain R6 (spr0822_Spn_R6) were aligned and a tree was generated from the alignment. Confidence values, represented as percent of supporting bootstrapping iterations are shown for each node. Scale, in differences per site, is indicated at the bottom
hsdS genes in M. pneumoniae strain M129
| Gene | Location (M129) | TR sequence | Amino acid repeat |
|---|---|---|---|
| MPN089 | 111610–112617 | CCGAGCTAAGCG | AELS |
| MPN201 | 244484–245569 | CCGAGCTAAG | AEL |
| MPN285 | 340244–341533 | CCGAGCTAAGTG(A) | A(T)ELS |
| MPN289 | 347169–347732 | CCGAGCTAAGCG | AELS |
| MPN290 | 347871–348308 | ||
| MPN343 | 409562–410863 | CCGAACTAAGCG | AELS |
| MPN365 | 435618–436730 | CCGAGCTAAGCG | AELS |
| MPN507 | 617366–618457 | ||
| MPN615 | 738245–739351 | CCGAGCTAAGCG | AELS |
| MPN638 | 764400–765527 |
Tandem repeat copy numbers of each hsdS gene in 15 M. pneumoniae strains
| Strain | MPN089 | MPN201 | MPN285 | MPN289 | MPN290 | MPN343 | MPN365 | MPN507 | MPN615 | MPN638 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Type 1 | M129 |
| 0 |
|
| 0 |
| 1 | 0 | 1 | 0 |
| M129 rs |
| 0 |
|
| 0 |
| 1 | 0 | 1 | 0 | |
| 142.8 | 3 | 0 | 21 | 5 | 0 | 12 | 1 | 0 | 5 | 0 | |
| 51494 | 3 | 0 | 24 | 1 | 0 | 10 | 1 | 0 | 4 | 0 | |
| 54089 | 2 | 0 | 33 | 4 | 0 | 13 | 1 | 0 | 4 | 0 | |
| 54524 | 3 | 0 | 31 | 6 | 0 | 11 | 1 | 0 | 5 | 0 | |
| 85084 | 6 | 0 | 16 | 3 | 0 | 13 | 1 | 0 | 4 | 0 | |
| 85138 | 5 | 0 | 14 | 4 | 0 | 13 | 1 | 0 | 5 | 0 | |
| Type 2 | FH |
| 0 |
|
| 0 | 1 |
| 0 |
| 0 |
| FH rs |
| 0 |
|
| 0 | 1 |
| 0 |
| 0 | |
| 19294 | 3 | 0 | 26 | 4 | 0 | 1 | 1 | 0 | 4 | 0 | |
| 39443 | 4 | 0 | 15 | 6 | 0 | 1 | 3 | 0 | 1 | 0 | |
| M1139 | 4 | 0 | 7 | 2 | 0 | 1 | 1 | 0 | 2 | 0 | |
| M2192 | 3 | 0 | 13 | 6 | 0 | 1 | 1 | 0 | 4 | 0 | |
| M2592 | 2 | 0 | 20 | 5 | 0 | 1 | 1 | 0 | 3 | 0 | |
| MAC | 4 | 0 | 15 | 3 | 0 | 2 | 3 | 0 | 4 | 0 | |
| PO1 | 3 | 0 | 16 | 6 | 0 | 1 | 4 | 0 | 4 | 0 |
Notes: Strain FH rs and M129 rs are our resequenced strains. Numbers in bold font indicate variations between the published and resequenced genomes
Fig. 7Multiple protein sequence of the variable regions in the hsdS genes. Both copies of the hsdS gene had a repetitive region of varying length consisting of TELS and AELS units (highlighted in orange and yellow, respectively). Note that in both copies, the length and composition of the repeat does not correspond to strain subtype. Strain names are to the right and highlighted in blue for type 1 and green for type 2. a Repeat region in the MPN089 copy of the hsdS gene. This is part of the variation in the 108000–126000 region shown in Fig. 2c. b Repeat region in the MPN343 copy of the hsdS gene. This is the variation in the 409700–410900 region shown in Fig. 2c