| Literature DB >> 31783732 |
Joon Kee Lee1,2, Moon-Woo Seong3,4, Dongjin Shin5, Jong-Il Kim5,6,7, Mi Seon Han8, Youbin Yeon4, Sung Im Cho4, Sung Sup Park3,4, Eun Hwa Choi9,10.
Abstract
BACKGROUND: Mycoplasma pneumoniae is a common cause of respiratory tract infections in children and adults. This study applied high-throughput whole genome sequencing (WGS) technologies to analyze the genomes of 30 M. pneumoniae strains isolated from children with pneumonia in South Korea during the two epidemics from 2010 to 2016 in comparison with a global collection of 48 M. pneumoniae strains which includes seven countries ranging from 1944 to 2017.Entities:
Keywords: Comparative genomics; Mycoplasma pneumoniae; Whole genome analysis
Mesh:
Substances:
Year: 2019 PMID: 31783732 PMCID: PMC6884898 DOI: 10.1186/s12864-019-6306-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Genome lengths and contigs determined from the initial assembly with complete genome structures annotated by RAST
| Strain | Contigs | L50 | N50 | Min Length | Max Length | Total Length | %GC | Genes | ||
|---|---|---|---|---|---|---|---|---|---|---|
| CDS | RNA | Total | ||||||||
| 10–980 | 6 | 2 | 152,732 | 14,538 | 390,907 | 816,424 | 40.0 | 776 | 40 | 816 |
| 10–1048 | 6 | 2 | 152,735 | 14,538 | 392,185 | 816,465 | 40.0 | 777 | 40 | 817 |
| 10–1059 | 7 | 2 | 98,837 | 14,538 | 392,164 | 816,681 | 40.0 | 776 | 40 | 816 |
| 10–1110 | 8 | 2 | 152,733 | 20,993 | 388,970 | 816,522 | 40.0 | 775 | 40 | 815 |
| 10–1213 | 5 | 1 | 451,397 | 14,538 | 451,397 | 816,521 | 40.0 | 772 | 40 | 812 |
| 10–1257 | 3 | 1 | 702,439 | 14,562 | 702,439 | 816,333 | 40.0 | 776 | 40 | 816 |
| 10–1385 | 9 | 3 | 95,255 | 14,577 | 297,117 | 817,191 | 40.0 | 780 | 39 | 819 |
| 11–107 | 5 | 2 | 249,794 | 14,538 | 389,683 | 816,346 | 40.0 | 773 | 40 | 813 |
| 11–129 | 6 | 2 | 152,693 | 14,538 | 392,172 | 816,432 | 40.0 | 775 | 40 | 815 |
| 11–174 | 6 | 2 | 258,682 | 13,367 | 282,196 | 815,686 | 40.0 | 776 | 39 | 815 |
| 11–212 | 7 | 2 | 152,734 | 14,538 | 389,655 | 816,503 | 40.0 | 778 | 40 | 818 |
| 11–473 | 6 | 2 | 152,734 | 14,538 | 389,647 | 816,518 | 40.0 | 778 | 40 | 818 |
| 11–634 | 7 | 2 | 152,735 | 14,775 | 391,525 | 816,551 | 40.0 | 777 | 40 | 817 |
| 11–949 | 6 | 2 | 258,658 | 13,367 | 283,608 | 817,102 | 40.0 | 784 | 39 | 823 |
| 11–994 | 5 | 2 | 249,776 | 14,538 | 389,685 | 816,304 | 40.0 | 776 | 40 | 816 |
| 11–1384 | 6 | 2 | 258,694 | 13,367 | 283,575 | 818,669 | 40.0 | 787 | 39 | 826 |
| 12–060 | 6 | 2 | 152,734 | 14,538 | 392,205 | 816,506 | 40.0 | 775 | 40 | 815 |
| 12–091 | 6 | 2 | 152,734 | 14,538 | 391,968 | 816,510 | 40.0 | 777 | 40 | 817 |
| 14–637 | 6 | 2 | 156,124 | 60,136 | 298,090 | 818,560 | 40.0 | 789 | 39 | 828 |
| 15–215 | 6 | 2 | 152,734 | 14,561 | 392,183 | 816,388 | 40.0 | 775 | 40 | 815 |
| 15–885 | 6 | 2 | 152,734 | 14,561 | 389,671 | 816,420 | 40.0 | 776 | 40 | 816 |
| 15–969 | 6 | 2 | 152,735 | 14,538 | 392,144 | 816,389 | 40.0 | 780 | 40 | 820 |
| 15–982 | 5 | 2 | 156,554 | 14,538 | 390,947 | 816,495 | 40.0 | 769 | 40 | 809 |
| 16–002 | 6 | 2 | 152,736 | 14,538 | 389,658 | 816,530 | 40.0 | 773 | 40 | 813 |
| 16–004 | 6 | 2 | 152,736 | 14,538 | 392,133 | 816,561 | 40.0 | 777 | 40 | 817 |
| 16–032 | 6 | 2 | 152,734 | 14,538 | 392,119 | 816,471 | 40.0 | 772 | 40 | 812 |
| 16–118 | 5 | 1 | 443,549 | 14,538 | 443,549 | 816,467 | 40.0 | 775 | 40 | 815 |
| 16–462 | 5 | 2 | 152,735 | 57,889 | 392,162 | 816,525 | 40.0 | 776 | 40 | 816 |
| 16–710 | 7 | 2 | 152,734 | 14,538 | 392,162 | 816,537 | 40.0 | 773 | 40 | 813 |
| 16–734 | 6 | 2 | 258,694 | 13,367 | 283,522 | 818,445 | 40.0 | 784 | 39 | 823 |
L50, smallest number of contigs whose length sum makes up half of genome size; N50, sequence length of the shortest contig at 50% of the total genome length; CDS, coding sequence
Fig. 1Overall sequence identity of the 30 sequenced strains with the reference M129 genome. Solid coloration indicates > 99% identity and transparent grey indicates approximately 95% identity. Location in the reference genome is indicated by numeration on the inside of the ring. GC content in the reference genome is indicated by the black bar graphs between the genomic coordinates and the colored rings (bars pointing toward the outside of the circle indicate high GC content)
Fig. 2Whole genome alignment of the 30 sequenced strains with 6 reference sequences using MAUVE. Regions colored in MAUVE are conserved across all strains. a Two 1 Kbp (approximate) insertions are noticed in the P1 type 1 groups at 169–170 Kb and 178–179 Kb. b A 2 Kbp (approximate) insertion is noticed in the P1 type 1 groups at 558–560 Kb. c A 6 Kbp (approximate) insertion is noticed in the P1 type 2 groups at 708 Kb. All positions are based on M129 reference strain
Variant patterns relative to the nucleotide and amino acid structure of M129 reference strain
| Upstream | Synonymous | Missense | Splice | Start/stop | In-frame | Frameshift | Total | |
|---|---|---|---|---|---|---|---|---|
| 10–980 | 37 | 32 | 48 | 4 | 3 | 16 | 140 | |
| 10–1048 | 89 | 105 | 153 | 13 | 6 | 25 | 391 | |
| 10–1059 | 93 | 100 | 149 | 11 | 7 | 29 | 389 | |
| 10–1110 | 56 | 31 | 49 | 5 | 2 | 16 | 159 | |
| 10–1213 | 93 | 102 | 154 | 16 | 7 | 25 | 397 | |
| 10–1257 | 92 | 95 | 151 | 15 | 5 | 25 | 383 | |
| 10–1385 | 518 | 480 | 659 | 1 | 56 | 9 | 55 | 1778 |
| 11–107 | 114 | 107 | 172 | 15 | 9 | 23 | 440 | |
| 11–129 | 96 | 113 | 160 | 13 | 6 | 28 | 416 | |
| 11–174 | 518 | 479 | 658 | 1 | 57 | 11 | 54 | 1778 |
| 11–212 | 118 | 108 | 154 | 13 | 7 | 25 | 425 | |
| 11–473 | 116 | 97 | 141 | 15 | 5 | 25 | 399 | |
| 11–634 | 110 | 103 | 154 | 16 | 6 | 25 | 414 | |
| 11–949 | 521 | 489 | 665 | 1 | 53 | 9 | 55 | 1793 |
| 11–994 | 92 | 99 | 151 | 12 | 7 | 24 | 385 | |
| 11–1384 | 519 | 490 | 668 | 1 | 53 | 9 | 56 | 1796 |
| 12–060 | 119 | 104 | 160 | 15 | 7 | 25 | 430 | |
| 12–091 | 130 | 104 | 162 | 16 | 7 | 27 | 446 | |
| 14–637 | 518 | 483 | 657 | 1 | 51 | 11 | 59 | 1782 |
| 15–215 | 95 | 106 | 155 | 13 | 7 | 27 | 403 | |
| 15–885 | 130 | 108 | 170 | 15 | 7 | 25 | 455 | |
| 15–969 | 114 | 104 | 157 | 14 | 8 | 25 | 422 | |
| 15–982 | 142 | 108 | 157 | 14 | 8 | 25 | 454 | |
| 16–002 | 92 | 104 | 156 | 12 | 8 | 25 | 397 | |
| 16–004 | 116 | 114 | 163 | 14 | 8 | 27 | 442 | |
| 16–032 | 121 | 106 | 166 | 17 | 6 | 25 | 441 | |
| 16–118 | 126 | 100 | 156 | 14 | 7 | 25 | 428 | |
| 16–462 | 128 | 101 | 159 | 14 | 7 | 25 | 434 | |
| 16–710 | 115 | 100 | 158 | 14 | 7 | 25 | 419 | |
| 16–734 | 519 | 486 | 660 | 1 | 54 | 10 | 55 | 1785 |
Fig. 3Heatmap of protein families of 30 sequenced genomes with reference genome M. pneumoniae M129. Cell color represents the number of proteins from a specific genome in a given protein family. Note that P1 types 2 (10–1385, 11–174, 11–949, 11–1384, 14–637 and 16–734) are distinguishable from P1 types 1
Fig. 4Phylogenetic tree based on whole genome alignment of the 30 sequenced strains with 48 M. pneumoniae genomes accessed from NCBI. The tree was built through 500 bootstraps using the maximum composite likelihood approach based on neighbor-joining algorithms. Branch length designates actual distance. Bootstrapping values over 50 are represented on the tree. Blue colored strains are from this study and red colored strains are the 6 references. Strains are grouped into four distinct clades. ST, sequence type
Fig. 5Mycoplasma pneumoniae sequence type (ST) relationship by eBURST analysis including 30 strains from this study, 48 strains from NCBI, and previously reported STs from PubMLST (http://pubmlst.org/mpneumoniae/). Two main CCs were defined with two singletons (ST12 and ST22). ST3 and ST2 were the predicted founder of each CC. The size of each circle correlates with the number of isolates of each ST. STs in gray are previously reported, but not included in the investigation of this study. CC, clonal complex
Reference genomes included in the analysis
| NCBI Accession | Organism | Length (bp) | P1 type | Year Collected | Origin | Description |
|---|---|---|---|---|---|---|
| NC_000912.1 | 816,394 | 1 | 1968 | USA/NC | ATCC 29342 (Reference) | |
| CP_010546.1 | 817,207 | 2 | 1954 | USA/MA | ATCC 15531 (Reference) | |
| NC_016807.1 | 817,176 | 2a | 2011 | Japan | ||
| AP_017318.1 | 817,074 | 2b | 2017 | Japan | ||
| AP_017319.1 | 817,099 | 2c | 2017 | Japan | ||
| CP_013829.1 | 801,203 | 1 | 2016 | China | Macrolide resistant |