| Literature DB >> 23431003 |
Marco Galardini1, Francesco Pini, Marco Bazzicalupo, Emanuele G Biondi, Alessio Mengoni.
Abstract
Many bacterial species, such as the alphaproteobacterium Sinorhizobium meliloti, are characterized by open pangenomes and contain multipartite genomes consisting of a chromosome and other large-sized replicons, such as chromids, megaplasmids, and plasmids. The evolutionary forces in both functional and structural aspects that shape the pangenome of species with multipartite genomes are still poorly understood. Therefore, we sequenced the genomes of 10 new S. meliloti strains, analyzed with four publicly available additional genomic sequences. Results indicated that the three main replicons present in these strains (a chromosome, a chromid, and a megaplasmid) partly show replicon-specific behaviors related to strain differentiation. In particular, the pSymB chromid was shown to be a hot spot for positively selected genes, and, unexpectedly, genes resident in the pSymB chromid were also found to be more widespread in distant taxa than those located in the other replicons. Moreover, through the exploitation of a DNA proximity network, a series of conserved "DNA backbones" were found to shape the evolution of the genome structure, with the rest of the genome experiencing rearrangements. The presented data allow depicting a scenario where the pSymB chromid has a distinctive role in intraspecies differentiation and in evolution through positive selection, whereas the pSymA megaplasmid mostly contributes to structural fluidity and to the emergence of new functions, indicating a specific evolutionary role for each replicon in the pangenome evolution.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23431003 PMCID: PMC3622305 DOI: 10.1093/gbe/evt027
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Main Features of the 14 Sinorhizobium meliloti Genomes
| Strains | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1A42 | 5A14 | A0641M | A0643DD | AE608H | AK11 | AK75 | C0431A | C0438LL | H1 | Rm1021 | AK83 | BL225C | SM11 | |
| General stats | ||||||||||||||
| Length (bp) | 7,162,724 | 8,942,552 | 7,953,713 | 7,351,905 | 7,347,181 | 6,843,938 | 6,992,595 | 7,086,830 | 7,064,773 | 6,923,151 | 6,691,694 | 7,140,471 | 6,978,785 | 7,499,157 |
| G + C content | 62.02 | 61.98 | 61.88 | 61.84 | 62.00 | 62.03 | 61.86 | 61.96 | 61.96 | 61.96 | 0.61 | 0.62 | 0.62 | 61.91 |
| Coding % | 85.25 | 85.57 | 85.17 | 85.28 | 85.46 | 85.75 | 85.28 | 85.20 | 85.44 | 85.11 | 86.13 | 82.53 | 84.56 | 86.40 |
| Coding | 6,106,284 | 7,652,148 | 6,774,171 | 6,269,850 | 6,279,117 | 5,868,819 | 5,963,196 | 6,037,653 | 6,035,811 | 5,892,396 | 5,763,546 | 5,893,086 | 5,901,528 | 6,479,490 |
| ORFs | 7,374 | 8,735 | 8,411 | 7,771 | 7,197 | 6,895 | 7,555 | 7,386 | 7,242 | 6,993 | 6,218 | 6,518 | 6,359 | 7,428 |
| rRNA | 6 | 24 | 9 | 4 | 18 | 2 | 5 | 2 | 5 | 9 | 9 | 9 | 9 | 9 |
| tRNA | 56 | 78 | 63 | 51 | 63 | 47 | 56 | 48 | 52 | 52 | 54 | 56 | 55 | 56 |
| Annotation stats (%) | ||||||||||||||
| No function | 30.47 | 30.02 | 34.70 | 34.15 | 31.22 | 28.66 | 31.41 | 31.02 | 31.61 | 29.73 | 23.72 | 28.91 | 25.93 | 32.77 |
| ORFans | 3.38 | 3.57 | 3.67 | 3.35 | 1.92 | 3.16 | 3.45 | 2.40 | 3.02 | 2.89 | 1.74 | 2.75 | 0.30 | 0.70 |
| COG | 69.53 | 69.98 | 65.30 | 65.85 | 68.78 | 71.34 | 68.59 | 68.98 | 68.39 | 70.27 | 76.28 | 71.09 | 74.07 | 67.23 |
| Interpro | 87.86 | 86.14 | 84.65 | 85.28 | 85.24 | 88.69 | 87.66 | 85.96 | 86.33 | 88.15 | 88.68 | 84.84 | 87.42 | 81.35 |
| GO | 62.19 | 63.18 | 59.09 | 59.85 | 62.05 | 63.90 | 61.69 | 61.72 | 61.39 | 62.81 | 67.88 | 64.32 | 66.06 | 61.16 |
| KEGG | 44.36 | 45.20 | 41.31 | 41.56 | 44.69 | 45.82 | 43.35 | 43.39 | 43.84 | 44.96 | 50.63 | 46.59 | 48.51 | 43.09 |
| Rhizobase | 87.32 | 78.95 | 83.21 | 84.00 | 84.80 | 88.70 | 87.40 | 86.79 | 85.97 | 88.07 | 92.23 | 87.48 | 90.64 | 84.05 |
| Replicon sizes (bp) | ||||||||||||||
| Chromosome | 3,731,100 | 4,990,772 | 4,063,405 | 3,643,054 | 3,891,677 | 3,572,765 | 3,447,294 | 3,613,238 | 3,588,400 | 3,558,302 | 3,650,000 | 3,820,000 | 3,670,000 | 3,908,022 |
| Chromid pSymB | 1,588,418 | 1,878,443 | 1,754,039 | 1,593,848 | 1,630,437 | 1,565,308 | 1,647,631 | 1,558,480 | 1,626,313 | 1,603,718 | 1,680,000 | 1,680,000 | 1,690,000 | 1,632,395 |
| Megaplasmid pSymA | 1,396,116 | 1,506,164 | 1,457,555 | 1,307,958 | 1,285,253 | 1,387,542 | 1,518,412 | 1,389,865 | 1,313,050 | 1,374,799 | 1,350,000 | 1,310,000 | 1,610,000 | 1,633,319 |
| pSINME01 | 209,195 | 232,672 | 111,645 | 136,588 | 125,681 | 109,990 | 76,265 | 112,338 | 94,450 | 127,226 | 260,000 | |||
| pSINME02 | 11,278 | 25,905 | 7,018 | 70,000 | ||||||||||
| pSmeSM11b | 140,640 | 178,756 | 63,273 | 122,560 | 132,879 | 11,263 | 33,722 | 71,081 | 13,413 | 181,251 | ||||
| pSmeSM11a | 14,949 | 3,811 | 232,455 | 19,702 | 14,801 | 3,810 | 144,170 | |||||||
| Not mapped | 82,306 | 155,746 | 499,985 | 304,164 | 281,255 | 151,462 | 269,272 | 398,109 | 360,652 | 245,693 | ||||
aThe general statistics and the putative mapping of reads to the six replicons present in the completely sequenced genomes (Rm1021, AK83, BL225C, and SM11) are reported.
bPercentage on total ORFs.
cRm1021 is excluded.
FSinorhizobium meliloti pangenome permutations statistics. Each point indicates the number of orthologs that are found in each pangenomic fraction. The trend lines on the median values are shown.
FDendrograms of 14 Sinorhizobium meliloti strains based on pangenome content. Gray shades indicate the dendrogram’s clusters, and strain names and branches are colored after the geographical origin of each strain (black: reference strain Rm1021; red: Germany; yellow: Iran; and blue: Kazakhstan; green: Italy). (a) Bayesian consensus dendrogram of S. meliloti strains from core genome sequences alignments; both coding and noncoding sequences dendrogram are reported. All represented nodes have a posterior probability equal to 1. (b) Neighbor-joining dendrograms with respect to the pattern of occurrence of 4,602 accessory orthologs in the accessory genome. The concatemer for the dendrograms of coding sequences is formed by 883,803 sites (7,921 polymorphic, 4,626 parsimony informative) from the whole core genome; 629,418 sites (2,498 polymorphic, 1,104 parsimony informative) from the chromosomal genes; 178,901 sites (4,023 polymorphic, 2,513 parsimony informative) from the chromid pSymB; and 75,291 sites (1,400 polymorphic, 391 parsimony informative) from the megaplasmid pSymA. For the dendrograms of noncoding sequences, concatemer is formed by 210,215 sites (6,215 polymorphic, 3,608 parsimony informative) from the whole core genome; 295,554 sites (1,537 polymorphic, 414 parsimony informative) from the chromosomal genes; 109,256 sites (3,160 polymorphic and 2,083 parsimony informative) from the chromid pSymB; and 71,405 sites (1,518 polymorphic and 1,111 parsimony informative) from the megaplasmid pSymA.
FDetection of positively selected sites in the three main S. meliloti replicons. (a) The proportion of positively selected sites detected with respect to the total number of genes present in each replicon is reported. Asterisks indicate significant enrichment (P < 0.05 with a Fisher’s exact test) of positively selected genes. (b) Difference between relative proportions of each COG category between selected and nonselected core genes. Solid borders indicate a significant difference (P < 0.05 with a Fisher’s exact test). See http://www.ncbi.nlm.nih.gov/COG/grace/fiew.cgi for the list of COG codes.
FTaxonomic distribution of the S. meliloti pangenome. (a) Taxonomic distribution of the OGs mapped to each replicon: For each taxonomic group, the proportion of the orthologs for each replicon having a significant hit is reported. (b) Difference between relative proportions of each COG category between proteobacterial hits and nonproteobacterial hits. Solid histograms mark categories with significant differences (P < 0.05 with a Fisher’s exact test).
FStructural alignments of the 14 Sinorhizobium meliloti genomes. (a) Alignment between genomes is reported, following the order of the overall Bayesian dendrogram (fig. 2), the presence of core, and accessory and unique contiguous regions of orthologs whose length is over 10 kb is reported. Whole replicon inversions (as in the chromid pSymB and megaplasmid pSymA between strain 1A42 and SM11) and translocations spanning over the starting point of a replicon (as in the chromid pSymB between strain AK75 and BL225C) are artifacts dependent on the specific orientation and starting point of the nucleotide sequences. (b) Proportion of contiguous regions for each pangenomic category in each replicon.
FProximity network construction and statistics. (a) Explanatory proximity network construction details. (b) Block model simplified version of the proximity network, obtained dividing the nodes according to their replicon of origin; nodes and edges sizes are proportional to the number of orthologs and number of links observed, respectively.
DNA Proximity Network Statistics
| DNA Proximity Network Cluster | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| All | Mapped | Shuffled | Chromosome | pSymB | pSymA | pSINME01 | pSmeSM11b | pSINME02 | pSmeSM11a | |
| Average degree | 3.00 | 2.91 | 2.63 | 2.98 | 2.88 | 2.77 | 2.43 | 2.79 | NA | 2.42 |
| Std-dev degree | 1.00 | 0.97 | 0.81 | 1.00 | 0.97 | 0.89 | 0.73 | 0.89 | 0.00 | 0.72 |
| Replicon assortativity | 0.67 | 1.00 | NA | NA | NA | NA | NA | NA | NA | NA |
| Major component weighted size | 1.00 | 0.21 | 0.19 | 0.43 | 0.17 | 0.29 | 0.13 | 0.56 | 0.04 | 0.13 |
| Boundary weighted size | NA | 0.34 | NA | 0.27 | 0.34 | 0.44 | 0.94 | 0.68 | 1.00 | 0.70 |
Note.—Std-dev, standard deviation; NA, not applicable.
aConsidering nodes with degree > 1.
bNodes having at least a link to the shuffled cluster.
DNA Backbones Statistics
| All | Chromosome | pSymB | pSymA | |
|---|---|---|---|---|
| Number of chains | 1,035 | 349 | 118 | 60 |
| Total length (Mb) | 2.68 | 2.03 | 0.51 | 0.27 |
| Length proportion | Nd | 0.54 | 0.31 | 0.18 |
| Average length (bp) | 5,391.9 | 5,819.7 | 4,342.4 | 4,502.2 |
| Standard size (bp) | 6,561.6 | 6,281.8 | 6,257.8 | 5,522.5 |
Note.—Nd, not determined.
aReplicon size is the average replicon length in the four complete Sinorhizobium meliloti genomes.
FTasks and evolutionary differences of replicons in S. meliloti. Pie charts indicate the proportion of each replicon that is present in the DNA backbones. The arrows indicate the transmission mechanism: vertical inheritance (two arrows) or HGT (radial arrows).