| Literature DB >> 23812535 |
Kui Han1, Zhi-feng Li, Ran Peng, Li-ping Zhu, Tao Zhou, Lu-guang Wang, Shu-guang Li, Xiao-bo Zhang, Wei Hu, Zhi-hong Wu, Nan Qin, Yue-zhong Li.
Abstract
Complex environmental conditions can significantly affect bacterial genome size by unknown mechanisms. The So0157-2 strain of Sorangium cellulosum is an alkaline-adaptive epothilone producer that grows across a wide pH range. Here, we show that the genome of this strain is 14,782,125 base pairs, 1.75-megabases larger than the largest bacterial genome from S. cellulosum reported previously. The total 11,599 coding sequences (CDSs) include massive duplications and horizontally transferred genes, regulated by lots of protein kinases, sigma factors and related transcriptional regulation co-factors, providing the So0157-2 strain abundant resources and flexibility for ecological adaptation. The comparative transcriptomics approach, which detected 90.7% of the total CDSs, not only demonstrates complex expression patterns under varying environmental conditions but also suggests an alkaline-improved pathway of the insertion and duplication, which has been genetically testified, in this strain. These results provide insights into and a paradigm for how environmental conditions can affect bacterial genome expansion.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23812535 PMCID: PMC3696898 DOI: 10.1038/srep02101
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Colony morphologies of S. cellulosum strain So0157-2 on CNST medium with different pH values.
(a) pH 6.0; (b) pH 9.0; (c) 11.0. Bar = 100 μm.
Figure 2Genomic features of S. cellulosum So0157-2.
(a) The genomic organization of the Sorangium cellulosum So0157-2 strain. Circle 1, genome positions in kb (from dnaA); Circles 2 and 4, predicted protein coding sequences (CDSs) on the forward (outer wheel) and the reverse (inner wheel) strands, colored according to COG class (leading strand, 5,825 CDSs, 49.9% of the total CDSs; lagging strand, 5,848 CDSs, 50.1% of the total CDSs); Circle 3, GC skew; Circles 5 and 6, putative ICE (integrative conjugative element)-derived CDSs (leading strand, 456 CDSs; lagging strand, 485 CDSs, 8.06% of the total CDSs); Circles 7 and 8, putative plasmid-derived CDSs (leading strand, 2,434 CDSs; lagging strand, 2,355 CDSs, 41% of the total CDSs); Circle 9, GC content showing deviations from the average (72.1%); Circles 10 and 11, putative HGT (horizontal gene transfer)-related genes (leading strand, 630 CDSs; lagging strand, 613 CDSs, 9.86% of the total CDSs); Circle 12, CDSs with regions showing high identity to virus genes; Circle 13, CDSs with regions showing high identity to prophage genes; Circle 14, putative restriction and modification system genes; Circle 15, two-component system genes in the genome (leading strand, cyan; lagging strand, purple); Circle 16, 109 sigma factor genes and 347 related transcription factors in the genome (leading strand, yellow; lagging strand, green); Circle 17, 55 CDSs with DNA-binding regions (green); Circle 18, secondary metabolite biosynthesis genes (dark purple), 10.6% of the whole genome; Innermost circle, putative paralogous genes in the genome. (b) Syntenic map between S. cellulosum So0157-2 and So ce56. (c) HGT (blue), ICE (red) and Plasmid, prophage and virus (green) are the three main mechanisms that have introduced alien genetic materials into the Sorangium genome. Generally, most ICEs fall into the green circle (932/941), whereas approximately 3/4 of the HGT genes lie in the green circle (908/1,268). A total of 197 genes are shared between the ICE and HGT groups. 193 genes are common to all three groups. About half of genes could be designated as alien genetic material (5,129/11,599).
Comparison of COG assignments between Sorangium cellulosum So0157-2 and So ce56
| All features | Homologous genes | Strain specific genes | |||||||
|---|---|---|---|---|---|---|---|---|---|
| So0157-2 | So ce56 | P value | So0157-2 | So ce56 | P value | So0157-2 | So ce56 | P value | |
| RNA processing and modification | 2 | 3 | 0.812 | 2 | 3 | 0.6631 | 0 | 0 | - |
| Chromatin Structure and dynamics | 2 | 2 | 0.8315 | 2 | 2 | 0.9917 | 0 | 0 | - |
| Energy production and conversion | 281 | 264 | 0.0835 | 222 | 219 | 0.8352 | 59 | 45 | 0.057 |
| Cell cycle control and mitosis | 47 | 40 | 0.8967 | 35 | 34 | 0.9654 | 12 | 6 | 0.9514 |
| Amino Acid metabolis and transport | 365 | 310 | 0.5447 | 287 | 280 | 0.7009 | 78 | 30 | 0.202 |
| Nucleotide metabolism and transport | 89 | 83 | 0.3888 | 78 | 76 | 0.8839 | 11 | 7 | 0.8507 |
| Carbohydrate metabolism and transport | 324 | 233 | 0.1794 | 233 | 202 | 0.1159 | 91 | 31 | 0.054 |
| Coenzyme metabolis | 205 | 185 | 0.2979 | 174 | 164 | 0.5534 | 31 | 21 | 0.4085 |
| Lipid metabolism | 218 | 172 | 0.8472 | 167 | 148 | 0.2629 | 51 | 24 | 0.8047 |
| Tranlsation | 183 | 184 | 0.0398* | 171 | 173 | 0.991 | 12 | 11 | 0.2379 |
| Transcription | 427 | 309 | 0.1396 | 283 | 251 | 0.1359 | 144 | 58 | 0.1242 |
| Replication and repair | 196 | 170 | 0.5345 | 125 | 136 | 0.5893 | 71 | 34 | 0.8012 |
| Cell wall/membrane/envelop biogenesis | 331 | 277 | 0.7 | 259 | 238 | 0.302 | 72 | 39 | 0.8803 |
| Cell motility | 61 | 53 | 0.7727 | 50 | 47 | 0.7985 | 11 | 6 | 0.9111 |
| Post-translational modification, protein turnover, chaperone functions | 231 | 197 | 0.614 | 176 | 170 | 0.7111 | 55 | 27 | 0.9271 |
| Inorganic ion transport and metabolism | 228 | 192 | 0.7129 | 175 | 160 | 0.3838 | 53 | 32 | 0.5509 |
| Secondary Structure | 184 | 139 | 0.5791 | 117 | 107 | 0.4934 | 67 | 32 | 0.8022 |
| General function prediction only | 961 | 802 | 0.5076 | 727 | 675 | 0.1004 | 234 | 127 | 0.6719 |
| Signal Transduction | 460 | 405 | 0.2152 | 347 | 343 | 0.7967 | 113 | 62 | 0.7493 |
| Intracellular trafficing and secretion | 41 | 37 | 0.7108 | 34 | 36 | 0.9393 | 7 | 1 | 0.3916 |
| Defense mechanisms | 123 | 70 | 0.0216* | 67 | 61 | 0.6148 | 56 | 9 | 0.0009*** |
| Function Unknown | 582 | 464 | 0.8392 | 413 | 356 | 0.0260* | 169 | 108 | 0.0857 |
| Total | 5541 | 4591 | 0.0936 | 4144 | 3881 | <0.0001*** | 1397 | 710 | 0.0030** |
Two-tailed statistical analysis was conducted by Chi square test with Yate correction.
Figure 3Transcriptomic analysis of S. cellulosum So0157-2 in pH 7.0 and pH 9.0 conditions.
(a) Categories of genes that are differentially expressed at pH 7.0 and pH 9.0. (b) Categories of significantly differentially expressed genes at pH 7.0 and pH 9.0. All detected transcripts were characterized by clusters of orthologous groups (COG) categories. (c) Statistical analysis of gene expression. Plots of the log2 ratio (fold-change) vs. the mean log expression values under pH 7.0 and pH 9.0 conditions. Red dots indicate the differentially expressed genes at a 5% false discovery rate. The yellow and red dots in the upper left corners of the two panels indicate the genes with the largest log fold changes.