| Literature DB >> 26166067 |
Matteo Cossu1, Violette Da Cunha1, Claire Toffano-Nioche1, Patrick Forterre1, Jacques Oberto1.
Abstract
The genomes of the 21 completely sequenced Thermococcales display a characteristic high level of rearrangements. As a result, the prediction of their origin and termination of replication on the sole basis of chromosomal DNA composition or skew is inoperative. Using a different approach based on biologically relevant sequences, we were able to determine oriC position in all 21 genomes. The position of dif, the site where chromosome dimers are resolved before DNA segregation could be predicted in 19 genomes. Computation of the core genome uncovered a number of essential gene clusters with a remarkably stable chromosomal position across species, in sharp contrast with the scrambled nature of their genomes. The active chromosomal reorganization of numerous genes acquired by horizontal transfer, mainly from mobile elements, could explain this phenomenon.Entities:
Keywords: Archaea; Bioinformatics; Chromosomal landmarks; Genome evolution; Mobile elements; Thermococcales
Mesh:
Year: 2015 PMID: 26166067 PMCID: PMC4640148 DOI: 10.1016/j.biochi.2015.07.008
Source DB: PubMed Journal: Biochimie ISSN: 0300-9084 Impact factor: 4.079
List of Thermococcales species with a complete genome sequence available.
| Species | Bioproject | GI | Genes | Size (Mb) | GC% | Optimum T°C | Habitat | Reference |
|---|---|---|---|---|---|---|---|---|
| 664800204 | 2046 | 1.86 | 43.0 | 80 °C | Aquatic | |||
| 14518450 | 1875 | 1.77 | 44.71 | 103°C/90 °C | Aquatic | |||
| 18976372 | 2225 | 1.90 | 40.77 | 100°C/90 °C | Aquatic | |||
| 397650687 | 2113 | 1.91 | 40.79 | 100 °C | Aquatic | |||
| 14589963 | 2000 | 1.73 | 41.88 | 98°C/95 °C | Aquatic | |||
| 332157643 | 2028 | 1.86 | 42.74 | 93 °C | Aquatic | |||
| 389851449 | 1839 | 1.73 | 42.30 | 95 °C | Aquatic | |||
| 337283511 | 1952 | 1.72 | 51.64 | 98 °C | Aquatic | |||
| 315229765 | 2257 | 2.01 | 41.76 | 85 °C | Aquatic | |||
| 700302025 | 2183 | 2.12 | 53.47 | 85 °C | Aquatic | |||
| 240102057 | 2210 | 2.05 | 53.56 | 88 °C | Aquatic | |||
| 744793172 | 2170 | 1.92 | 52.86 | 88 °C | Aquatic | Zhang,X. et al., 2015 | ||
| 57639935 | 2358 | 2.09 | 52.00 | 85 °C | Aquatic | |||
| 530547444 | 2575 | 2.22 | 43.09 | 83 °C | Aquatic | |||
| 589908590 | 2288 | 1.97 | 54.84 | 87.5 °C | Aquatic | |||
| 212223144 | 2026 | 1.85 | 51.27 | 80 °C | Terrestrial | |||
| 242397997 | 2107 | 1.85 | 40.20 | 78 °C | Oil | |||
| 341581088 | 2181 | 2.01 | 56.08 | ND | Aquatic | |||
| 350525682 | 2279 | 2.08 | 54.78 | 80 °C | Aquatic | |||
| 390960176 | 2090 | 1.95 | 55.82 | 85 °C | Aquatic | |||
| 573023865 | 2090 | 1.95 | 40.30 | 82 °C | Aquatic |
Fig. 1Phylogenetic tree of the 21 sequenced Thermococcales. The phylogeny of the Thermococcales dataset was calculated with PhyML using the 16S ribosomal RNA genes as described in Material and Methods.
Prediction of oriC and dif in Thermococcales.
| Species | Putative | Putative | |||||
|---|---|---|---|---|---|---|---|
| Position on chromosome (Orb cluster coord.) | Cdc6 coord. | Sequence (28 bp) | Position on chromosome | Intergenic location | |||
| Left arm | Spacer | Right arm | |||||
| 1858353..0 | 583..1839 | T | 1158048 | Yes | |||
| 122701..123499 | 121402..122700 | A | 1220264 | Yes | |||
| 15355..16235 | 16236..17498 | 659548 | Yes | ||||
| 1479769..1480649 | 1478506..1479768 | 462638 | Yes | ||||
| 110790..111561 | 109476..110789 | 736581 | Yes | ||||
| 579324..580109 | 578064..579323 | ND | |||||
| 227904..228761 | 228762..230021 | ND | |||||
| 1426398..1427171 | 1427172..1428431 | 1058381 | Yes | ||||
| 1672620..1673707 | 1670448..1671713 | T | 880625 | Yes | |||
| 425720..426421 | 423614..424867 | 1862025 | Yes | ||||
| 126739..127591 | 125431..126738 | T | 1457065 | Yes | |||
| 813701..814368 | 1594403..1595665 | T | 100930 | Yes | |||
| 1711251..1712157 | 1712158..1713405 | T | 483614 | Yes | |||
| 974680..975085 | 1594403..1595665 | T | 1867166 | No | |||
| 1603522..1604207 | 1605068..1606321 | T | 772784 | Yes | |||
| 1510250..1510926 | 1508116..1509363 | 854799 | Yes | ||||
| 1783451..1784177 | 1434100..1435362 | T | 689121 | No | |||
| 1373703..1374410 | 1376165..1377412 | T | 97343 | Yes | |||
| 1530315..1531266 | 1529070..1530314 | 849102 | Yes | ||||
| 1018000..1018309 | 1020367..1021614 | 1704316 | Yes | ||||
| 1754560..1755481 | 1752377..1753639 | T | 1028150 | Yes | |||
| W | |||||||
Fig. 2Venn diagram for core and genus-specific proteins counting. Core, genus-specific proteins and their combinations were computed as described in Materials and Methods.
ArCOG assignment of the Thermococcales core genes.
| ArCOG class | Function | 790 core | 668 core |
|---|---|---|---|
| Information storage and processing | Translation, ribosomal structure and biogenesis | 140 | |
| RNA processing and modification | 0 | ||
| Transcription | 43 | ||
| Replication, recombination and repair | 45 | ||
| Chromatin structure and dynamics | 0 | ||
| Cellular processes and signaling | Cell cycle control, cell division, chromosome partitioning | 8 | |
| Nuclear structure | 0 | ||
| Defense mechanisms | 8 | ||
| Signal transduction mechanisms | 4 | ||
| Cell wall/membrane/envelope biogenesis | 12 | ||
| Cell motility | 5 | ||
| Cytoskeleton | 0 | ||
| Extracellular structures | 0 | ||
| Intracellular trafficking, secretion, and vesicular transport | 8 | ||
| Posttranslational modification, protein turnover, chaperones | 22 | ||
| Mobilome: prophages, transposons | 0 | ||
| Metabolism | Energy production and conversion | 28 | |
| Carbohydrate transport and metabolism | 30 | ||
| Amino acid transport and metabolism | 36 | ||
| Nucleotide transport and metabolism | 25 | ||
| Coenzyme transport and metabolism | 36 | ||
| Lipid transport and metabolism | 12 | ||
| Inorganic ion transport and metabolism | 11 | ||
| Secondary metabolites biosynthesis, transport and catabolism | 4 | ||
| Poorly characterized | General function prediction only | 115 | |
| Function unknown | 76 |
Bold numbers in columns 1 & 3 refer to 790 core genes.
Fig. 3Graphical correlation between core-free genomic regions and integration of mobile elements in Thermococcus kodakarensis. The physical map corresponding to Thermococcus kodakarensis was drawn proportionally. The outermost numbered cyan bars indicate the clusters of core genes. Each black bar positions a single gene of the entire genome: the outer bars correspond to genes transcribed in the same polarity as DNA replication; the inner bars refer to the opposite orientation. Similarly, red bars correspond to single 'core genes' with the same orientation convention as above. Bright green bars indicate the location of clusters of species-specific genes (integrated mobile elements). Purple and green bars correspond to GC skew values calculated in windows of 1000bp, shifted 500bp with the purple and green bars indicating values below and above average genomic GC skew, respectively. Predicted origins of replication and dif sites are show as green circles and red squares, respectively. The positions of the four integrated elements (TKV1 to TKV4) as well as the predicted dark matter islands are represented in blue color.
Thermococcales conserved clusters characteristics.
| Cluster | oriC distance | Number of genes | Mean expression level | Relevant encoded protein(s) | |
|---|---|---|---|---|---|
| Mean (%) | Standard deviation (%) | ||||
| 01 | 0.33 | 0.44 | 3 | 478.9 | Hypothetical |
| 02 | 2.69 | 1.91 | 2 | 221.1 | Molybdopterin converting factor, subunit 2 |
| 03 | 5.17 | 3.42 | 2 | 2551.7 | Hypothetical |
| 04 | 5.39 | 3.23 | 3 | 557.2 | KEOPS complex KAE1 |
| 05 | 7.36 | 4.34 | 7 | 877.6 | V-type ATP synthase, 7 subunits |
| 06 | 8.25 | 3.41 | 3 | 268.2 | Preprotein translocase |
| 07 | 9.14 | 4.67 | 2 | 357.5 | Oligopeptide transporters |
| 08 | 12.94 | 5.18 | 5 | 2926.0 | RNA polymerase |
| 09 | 17.76 | 3.90 | 27 | 3626.6 | Ribsosomal proteins |
| 10 | 20.89 | 3.63 | 10 | 2234.8 | Ribosomal proteins – RNA polymerase |
| 11 | 22.40 | 5.77 | 5 | 482.4 | Thymidylate kinase |
| 12 | 23.46 | 4.47 | 3 | 1011.2 | DNA primase |
| 13 | 24.62 | 5.45 | 3 | 234.9 | Mevalonate kinase |
| 14 | 26.50 | 5.92 | 7 | 1535.2 | Ribosomal proteins - RNA polymerase |
| 15 | 33.34 | 6.01 | 2 | 486.7 | Glutamyl-tRNA(Gln) amidotransferase |
| 16 | 34.14 | 5.44 | 2 | 840.6 | Translation initiation factor IF-2 |
| 17 | 38.58 | 5.63 | 2 | 1685.0 | Ribosomal protein |