| Literature DB >> 30239713 |
Thomas H A Haverkamp1,2, Claire Geslin3,4,5, Julien Lossouarn3,4,5, Olga A Podosokorskaya6, Ilya Kublanov6,7, Camilla L Nesbø1,8.
Abstract
Thermosipho species inhabit thermal environments such as marine hydrothermal vents, petroleum reservoirs, and terrestrial hot springs. A 16S rRNA phylogeny of available Thermosipho spp. sequences suggested habitat specialists adapted to living in hydrothermal vents only, and habitat generalists inhabiting oil reservoirs, hydrothermal vents, and hotsprings. Comparative genomics of 15 Thermosipho genomes separated them into three distinct species with different habitat distributions: The widely distributed T. africanus and the more specialized, T. melanesiensis and T. affectus. Moreover, the species can be differentiated on the basis of genome size (GS), genome content, and immune system composition. For instance, the T. africanus genomes are largest and contained the most carbohydrate metabolism genes, which could explain why these isolates were obtained from ecologically more divergent habitats. Nonetheless, all the Thermosipho genomes, like other Thermotogae genomes, show evidence of genome streamlining. GS differences between the species could further be correlated to differences in defense capacities against foreign DNA, which influence recombination via HGT. The smallest genomes are found in T. affectus that contain both CRISPR-cas Type I and III systems, but no RM system genes. We suggest that this has caused these genomes to be almost devoid of mobile elements, contrasting the two other species genomes that contain a higher abundance of mobile elements combined with different immune system configurations. Taken together, the comparative genomic analyses of Thermosipho spp. revealed genetic variation allowing habitat differentiation within the genus as well as differentiation with respect to invading mobile DNA.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30239713 PMCID: PMC6211235 DOI: 10.1093/gbe/evy202
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Overview of Thermosipho Genomes Used in This Study
| T. genome | NCBI Accession Number | Nr of Contigs | Genome Size (bp) | GC Content (%) | Intergenic Region Size (% GS) | Sequence Technology | Isolation Source | References |
|---|---|---|---|---|---|---|---|---|
| NC_009616 | 1 | 1.915.238 | 31.4 | 7.31 | 454/Sanger | Gills of mussel, Deep-sea H. vent, Pacific Ocean, Lau Basin | ||
| JYCX00000000 | 4 | 1.906.053 | 31.5 | 5.98 | MiSeq | Gills of gastropod Deep-sea H. vent, pacific ocean, Lau Basin Deep-sea H. vent, pacific ocean, Lau Basin | ||
| CP007389 | 1 | 1.915.344 | 31.4 | 6.06 | Pacbio /MiSeq | Sample of active chimney | This study | |
| JYCY00000000 | 21 | 1.890.294 | 32.6 | 6.70 | MiSeq | Sample of active chimney | This study | |
| JYCZ00000000 | 21 | 1.890.575 | 32.6 | 6.51 | MiSeq | Gills of a mussel | ||
| JYDA00000000 | 22 | 1.892.506 | 32.7 | 6.31 | MiSeq | |||
| JYDB00000000 | 21 | 1.890.479 | 32.6 | 6.39 | MiSeq | Sample of active chimney | ||
| T. | CP007223 | 1 | 1.766.633 | 31.5 | 7.30 | Pacbio /MiSeq | Deep sea H. vent, Atlantic ocean | |
| T. | CP007121 | 1 | 1.781.851 | 31.5 | 6.60 | Pacbio /MiSeq | Menez-Gwen hydrothermal field | This study |
| T. | LBEX00000000 | 20 | 1.788.307 | 33.3 | 7.02 | MiSeq | Menez-Gwen hydrothermal field | This study |
| T. | LBFC00000000 | 27 | 1.771.018 | 32.6 | 6.80 | MiSeq | Rainbow H.vent field | |
| T. | LBEY00000000 | 39 | 1.767.555 | 32.1 | 6.88 | MiSeq | Rainbow H.vent field | This study |
| NKRG00000000 | 23 | 1.933.435 | 30.6 | 8.25 | Ion Torrent | Tadjoura gulf Hydrothermal springs, Djibouti | ||
| AJIP0100000 | 49 | 2.083.551 | 29.9 | 7.64 | 454/Sanger | Produced water from Hibernia oil platform, North-west Atlantic Ocean. | This study | |
| NC_011653 | 1 | 2.016.657 | 30.8 | 8.98 | Sanger | Produced water from Troll C oil platform, North Sea |
Not sequenced during this study.
Strains 429–487 isolated during Oceanographic cruise Biolau: Pacific Ocean, Lau Basin, Deep-sea Hydrothermal vent.
Strains 1063–1074 isolated during Oceanographic cruise Marvel: Atlantic ocean, Menez Gwen, Deep-sea Hydrothermal vent.
Isolated during Oceanographic cruise Atos, Atlantic ocean, Rainbow, Deep sea Hydrothermal vent.
Genomes with 1 contig were closed.
. 1.—Maximum Likelihood phylogeny of Thermosipho 16S rRNA sequences. The tree was constructed in MEGA6 (Tamura et al. 2013) using the General Time Reversible (GTR) model (Γ + I, four categories, Nei and Kumar 2000). The tree with the highest log likelihood is shown. The percentage of trees in 100 bootstrap replicates in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained by applying the Neighbor-Joining method to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site, the bar indicates 0.05 substitutions per site. The analysis involved 61 nucleotide sequences. Only alignment positions with fewer than 5% gaps, missing data, and ambiguous bases were included, resulting in a total of 1,245 positions in the final data set. The phylogeny was rooted using 16S rRNA gene sequences of representative sequences from other Thermotogae genera (Defluviitoga, Fervidobacterium, Geotoga, Kosmotoga, Marinitoga, Mesoaciditoga, Mesotoga, Oceanotoga, Petrotoga, Pseudothermotoga, and Thermotoga). All non-Thermosipho Thermotogales formed a distinct clade and were collapsed (supplementary fig. S1, Supplementary Material online for the full phylogeny). Thermosipho sequences are colored based on the environment of isolation. Sequences with bold fonts are whole genome sequences. Triangles behind the sequence ID indicate genome not from this study. Circles behind sequence ID indicate genome from this study.
. 2.—Phylogenomic comparison of 15 Thermosipho genomes using core SNPs, ANIb, and the number of shared genes. (A) Neighbor network of the 15 Thermosipho strains. The network is based SNP’s in core genome fragments that were present in all genomes with a minimum of 70% similarity. The network was visualized in Splitstree using the NeighborNet algorithm from uncorrected distances (Huson and Bryant 2006). (B) Heatmap visualization of pairwise ANIb distances. (C) Fraction of genes shared between 15 Thermosipho genomes. Data were generated at the IMG-database using pairwise Bidirectional Best nSimScan Hits, with genes sharing 70% sequence identity and at least 70% coverage. Percentages were calculated by dividing the number shared genes by the total number of genes for the genome on the y-axis. The colored lines (black, green and blue) indicated which strains belong to which Thermosipho lineage/species.
. 3.—Genomic properties of the Thermosipho genomes. (A) Genome size versus optimal growth temperature for all Thermosipho genomes compared with all Thermotogales genomes, for which OGT is known. (B) %GC versus relative intergenic region size (% of total GS). (C and D) Boxplot comparison of N-ARSC & C-ARSC values per gene found in Thermosipho genomes, one representative genome of each Thermotogales species and Prochlorococcus marinus MED4.
Overview of Genome Content with a Focus on Mobile DNA Defence Systems and Mobile Elements
| Genome | Closed | ORFs | Species-specific ORFs | Strain Specific ORFs | Putative HGT ORFs | Defence Genes | CRISPR Arrays | Transposase Genes | Prophage Present | Prophage ORFs |
|---|---|---|---|---|---|---|---|---|---|---|
| Y | 1,879 | 424 | 33 | 80 | 34 | 5 | 1 | Yes | 49 | |
| Y | 1,868 | 422 | 4 | 83 | 37 | 5 | 2 (2) | Yes | 53 | |
| T. sp. 1063 | Y | 1,706 | 349 | 11 | 66 | 31 | 4 | 0 (1) | ||
| T. sp. 1070 | Y | 1,765 | 351 | 43 | 71 | 31 | 4 | 0 (1) | ||
| T. sp. 1074 | 1,756 | 353 | 47 | 73 | 31 | 5 | 1 | Yes | 77 | |
| 1,720 | 348 | 17 | 71 | 33 | 5 | 1 (1) | ||||
| T. sp. 1223 | 1,726 | 349 | 68 | 76 | 30 | 5 | 0 (1) | |||
| 1,902 | 638 | 63 | 83 | 24 | 6 | 10 | ||||
| 1,982 | 656 | 169 | 116 | 19 | 6 | 0 (4) | Yes | 48 | ||
| Y | 1,954 | 657 | 64 | 77 | 32 | 12 | 43 |
Each genome is used as a reference therefore species-specific orfs can show variations.
Number in brackets indicates genes with disrupted open reading frames.
. 4.—Comparison of COG functional annotations for three Thermosipho species. (A) Complete genome COG category annotations averaged for three Thermosipo species. COG category counts are relative to indicate proportional differences in gene content between the three species. The three species are indicated in the textbox in the figure. (B) Mobile DNA defense related COG annotation counts for 10 Thermosipho isolates. A total of 38 COGs were identified using a list of COGs related to defense genes (Makarova et al. 2011). The counts of the identified COGs were summarized into three groups: Restriction–modification systems COGs, CRISPR-cas associated COGs, and other COGs. The strains are grouped per Thermosipho species.