| Literature DB >> 34970235 |
Haohui Zhong1,2, Hao Sun1, Ronghua Liu1, Yuanchao Zhan1, Xinyu Huang3,4, Feng Ju3,4, Xiao-Hua Zhang1,2,5.
Abstract
Hadal zones are marine environments deeper than 6,000 m, most of which comprise oceanic trenches. Microbes thriving at such depth experience high hydrostatic pressure and low temperature. The genomic potentials of these microbes to such extreme environments are largely unknown. Here, we compare five complete genomes of bacterial strains belonging to Labrenzia aggregata (Alphaproteobacteria), including four from the Mariana Trench at depths up to 9,600 m and one reference from surface seawater of the East China Sea, to uncover the genomic potentials of this species. Genomic investigation suggests all the five strains of L. aggregata as participants in nitrogen and sulfur cycles, including denitrification, dissimilatory nitrate reduction to ammonium (DNRA), thiosulfate oxidation, and dimethylsulfoniopropionate (DMSP) biosynthesis and degradation. Further comparisons show that, among the five strains, 85% gene functions are similar with 96.7% of them encoded on the chromosomes, whereas the numbers of functional specific genes related to osmoregulation, antibiotic resistance, viral infection, and secondary metabolite biosynthesis are majorly contributed by the differential plasmids. A following analysis suggests the plasmidic gene numbers increase along with isolation depth and most plasmids are dissimilar among the five strains. These findings provide a better understanding of genomic potentials in the same species throughout a deep-sea water column and address the importance of externally originated plasmidic genes putatively shaped by deep-sea environment.Entities:
Keywords: Labrenzia aggregata; comparative genomics; hadal zone; metabolic capacity; nitrogen cycle; sulfur cycle; the Mariana Trench
Year: 2021 PMID: 34970235 PMCID: PMC8712697 DOI: 10.3389/fmicb.2021.770370
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1Phylogenetic inference of the Labrenzia aggregata strains. This is a maximum-likelihood codon tree in KOSI07 + F + R3 model selected by ModelFinder in IQ-TREE 2 (Minh et al., 2020) based on concatenated coding sequences of 52 ribosomal proteins with a total of 22,932 nucleobases. This tree was tested by 1,000 replicates of ultra-fast bootstrap method, and the percentage of support for each branch is labeled near the node.
Genomic information of the five completely sequenced Labrenzia aggregata strains.
| Strain | Isolation Location | Isolation Depth (m) | Isolation Source | Genome (Mbp) | Chromosome (Mbp) | Plasmids Total (Mbp) | Each Plasmid (Mbp) | G + C (%) | CDS | Unique Gene Annotations | Hypothetical Proteins | tRNA |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LZB033 | East China Sea | 0 | Water | 6.49 | 5.85 | 0.64 | 0.45, 0.19 | 59 | 6,248 | 3,052 | 1,863 | 52 |
| SDL044 | Mariana Trench | 0 | Water | 6.55 | 5.89 | 0.66 | 0.46, 0.10, 0.07, 0.04 | 59.1 | 6,312 | 3,067 | 1,913 | 53 |
| RF14 | Mariana Trench | 4,000 | Water | 6.62 | 5.87 | 0.75 | 0.27, 0.25, 0.13, 0.10 | 59 | 6,421 | 3,048 | 1,952 | 52 |
| ZYF703 | Mariana Trench | 9,600 | Water | 6.69 | 5.88 | 0.81 | 0.45, 0.19, 0.17 | 59 | 6,456 | 3,088 | 1,967 | 51 |
| ZYF612 | Mariana Trench | 9,600 | Water | 6.96 | 5.83 | 1.13 | 0.47, 0.25, 0.22, 0.19 | 58.8 | 6,825 | 3,121 | 2,215 | 54 |
The plasmids are sorted in descending order of their sizes and named after this order herein (LZB033P1, LZB033P2; SDL044P1 ~ SDL044P4; RF14P1 ~ RF14P4; ZYF703P1 ~ ZYF703P3; ZYF612P1 ~ ZYF612P4; and P1 = largest in size).
Figure 2Venn diagrams of the gene function numbers according to RAST annotations (Brettin et al., 2015) in the (A) chromosomes, (B) plasmids, and (C) whole genomes of the five L. aggregata strains. Genes with the same annotation are counted only once regardless of their identities or coverages. Hypothetical protein genes are excluded. The percentage of each part is labeled under each count. Chromosomic genes are functionally highly similar among the five strains, while most functional specific genes are contributed by the plasmids. (D) Predicted curve fittings of pan-genome and core-genome of the five strains according to 95% nucleotide identity gene clustering by CD-HIT (Fu et al., 2012) including hypothetical proteins.
Figure 3Metabolic reconstruction of L. aggregata. The gene locations and presence in the five strains are depicted as divided circles and pentagrams. G3P, glyceraldehyde 3-phosphate; F6P, fructose 6-phosphate; 6P-gluconate, 6-phospho-gluconate; TCA, tricarboxylic acid; PPP, pentose phosphate pathway; DMSP, dimethylsulfoniopropionate; DMS, dimethyl sulfide; OPGs, osmoregulated periplasmic glucans; and NAGGN, N-acetylglutaminylglutamine amide.
Figure 4Network of the plasmidic identical genes (considering nucleotide identity >95% as identical) in the five L. aggregata strains. Each node represents a plasmid and the size of it reflects the numbers of genes after removing multiple copies. The widths of the edges indicate the numbers of shared genes. Different from Figure 2B, this is a network based on identity regardless of annotation; thus, the results also consider previously ignored hypothetical proteins. Most genes are shared by the longest plasmids of the five strains and the other plasmids encode more specific genes. The plasmids are named after their lengths as mentioned in Table 1. All these plasmids encode less than 20 identical genes with the chromosomes (mainly in P1; data not shown).