| Literature DB >> 31980727 |
Chandni Talwar1, Shekhar Nagar1, Roshan Kumar2, Joy Scaria3,4, Rup Lal5, Ram Krishan Negi6.
Abstract
Devosia are well known for their dominance in soil habitats contaminated with various toxins and are best characterized for their bioremediation potential. In this study, we compared the genomes of 27 strains of Devosia with aim to understand their metabolic abilities. The analysis revealed their adaptive gene repertoire which was bared from 52% unique pan-gene content. A striking feature of all genomes was the abundance of oligo- and di-peptide permeases (oppABCDF and dppABCDF) with each genome harboring an average of 60.7 ± 19.1 and 36.5 ± 10.6 operon associated genes respectively. Apart from their primary role in nutrition, these permeases may help Devosia to sense environmental signals and in chemotaxis at stressed habitats. Through sequence similarity network analyses, we identified 29 Opp and 19 Dpp sequences that shared very little homology with any other sequence suggesting an expansive short peptidic transport system within Devosia. The substrate determining components of these permeases viz. OppA and DppA further displayed a large diversity that separated into 12 and 9 homologous clusters respectively in addition to large number of isolated nodes. We also dissected the genome scale positive evolution and found genes associated with growth (exopolyphosphatase, HesB_IscA_SufA family protein), detoxification (moeB, nifU-like domain protein, alpha/beta hydrolase), chemotaxis (cheB, luxR) and stress response (phoQ, uspA, luxR, sufE) were positively selected. The study highlights the genomic plasticity of the Devosia spp. for conferring adaptation, bioremediation and the potential to utilize a wide range of substrates. The widespread toxin-antitoxin loci and 'open' state of the pangenome provided evidence of plastic genomes and a much larger genetic repertoire of the genus which is yet uncovered.Entities:
Mesh:
Substances:
Year: 2020 PMID: 31980727 PMCID: PMC6981132 DOI: 10.1038/s41598-020-58163-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
General attributes of the Devosia genomes analyzed in this study.
| Strain | Genome Size (bp) | No. of Contigs | GC Content (%) | CDS | rRNAs (5S, 16S, 23S) | tRNAs | CRISPRs | Source of Isolation | Accession Number | Reference |
|---|---|---|---|---|---|---|---|---|---|---|
| 5,750,119 | 410 | 65.3 | 5632 | 1,1,1 | 50 | — | Soil sample South Korea: Dokdo Island, East Sea of Korea | NZ_LAJE00000000.2 | [ | |
| 4,297,227 | 25 | 62.7 | 4183 | 2,1,1 | 48 | — | Nitrifying inoculum of activated sludge in Gent, Belgium | NZ_FQVC00000000.1 | [ | |
| 4,136,371 | 48 | 61 | 4183 | 3,1,1 | 48 | — | Greenhouse soil used to cultivate lettuce in Daejeon City, Korea | NZ_LAJG00000000.1 | [ | |
| 3,859,784 | 47 | 61.1 | 3745 | 2,2,2 | 49 | — | Skin of medical leech | NZ_LANJ00000000.1 | Unpublished data | |
| 5,052,234 | 113 | 61.8 | 5042 | 1,1,1 | 52 | — | Riboflavin rich soil in Rahway, New Jersey | NZ_JQGC00000000.1 | [ | |
| 3,497,719 | 98 | 62.3 | 3437 | 2,2,2 | 48 | — | Soil samples from an India Pesticide Limited plant at hexachlorocyclohexane (HCH) dump site, Lucknow, India. | NZ_JZEY00000000.1 | [ | |
| 4,465,063 | 207 | 65.9 | 4432 | 1,1,1 | 49 | — | Diesel-contaminated soil in Geoje, Korea | NZ_JZEX00000000.1 | [ | |
| 3,723,990 | 7 | 61.3 | 3706 | 1,1,1 | 45 | 1 | Hexachlorocyclohexane (HCH)-contaminated site in Chinhat, Lucknow, India | NZ_FPCK00000000.1 | This study | |
| 4,328,275 | 85 | 61.2 | 4353 | 1,1,1 | 49 | — | Alpine glacier cryoconite, Tyrol, Austria | FOMB00000000.1 | Unpublished data | |
| 4,220,684 | 5 | 65.6 | 4107 | 2,1,2 | 48 | 1 | Freshwater from the Putah Creek overflow in Davis, Calif, California | NZ_FPKU00000000.1 | Unpublished data | |
| 3,719,665 | 3 | 62.9 | 3722 | 1,1,1 | 46 | 1 | HCH contaminated pond soil in Ummari village, Lucknow, India | NZ_FXWK00000000.1 | This study | |
| 4,123,118 | 20 | 60.9 | 4165 | 3,1,1 | 48 | — | Sediment sample from Hwasun Beach in Jeju, Republic of Korea | IMG Genome ID 2654587640 | Unpublished data | |
| 4,202,858 | 47 | 62.3 | 4217 | 2,2,2 | 48 | — | Limestone Capitan Formation at −347 m in Lechuguilla Cave, New Mexico, U.S.A. | JNNO00000000.1 | [ | |
| 4,594,249 | 1 | 64.8 | 4574 | 2,2,2 | 51 | — | Human cerebrospinal fluid | NZ_CP011300.1 | [ | |
| 3,919,001 | 16 | 63.8 | 3890 | 1,1,1 | 46 | 1 | Root of | LMEM00000000.1 | [ | |
| 4,397,456 | 5 | 61.5 | 4228 | 1,1,1 | 48 | — | Root of | LMHK00000000.1 | [ | |
| 5,032,994 | 1 | 65.8 | 4992 | 2,2,2 | 57 | — | Wheat field, China; Nanjing | NZ_CP012945.1 | [ | |
| 4,684,238 | 124 | 64 | 4584 | 2,1,1 | 49 | — | Alfalfa soil sample that was enriched with | JQGB00000000.1 | [ | |
| 5,850,117 | 21 | 65.4 | 5737 | 1,1,1 | 51 | — | Root of | LMCR00000000.1 | [ | |
| 5,851,361 | 14 | 65.4 | 5716 | 1,1,1 | 50 | — | Root of | LMEA00000000.1 | [ | |
| 3,816,628 | 24 | 64.1 | 3748 | 1,1,1 | 48 | 1 | Root of | LMGZ00000000.1 | [ | |
| 4,669,456 | 95 | 64 | 4578 | 1,1,1 | 49 | — | Mycotoxin contaminated Wheat field soil in Nanyang, China | CCAO000000000.1 | [ | |
| 3,878,148 | 151 | 64.1 | 3878 | 1,1,1 | 55 | — | Oil palm rhizospheric soil, Temerloh, Pahang, Malaysia | LVVY00000000.1 | Unpublished data | |
| 4,244,488 | 24 | 60.5 | 4206 | 1,1,1 | 48 | — | LMLO00000000.1 | [ | ||
| 4,219,583 | 16 | 60.7 | 4128 | 1,1,1 | 50 | — | LMQU00000000.1 | [ | ||
| 3,831,215 | 11 | 62.5 | 3755 | 2,2,2 | 51 | — | FOFL00000000.1 | Unpublished data | ||
| 4,005,916 | 1 | 61.9 | 4021 | 2,2,2 | 48 | — | Pit mud, Indian ocean | NZ_CP026747.1 | Unpublished data |
Figure 1Phylogenomics analysis. The tree is based on the 400 conserved bacterial marker gene sequences constructed using maximum likelihood method with 1000 bootstrap replications. The innermost ring represents the three major groups of strains thus formed which are denoted as Group I, II and III. The colors in the middle ring represent the habitat of each strain and the outermost ring represents their geographic origin. The tree was constructed using iTOL (https://itol.embl.de/)[84].
Figure 2Phylogenomics analyses. (A) Maximum likelihood tree based on the single copy core genetic content of the 27 analyzed members of the genus Devosia. Bootstrap values calculated from 100 bootstrap repetitions are denoted. (B) Correlation between the genomes on the basis of blast based average nucleotide identity (ANIb) values. The blue and pink squares denote high and low correlation values for a pair of genomes and the corresponding values of predicted Pearson correlation coefficients (-1 to 1.0) are shown in the adjacent bar.
Figure 3Pangenome analysis. Clustering of genomes based on the presence/absence patterns of 23,421 pangenomic clusters. The genomes are organized in radial layers as core, unique and accessory gene clusters [Euclidean distance; Ward linkage] which are defined by the gene tree in the center. The clades are colored based on the shared gene clusters as shown in the tree in the right top above the heatmaps and the phylogenomic groups of the strains are denoted by the corresponding colors in the pangenome tree as in Fig. 1. Heat maps denote the functions enriched in the core- (below) and strain-specific (top) gene contents based on annotated clusters of orthologous groups (COG) categories. The core- and strain-specific gene clusters are highlighted to distinguish them from dispensable genome. The figure was constructed using Anvi’o pangenomics workflow (http://merenlab.org/software/anvio/)[33].
Figure 4Comparative metabolic pathway analysis. The top metabolic pathways within each genome are compared based on their percentage reconstruction. A dendrogram constructed based on the metabolic profiles is shown at the top and the different phylogenetic groups are shown with corresponding colors. The heatmap was constructed using pheatmap[92] in R (R Development Core Team, 2015).
Figure 5Sequence similarity network analyses. Diversity of (A) Oligopeptide (Opp) and (B) Dipeptide (Dpp) permeases in analysed genomes. The nodes represent sequences connected through edges if the similarity exceeds the cutoff score. The networks are thresholded at e-value cutoff of 1e-30 and 1e-25 respectively. The ABCDF components of the permeases are represented by different colors. The clusters are ranked in order of decreasing number of nodes. Clusters with more than 10 nodes are numbered. (C) Topological properties of the similarity networks: degree distribution, average clustering coefficient, average neighborhood connectivity and closeness centrality are plotted against the number of neighbors. The power law fit curves are shown within each graph.
Parameters of the sequence similarity networks.
| Network Parameters | Opp | OppA | Dpp | DppA |
|---|---|---|---|---|
| No. of Nodes | 1529 | 343 | 919 | 192 |
| No. of Edges | 2,39,422 | — | 23,141 | — |
| Average degree | 313.17 | 21.24 | 50.36 | 15.94 |
| Connected components | 65 | — | 55 | — |
| Isolated nodes | 29 | 12 | 19 | 9 |
| Network Density | 0.20 | 0.06 | 0.05 | 0.08 |
| Characteristic path length | 1.92 | — | 2.02 | — |
| Shortest path | 38% | — | 18% | — |
| Network centralization | 0.298 | 0.096 | 0.19 | 0.12 |
| Clustering coefficient | 0.87 | 0.9 | 0.8 | 0.85 |
List of genes identified to be under positive selection across the genus.
| Gene | Function | ω | p-value | q-value |
|---|---|---|---|---|
| Pyrroline-5-carboxylate | Proline synthesis and osmotic stress | 15.385168 | 0.000456 | 0.004044 |
| Alpha/beta hydrolase | Hydrolysis | 13.417266 | 0.00122 | 0.008221 |
| LamB | Lactam utilization | 12.54433 | 0.001888 | 0.012231 |
| Response regulator in two-component regulatory system with PhoQ | Response to divalent cation starvation; Resistance to antimicrobial peptides | 21.08824 | 0.000026 | 0.000738 |
| Translation initiation factor 3 | Translation | 14.490274 | 0.000714 | 0.005226 |
| Acetyl-coenzyme A carboxyl transferase alpha chain | Membrane lipid synthesis | 17.42453 | 0.000165 | 0.001848 |
| probable iron binding protein from the HesB_IscA_SufA family | Iron starvation | 20.783648 | 0.000031 | 0.000738 |
| Exopolyphosphatase (EC 3.6.1.11) | Inorganic polyphosphate utilization, adaptation to amino acid starvation | 17.073976 | 0.000196 | 0.002064 |
| NifU-like domain protein | Maturation of nitrogenase; scaffold for Fe-S cluster assembly | 11.071378 | 0.003943 | 0.02372 |
| DNA-directed RNA polymerase omega subunit (EC 2.7.7.6) | Transcription | 11.501884 | 0.00318 | 0.019835 |
| Molybdopterin biosynthesis protein MoeB | Cofactor for detoxifying enzymes | 9.045598 | 0.010859 | 0.053791 |
| Transcriptional regulator, LuxR family | Quorum sensing, motility | 19.678534 | 0.000053 | 0.000999 |
| Glutamate methylesterase CheB (EC 3.1.1.61) | Chemotaxis | 14.858994 | 0.000593 | 0.00476 |
| MutT/nudix family protein | Housekeeping enzyme | 10.824536 | 0.004462 | 0.025911 |
| hypothetical protein | — | 18.851394 | 0.000081 | 0.001132 |
| FtsZ (EC 3.4.24.-) | Cell division | 33.57748 | 0 | 0.000009 |
| SSU ribosomal protein S6p | Ribosomal protein | 17.630638 | 0.000148 | 0.001786 |
| Scaffold protein for [4Fe-4S] cluster assembly ApbC, MRP-like | Fe-S cluster assembly; Probable Iron binding protein | 24.659618 | 0.000004 | 0.000227 |
| PetP | HTH-type transcriptional regulator | 9.34578 | 0.009345 | 0.049185 |
| 3-isopropylmalate dehydratase small subunit (EC 4.2.1.33) | Biosynthesis of leucine and lysine | 9.057016 | 0.010797 | 0.053791 |
| Ribonuclease PH (EC 2.7.7.56) | tRNA processing | 18.16992 | 0.000113 | 0.001469 |
| Hypothetical protein | — | 23.900184 | 0.000006 | 0.000227 |
| Universal stress protein UspA and related nucleotide-binding proteins | Response to various stressors | 14.555118 | 0.000691 | 0.005226 |
| Sulfur acceptor protein SufE for iron-sulfur cluster assembly | Oxidative stress and iron starvation | 19.676042 | 0.000053 | 0.000999 |
Figure 6(A) Positively selected genes in genome pairs of strains isolated from hexachlorocyclohexane (HCH) contaminated sites. dN/dS values are plotted against dS values. The total number of predicted orthologs are for each pair that were subjected to the analysis are shown. The positively evolving poteins with dN/dS values > 1 are labelled. Hypothetical proteins are denoted as hp. (B) Presence absence pattern of the genes involved in the biosynthesis of osmolytes glycine betaine, ectoine and hydroxyectoine in response to osmotic sress response.
Figure 7Biodegradation of organic compounds. Clustering of genomes based on the ability to degrade (A) alkylphosphonates and alkanesulphonates and (B) aromatic and xenobiotic compounds. The genomes are colored according to their original phylogenetic clustering at the tip of each branch in the tree.
Figure 8Metabolic versatility of urea decomposition. (A) The two different metabolic routes of decomposition of urea catalyzed by different enzymes namely urease and urea carboxylase. (B) A phylogram based on the genes involved in the urease pathway and their organization into operons within genomes. The phylogenetic clades are shown with the colored boxes in front of each genome name in the tree.
Various toxin-antitoxin (TA) systems identified within Devosia genomes.
| Genomes | Toxins and Antitoxins | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RelB/StbD | RelE/StbE | ParE | ParD | HigB | HigA | VapC | VapB | VapB1 | YoeB | YefM | YafQ | DinJ | |
| DDB001 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 |
| 17-2-E-8 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 |
| L15 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| GH2-10 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| HST3-14 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| S37 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| DSM17137 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| Root685 | 0 | 0 | 0 | 0 | 1 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
| IPL20 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| BD-c194 | 0 | 0 | 1 | 1 | 3 | 2 | 4 | 5 | 1 | 1 | 1 | 0 | 0 |
| Root635 | 0 | 0 | 0 | 1 | 2 | 2 | 1 | 0 | 0 | 0 | 0 | 1 | 1 |
| E84 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| A16 | 0 | 0 | 1 | 1 | 0 | 3 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| Root105 | 0 | 0 | 1 | 1 | 1 | 1 | 2 | 2 | 0 | 0 | 0 | 0 | 1 |
| CGMCC 1.10210 | 0 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0 | 0 | 1 | 0 | 0 |
| IFO13584 | 0 | 0 | 0 | 0 | 1 | 1 | 2 | 2 | 0 | 0 | 0 | 0 | 0 |
| LC5 | 0 | 0 | 1 | 2 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 |
| YR412 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 |
| Root413-D1 | 0 | 0 | 1 | 1 | 1 | 1 | 2 | 2 | 0 | 0 | 0 | 0 | 1 |
| H5989 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Leaf420 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| ATCC 23634 | 1 | 0 | 0 | 2 | 0 | 1 | 2 | 2 | 0 | 0 | 0 | 0 | 1 |
| Leaf64 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| Root436 | 0 | 0 | 0 | 1 | 2 | 2 | 1 | 0 | 0 | 0 | 0 | 1 | 1 |
| DS-56 | 1 | 1 | 3 | 2 | 0 | 0 | 5 | 2 | 0 | 1 | 1 | 1 | 0 |