| Literature DB >> 29225730 |
Aregu Amsalu Aserse1, Tanja Woyke2, Nikos C Kyrpides2, William B Whitman3, Kristina Lindström1.
Abstract
The type strain of the prospective 10.1601/nm.30737 sp. nov. ERR11T, was isolated from a nodule of the leguminous tree Erythrina brucei native to Ethiopia. The type strain 10.1601/nm.1463 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+10071 T, was isolated from the nodules of Lespedeza cuneata in Beijing, China. The genomes of ERR11T and 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+10071 T were sequenced by DOE-JGI and deposited at the DOE-JGI genome portal as well as at the European Nucleotide Archive. The genome of ERR11T is 9,163,226 bp in length and has 102 scaffolds, containing 8548 protein-coding and 86 RNA genes. The 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+10071 T genome is arranged in 108 scaffolds and consists of 8,201,522 bp long and 7776 protein-coding and 85 RNA genes. Both genomes contain symbiotic genes, which are homologous to the genes found in the complete genome sequence of 10.1601/nm.24498 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+110 T. The genes encoding for nodulation and nitrogen fixation in ERR11T showed high sequence similarity with homologous genes found in the draft genome of peanut-nodulating 10.1601/nm.27386 10.1601/strainfinder?urlappend=%3Fid%3DLMG+26795 T. The nodulation genes nolYA-nodD2D1YABCSUIJ-nolO-nodZ of ERR11T and 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+10071 T are organized in a similar way to the homologous genes identified in the genomes of 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+110 T, 10.1601/nm.25806 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+4 and 10.1601/nm.1462 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+05525. The genomes harbor hupSLCFHK and hypBFDE genes that code the expression of hydrogenase, an enzyme that helps rhizobia to uptake hydrogen released by the N2-fixation process and genes encoding denitrification functions napEDABC and norCBQD for nitrate and nitric oxide reduction, respectively. The genome of ERR11T also contains nosRZDFYLX genes encoding nitrous oxide reductase. Based on multilocus sequence analysis of housekeeping genes, the novel species, which contains eight strains formed a unique group close to the 10.1601/nm.25806 branch. Genome Average Nucleotide Identity (ANI) calculated between the genome sequences of ERR11T and closely related sequences revealed that strains belonging to 10.1601/nm.25806 branch (10.1601/strainfinder?urlappend=%3Fid%3DUSDA+4 and 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+15615), were the closest strains to the strain ERR11T with 95.2% ANI. Type strain ERR11T showed the highest DDH predicted value with 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+15615 (58.5%), followed by 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+4 (53.1%). Nevertheless, the ANI and DDH values obtained between ERR11T and 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+15615 or 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+4 were below the cutoff values (ANI ≥ 96.5%; DDH ≥ 70%) for strains belonging to the same species, suggesting that ERR11T is a new species. Therefore, based on the phylogenetic analysis, ANI and DDH values, we formally propose the creation of 10.1601/nm.30737 sp. nov. with strain ERR11T (10.1601/strainfinder?urlappend=%3Fid%3DHAMBI+3532 T=10.1601/strainfinder?urlappend=%3Fid%3DLMG+30162 T) as the type strain.Entities:
Keywords: Bradyrhizobium shewense sp. nov. ERR11T; Bradyrhizobium yuanmingense CCBAU 10071T; Digital DNA-DNA hybridization; Erythrina brucei; Ethiopia; Genome average nucleotide identity; Symbiotic
Year: 2017 PMID: 29225730 PMCID: PMC5717998 DOI: 10.1186/s40793-017-0283-x
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Fig. 1Maximum Likelihood phylogenetic tree reconstructed based on recA-glnII-rpoB concatenated nucleotide sequences, showing the relationships between 10.1601/nm.30737 sp. nov. (in green) and recognized species of the genus 10.1601/nm.1459 as well as the position of type strain 10.1601/nm.1463 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+1007 T.The tree was constructed by using General Time Reversible model using MEGA version 7. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0.2999). The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 31.7544% sites). Bootstrap values (100 replicates) are indicated at the branching points. Reference type strains are indicated with superscript ‘T’. Bar, % estimated substitutions. GenBank accession numbers of the sequences (recA, glnII, rpoB in order) are listed in parentheses next to the strains codes. The accession numbers of whole genome sequenced strains are indicated with bold*. Abbreviations: B, 10.1601/nm.1459; R, 10.1601/nm.1279; sp., species
ANI and DDH Genomic comparison between 10.1601/nm.30737 sp. nov. ERR11T and reference 10.1601/nm.1459 species
| Genome name | NCBI/ENA accession number | MSLA | ANI was computed from protein-coding genes of the genomes using the MiSI program | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | |||
| 1 |
| FMAI01000000 | 95.2 | 95.2 | 89.6 | 89.3 | 89.3 | 89.2 | 89.1 | 89.0 | 89.0 | 89.0 | 89.0 | 86.9 | 89.6 | ||
| 2 |
| AXAF00000000 | 96.0 | 53.1 | 99.9 | 90.0 | 90.3 | 90.3 | 89.1 | 90.2 | 89.1 | 89.1 | 90.0 | 89.1 | 87.1 | 90.1 | |
| 3 |
| AJQG0100000 | 96.0 | 58.3 | 99.0 | 90.2 | 90.3 | 90.4 | 89.2 | 90.3 | 89.2 | 89.1 | 90.0 | 89.2 | 87.1 | 90.3 | |
| 4 |
| AJQD00000000 | 95.0 | 36.6 | 38.0 | 38.7 | 89.7 | 89.6 | 88.8 | 89.2 | 90.4 | 90.4 | 89.6 | 90.4 | 87.0 | 99.9 | |
| 5 |
| PRJNA255602 | 94.0 | 35.6 | 39.2 | 39.4 | 37.4 | 91.2 | 90.0 | 90.3 | 89.3 | 89.2 | 89.8 | 89.2 | 87.6 | 89.7 | |
| 6 |
| CP011360 | 94.0 | 35.7 | 39.3 | 39.6 | 37.2 | 42.1 | 89.4 | 91.0 | 88.8 | 88.8 | 89.7 | 88.8 | 87.7 | 89.6 | |
| 7 |
| FPBQ01000000 | 94.0 | 35.4 | 35.1 | 35.2 | 34.8 | 37.3 | 35.9 | 89.5 | 88.6 | 88.6 | 89.6 | 88.6 | 87.7 | 88.8 | |
| 8 |
| AP012206 | 94.0 | 35.4 | 39.0 | 39.3 | 36.3 | 39.3 | 41.0 | 36.4 | 88.4 | 88.4 | 89.3 | 88.5 | 87.5 | 89.2 | |
| 9 |
| FMAE00000000 | 94.0 | 34.7 | 34.7 | 35.1 | 38.3 | 35.3 | 34.5 | 34.0 | 33.6 | 100.0 | 89.1 | 98.2 | 86.8 | 90.3 | |
| 10 |
| PRJNA255601 | 94.0 | 34.7 | 34.7 | 35.0 | 38.3 | 35.5 | 34.5 | 33.8 | 33.6 | 100.0 | 90.0 | 98.2 | 86.8 | 90.3 | |
| 11 |
| PRJNA255603 | 94.0 | 34.9 | 38.3 | 38.5 | 37.0 | 37.8 | 37.7 | 34.0 | 36.7 | 35.1 | 35.1 | 89.1 | 86.8 | 89.7 | |
| 12 |
| AJQL00000000 | 94.0 | 34.7 | 35.0 | 35.1 | 38.4 | 35.5 | 34.5 | 34.0 | 33.7 | 82.5 | 82.5 | 35.0 | 86.8 | 90.4 | |
| 13 |
| LJYG00000000 | 94.0 | 31.1 | 58.1 | 31.6 | 31.5 | 32.4 | 32.3 | 32.5 | 32.2 | 31.1 | 31.1 | 30.9 | 31.0 | 87.0 | |
| 14 |
| AJQC00000000 | 95.0 | 36.6 | 38.3 | 39.1 | 99.5 | 37.7 | 37.6 | 34.8 | 36.7 | 38.2 | 38.2 | 37.3 | 38.3 | 31.6 | |
| DDH values were predicted by the Genome-to-Genome Distance calculator 2.0, formula 2 | |||||||||||||||||
The numbers in MLSA column indicate recA, glnII, rpoB concatenated gene sequence similarities between ERR11T and reference strains. The numbers below the diagonal are DDH values predicted between pairwise genomes. The numbers above the diagonal are ANI values between genomes; in all ANI calculations AF was > = 60%. Reference type strains are indicated with superscript ‘T’; B, Bradyrhizobium
Classification and general features of Bradyrhizobium shewense sp. nov. ERR11T and B. yuanmingense CCBAU 10071T [94]
| MIGS ID | Property | ERR11T | CCBAU 10071T | ||
|---|---|---|---|---|---|
| Term | Evidence code | Term | Evidence code | ||
| Domain | TAS [ | Domain | TAS [ | ||
| Phylum 10.1601/nm.808 | TAS [ | Phylum 10.1601/nm.808 | TAS [ | ||
| Class 10.1601/nm.809 | TAS [ | Class 10.1601/nm.809 | TAS [ | ||
| Classification | Order 10.1601/nm.1277 | TAS [ | Order 10.1601/nm.1277 | TAS [ | |
| Family 10.1601/nm.1458 | TAS [ | Family 10.1601/nm.1458 | TAS [ | ||
| Genus 10.1601/nm.1459 | TAS [ |
| TAS [ | ||
| Species 10.1601/nm.30737 sp. nov. | IDA | Species 10.1601/nm.1463 | TAS [ | ||
| Type strain ERR11T | IDA | Type strain CCBAU 10071T | TAS [ | ||
| Gram stain | Negative | IDA | Negative | IDA | |
| Cell shape | Rod | IDA | Rod | IDA | |
| Motility | Motile | IDA | Motile | IDA | |
| Sporulation | Non-sporulating | IDA | Non-sporulating | IDA | |
| Temperature range | Mesophile | IDA | Mesophile | TAS [ | |
| Optimum temperature | 28 °C | IDA | 28 °C | TAS [ | |
| pH range; Optimum | 5–10; 7 | IDA | 6.5–7.5; 7 | TAS [ | |
| Carbon source | Varied (Additional file | IDA | Varied | TAS [ | |
| MIGS-6 | Habitat | Soil, root nodule | [ | Soil, root nodule | TAS [ |
| MIGS-6.3 | Salinity | Non-halophile | IDA | Non-halophile | TAS [ |
| MIGS-22 | Oxygen requirement | Aerobic | IDA | Aerobic | TAS [ |
| MIGS-15 | Biotic relationship | Free living, symbiotic | IDA | Free living, symbiotic | TAS [ |
| MIGS-14 | Pathogenicity | Non-pathogenic | NAS | Non-pathogenic | NAS |
| MIGS-4 | Geographic location | Central Ethiopia | [ | Beijing, China | TAS [ |
| MIGS-5 | Sample collection | September, 2007 | [ | 1995 | TAS [ |
| MIGS-4.1 | Latitude | 08o 59' 38" | [ | Not reported | TAS [ |
| MIGS-4.2 | Longitude | 038o 4' 18.5" | [ | Not reported | TAS [ |
| MIGS-4.4 | Altitude | 2327 | [ | Not reported | TAS [ |
Evidence codes – IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e.,not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [101]
Fig. 2Gram stain and dimensions of 10.1601/nm.30737 sp. nov. ERR11T and 10.1601/nm.1463 CBAU1007T
Project information
| MIGS ID | Property | Term, ERR11T | Term, CCBAU 10071T |
|---|---|---|---|
| MIGS 31 | Finishing quality | High-quality draft | High-quality draft |
| MIGS-28 | Libraries used | Illumina std. shotgun library | Illumina std. shotgun library |
| MIGS 29 | Sequencing platforms | Illumina HiSeq 2500, Illumina HiSeq 2500–1 TB | Illumina HiSeq 2500, Illumina HiSeq 2500–1 TB |
| MIGS 31.2 | Fold coverage | 225.2X | 279.9× |
| MIGS 30 | Assemblers | Velvet (version 1.2.07), Allpaths–LG (version r46652) | Velvet (version 1.2.07), Allpaths–LG (version r46652) |
| MIGS 32 | Gene calling method | Prodigal | Prodigal |
| Locus Tag | ATF67 | ATF66 | |
| GenBank ID |
|
| |
| GenBank Date of Release | 01-AUG-2016 | 01-10.1601/strainfinder?urlappend=%3Fid%3DAUG+2016 | |
| GOLD ID | Gp0108279 | Gp0108280 | |
| BIOPROJECT |
|
| |
| MIGS 13 | Source Material Identifier | ERR11 | CCBAU 10071 |
| Project relevance | Symbiotic N2 fixation, agriculture | Symbiotic N2 fixation, agriculture |
Genome statistics
| Attribute | ERR11T | CCBAU 10071T | ||
|---|---|---|---|---|
| Value | % of Total | Value | % of Total | |
| Genome size (bp) | 9,163,226 | 100% | 8,201,522 | 100% |
| DNA coding (bp) | 8548 | 99% | 6,928,453 | 84.48% |
| DNA G + C (bp) | 5,792,812 | 63.22% | 5,230,108 | 63.77% |
| DNA scaffolds | 102 | 100% | 108 | 100% |
| Total genes | 8634 | 100% | 7861 | 100% |
| Protein coding genes | 8548 | 99% | 7776 | 98.92% |
| RNA genes | 86 | 1% | 85 | 1.08% |
| Pseudo genes | not determined | not determined | ||
| Genes in internal clusters | 1889 | 21.88% | 1457 | 18.53% |
| Genes with function prediction | 6282 | 72.76% | 5703 | 72.55% |
| Genes assigned to COGs | 5346 | 61.92% | 4913 | 62.50% |
| Genes with Pfam domains | 6555 | 75.92% | 6014 | 76.50% |
| Genes with signal peptides | 924 | 10.70% | 812 | 10.33% |
| Genes with transmembrane helices | 1956 | 22.65% | 1772 | 22.54% |
| CRISPR repeats | 3 | 1 | ||
Number of genes associated with general COG functional categories
| Code | ERR11T | CCBAU 10071T | Description | ||
|---|---|---|---|---|---|
| Value | %age | Value | %age | ||
| J | 225 | 3.65% | 231 | 4.08% | Translation, ribosomal structure and biogenesis |
| A | 0 | 00% | 0 | 00% | RNA processing and modification |
| K | 458 | 7.44% | 392 | 6.93% | Transcription |
| L | 135 | 2.19% | 143 | 2.53% | Replication, recombination and repair |
| B | 2 | 0.03% | 2 | 0.04% | Chromatin structure and dynamics |
| D | 36 | 0.58% | 39 | 0.69% | Cell cycle control, Cell division, chromosome partitioning |
| V | 162 | 2.63% | 134 | 2.37% | Defense mechanisms |
| T | 288 | 4.68% | 263 | 4.47% | Signal transduction mechanisms |
| M | 316 | 5.13% | 300 | 5.3% | Cell wall/membrane biogenesis |
| N | 106 | 1.72% | 109 | 1.93% | Cell motility |
| U | 85 | 1.38% | 113 | 2% | Intracellular trafficking and secretion |
| O | 245 | 3.98% | 221 | 3.9% | Posttranslational modification, protein turnover, chaperones |
| C | 440 | 7.15% | 378 | 6.68% | Energy production and conversion |
| G | 438 | 7.11% | 339 | 5.97% | Carbohydrate transport and metabolism |
| E | 665 | 10.8% | 623 | 11.01% | Amino acid transport and metabolism |
| F | 98 | 1.59% | 94 | 1.66% | Nucleotide transport and metabolism |
| H | 309 | 5.02% | 271 | 4.79% | Coenzyme transport and metabolism |
| I | 413 | 6.71% | 398 | 7.03% | Lipid transport and metabolism |
| P | 358 | 5.81% | 311 | 5.49% | Inorganic ion transport and metabolism |
| Q | 266 | 4.32% | 278 | 4.91% | Secondary metabolites biosynthesis, transport and catabolism |
| R | 684 | 11.11% | 626 | 11.06% | General function prediction only |
| S | 353 | 5.73% | 333 | 5.83% | Function unknown |
| – | 3288 | 38.08% | 2948 | 37.5% | Not in COGs |
The total is based on the total number of protein coding genes in the genome
Fig. 3Venn diagram (panel a) plotted by OrthoVenn program shows shared orthologous protein clusters between three genomes (in the center): 10.1601/nm.30737 sp.nov. ERR11T, 10.1601/nm.25806 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+4 and 10.1601/nm.1462 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+83689. The total number of protein sequences, the number of protein clusters comprising multiple protein families and also the number of singletons i.e. protein with no orthologous are summarized in (panel b) for each genome
Fig. 4Mauve alignment comparing the genome of ERR11T with the genome of 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+110 T, 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+4, 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+05525 and CCBAU 10071T. The nod genes: nolY-nolA-nodD2-nodD1YABCSUIJ-nolO-nodZ indicated by the arrows are homologous and organized similarly between the genomes