Literature DB >> 24976887

Genome sequence of Ensifer sp. TW10; a Tephrosia wallichii (Biyani) microsymbiont native to the Indian Thar Desert.

Nisha Tak1, Hukam S Gehlot1, Muskan Kaushik1, Sunil Choudhary1, Ravi Tiwari2, Rui Tian2, Yvette Hill2, Lambert Bräu3, Lynne Goodwin4, James Han5, Konstantinos Liolios5, Marcel Huntemann5, Krishna Palaniappan6, Amrita Pati5, Konstantinos Mavromatis5, Natalia Ivanova5, Victor Markowitz6, Tanja Woyke5, Nikos Kyrpides5, Wayne Reeve2.   

Abstract

Ensifer sp. TW10 is a novel N2-fixing bacterium isolated from a root nodule of the perennial legume Tephrosia wallichii Graham (known locally as Biyani) found in the Great Indian (or Thar) desert, a large arid region in the northwestern part of the Indian subcontinent. Strain TW10 is a Gram-negative, rod shaped, aerobic, motile, non-spore forming, species of root nodule bacteria (RNB) that promiscuously nodulates legumes in Thar Desert alkaline soil. It is fast growing, acid-producing, and tolerates up to 2% NaCl and capable of growth at 40(o)C. In this report we describe for the first time the primary features of this Thar Desert soil saprophyte together with genome sequence information and annotation. The 6,802,256 bp genome has a GC content of 62% and is arranged into 57 scaffolds containing 6,470 protein-coding genes, 73 RNA genes and a single rRNA operon. This genome is one of 100 RNB genomes sequenced as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) project.

Entities:  

Keywords:  Alphaproteobacteria; nitrogen fixation; rhizobia; root-nodule bacteria

Year:  2013        PMID: 24976887      PMCID: PMC4062627          DOI: 10.4056/sigs.4598281

Source DB:  PubMed          Journal:  Stand Genomic Sci        ISSN: 1944-3277


Introduction

The Great Indian (or Thar) Desert is a large, hot, arid region in the northwestern part of the Indian subcontinent. It is the 18th largest desert in the world covering 200,000 square km with 61% of its landmass occupying Western Rajasthan. The landscape occurs at low altitude (<1500 m above sea level) and extends from India into the neighboring country of Pakistan [1]. The Thar Desert region is characterized by low annual precipitation (50 to 300 mm), high thermal load and alkaline soils that are poor in texture and fertility [2]. Despite these harsh conditions, the Thar Desert has very rich plant diversity in comparison to other desert landscapes [3]. Approximately a quarter of the plants in the Thar Desert are used to provide animal fodder or food, fuel, medicine or shelter for local inhabitants [4]. The Indian Thar desert harbors several native and exotic plants of the Leguminoseae family [2] including native legume members of the sub-families Caesalpinioideae, Mimosoideae and Papilionoideae that have adapted to the harsh Thar desert environment [5]. The Papilionoid genus Tephrosia can be found throughout this semi-arid to arid environment and these plants are among the first to grow after monsoonal rains. The generic name is derived from the Greek word “tephros” meaning “ash-gray” since dense trichomes on the leaves provide a greyish tint to the plant. Many species within this genus produce the potent toxin rotenone, which historically has been used to poison fish. It is a perennial shrub that has adapted to the harsh desert conditions by producing a long tap root system and dormant auxillary shoot buds. Recently, the root nodule bacteria (RNB) microsymbionts capable of fixing nitrogen in symbiotic associations with Tephrosia have been characterized [5]. Both and were present within nodules, but a particularly high incidence of was noted [5]. was found to occupy the nodules of all four species of Tephrosia examined [5]. Here we present a preliminary description of the general features of the T. wallichii (Biyani) microsymbiont sp. TW10 together with its genome sequence and annotation. Minimum Information about the Genome Sequence (MIGS) is provided in Table 1. Figure 1 shows the phylogenetic neighborhood of sp. strain TW10 in a 16S rRNA sequence based tree. This strain has 99% sequence identity at the 16S rRNA sequence level to E. kostiense LMG 19227 and 100% 16S rRNA sequence identity to other Indian Thar Desert species (JNVU IC18 from a nodule of Indigofera and JNVU TF7, JNVU TP6 and TW8 from nodules of Tephrosia).
Table 1

Classification and general features of sp. TW10 according to the MIGS recommendations [6]

MIGS ID    Property    Term   Evidence code
    Current classification    Domain Bacteria   TAS [7]
    Phylum Proteobacteria   TAS [8]
    Class Alphaproteobacteria   TAS [9,10]
    Order Rhizobiales   TAS [10,11]
    Family Rhizobiaceae   TAS [12,13]
    Genus Ensifer   TAS [14-16]
    Species Ensifer sp.   IDA
    Gram stain    Negative   IDA
    Cell shape    Rod   IDA
    Motility    Motile   IDA
    Sporulation    Non-sporulating   NAS
    Temperature range    Mesophile   NAS
    Optimum temperature    28°C   NAS
    Salinity    Non-halophile   NAS
MIGS-22    Oxygen requirement    Aerobic   TAS [5]
    Carbon source    Varied   NAS
    Energy source    Chemoorganotroph   NAS
MIGS-6    Habitat    Soil, root nodule, on host   TAS [5]
MIGS-15    Biotic relationship    Free living, symbiotic   TAS [5]
MIGS-14    Pathogenicity    Non-pathogenic   NAS
    Biosafety level    1   TAS [17]
    Isolation    Root nodule of Tephrosia wallichii   TAS [5]
MIGS-4    Geographic location    Jodhpur, Indian Thar Desert   TAS [5]
MIGS-5    Soil collection date    Oct, 2009   IDA
MIGS-4.1     Longitude    73.021177   IDA
MIGS-4.2    Latitude    26.27061   IDA
MIGS-4.3    Depth    15cm
MIGS-4.4    Altitude    Not recorded

Evidence codes – IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [18].

Figure 1

Phylogenetic tree showing the relationship of sp. TW10 (shown in bold print) to other spp. in the order based on aligned sequences of the 16S rRNA gene (1,290 bp internal region). All sites were informative and there were no gap-containing sites. Phylogenetic analyses were performed using MEGA, version 5 [19]. The tree was built using the Maximum-Likelihood method with the General Time Reversible model [20]. Bootstrap analysis [21] with 500 replicates was performed to assess the support of the clusters. Type strains are indicated with a superscript T. Brackets after the strain name contain a DNA database accession number and/or a GOLD ID (beginning with the prefix G) for a sequencing project registered in GOLD [22]. Published genomes are indicated with an asterisk.

Evidence codes – IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [18]. Phylogenetic tree showing the relationship of sp. TW10 (shown in bold print) to other spp. in the order based on aligned sequences of the 16S rRNA gene (1,290 bp internal region). All sites were informative and there were no gap-containing sites. Phylogenetic analyses were performed using MEGA, version 5 [19]. The tree was built using the Maximum-Likelihood method with the General Time Reversible model [20]. Bootstrap analysis [21] with 500 replicates was performed to assess the support of the clusters. Type strains are indicated with a superscript T. Brackets after the strain name contain a DNA database accession number and/or a GOLD ID (beginning with the prefix G) for a sequencing project registered in GOLD [22]. Published genomes are indicated with an asterisk.

Classification and general features

sp. strain TW10 is a Gram-negative rod (Figure 2, and Figure 3) in the order of the class . It is fast growing, forming white-opaque, slightly domed and moderately mucoid colonies with smooth margins within 3-4 days at 28°C when grown on YMA [23].
Figure 2

Image of sp. TW10 using scanning electron microscopy.

Figure 3

Image of sp. TW10 using transmission electron microscopy.

Image of sp. TW10 using scanning electron microscopy. Image of sp. TW10 using transmission electron microscopy.

Symbiotaxonomy

sp. TW10 has the ability to nodulate (Nod+) and fix nitrogen (Fix+) effectively with a wide range of perennial native (wild) legumes of Thar Desert origin and with species of crop legumes (Table 2). sp. TW10 is symbiotically competent with these species when grown in alkaline soils. TW10 can nodulate the wild tree legume Prosopis cineraria of the Mimosoideae subfamily. However, it does not form nodules on the Mimosoid hosts Mimosa hamata and M. himalayana even though these hosts are known to be nodulated by species [5,24]. TW10 was not compatible with the host Phaseolus vulgaris, a legume of the Phaseolae tribe.
Table 2

Compatibility of sp. TW10 with different wild and cultivated legume species

Species Name   Family   Wild/ Cultivar    Common Name   Habit/ Growth Type   Nod   Fix
Tephrosia falciformis Ramaswami   Papilionoideae   Wild    Rati biyani   Under-shrub Perennial   +   +
Tephrosia purpurea(L.) Pers. sub sp.leptostachya DC.   Papilionoideae   Wild    -   Herb Annual/ Perennial   +   +
Tephrosia purpurea(L.) Pers. sub sp.purpurea (L.) Pers   Papilionoideae   Wild    Biyani, Sarphanko   Herb Annual/ Perennial   +   +
Tephrosia villosa(Linn.) Pres.   Papilionoideae   Wild    Ruvali-biyani   Herb Annual/ Perennial   +   +
Prosopis cineraria(Linn.) Druce.   Mimosoideae   Wild/   Cultivar    Khejari   Tree Perennial   +   +
Mimosa hamata Willd.   Mimosoideae   Wild    Jinjani, Jinjanio   Shrub Perennial   -   -
M. himalayana Gamble   Mimosoideae   Wild    Hajeru   Shrub Perennial   -   -
Vigna radiata(L.) Wilczek   Papilionoideae   Cultivar    Moong bean   Annual   +   +
Vigna aconitifolia(Jacq.) Marechal   Papilionoideae   Cultivar    Moth bean   Annual   +   +
Vigna unguiculata(L.) Walp.   Papilionoideae   Cultivar    Cowpea   Annual   +   +
Macroptilium atropurpureum(DC.) Urb.   Papilionoideae   Cultivar    Siratro   Annual   +   +
Phaseolus vulgarisL.   Papilionoideae   Cultivar    Common bean   Annual   -   -

Nod: “+” means nodulation observed, “-” means no nodulation

Fix: “+” means fixation observed, “-” means no fixation

Nod: “+” means nodulation observed, “-” means no nodulation Fix: “+” means fixation observed, “-” means no fixation

Genome sequencing and annotation

Genome project history

This organism was selected for sequencing on the basis of its environmental and agricultural relevance to issues in global carbon cycling, alternative energy production, and biogeochemical importance, and is part of the Community Sequencing Program at the U.S. Department of Energy, Joint Genome Institute (JGI) for projects of relevance to agency missions. The genome project is deposited in the Genomes OnLine Database [22] and standard draft genome sequence in IMG. Sequencing, finishing and annotation were performed by the JGI. A summary of the project information is shown in Table 3.
Table 3

Genome sequencing project information for sp. strain TW10.

MIGS ID    Property    Term
MIGS-31    Finishing quality    Standard draft
MIGS-28    Libraries used    1× Illumina library
MIGS-29    Sequencing platforms    Illumina HiSeq2000
MIGS-31.2    Sequencing coverage    330× Illumina
MIGS-30    Assemblers    Allpaths, LG version r42328, Velvet 1.1.04
MIGS-32    Gene calling methods    Prodigal 1.4,
    GenBank    Genbank Date of Release    GOLD ID    pending    pending    Gi08835
    NCBI project ID    210334
    Database: IMG    2509276019
    Project relevance    Symbiotic N2 fixation, agriculture

Growth conditions and DNA isolation

sp. TW10 was cultured to mid logarithmic phase in 60 ml of TY rich medium [25] on a gyratory shaker at 28°C. DNA was isolated from the cells using a CTAB (Cetyl trimethyl ammonium bromide) bacterial genomic DNA isolation method [26].

Genome sequencing and assembly

The genome of sp. TW10 was generated at the Joint Genome Institute (JGI) using Illumina [27] technology. An Illumina std shotgun library was constructed and sequenced using the Illumina HiSeq 2000 platform which generated 14,938,244 reads totaling 2,241 Mbp. All general aspects of library construction and sequencing performed at the JGI can be found at the JGI website [26]. All raw Illumina sequence data was passed through DUK, a filtering program developed at JGI, which removes known Illumina sequencing and library preparation artifacts (Mingkun L, Copeland, A, and Han, J, unpublished). The following steps were then performed for assembly: (1) filtered Illumina reads were assembled using Velvet [28] (version 1.1.04), (2) 1–3 kb simulated paired end reads were created from Velvet contigs using wgsim (https://github.com/lh3/wgsim), and (3) Illumina reads were assembled with simulated read pairs using Allpaths–LG (version r42328) [29]. Parameters for assembly steps were: 1) Velvet (velveth: 63 –shortPaired and velvetg: –veryclean yes –exportFiltered yes –mincontiglgth 500 –scaffolding no–covcutoff 10) 2) wgsim (–e 0 –1 100 –2 100 –r 0 –R 0 –X 0) 3) Allpaths–LG (PrepareAllpathsInputs:PHRED64=1 PLOIDY=1 FRAGCOVERAGE=125 JUMPCOVERAGE=25 LONGJUMPCOV=50, RunAllpath-sLG: THREADS=8 RUN=stdshredpairs TARGETS=standard VAPIWARNONLY=True OVERWRITE=True). The final draft assembly contained 57 contigs in 57 scaffolds. The total size of the genome is 6.8 Mbp and the final assembly is based on 2241Mbp of Illumina data, which provides an average 330× coverage of the genome.

Genome annotation

Genes were identified using Prodigal [30] as part of the DOE-JGI annotation pipeline [31]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. The tRNAScanSE tool [7] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [32]. Other non–coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [33]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes (IMG) platform) [34,35].

Genome properties

The genome is 6,802,256 nucleotides with 61.56% GC content (Table 4) and comprised of 57 scaffolds (Figure 4) of 57 contigs. From a total of 6,546 genes, 6,473 were protein encoding and 73 RNA only encoding genes. The majority of genes (77.44%) were assigned a putative function while the remaining genes were annotated as hypothetical. The distribution of genes into COGs functional categories is presented in Table 5.
Table 4

Genome Statistics for sp. TW10

Attribute   Value   % of Total
Genome size (bp)   6,802,256   100.00
DNA coding region (bp)   5,800,968   85.28
DNA G+C content (bp)   4,187,461   61.56
Number of scaffolds   57
Number of contigs   57
Total gene   6,546   100.00
RNA genes   73   1.12
rRNA operons   1
Protein-coding genes   6,473   98.88
Genes with function prediction   5,069   77.44
Genes assigned to COGs   5,069   77.44
Genes assigned Pfam domains   5,282   80.69
Genes with signal peptides   539   8.23
Genes with transmembrane helices   1,419   21.68
Figure 4

Graphical map of five of the largest scaffolds from the genome of sp. TW10. From bottom to the top of each scaffold: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, other RNAs black), GC content, GC skew.

Table 5

Number of protein coding genes of sp. TW10 associated with the general COG functional categories.

CodeValue%age    Description
J   198    3.55    Translation, ribosomal structure and biogenesis
A   0    0.00    RNA processing and modification
K   481    8.61    Transcription
L   237    4.24    Replication, recombination and repair
B   3    0.05    Chromatin structure and dynamics
D   37    0.66    Cell cycle control, mitosis and meiosis
Y   0    0.00    Nuclear structure
V   66    1.18    Defense mechanisms
T   262    4.69    Signal transduction mechanisms
M   298    5.34    Cell wall/membrane biogenesis
N   77    1.38    Cell motility
Z   0    0.00    Cytoskeleton
W   1    0.02    Extracellular structures
U   132    2.36    Intracellular trafficking and secretion
O   192    3.44    Posttranslational modification, protein turnover, chaperones
C   322    5.77    Energy production conversion
G   538    9.63    Carbohydrate transport and metabolism
E   606    10.85    Amino acid transport metabolism
F   96    1.72    Nucleotide transport and metabolism
H   194    3.47    Coenzyme transport and metabolism
I   199    3.56    Lipid transport and metabolism
P   251    4.49    Inorganic ion transport and metabolism
Q   139    2.49    Secondary metabolite biosynthesis, transport and catabolism
R   678    12.14    General function prediction only
S   578    10.35    Function unknown
-   1,477    22.56    Not in COGS
Graphical map of five of the largest scaffolds from the genome of sp. TW10. From bottom to the top of each scaffold: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, other RNAs black), GC content, GC skew.
  17 in total

1.  Solexa Ltd.

Authors:  Simon Bennett
Journal:  Pharmacogenomics       Date:  2004-06       Impact factor: 2.533

2.  High-quality draft assemblies of mammalian genomes from massively parallel sequence data.

Authors:  Sante Gnerre; Iain Maccallum; Dariusz Przybylski; Filipe J Ribeiro; Joshua N Burton; Bruce J Walker; Ted Sharpe; Giles Hall; Terrance P Shea; Sean Sykes; Aaron M Berlin; Daniel Aird; Maura Costello; Riza Daza; Louise Williams; Robert Nicol; Andreas Gnirke; Chad Nusbaum; Eric S Lander; David B Jaffe
Journal:  Proc Natl Acad Sci U S A       Date:  2010-12-27       Impact factor: 11.205

3.  List of new names and new combinations previously effectively, but not validly, published.

Authors: 
Journal:  Int J Syst Evol Microbiol       Date:  2006-01       Impact factor: 2.747

4.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.

Authors:  Koichiro Tamura; Daniel Peterson; Nicholas Peterson; Glen Stecher; Masatoshi Nei; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2011-05-04       Impact factor: 16.240

5.  Constructs for insertional mutagenesis, transcriptional signal localization and gene regulation studies in root nodule and other bacteria.

Authors:  Wayne G Reeve; Ravi P Tiwari; Penelope S Worsley; Michael J Dilworth; Andrew R Glenn; John G Howieson
Journal:  Microbiology       Date:  1999-06       Impact factor: 2.777

6.  The genus name Sinorhizobium Chen et al. 1988 is a later synonym of Ensifer Casida 1982 and is not conserved over the latter genus name, and the species name 'Sinorhizobium adhaerens' is not validly published. Opinion 84.

Authors: 
Journal:  Int J Syst Evol Microbiol       Date:  2008-08       Impact factor: 2.747

7.  Prodigal: prokaryotic gene recognition and translation initiation site identification.

Authors:  Doug Hyatt; Gwo-Liang Chen; Philip F Locascio; Miriam L Land; Frank W Larimer; Loren J Hauser
Journal:  BMC Bioinformatics       Date:  2010-03-08       Impact factor: 3.169

8.  The minimum information about a genome sequence (MIGS) specification.

Authors:  Dawn Field; George Garrity; Tanya Gray; Norman Morrison; Jeremy Selengut; Peter Sterk; Tatiana Tatusova; Nicholas Thomson; Michael J Allen; Samuel V Angiuoli; Michael Ashburner; Nelson Axelrod; Sandra Baldauf; Stuart Ballard; Jeffrey Boore; Guy Cochrane; James Cole; Peter Dawyndt; Paul De Vos; Claude DePamphilis; Robert Edwards; Nadeem Faruque; Robert Feldman; Jack Gilbert; Paul Gilna; Frank Oliver Glöckner; Philip Goldstein; Robert Guralnick; Dan Haft; David Hancock; Henning Hermjakob; Christiane Hertz-Fowler; Phil Hugenholtz; Ian Joint; Leonid Kagan; Matthew Kane; Jessie Kennedy; George Kowalchuk; Renzo Kottmann; Eugene Kolker; Saul Kravitz; Nikos Kyrpides; Jim Leebens-Mack; Suzanna E Lewis; Kelvin Li; Allyson L Lister; Phillip Lord; Natalia Maltsev; Victor Markowitz; Jennifer Martiny; Barbara Methe; Ilene Mizrachi; Richard Moxon; Karen Nelson; Julian Parkhill; Lita Proctor; Owen White; Susanna-Assunta Sansone; Andrew Spiers; Robert Stevens; Paul Swift; Chris Taylor; Yoshio Tateno; Adrian Tett; Sarah Turner; David Ussery; Bob Vaughan; Naomi Ward; Trish Whetzel; Ingio San Gil; Gareth Wilson; Anil Wipat
Journal:  Nat Biotechnol       Date:  2008-05       Impact factor: 54.908

9.  SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB.

Authors:  Elmar Pruesse; Christian Quast; Katrin Knittel; Bernhard M Fuchs; Wolfgang Ludwig; Jörg Peplies; Frank Oliver Glöckner
Journal:  Nucleic Acids Res       Date:  2007-10-18       Impact factor: 16.971

10.  The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata.

Authors:  Konstantinos Liolios; Konstantinos Mavromatis; Nektarios Tavernarakis; Nikos C Kyrpides
Journal:  Nucleic Acids Res       Date:  2007-11-02       Impact factor: 16.971

View more
  3 in total

1.  Genomic characterization of Ensifer aridi, a proposed new species of nitrogen-fixing rhizobium recovered from Asian, African and American deserts.

Authors:  Antoine Le Quéré; Nisha Tak; Hukam Singh Gehlot; Celine Lavire; Thibault Meyer; David Chapulliot; Sonam Rathi; Ilham Sakrouhi; Guadalupe Rocha; Marine Rohmer; Dany Severac; Abdelkarim Filali-Maltouf; Jose-Antonio Munive
Journal:  BMC Genomics       Date:  2017-01-14       Impact factor: 3.969

2.  Genome sequence of Ensifer medicae strain WSM1369; an effective microsymbiont of the annual legume Medicago sphaerocarpos.

Authors:  Jason Terpolilli; Giovanni Garau; Yvette Hill; Rui Tian; John Howieson; Lambert Bräu; Lynne Goodwin; James Han; Konstantinos Liolios; Marcel Huntemann; Amrita Pati; Tanja Woyke; Konstantinos Mavromatis; Victor Markowitz; Natalia Ivanova; Nikos Kyrpides; Wayne Reeve
Journal:  Stand Genomic Sci       Date:  2013-12-17

3.  High-quality permanent draft genome sequence of Ensifer sp. PC2, isolated from a nitrogen-fixing root nodule of the legume tree (Khejri) native to the Thar Desert of India.

Authors:  Hukam Singh Gehlot; Julie Ardley; Nisha Tak; Rui Tian; Neetu Poonar; Raju R Meghwal; Sonam Rathi; Ravi Tiwari; Wan Adnawani; Rekha Seshadri; T B K Reddy; Amrita Pati; Tanja Woyke; Manoj Pillay; Victor Markowitz; Mohammed N Baeshen; Ahmed M Al-Hejin; Natalia Ivanova; Nikos Kyrpides; Wayne Reeve
Journal:  Stand Genomic Sci       Date:  2016-06-23
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.