| Literature DB >> 28211005 |
Luis E Moraes1, Matthew J Blow2, Erik R Hawley3, Hailan Piao4, Rita Kuo2, Jennifer Chiniquy2, Nicole Shapiro2, Tanja Woyke2, James G Fadel1, Matthias Hess5,6.
Abstract
Cyanobacteria have the potential to produce bulk and fine chemicals and members belonging to Nostoc sp. have received particular attention due to their relatively fast growth rate and the relative ease with which they can be harvested. Nostoc punctiforme is an aerobic, motile, Gram-negative, filamentous cyanobacterium that has been studied intensively to enhance our understanding of microbial carbon and nitrogen fixation. The genome of the type strain N. punctiforme ATCC 29133 was sequenced in 2001 and the scientific community has used these genome data extensively since then. Advances in bioinformatics tools for sequence annotation and the importance of this organism prompted us to resequence and reanalyze its genome and to make both, the initial and improved annotation, available to the scientific community. The new draft genome has a total size of 9.1 Mbp and consists of 65 contiguous pieces of DNA with a GC content of 41.38% and 7664 protein-coding genes. Furthermore, the resequenced genome is slightly (5152 bp) larger and contains 987 more genes with functional prediction when compared to the previously published version. We deposited the annotation of both genomes in the Department of Energy's IMG database to facilitate easy genome exploration by the scientific community without the need of in-depth bioinformatics skills. We expect that an facilitated access and ability to search the N. punctiforme ATCC 29133 for genes of interest will significantly facilitate metabolic engineering and genome prospecting efforts and ultimately the synthesis of biofuels and natural products from this keystone organism and closely related cyanobacteria.Entities:
Keywords: Carbon cycle; Cyanobacteria; Natural product synthesis; Nitrogen cycle; Nostoc punctiforme; Single molecule real-time sequencing
Year: 2017 PMID: 28211005 PMCID: PMC5313495 DOI: 10.1186/s13568-017-0338-9
Source DB: PubMed Journal: AMB Express ISSN: 2191-0855 Impact factor: 3.298
Classification and general features of Nostoc punctiforme ATCC 29133
| Property | Term | Evidence codea |
|---|---|---|
| Classification | Domain | TAS (Woese et al. |
| Phylum | TAS (Castenholz | |
| Class | ||
| Order | TAS (Rippka et al. | |
| Family | TAS (Whitman | |
| Genus | TAS (Herdman et al. | |
| Species | TAS (Herdman et al. | |
| Strain ATCC 29133/PCC 73102 | ||
| Gram stain | Negative | TAS (Hoiczyk and Hansel |
| Cell shape | Filamentous | TAS (Herdman et al. |
| Motility | Motile | TAS (Lehner et al. |
| Growth temperature | 26 °C | IDA |
| pH | 7.1 | IDA |
| Habitat | Fresh water, Soil | TAS (Herdman et al. |
| Oxygen requirement | Aerobic | TAS (Herdman et al. |
| Biotic relationship | Symbiotic | TAS (Herdman et al. |
| Pathogenicity | Non-pathogen | NAS |
| Geographic location | USA/Washington | |
| Sample collection | October 10th 2014 | |
| Latitude | 46.3119 | |
| Longitude | −119.263 |
aEvidence codes—IDA: Inferred from direct assay; TAS: traceable author statement (i.e., a direct report exists in the literature); NAS: non-traceable author statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). Evidence codes are from the Gene Ontology project (Ashburner et al. 2000)
Sequencing and assembly information
| MIGS IDa | Property | Term |
|---|---|---|
| MIGS 31 | Finishing quality | High quality draft |
| MIGS-28 | Libraries used | >10 kbp PacBio SMRTbell |
| MIGS 29 | Sequencing platform | PacBio SMRT RS |
| MIGS 31.2 | Fold coverage | 96.8-fold |
| MIGS 30 | Assembler | HGAP 2.3.0 |
aField et al. 2008
Genome statistics for Nostoc punctiforme ATCC 29133
| Attribute | Meeks et al. ( | This study | ||
|---|---|---|---|---|
| Value | % of total | Value | % of total | |
| Genome size (bp) | 9059191 | 100 | 9064343 | 100 |
| DNA coding (bp) | 7015747 | 77.44 | 7393120 | 81.56 |
| DNA G+C (bp) | 3746385 | 41.35 | 3751137 | 41.38 |
| DNA scaffolds | 6 | 100 | 65 | 100 |
| Total genes | 6791 | 100 | 7775 | 100 |
| Protein coding genes | 6690 | 98.51 | 7664 | 98.57 |
| RNA genes | 101 | 1.49 | 111 | 1.43 |
| Genes with function prediction | 4089 | 60.21 | 5076 | 65.29 |
| Genes without assigned function | 2601 | 38.3 | 2588 | 33.29 |
| Genes assigned to COGs | 3432 | 50.54 | 3598 | 46.28 |
| Genes with Pfam domains | 5010 | 73.77 | 5381 | 69.21 |
| Genes with signal peptides | 274 | 4.03 | 297 | 3.82 |
| Genes with transmembrane helices | 1525 | 22.46 | 1635 | 21.03 |
| Genes in biosynthetic clusters | 602 | 8.86 | 1080 | 13.89 |
| CRISPR repeats | 8 | 10 | ||
Number of genes associated with general COG functional categories
| Code | Meeks et al. (2001) | This study | Description | ||
|---|---|---|---|---|---|
| Count | %a | Count | %a | ||
| E | 246 | 6.23 | 245 | 6.04 | Amino acid transport and metabolism |
| G | 185 | 4.69 | 192 | 4.73 | Carbohydrate transport and metabolism |
| D | 43 | 1.09 | 39 | 0.96 | Cell cycle control, cell division, chromosome partitioning |
| N | 63 | 1.6 | 67 | 1.65 | Cell motility |
| M | 276 | 6.99 | 277 | 6.83 | Cell wall/membrane/envelope biogenesis |
| B | 2 | 0.05 | 2 | 0.05 | Chromatin structure and dynamics |
| H | 240 | 6.08 | 240 | 5.91 | Coenzyme transport and metabolism |
| V | 144 | 3.65 | 153 | 3.77 | Defense mechanisms |
| C | 207 | 5.24 | 210 | 5.17 | Energy production and conversion |
| W | 20 | 0.51 | 23 | 0.57 | Extracellular structures |
| S | 242 | 6.13 | 270 | 6.65 | Function unknown |
| R | 553 | 14.01 | 564 | 13.9 | General function prediction only |
| P | 233 | 5.9 | 237 | 5.84 | Inorganic ion transport and metabolism |
| U | 37 | 0.94 | 38 | 0.94 | Intracellular trafficking, secretion, and vesicular transport |
| I | 143 | 3.62 | 144 | 3.55 | Lipid transport and metabolism |
| – | 43 | 1.09 | 68 | 1.68 | Mobilome: prophages, transposons |
| F | 76 | 1.93 | 74 | 1.82 | Nucleotide transport and metabolism |
| O | 198 | 5.02 | 204 | 5.03 | Posttranslational modification, protein turnover, chaperones |
| A | 1 | 0.03 | 1 | 0.02 | RNA processing and modification |
| L | 138 | 3.5 | 140 | 3.45 | Replication, recombination and repair |
| Q | 180 | 4.56 | 194 | 4.78 | Secondary metabolites biosynthesis, transport and catabolism |
| T | 318 | 8.05 | 313 | 7.71 | Signal transduction mechanisms |
| K | 155 | 3.93 | 159 | 3.92 | Transcription |
| J | 205 | 5.19 | 204 | 5.03 | Translation, ribosomal structure and biogenesis |
| – | 3359 | 49.46 | 4177 | 53.72 | Not in COG |
aBased on the total number of protein coding genes
Fig. 1Alignment plot of Nostoc punctiforme ATCC 29133 genomes