| Literature DB >> 21803841 |
Naoyuki Tajima1, Shusei Sato, Fumito Maruyama, Takakazu Kaneko, Naobumi V Sasaki, Ken Kurokawa, Hiroyuki Ohta, Yu Kanesaki, Hirofumi Yoshikawa, Satoshi Tabata, Masahiko Ikeuchi, Naoki Sato.
Abstract
Synechocystis sp. PCC 6803 is the most popular cyanobacterial strain, serving as a standard in the research fields of photosynthesis, stress response, metabolism and so on. A glucose-tolerant (GT) derivative of this strain was used for genome sequencing at Kazusa DNA Research Institute in 1996, which established a hallmark in the study of cyanobacteria. However, apparent differences in sequences deviating from the database have been noticed among different strain stocks. For this reason, we analysed the genomic sequence of another GT strain (GT-S) by 454 and partial Sanger sequencing. We found 22 putative single nucleotide polymorphisms (SNPs) in comparison to the published sequence of the Kazusa strain. However, Sanger sequencing of 36 direct PCR products of the Kazusa strains stored in small aliquots resulted in their identity with the GT-S sequence at 21 of the 22 sites, excluding the possibility of their being SNPs. In addition, we were able to combine five split open reading frames present in the database sequence, and to remove the C-terminus of an ORF. Aside from these, two of the Insertion Sequence elements were not present in the GT-S strain. We have thus become able to provide an accurate genomic sequence of Synechocystis sp. PCC 6803 for future studies on this important cyanobacterial strain.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21803841 PMCID: PMC3190959 DOI: 10.1093/dnares/dsr026
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
List of putative SNPs
| No. | Site | Gene | CyanoClust cluster no. | Database | GT-Kazusa | GT-S | Amino acid change | Annotation | Ref. |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 943495 | 16 | G | A | A | V→I | P700 apoprotein subunit Ia | ||
| 2 | 1012958 | No gene | — | G | T | T | N/A | — | |
| 3 | 1364187 | 784 | A | G | G | None | Orotidine 5’ monophosphate decarboxylase | ||
| 4 | 1819782 | 18 | A | G | G | None | Photosystem II D1 protein | ||
| 5 | 1819788 | A | G | G | None | ||||
| 6 | 2092571 | 1760 | A | T | T | L→ter | Asparaginase | ||
| 7 | 2198893 | 15 | T | C | C | None | Cation or drug efflux system protein | ||
| 8 | 2204584 | 917+7792 | G | G | — | Frame shift | Pilin biogenesis protein | ||
| 9 | 2301721 | 6624 | A | G | G | K→E | Hypothetical protein | ||
| 10 | 2350285.5 | No gene | — | — | A | A | N/A | — | |
| 11 | 2360245.5 | 26 765+19 649 | — | C | C | Frame shift | Hypothetical protein | ||
| 12 | 2409244 | 2611 | C | — | — | Frame shift | Hypothetical protein | ||
| 13 | 2419399 | 779 | T | — | — | Frame shift | Hypothetical protein | ||
| 14 | 2544044.5 | 2596 | — | C | C | Frame shift | Hypothetical protein | ||
| 15 | 2602717 | 31358 | C | A | A | H→Q | Hypothetical protein | ||
| 16 | 2602734 | T | A | A | I→N | ||||
| 17 | 2748897 | No gene | — | C | T | T | N/A | — | |
| 18 | 3096187 | 796 | T | C | C | I→T | Transposase | ||
| 19 | 3110189 | No gene | — | G | A | A | N/A | — | |
| 20 | 3110343 | 1448 | G | T | T | P→Q | Transposase | ||
| 21 | 3142651 | 2831 | A | G | G | None | Sucrose phosphate synthase | ||
| 22 | 3260096 | No gene | — | C | — | — | N/A | — |
GT-Kazusa and GT-S are Synechocystis sp. PCC 6803 strain GT in Kazusa DNA Research Institute and Sato Laboratory. ‘Site’ and ‘Database’ refers to the sequences in BA000022 or NC_000911. Insertion site numbers represent the last position of insertion site + 0.5. N/A indicates that the amino acid change is not applicable because SNP site is not in an ORF.
Figure 1.Correction of ORFs due to a frame shift in the sll0762–sll0763 retion. (A) Output of an SNP site in the reference sequence of the Kazusa strain by the inGAP software. The upper DNA sequence indicates the reference sequence of the Kazusa strain (GenBank and RefSeq accession numbers: BA000022 and NC_000911), and the lower DNA sequence indicates the sequence of the GT-S strain. Each arrow represents a gene. Each arrowhead indicates an SNP site. (B) New alignment with a corrected sequence. Homology of affected ORFs with corresponding sequences in other cyanobacteria was analysed by the CyanoClust database version 4, and the cluster 2611 was found. Sequences were retrieved and a new alignment was obtained by the Clustal X software. ‘New_Sequence’ indicates the corrected sequence. Arrowhead indicates the nucleotide variations detected as putative SNP site.
List of ISY203s detected in GT strains
| IS name | Transposase gene | Database | GT-Kazusa | GT-S |
|---|---|---|---|---|
| ISY | Yes | Yes | No | |
| ISY | Yes | Yes | Yes | |
| ISY | Yes | Yes | No |