| Literature DB >> 35998200 |
Katelyn M McKindles1,2, R Michael McKay1,3, George S Bullerjahn3,4.
Abstract
Planktothrix agardhii is a filamentous cyanobacterial species that dominates harmful algal blooms in Sandusky Bay, Lake Erie and other freshwater basins across the world. P. agardhii isolates were obtained from early (June) blooms via single filament isolation; eight have been characterized from 2016, and 12 additional isolates have been characterized from 2018 for a total of 20 new cultures. These novel isolates were processed for genomic sequencing, where reads were used to generate scaffolds and contigs which were annotated with DIAMOND BLAST hit, Pfam, and GO. Analyses include whole genome alignment to generate phylogenetic trees and comparison of genetic rearrangements between isolates. Nitrogen acquisition and metabolism was compared across isolates. Secondary metabolite production was genetically explored including microcystins, two types of aeruginosin clusters, anabaenopeptins, cyanopeptolins, microviridins, and prenylagaramides. Two common and 4 unique CRISPR-cas islands were analyzed for similar sequences across all isolates and against the known Planktothrix-specific cyanophage, PaV-LD. Overall, the uniqueness of each genome from Planktothrix blooms sampled from the same site and at similar times belies the unexplored diversity of this genus.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35998200 PMCID: PMC9398003 DOI: 10.1371/journal.pone.0273454
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Genome characteristics for Sandusky Bay Planktothrix agardhii isolates and reference sequence Planktothrix agardhii NIVA_CYA 126/8.
| Total length (kbp) | No. contigs and scaffolds | G+C content (%) | N50 (kbp) | No. protein-coding sequences | No. of coding sequences attributed to non-cyanobacteria | |
|---|---|---|---|---|---|---|
| NIVA_CYA 126/8 | 5045.9 | 6 | 39.6 | 4785.6 | 4532 | 23 |
| Plk1025 | 4974.0 | 18 | 39.6 | 4291.3 | 4533 | 35 |
| Plk1026 | 5422.1 | 74 | 39.5 | 4662.3 | 5387 | 47 |
| Plk1027 | 5152.6 | 23 | 39.7 | 4046.3 | 5176 | 34 |
| Plk1029 | 5147.2 | 8 | 39.6 | 4508.1 | 5133 | 41 |
| Plk1030 | 5114.1 | 37 | 39.6 | 4710.1 | 5099 | 41 |
| Plk1031 | 5046.1 | 31 | 39.6 | 4696.5 | 4571 | 32 |
| Plk1032 | 4991.8 | 13 | 39.6 | 4684.6 | 4985 | 29 |
| Plk1033 | 5349.1 | 191 | 39.4 | 4058.0 | 5537 | 66 |
| Plk1801 | 4856.2 | 18 | 39.7 | 4235.1 | 4912 | 43 |
| Plk1803 | 4991.9 | 22 | 39.7 | 3052.9 | 4981 | 33 |
| Plk1804 | 4869.8 | 8 | 39.6 | 4539.4 | 4868 | 33 |
| Plk1805 | 5039.6 | 12 | 39.6 | 4104.9 | 5055 | 36 |
| Plk1806 | 4970.4 | 9 | 39.6 | 4590.0 | 4972 | 33 |
| Plk1807 | 5429.1 | 72 | 39.5 | 4511.2 | 5360 | 42 |
| Plk1808 | 4965.4 | 9 | 39.6 | 4701.5 | 4475 | 30 |
| Plk1809 | 5656.3 | 20 | 39.6 | 4804.8 | 5114 | 28 |
| Plk1810 | 4890.6 | 11 | 39.6 | 4267.2 | 4451 | 18 |
| Plk1811 | 5092.8 | 16 | 39.9 | 3879.8 | 5347 | 105 |
| Plk1812 | 5908.4 | 160 | 39.4 | 4397.2 | 5948 | 65 |
| Plk1813 | 4957.6 | 15 | 39.6 | 4586.0 | 4502 | 34 |
Fig 1Relatedness of whole genome alignment of 20 P. agardhii isolates from Sandusky Bay, Lake Erie.
The top of the matrix is the average nucleotide identity (ANI) common between two isolates. The bottom of the matrix is the alignment percentage (AP) common between two isolates. The lowest AP value suggests a common genome core of 45%.
Fig 2Whole genome phylogenetic tree based on (AP/ANI) reveals distinct grouping of P. agardhii isolates.
Since the grouping is the same using either AP and ANI, only the tree generated using ANI and the UPGMA method is shown here. The bar represents the horizontal distance matrix used to scale the branch length as a function of substitutions per site.
Group differential gene function table.
| GO function ID and Name | Log₂ fold change | Fold change | P-value | Bonferroni | |
|---|---|---|---|---|---|
| Group 1: | 0005536 // glucose binding | 1.63 | 3.08 | 1.9E-06 | 3.0E-03 |
| 0051156 // glucose 6-phosphate metabolic process | 1.15 | 2.21 | 2.3E-06 | 3.7E-03 | |
| 0034061 // DNA polymerase activity | 0.98 | 1.97 | 0.0E+00 | 0.0E+00 | |
| 1990234 // transferase complex | 0.82 | 1.77 | 0.0E+00 | 0.0E+00 | |
| 0016042 // lipid catabolic process | 0.81 | 1.75 | 2.7E-05 | 4.0E-02 | |
| 0004527 // exonuclease activity | 0.64 | 1.56 | 2.7E-11 | 4.2E-08 | |
| 0015666 // restriction endodeoxyribonuclease activity | 0.56 | 1.47 | 1.9E-05 | 3.0E-02 | |
| 0006260 // DNA replication | 0.4 | 1.32 | 2.2E-12 | 3.4E-09 | |
| 1902494 // catalytic complex | 0.37 | 1.29 | 2.0E-10 | 3.2E-07 | |
| 0030234 // enzyme regulator activity | 0.32 | 1.25 | 4.0E-08 | 6.3E-05 | |
| 0046983 // protein dimerization activity | 0.29 | 1.23 | 2.8E-05 | 4.0E-02 | |
| 0016779 // nucleotidyltransferase activity | 0.19 | 1.14 | 7.2E-07 | 1.1E-03 | |
| Group 2: | 0016705 // oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen | 1.66 | 3.16 | 0.0E+00 | 0.0E+00 |
| 0008171 // O-methyltransferase activity | 1.54 | 2.92 | 5.7E-10 | 9.0E-07 | |
| 0043571 // maintenance of CRISPR repeat elements | 1.11 | 2.16 | 2.8E-10 | 4.5E-07 | |
| 0005506 // iron ion binding | 1.03 | 2.04 | 0.0E+00 | 0.0E+00 | |
| 0009605 // response to external stimulus | 0.91 | 1.88 | 7.4E-07 | 1.2E-03 | |
| 0051704 // multi-organism process | 0.73 | 1.65 | 8.4E-06 | 1.0E-02 | |
| 0020037 // heme binding | 0.66 | 1.58 | 1.7E-12 | 2.6E-09 | |
| 0046906 // tetrapyrrole binding | 0.47 | 1.39 | 8.7E-12 | 1.4E-08 | |
| 0004519 // endonuclease activity | 0.4 | 1.32 | 1.5E-06 | 2.4E-03 | |
| 0006304 // DNA modification | 0.38 | 1.3 | 1.6E-10 | 2.6E-07 | |
| 0008170 // N-methyltransferase activity | 0.34 | 1.27 | 3.0E-06 | 4.8E-03 | |
| 0046914 // transition metal ion binding | 0.33 | 1.26 | 0.0E+00 | 0.0E+00 | |
| 0043414 // macromolecule methylation | 0.33 | 1.25 | 6.3E-06 | 1.0E-02 | |
| 0006259 // DNA metabolic process | 0.3 | 1.24 | 1.1E-14 | 1.8E-11 | |
| 0016758 // transferase activity, transferring hexosyl groups | 0.28 | 1.22 | 2.3E-06 | 3.6E-03 | |
| 0016757 // transferase activity, transferring glycosyl groups | 0.28 | 1.21 | 1.3E-12 | 2.0E-09 | |
| 0071840 // cellular component organization or biogenesis | 0.19 | 1.14 | 2.7E-06 | 4.2E-03 | |
| 0008168 // methyltransferase activity | 0.17 | 1.13 | 1.6E-05 | 3.0E-02 | |
| Group 3: | 0016832 // aldehyde-lyase activity | 1.93 | 3.8 | 2.0E-08 | 3.2E-05 |
| 0016884 // carbon-nitrogen ligase activity, with glutamine as amido-N-donor | 1.17 | 2.25 | 1.9E-05 | 3.0E-02 | |
| 0016830 // carbon-carbon lyase activity | 0.78 | 1.72 | 4.8E-06 | 7.6E-03 | |
| 0009067 // aspartate family amino acid biosynthetic process | 0.61 | 1.53 | 1.7E-06 | 2.6E-03 | |
| 0072330 // monocarboxylic acid biosynthetic process | 0.51 | 1.42 | 2.7E-05 | 4.0E-02 | |
| 0030976 // thiamine pyrophosphate binding | 0.49 | 1.4 | 1.5E-05 | 2.0E-02 | |
| 0034655 // nucleobase-containing compound catabolic process | 0.48 | 1.39 | 4.2E-09 | 6.7E-06 | |
| 1901361 // organic cyclic compound catabolic process | 0.41 | 1.33 | 1.4E-05 | 2.0E-02 | |
| 0046700 // heterocycle catabolic process | 0.39 | 1.31 | 1.0E-05 | 2.0E-02 | |
| 0030259 // lipid glycosylation | 0.38 | 1.3 | 7.3E-08 | 1.2E-04 | |
| 0016879 // ligase activity, forming carbon-nitrogen bonds | 0.23 | 1.17 | 5.8E-07 | 9.2E-04 | |
| Group 4: | 0070069 // cytochrome complex | 2.17 | 4.51 | 0.0E+00 | 0.0E+00 |
| 0043565 // sequence-specific DNA binding | 1.14 | 2.21 | 8.9E-16 | 1.4E-12 | |
| 0043531 // ADP binding | 0.83 | 1.78 | 1.5E-08 | 2.4E-05 | |
| 0016763 // transferase activity, transferring pentosyl groups | 0.59 | 1.5 | 9.0E-07 | 1.4E-03 | |
| 0045333 // cellular respiration | 0.52 | 1.44 | 0.0E+00 | 0.0E+00 | |
| 0006400 // tRNA modification | 0.51 | 1.42 | 2.3E-06 | 3.6E-03 | |
| 0004518 // nuclease activity | 0.34 | 1.27 | 8.7E-06 | 1.0E-02 | |
| 0016788 // hydrolase activity, acting on ester bonds | 0.32 | 1.25 | 3.6E-10 | 5.7E-07 | |
| 0006733 // oxidoreduction coenzyme metabolic process | 0.3 | 1.23 | 6.4E-09 | 1.0E-05 |
Fig 3Concatenated conserved gene phylogenetic tree of P. agardhii isolates.
Tree generated by concatenating the alignments of all Sandusky Bay isolates alongside two P. agardhii and two P. rubescens reference sequences. Genes included in concatenation include ftsz, gyrB, ntcA, rpoB, and rpoC1. The bar represents the horizontal distance matrix used to scale the branch length as a function of substitutions per site.
Fig 4Alignments of unique secondary metabolite clusters as references for the relatedness of sequences between isolates.
Reference sequence is highlighted in yellow and includes gene annotations for the clusters. Black segments in the non-highlighted sequences indicate points of difference, grey segments indicate similar regions, and the lines indicate regions of no coverage. A. Microcystin (mcy) cluster. B.Aeruginosin (aer) cluster. C. Anabaenapeptin (apn) cluster. D. Cyanopeptin (oci) cluster. E. Microviridin (mvd) cluster. F. Prenylagaramide cluster (pag). For which isolates were collapsed into each head sequence, see S4 Table.
Fig 5Oligotype phylogenetic tree, generated by the concatenation of the alignments for mcy, oci, aer, apn, mvd, and pag.
The table relates presence and absence of specific secondary metabolite gene clusters to understand the relatedness of each isolate. The bar represents the horizontal distance matrix used to scale the branch length as a function of substitutions per site.
Fig 6Common and unique CRISPR-Cas systems found in P. agardhii isolates of Sandusky Bay.
Table of CRISPR spacer sequences with matching PaV-LD ORF and function.
| PaV-LD ORF | PaV-LD function | Lowest E-value | Greatest % Identity | Greatest Bit Score | CRISPR spacer |
|---|---|---|---|---|---|
| PaVLD_ | replicative DNA helicase | 5.95E-08 | 93.182 | 67.1 | 1025_III-B_24, 1026_III-B_17, 1027_III-B_24, 1029_III-B_32, 1031_III-B_30, 1032_III-B_32, 1807_III-B_32, 1808_III-B_30, 1809_III-B_30, 1813_III-A_48 |
| PaVLD_ | tail tape measure protein | 8.26E-14 | 100 | 86 | 1029_III-B_31,1030_III-B_26, 1031_III-B_29, 1032_III-B_31, 1801_III-B_41 |
| PaVLD_ | hypothetical protein | 2.81E-10 | 100 | 73.4 | 1029_III-B_22, 1030_III-B_19, 1031_III-B_20, 1032_III-B_22, 1807_III-B_22, 1808_III-B_20, 1809_III-B_20, 1811_III-B_41 |
| PaVLD_ | hypothetical protein | 9.19E-11 | 100 | 75.2 | 1029_III-B_26, 1030_III-B_23, 1031_III-B_25, 1032_III-B_26, 1807_III-B_26, 1808_III-B_24, 1809_III-B_25, 1813_I-D_6 |
| PaVLD_ | crossover junction endo-deoxyribonuclease | 8.31E-08 | 100 | 64.4 | 1029_III-B_24, 1030_III-B_21, 1031_III-B_23, 1032_III-B_24, 1807_III-B_24, 1808_III-B_22, 1809_III-B_23, 1813_III-A_41 |
| PaVLD_ | integrase | 2.38E-08 | 100 | 66.2 | 1029_I-D_9, 1030_I-D_9, 1031_I-D_9,1032_I-D_9, 1807_I-D_9, 1808_I-D_9, 1809_I-D_11 |
| PaVLD_ | capsid protein | 0.002 | 93.75 | 50 | 1801_III-B_37 |
| PaVLD_ | replication-related protein | 4.35E-07 | 94.872 | 62.6 | 1813_I-D_16 |
| PaVLD_ | hypothetical protein | 2.98E-08 | 100 | 66.2 | 1813_I-D_1 |
| PaVLD_ | site-specific DNA methylase | 2.38E-08 | 100 | 66.2 | 1813_I-D_22 |
| PaVLD_ | hypothetical protein | 6.23E-08 | 100 | 64.4 | 1813_I-D_17 |
| PaVLD_ | anti-repressor protein | 5.25E-04 | 91.667 | 52.7 | 1813_I-D_15 |
| PaVLD_ | hypothetical protein | 2.90E-07 | 97.297 | 63.5 | 1813_I-D_13 |
*Denotes sequences with minor deviations from the other sequences for that PaV-LD ORF.
Table of common CRISPR spacer elements across a majority of isolates (≥ 10).
| CRISPR Spacer sequence: | Found in isolates: | Reference sequences (E-value) |
|---|---|---|
|
| 1025, 1026, 1027, 1029, 1031, 1032, 1033, 1803, 1804, 1805, 1806, 1807, 1808, 1809, 1810, 1813 | P. agardhii NIES-204 |
|
| 1025, 1026, 1027, 1029, 1031, 1032, 1807, 1808, 1809, 1813 | P. agardhii str. 7805 |
|
| 1025, 1026, 1027, 1029, 1031, 1032, 1807, 1808, 1809, 1813 | None |
|
| 1025, 1027, 1029, 1030, 1031 | None |
*Denotes the presence of more than one copy of this spacer in different CRISPR segments.
Fig 7Nitrogen acquisition and storage genes found in P. agardhii.
A. Sequence alignment of the nrtABCD cluster in reference NIES-204 and the P. agardhii isolates from Sandusky Bay. B. Sequence alignment of cyanophycin synthetase cphA1. C. Partial sequence alignment of cyanophycinase (cphB) and cyanophycin synthetase chpA2 operon.
Sequence similarity of important nutrient acquisition genes for Planktothrix agardhii.
Ammonium transporter genes are linked in the genome and were analyzed as a gene set.
| ABC-type nitrate/sulfonate/bicarbonate transporter | Ammonium transporters (amt1, amt3) | Carbonic anhydrase 1 (beta) | Carbonic anhydrase 2 (beta) | Carbonic anhydrase 3 (beta) | Carbonate dehydratase (beta) | |
|---|---|---|---|---|---|---|
| NIES-204 | Ref. (BBD53028.1) | - | Ref. (BBD56413.1) | Ref. (BBD55070.1) | - | Ref. (BBD56294.1) |
| NIVA-CYA 126/8 | - | Ref. (WP_042151837.1, WP_072005174.1) | - | - | Ref. (WP_042155137.1) | - |
| 1025 | 100 | 99.44 | 100 | 99.72 | N/A | 100 |
| 1026 | 100 | 99.42 | 100 | 99.72 | N/A | 100 |
| 1027 | 100 | 99.44 | 100 | 99.72 | N/A | 100 |
| 1029 | N/A | 99.93 | 99.66 | 99.86 | 100 | N/A |
| 1030 | N/A | 99.93 | 99.66 | 99.86 | 100 | N/A |
| 1031 | N/A | 99.93 | 99.83 | 99.72 | 100 | N/A |
| 1032 | N/A | 99.93 | 99.83 | 99.72 | 100 | N/A |
| 1033 | N/A | 96.11 | 99.83 | 99.72 | 99.85 | N/A |
| 1801 | N/A | 99.46 | 99.49 | 98.44 | 99.56 | N/A |
| 1803 | N/A | 99.46 | 99.83 | 98.3 | 100 | N/A |
| 1804 | N/A | 99.44 | 99.83 | 98.3 | 100 | N/A |
| 1805 | N/A | 99.46 | 99.83 | 98.3 | 100 | N/A |
| 1806 | N/A | 99.44 | 99.83 | 98.3 | 100 | N/A |
| 1807 | N/A | 99.1 | 99.83 | 99.86 | 100 | N/A |
| 1808 | N/A | 99.11 | 99.83 | 99.86 | 100 | N/A |
| 1809 | N/A | 99.11 | 99.83 | 99.86 | 100 | N/A |
| 1810 | N/A | 99.51 | 99.83 | 98.44 | 99.41 | N/A |
| 1811 | N/A | 99.13 | 99.49 | 98.3 | 99.56 | N/A |
| 1812 | N/A | 93.58 | 99.49 | N/A | 99.56 | N/A |
| 1813 | N/A | 99.42 | 99.83 | 99.57 | 100 | N/A |