| Literature DB >> 18673386 |
Matthew B Sullivan1, Maureen L Coleman, Vanessa Quinlivan, Jessica E Rosenkrantz, Alicia S Defrancesco, G Tan, Ross Fu, Jessica A Lee, John B Waterbury, Joseph P Bielawski, Sallie W Chisholm.
Abstract
Oceanic phages are critical components of the global ecosystem, where they play a role in microbial mortality and evolution. Our understanding of phage diversity is greatly limited by the lack of useful genetic diversity measures. Previous studies, focusing on myophages that infect the marine cyanobacterium Synechococcus, have used the coliphage T4 portal-protein-encoding homologue, gene 20 (g20), as a diversity marker. These studies revealed 10 sequence clusters, 9 oceanic and 1 freshwater, where only 3 contained cultured representatives. We sequenced g20 from 38 marine myophages isolated using a diversity of Synechococcus and Prochlorococcus hosts to see if any would fall into the clusters that lacked cultured representatives. On the contrary, all fell into the three clusters that already contained sequences from cultured phages. Further, there was no obvious relationship between host of isolation, or host range, and g20 sequence similarity. We next expanded our analyses to all available g20 sequences (769 sequences), which include PCR amplicons from wild uncultured phages, non-PCR amplified sequences identified in the Global Ocean Survey (GOS) metagenomic database, as well as sequences from cultured phages, to evaluate the relationship between g20 sequence clusters and habitat features from which the phage sequences were isolated. Even in this meta-data set, very few sequences fell into the sequence clusters without cultured representatives, suggesting that the latter are very rare, or sequencing artefacts. In contrast, sequences most similar to the culture-containing clusters, the freshwater cluster and two novel clusters, were more highly represented, with one particular culture-containing cluster representing the dominant g20 genotype in the unamplified GOS sequence data. Finally, while some g20 sequences were non-randomly distributed with respect to habitat, there were always numerous exceptions to general patterns, indicating that phage portal proteins are not good predictors of a phage's host or the habitat in which a particular phage may thrive.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18673386 PMCID: PMC2657995 DOI: 10.1111/j.1462-2920.2008.01702.x
Source DB: PubMed Journal: Environ Microbiol ISSN: 1462-2912 Impact factor: 5.491
Efficacy of three different primer sets at amplifying the g20 gene from cultured cyanophage.
| g20 primer set | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Phage strain | Original host strain isolated on | Site of Isolation | Depth (m) | Date isolated | Family | CPS4GC/5 | CPS1/8 | CPS1.1/8.1 | Refs |
| P-SSP1 | MIT 9215 | BATS/31°48′N, 64°16′W | 100 | 6 June 2000 | P | – | – | – | 1 |
| P-RSP1 | MIT 9215 | Red Sea/29°28′N, 34°53′E | 0 | 15 July 2000 | P | – | – | – | 1 |
| P-RSP2 | MIT 9302 | Red Sea/29°28′N, 34°53′E | 0 | 15 July 2000 | P | – | – | – | 1 |
| P-SSP2 | MIT 9312 | BATS/31°48′N, 64°16′W | 120 | 29 September 1999 | P | – | – | – | 1 |
| P-SSP3 | MIT 9312 | BATS/31°48′N, 64°16′W | 100 | 29 September 1999 | P | – | – | – | 1 |
| P-SSP4 | MIT 9312 | BATS/31°48′N, 64°16′W | 70 | 26 September 1999 | P | – | – | – | 1 |
| P-SSP5 | MIT 9515 | BATS/31°48′N, 64°16′W | 120 | 29 September 1999 | P | – | – | – | 1 |
| P-SSP6 | MIT 9515 | BATS/31°48′N, 64°16′W | 100 | 26 September 1999 | P | – | – | – | 1 |
| P-SSP7 | MED4 | BATS/31°48′N, 64°16′W | 100 | 26 September 1999 | P | – | – | – | 1 |
| P-GSP1 | MED4 | Gulf Stream/38°21′N, 66°49′W | 40 | 6 October 1999 | P | – | – | – | 1 |
| P-SSP8 | NATL2A | BATS/31°48′N, 64°16′W | 100 | 26 September 1999 | P | – | – | – | 1 |
| P-RSP3 | NATL2A | Red Sea/29°28′N, 34°55′E | 50 | 13 September 2000 | P | – | – | – | 1 |
| P-SP1 | SS120 | Slope/38°10′N, 73°09′W | 83 | 17 September 2001 | P | – | – | – | 1 |
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| Syn5 | WH 8109 | Sargasso Sea/36°58′N, 73°42′W | 0 | December 1990 | P | – | – | – | 1 |
| Syn12 | WH 8017 | Gulf Stream/34°06′N, 61°01′W | 0 | July 1990 | P | – | – | – | 1 |
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| S-RIM3 | WH 8018 | Mt. Hope Bay, RI/41°39′N, 71°15′W | 0 | September 1999 | M? | + | – | + | 4 |
| | |||||||||
| S-PM2 | WH 7803 | English Channel/50°18′N, 4°12′W | 0 | 23 September 1992 | M | + | + | + | 5 |
| S-WHM1 | WH 7803 | Woods Hole/41°31′N, 71°40′W | 0 | 11 August 1992 | M | + | + | + | 5 |
| S-RIM9 | WH 7803 | Mt. Hope Bay, RI/41°39′N, 71°15′W | 0 | May 2000 | M? | + | – | + | 4 |
| S-RIM17 | WH 7803 | Mt. Hope Bay, RI/41°39′N, 71°15′W | 0 | July 2001 | M? | + | – | + | 4 |
| S-RIM24 | WH 7803 | Mt. Hope Bay, RI/41°39′N, 71°15′W | 0 | December 2001 | M? | + | – | + | 4 |
| S-RIM30 | WH 7803 | Mt. Hope Bay, RI/41°39′N, 71°15′W | 0 | June 2002 | M? | + | – | + | 4 |
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| | |||||||||
| Other phages | |||||||||
| IH6-φ1 | IH6 | Inner Harbor, Baltimore, MD | 0 | 17 November 2000 | M | – | – | – | 6 |
| IH6-φ7 | IH6 | Inner Harbor, Baltimore, MD | 0 | 17 November 2000 | P | – | – | – | 6 |
| IH11-φ2 | Inner Harbor, Baltimore, MD | 0 | 17 November 2000 | M | – | – | – | 6 | |
| IH11-φ5 | Inner Harbor, Baltimore, MD | 0 | 17 November 2000 | P | – | – | – | 6 | |
| CB8-φ2 | CB8 | Chesapeake Bay, MD | 0 | 17 November 2000 | M | – | – | – | 6 |
| CB8-φ6 | CB8 | Chesapeake Bay, MD | 0 | 17 November 2000 | M | – | – | – | 6 |
| CB-φ8 | Chesapeake Bay, MD | 0 | 17 November 2000 | M | – | – | – | 6 | |
| HER320 | H7 | Helgoland, North Sea | 0 | 1976−1978 | M | – | – | – | 7 |
| HER321 | H100 | Helgoland, North Sea | 0 | 1976−1978 | P | – | – | – | 7 |
| HER322 | H100 | Helgoland, North Sea | 0 | 1976−1978 | M | – | – | – | 7 |
| HER327 | 11–68 | Helgoland, North Sea | 0 | 1976−1978 | S | – | – | – | 7 |
| HER328 | H105 | Helgoland, North Sea | 0 | 1976−1978 | S | – | – | – | 7 |
M, P and S represent the virus families Myoviridae, Podoviridae and Siphoviridae respectively, as determined by morphology. ‘M?’ indicates that the assignment is based solely on amplification and sequencing of a g20 PCR product and has not been confirmed with electron microscopy.
Reference where cultured isolate was originally described: 1, Sullivan and colleagues (2003); 2, this study; 3, Waterbury and Valois (1993); 4, Marston and Salee (2003); 5, Wilson and colleagues (1993); 6, Zhong and colleagues (2002); 7, Wichels and colleagues (1998).
‘+’ indicates positive PCR amplification; ‘−’ indicates that there was no PCR product of the expected size. The new g20 sequences contributed in this study are shown in bold letters. CPS1.1/8.1 is the new primer set designed for this study, while CPS4GC/5 and CPS1/8 were published previously.
Fig. 1Evolutionary relationships determined using 183 amino acids of the portal protein gene (g20) amplified from cultured phage isolates (names begin with ‘S-’ or ‘P-’ and are coloured orange or green for Synechococcus or Prochlorococcus phages respectively) from this study (italicized), as well as previous studies (non-italicized), and environmental g20 sequences (names in black) (Zhong ; Marston and Sallee, 2003). Clusters defined by Zhong and colleagues (2002) are as follows: clusters I–III contain g20 sequences from cultured phage isolates, while clusters A–F represent only environmental g20 sequences. Clusters containing identical g20 protein sequences are numbered with alphanumeric numbers (1–13). For cultured phages, the phage isolate names are followed by black lettering that indicates the original host strain used for isolation, while the phage host range is indicated as high light-adapted Prochlorococcus (green circle or dash), low light-adapted Prochlorococcus (blue circle or dash) or Synechococcus (orange circle or dash). The circles represent cross-infection was observed within this group of hosts tested, whereas a dash indicates that no cross-infection was observed. Isolates not available for host range testing have no indication of their host range. The tree shown was inferred by neighbour-joining as described in the Experimental procedures. Support values shown at the nodes are neighbour-joining bootstrap/maximum parsimony bootstrap/maximum likelihood quartet puzzling support (only values > 50 are shown). Well-supported nodes (as defined in Experimental procedures) are designated by italicized support values, including six nodes that represent subclusters within the culture-containing clusters I–III. The g20 sequence from the non-cyanomyophage isolate T4 was used as an outgroup to root this tree.
Fig. 2Evolutionary relationships determined using 554 base pairs of the portal protein gene (g20) from 769 available g20 sequences. Clusters defined by Zhong and colleagues (2002) are identified as culture-based clusters I–III and environmental-sequence-only clusters A–F. New clusters defined since Zhong and colleagues (2002) are indicated with the preface ‘new cluster’, a number and a brief description. The tree shown is the consensus (majority rules) tree from 11 GARLI iterations inferred using the maximum likelihood criterion (see Experimental procedures), with the Aeromonas phage Aeh1 g20 sequence used as an outgroup to root the tree. Three colour rings reflect the habitat type from which the g20 sequence originated. For most of these sequences (GOS sequences), there is ribotype dot-blot and metagenomic information about the microbial community structure at the site, while for non-GOS sequences such information was assumed where reasonable to do so (see Table 3 legend). The inner ring is the microbial community structure information listed as Rusch and colleagues (2007)-defined environmental categories, while the other two rings reflect the temperature and salinity of the original sampling site.
Origins of the g20 sequences used in ‘meta’ phylogenetic analyses shown in Fig. 2.
| # Sequences | Description | PCR-based? | Sequence label in | Refs |
|---|---|---|---|---|
| 512 | Environmental sequences from 42 oceanic sample sites from the GOS | N | JC# | 1 |
| 56 | Environmental sequences from 19 globally distributed freshwater and marine sites | Y | AY705# | 2 |
| 25 | Environmental sequences from Rhode Island coastal waters, USA | Y | AY259# | 3 |
| 43 | Environmental sequences from Lake Erie, USA | Y | DQ318# | 4 |
| 47 | Environmental sequences from Lake Bourget, France | Y | AY426# | 5 |
| 27 | Environmental sequences and mixed lysates from coastal north-western Atlantic Ocean | Y | Variable | 6 |
| 51 | Cultured marine cyanomyophages of variable coastal and open ocean origins | N/A | Variable | 3, 7 |
| 8 | Cultured non-cyanomyophages from sewage | N/A | Variable | 8 |
The ‘PCR-based’ column indicates whether the environmental sequence was obtained by PCR or metagenomic approaches (N/A indicates that this is not applicable for sequences from cultured phage isolates). Reference code: 1, Rusch and colleagues (2007); 2, Short and Suttle (2005); 3, Marston and Salee (2003); 4, Wilhelm and colleagues (2006); 5, Dorigo and colleagues (2004); 6, Zhong and colleagues (2002); 7, this study; 8, T4-like phage genomes website http://phage.bioc.tulane.edu/
Relationship between g20 sequence clusters and the microbial community types of the original habitats from which they were collected.
Probability that g20 sequence clusters are non-random with respect to the salinity at the site from which they were collected.
| Environmental category | Salinity (ppt) | # Sequences | Unifrac |
|---|---|---|---|
| Brackish | 18–32.99 | 183 | 0.1456 |
The Unifrac distance metric (Lozupone ) was used for the analysis. Salinity values, when not available from the published work, were obtained from the communicating author of the paper in which the g20 sequence was first reported. All freshwater samples were assumed to have a salinity of < 0.50 ppt. All but the sequences from brackish waters clustered non-randomly (P< 0.05) with respect to the habitat type as defined by salinity.
Probability that g20 sequence clusters are non-random with respect to the temperature at the site from which they were collected.
| Environment | Temperature (°C) | # Sequences | Unifrac |
|---|---|---|---|
| Medium | 15–21.99 | 141 | 0.2296 |
The Unifrac distance metric (Lozupone ) was used for the analysis. Temperature values, when not available from the published work, were obtained from the communicating author of the paper in which the g20 sequence was first reported. All but the sequences from moderate temperatures clustered non-randomly (P< 0.05) with respect to the habitat type as defined by temperature.