| Literature DB >> 23789085 |
Luke R Thompson1, Chris Field, Tamara Romanuk, David Ngugi, Rania Siam, Hamza El Dorry, Ulrich Stingl.
Abstract
Large swaths of the nutrient-poor surface ocean are dominated numerically by cyanobacteria (Prochlorococcus), cyanobacterial viruses (cyanophage), and alphaproteobacteria (SAR11). How these groups thrive in the diverse physicochemical environments of different oceanic regions remains poorly understood. Comparative metagenomics can reveal adaptive responses linked to ecosystem-specific selective pressures. The Red Sea is well-suited for studying adaptation of pelagic-microbes, with salinities, temperatures, and light levels at the extreme end for the surface ocean, and low nutrient concentrations, yet no metagenomic studies have been done there. The Red Sea (high salinity, high light, low N and P) compares favorably with the Mediterranean Sea (high salinity, low P), Sargasso Sea (low P), and North Pacific Subtropical Gyre (high light, low N). We quantified the relative abundance of genetic functions among Prochlorococcus, cyanophage, and SAR11 from these four regions. Gene frequencies indicate selection for phosphorus acquisition (Mediterranean/Sargasso), DNA repair and high-light responses (Red Sea/Pacific Prochlorococcus), and osmolyte C1 oxidation (Red Sea/Mediterranean SAR11). The unexpected connection between salinity-dependent osmolyte production and SAR11 C1 metabolism represents a potentially major coevolutionary adaptation and biogeochemical flux. Among Prochlorococcus and cyanophage, genes enriched in specific environments had ecotype distributions similar to nonenriched genes, suggesting that inter-ecotype gene transfer is not a major source of environment-specific adaptation. Clustering of metagenomes using gene frequencies shows similarities in populations (Red Sea with Pacific, Mediterranean with Sargasso) that belie their geographic distances. Taken together, the genetic functions enriched in specific environments indicate competitive strategies for maintaining carrying capacity in the face of physical stressors and low nutrient availability.Entities:
Keywords: Cyanophage; Pelagibacter; Prochlorococcus; SAR11; metagenomics; osmolyte; population genomics
Year: 2013 PMID: 23789085 PMCID: PMC3686209 DOI: 10.1002/ece3.593
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Database and source water properties for the four metagenomic datasets included in this study, including estimated nutrient concentrations and physical properties
| RS | MED | BATS | HOT | |||||
|---|---|---|---|---|---|---|---|---|
| 50 m | 50 m | 20 m | 50 m | 100 m | 25 m | 75 m | 110 m | |
| Database properties | ||||||||
| Sequence reads | 1,177,603 | 1,204,381 | 357,881 | 464,651 | 525,605 | 623,558 | 673,673 | 473,165 |
| Total base pairs (Mbp) | 365 | 318 | 88 | 102 | 120 | 136 | 139 | 110 |
| Mean read length (bp) | 310 | 264 | 246 | 220 | 228 | 218 | 206 | 232 |
| Median read length (bp) | 327 | 273 | 263 | 247 | 251 | 249 | 223 | 254 |
| Sample properties | ||||||||
| Site/cruise | Atlantis II | MedDCM | BATS216 | HOT186 | ||||
| Date collected | Oct. 2008 | Oct. 2007 | Oct. 2006 | Oct. 2006 | ||||
| Latitude | 21.217°N | 38.068°N | 31.667°N | 22.733°N | ||||
| Longitude | 37.967°E | 0.232°E | 64.167°W | 158.033°W | ||||
| Mixed layer depth (m) | 45 | n.a. | 45 | 70 | ||||
| Deep chl max depth (m) | 80 | 50 | 100 | 110 | ||||
| Size fraction (μm) | 0.1–0.8 | 0.2–5.0 | 0.2–1.6 | 0.2–1.6 | ||||
| Physicochemical properties | ||||||||
| Nitrate+Nitrite (μM) | 0.21 | 0.50 | <d.l. | <d.l. | 0.12 | 0.06 ± 0.01 | 0.08 | 0.07 |
| Nitrite (μM) | 0.04 | n.a. | 0.01 | 0.01 | 0.02 | n.a. | n.a. | n.a. |
| Phosphate (μM) | 0.11 | ∼0.1 | <d.l. | <d.l. | <d.l. | 0.02 ± 0.01 | 0.02 | 0.05 |
| Salinity (psu) | 39.67 ± 0.01 | ∼38 | 36.44 ± 0.08 | 36.74 ± 0.08 | 36.69 ± 0.07 | 35.12 ± 0.05 | 35.20 ± 0.07 | 35.30 ± 0.01 |
| Temperature (°C) | 29.1 ± 0.2 | ∼16 | 26.7 ± 0.3 | 24.0 ± 1.2 | 19.6 ± 0.9 | 26.20 ± 0.05 | 23.5 ± 1.0 | 22.06 ± 0.06 |
| Monthly mean solar downward flux (W m−2) | ||||||||
| Yearly mean | 244.2 | 201.1 | 190.4 | 240.0 | ||||
| Brightest month mean | 307.5 | 315.0 | 285.2 | 309.4 | ||||
| Dimmest month mean | 173.6 | 89.4 | 94.1 | 157.3 | ||||
Where multiple data points were available, ranges of values (midpoint, minimum, maximum) are reported. See Methods for more information. Sequence read archive accession numbers for pyrosequencing reads: RS: SRX253027; MED: SRX017111; BATS: SRX008032, SRX008033, SRX008035; HOT: SRX007369, SRX007370, SRX007372. n.a., not available; d.l., detection limit.
Figure 2Ecotype distributions of gene clusters from Prochlorococcus and cyanophage. For each sea, ecotype frequencies for all gene clusters are plotted as box and whisker plots, with median, interquartile range, whiskers (whisker length w = 1.5), and outliers (outside of whiskers as defined) indicated. Colored boxes to the right of the box plots are gene clusters over-represented in that sea (Table 2), colored by metabolic function or phage distribution, with those gene clusters among the outliers labeled with the gene cluster number.
Gene clusters over-represented in RS, MED, BATS, or HOT
| RS | MED | BATS | HOT | Entropy | Reads | Function | ProPortal | Distribution | |
|---|---|---|---|---|---|---|---|---|---|
| PRO2654 | 0.540 | 0.000 | 0.197 | 0.263 | 1.004 | 108 | Hypothetical protein | 3504 | Core HLII |
| PRO2267 | 0.488 | 0.032 | 0.118 | 0.362 | 1.081 | 111 | 2OG-FeII oxygenase superfamily | 4466 | All except MED4 |
| PRO2760 | 0.397 | 0.037 | 0.289 | 0.277 | 1.204 | 368 | Deoxyribodipyrimidine photolyase | 7370 | 4/5 HLII, 1/2 HLI |
| PRO2575 | 0.465 | 0.102 | 0.167 | 0.265 | 1.240 | 93 | Carboxylesterase | 3327 | Core HL |
| PRO2420 | 0.445 | 0.077 | 0.246 | 0.231 | 1.242 | 122 | MnII/FeII transporter | 6754 | Core HLII |
| PRO2498 | 0.363 | 0.057 | 0.309 | 0.271 | 1.248 | 119 | LEM domain-containing protein | 3045 | Core HL |
| PRO1012 | 0.423 | 0.115 | 0.142 | 0.320 | 1.254 | 116 | Carbohydrate-selective porin OprB family | 4464 | All except MED4 |
| PRO2504 | 0.405 | 0.072 | 0.246 | 0.277 | 1.257 | 138 | SMC domain-containing protein | 3321 | Core HL |
| PRO2832 | 0.063 | 0.462 | 0.466 | 0.009 | 0.929 | 121 | Arsenite efflux pump ACR3 family | 3136 | 2/7 HL, 3/6 LL |
| PRO2983 | 0.055 | 0.449 | 0.480 | 0.016 | 0.937 | 325 | Alkaline phosphatase PhoA | 3127 | 2/7 HL, 2/6 LL |
| PRO2362 | 0.075 | 0.637 | 0.164 | 0.124 | 1.037 | 97 | 4-amino-4-deoxy-L-arabinose transferase | 7522 | Core HL |
| PRO2369 | 0.101 | 0.555 | 0.267 | 0.077 | 1.110 | 94 | Hypothetical protein | 3087 | Core HLI, core LL |
| PRO2623 | 0.198 | 0.342 | 0.433 | 0.026 | 1.146 | 188 | Two-component sensor kinase P-sensing PhoR | 3125 | 3/7 HL, 4/6 LL |
| PRO2683 | 0.203 | 0.388 | 0.346 | 0.063 | 1.232 | 218 | Chromate transporter | 3130 | 3/7 HL, 3/6 LL |
| PRO3097 | 0.195 | 0.470 | 0.132 | 0.203 | 1.264 | 117 | Peroxiredoxin DsrE family | 2737 | 3/7 HL |
| PRO2832 | 0.063 | 0.462 | 0.466 | 0.009 | 0.929 | 121 | Arsenite efflux pump ACR3 family | 3136 | 2/7 HL, 3/6 LL |
| PRO2983 | 0.055 | 0.449 | 0.480 | 0.016 | 0.937 | 325 | Alkaline phosphatase PhoA | 3127 | 2/7 HL, 2/6 LL |
| PRO2524 | 0.287 | 0.000 | 0.406 | 0.308 | 1.087 | 119 | Cytochrome c class I | 4564 | Core LL |
| PRO2623 | 0.198 | 0.342 | 0.433 | 0.026 | 1.146 | 188 | Two-component sensor kinase P-sensing PhoR | 3125 | 3/7 HL, 4/6 LL |
| PRO2684 | 0.215 | 0.200 | 0.515 | 0.070 | 1.181 | 122 | Two-component response regulator P PhoB | 3124 | 3/7 HL, 3/6 LL |
| PRO2683 | 0.203 | 0.388 | 0.346 | 0.063 | 1.232 | 218 | Chromate transporter | 3130 | 3/7 HL, 3/6 LL |
| PRO2216 | 0.313 | 0.064 | 0.347 | 0.276 | 1.262 | 163 | Rhodanese-like protein | 2514 | All except MIT9202 |
| PRO2267 | 0.488 | 0.032 | 0.118 | 0.362 | 1.081 | 111 | 2OG-FeII oxygenase superfamily | 4466 | All except MED4 |
| PRO1312 | 0.310 | 0.043 | 0.319 | 0.328 | 1.228 | 258 | Abortive infection protein | 2716 | Core |
| PRO2365 | 0.308 | 0.048 | 0.314 | 0.330 | 1.239 | 151 | Hypothetical protein | 5598 | 5/7 HL, core LL |
| Cyanophage genes over-represented in RS | |||||||||
| PH1590 | 0.551 | 0.034 | 0.076 | 0.340 | 1.004 | 100 | Baseplate wedge initiator | 93 | P-HM1, P-HM2 only |
| PH1063 | 0.526 | 0.282 | 0.000 | 0.192 | 1.012 | 40 | Plasmid stability protein | 166 | All T4-like except S-PM2 |
| PH1210 | 0.599 | 0.056 | 0.255 | 0.090 | 1.034 | 75 | Hypothetical protein | 108 | 5/17 T4-like |
| PH1309 | 0.435 | 0.328 | 0.000 | 0.236 | 1.069 | 114 | Hypothetical protein | 373 | 3/17 T4-like |
| Cyanophage genes over-represented in MED | |||||||||
| PH1105 | 0.000 | 1.000 | 0.000 | 0.000 | 0.000 | 40 | Hypothetical cyanophage protein | 258 | Syn T4-like only (10/17) |
| PH1135 | 0.000 | 1.000 | 0.000 | 0.000 | 0.000 | 54 | 6-phosphogluconate dehydrogenase Gnd | 964 | Syn T4-like only (8/17) |
| PH1180 | 0.000 | 0.968 | 0.000 | 0.032 | 0.142 | 38 | Glucose-6-phosphate dehydrogenase Zwf | 969 | Syn T4-like only (6/17) |
| PH1046 | 0.095 | 0.519 | 0.000 | 0.386 | 0.931 | 52 | Terminase DNA packaging enzyme small subunit | 106 | Core T4-like |
| PH1144 | 0.192 | 0.469 | 0.000 | 0.339 | 1.039 | 42 | Precursor of major head subunit | 1074 | 8/17 T4-like |
| PH1009 | 0.365 | 0.445 | 0.000 | 0.190 | 1.043 | 46 | Hypothetical protein | 233 | Core T4-like |
| Cyanophage genes over-represented in BATS | |||||||||
| PH1168 | 0.016 | 0.359 | 0.607 | 0.018 | 0.807 | 37 | DUF680 domain-containing protein | 173 | 7/17 T4-like |
| PH1434 | 0.000 | 0.310 | 0.515 | 0.175 | 1.010 | 44 | Phage tail fiber-like protein | 93 | P-SSM2, S-SSM7 only |
| PH1133 | 0.068 | 0.259 | 0.577 | 0.096 | 1.076 | 223 | Phosphate transporter PstS | 174 | 9/17 T4-like |
| Cyanophage genes over-represented in HOT | |||||||||
| PH1145 | 0.241 | 0.084 | 0.000 | 0.675 | 0.816 | 40 | Hypothetical protein | 336 | 8/17 T4-like |
| PH1574 | 0.393 | 0.047 | 0.000 | 0.560 | 0.835 | 37 | Hypothetical protein | 2051 | P-HM1, P-HM2 only |
| PH1376 | 0.000 | 0.096 | 0.290 | 0.614 | 0.884 | 37 | Phage tail fiber-like protein | 564 | P-SSM2 only (2 copies) |
| PH1606 | 0.387 | 0.148 | 0.000 | 0.465 | 1.006 | 169 | Glycine dehydrogenase | 2105 | P-HM1, P-HM2 only |
| PH1158 | 0.309 | 0.212 | 0.000 | 0.479 | 1.044 | 38 | Hypothetical protein | 1048 | 7/17 T4-like |
| PH1033 | 0.244 | 0.337 | 0.000 | 0.419 | 1.075 | 103 | Recombination endonuclease subunit | 138 | Core T4-like |
For each gene cluster, relative normalized abundance in each of the four seas, entropy, number of reads mapping, proposed function, cross-referenced ProPortal CyCOG (Prochlorococcus) and PhCOG (cyanophage) numbers (http://proportal.mit.edu/), and distribution among the genomes are given. Data for BATS and HOT were summed over three depths (Methods). Genome information for distributions can be found in Table S1.
Figure 1Stacked bar graphs showing relative normalized abundances of gene clusters over-represented in one or more of the four seas. Gene clusters implicated in selected metabolic processes are shown. Data shown are for all depths summed for each sea (solid colors), or for mixed layer depths only (diagonal shading), sub-mixed layer depths only (cross-hatched shading), or deep chlorophyll maximum depths only (horizontal shading). Bars are sorted by size from left (largest) to right (smallest). Tick marks indicate 25% subdivisions.
Figure 3Relative normalized abundance and entropy of gene clusters versus position in reference genomes. Gene clusters are plotted at their corresponding positions in the reference genomes Prochlorococcus MIT9301, cyanophage S-SM2, and SAR11 HTCC7211, which were the most represented genomes based on top BLASTX hits (Methods). Only gene clusters with hit counts in the top 75% across the four seas are shown. Solid black lines indicate gene clusters with entropy in the bottom 15% (Prochlorococcus, SAR11) or 25% (cyanophage) and r.n.a. for one sea in the top or bottom 10%. Gray boxes indicate HVRs (Supporting Information). Dashed lines indicate equal normalized abundance across the four seas.
Figure 4Hierarchical clustering of microbial populations from RS, MED, BATS, and HOT based on relative normalized abundances of gene clusters. Separate clustering patterns are shown for Prochlorococcus, cyanophage, and SAR11. The AGNES agglomerative coefficient measures separation between clusters, ranging from 0 (no structure found) to 1 (clear structure found) (Kaufman and Rousseeuw 2005). Resulting values range from 0.50 to 0.55 for the three groups, indicating that moderately clear structuring is detected.