| Literature DB >> 17878949 |
Ana-Belen Martín-Cuadrado1, Purificación López-García, Juan-Carlos Alba, David Moreira, Luis Monticelli, Axel Strittmatter, Gerhard Gottschalk, Francisco Rodríguez-Valera.
Abstract
BACKGROUND: Metagenomics is emerging as a powerful method to study the function and physiology of the unexplored microbial biosphere, and is causing us to re-evaluate basic precepts of microbial ecology and evolution. Most marine metagenomic analyses have been nearly exclusively devoted to photic waters. METHODOLOGY/PRINCIPALEntities:
Mesh:
Substances:
Year: 2007 PMID: 17878949 PMCID: PMC1976395 DOI: 10.1371/journal.pone.0000914
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
General features of the Km3 metagenome.
| Major feature | Subcategory | Value |
| Number of fosmid clones | 20,767 | |
| Total archived sequence | 725 Mbp | |
| Number of 16S rRNA genes amplified by PCR | Archaeal | 28 |
| Bacterial | 16 | |
| Number of high-quality fosmid-end sequences | 9,048 | |
| Average length of sequence reads | 794 bp | |
| Total generated sequence length | 7,184 kbp | |
| Average GC content | 50.1% | |
| Fosmid-end sequences | belonging to defined taxonomic groups (<1e-50) | 23.4% |
| of ambiguous taxonomic ascription (1e-7 to 1e-50) | 53.7% | |
| with homologues only in GOS | 11.4% | |
| without homologues in any database | 11.4% | |
| Distribution of sequences in major COG categories | ||
| Metabolism | 50.4% | |
| Information storage and processing | 17.1% | |
| Cellular Processes and Signaling | 16.0% | |
| Poorly characterized | 16.5% | |
| Distribution of sequences in major KEGG categories | ||
| Metabolism | 70.6% | |
| Genetic Information Processing | 17.4% | |
| Environmental Information Processes | 10.2% | |
| Cellular Processes | 1.7% |
BLAST cut-off values used in each case are shown in brackets.
Considering average insert sizes of 35 kbp
Only fosmids containing bacterial ITS regions different in size of that from E. coli were identified.
GOS, Global Ocean Surveyor database {Rusch, 2007 #16}
Figure 1Prokaryotic taxa identified in Km3 3,000 m-deep plankton inferred from fosmid-ends and 16S rRNA detection in environmental and metagenomic libraries.
Relative abundances of PCR-amplified 16S rRNA genes in environmental libraries are from Ref.4. For bacteria in the Km3 metagenomic library (central panel), only 16S rRNA genes whose adjacent ITS can be distinguished in size from that of E. coli were detected.
Figure 2Taxonomy of archaeal fosmids.
A, Phylogenetic tree of 16S rRNA genes amplified from the metagenomic library Km3. The tree was constructed by maximum likelihood using PhyML and a total of 704 non ambiguously aligned positions. Non-parametric bootstrapping was performed upon 1,000 replicates. Only bootstrap values above 50 are shown. Km3 and ALOHA water column sequences are indicated in red and blue, respectively. B, Comparative taxonomic distribution obtained by best BLAST hit (see methods) of archaeal fosmid-ends in KM3 and ALOHA deep-sea libraries [14]. Marine group I and environmental samples are non-taxonomic designations as used in databases.
Figure 3Taxonomy of bacterial fosmids.
A, Phylogenetic tree of 16S rRNA genes amplified from the metagenomic library Km3 whose adjacent intergenic spacers differ in size from those of Escherichia coli. The tree was constructed by maximum likelihood using PhyML and a total of 1128 non ambiguously aligned positions. Non-parametric bootstrapping was performed upon 1,000 replicates. Only bootstrap values above 50 are shown. Km3 and ALOHA water column sequences are indicated in red and blue, respectively. Unc., uncultured. [*], AA0D00000000 genome sequence underway. B, Comparative taxonomic distribution of bacterial fosmid-ends by best BLAST hits in KM3 and ALOHA deep-sea libraries [14].
Hallmark genomes recruited by Km3 fosmid-end sequences.
| Taxonomic group | Species genome | Number of hits | % identity | %GC genome | %GC Km3 seq. | O2 requirement | Habitat | Metabolism |
| Alphaproteobacteria Rhizobiales | Rhizobiales (19 genomes) | 112 | 56.53 | _ | _ | aerobic | soil | heterotrophy |
|
| 27 | 60.54 | _ | _ | aerobic | |||
|
| 23 | 56.31 | 62.3 | 53.08 | aerobic | |||
| Crenarchaeota |
| 102 | 64.47 | 57.4 | 36.06 | aerobic | marine | autotrophy? |
| Alphaproteobacteria Rickettsiales |
| 84 | 69.81 | 29.8/29.7 | 34.16/33.64 | aerobic | marine | oligotrophy |
| Planctomycetales |
| 78 | 57.23 | 57 | 58.71 | aerobic | marine | oligotrophy |
| Planctomycetales |
| 58 | 56.65 | 55.4 | 58.22 | aerobic | marine | heterotrophy |
| Acidobacteria |
| 56 | 55.67 | 61.9 | 54.64 | aerobic | soil | oligotrophy |
| Chloroflexi |
| 42 | 56.58 | _ | _ | facultative an. | chemolithotrophy | |
| Alphaproteobacteria Rhodospirillales |
| 36 | 61.81 | 65.1/66.4 | 55.44/56.88 | aerobic | heterotrophy | |
| Gammaproteobacteria Pseudomonadales |
| 36 | 58.56 | _ | _ | aerobic | heterotrophy | |
| Alphaproteobacteria Rhodobacterales |
| 33 | 61.31 | _ | _ | aerobic | heterotrophy | |
| Acidobacteria | Bacterium Ellin345 | 32 | 57.59 | 58.4 | 52.43 | aerobic | soil | oligotrophy |
| Betaproteobacteria Burkholderiales |
| 32 | 52.33 | _ | _ | aerobic | heterotrophy | |
| Chloroflexi |
| 26 | 52.86 | 60.4 | 50.61 | facultative an. | photoautotrophy | |
| Deltaproteobacteria Desulfuromonadales |
| 25 | 55.73 | _ | _ | anaerobic | soil / sediment | chemolitotrophy |
| Actinobacteria |
| 23 | 56.54 | 70.5 | 53.21 | aerobic | soil | heterotrophy |
| Gammaproteobacteria |
| 21 | 59.61 | 48.61 | aerobic | marine | oligotrophy | |
| Gammaproteobacteria Chromatiales |
| 19 | 62.71 | 67.5 | 52.94 | anaerobic | chemolithotrophy | |
| Bacteroidetes Sphingobacteria |
| 18 | 51.55 | 66.1 | 44.88 | aerobic | hypersaline | heterotrophy |
| Gammaproteobacteria Oceanospirillales |
| 16 | 67.65 | 54.7 | 52.68 | aerobic | heterotrophy | |
| Gammaproteobacteria Chromatiales |
| 16 | 62.2 | 50.3 | 50.25 | aerobic | marine | chemolithotrophy |
| Betaproteobacteria Burkholderiales |
| 16 | 62.11 | _ | _ | aerobic | soil | heterotrophy |
| Firmicutes Clostridia |
| 16 | 53.94 | 42 | 51.88 | facultative an. | chemolithotrophy | |
| Alphaproteobacteria Rhodospirillales |
| 15 | 64.42 | 65.4 | 55.33 | facultative an. | ||
| Gammaproteobacteria Methylococcales |
| 15 | 62.49 | 63.6 | 52 | aerobic | methylotrophy | |
| Deltaproteobacteria Myxococcales |
| 15 | 57 | 74.9 | 56.86 | facultative an. | soil / sediments | heterotrophy |
| Gammaproteobacteria Alteromonadales |
| 13 | 62.63 | 56.9 | 48.38 | aerobic | marine | heterotrophy |
| Bacteroidetes Flavobacteria |
| 13 | 59.93 | 37 | 48.23 | aerobic | marine | heterotrophy |
| Gammaproteobacteria Chromatiales |
| 13 | 57.72 | 60 | 51 | aerobic | chemolithotrophy | |
| Actinobacteria |
| 13 | 55.84 | _ | _ | aerobic | soil | heterotrophy |
| Gammaproteobacteria |
| 12 | 60.98 | 49.4 | 49.25 | aerobic | marine | heterotrophy |
| Cyanobacteria Chroococcales |
| 11 | 61.28 | _ | _ | photoautotrophy | ||
| Deltaproteobacteria |
| 11 | 58.1 | 60.1 | 52.54 | anaerobic | chemolithotrophy | |
| Gammaproteobacteria Alteromonadales |
| 10 | 69.33 | 44.9 | 42.3 | aerobic | marine | heterotrophy |
| Betaproteobacteria Burkholderiales |
| 10 | 51.66 | _ | _ | aerobic | heterotrophy |
BLASTX cut off value1e-50
Figure 4Genome recruitment by Cenarchaeum symbiosum A.
Individual fosmid-end sequences were aligned to the sequenced strain genome and the alignment-sequence conservation visualized in the form of percent identity plot. Each dot of the graph represents an individual fosmid-end sequence aligned along its homologous region in C. symbiosum A genome. Y axis reflects its nucleotide percent identity to the syntenic region. Both Km3 and ALOHA water column datasets were used.
Hallmark proteins in the Km3 metagenome compared to the deep ALOHA water column.
| BLASTX cut-off value | Protein / Protein class | Number of gene/protein per Mbp | |||
| 500 | 770 | 4000 | Km3 | ||
| 1e-50 | RecA | 0.3 | 0.3 | 0.3 | 0.1 |
| DnaK | 0.5 | 0.7 | 0.6 | 1.5 | |
| RpoB | 0.3 | 0.3 | 0.3 | 0.7 | |
| Pgm | 0.0 | 0.2 | 0.1 | 0.4 | |
| PycA | 0.5 | 0.6 | 1.0 | 0.7 | |
| GyrB | 0.4 | 0.7 | 0.7 | 0.5 | |
| Mdh | 1.1 | 1.0 | 1.0 | 1.1 | |
| 1e-20 | Dehydrogenases | 40.3 | 38.1 | 43.7 | 59.7 |
| Carbon-monoxide-dehydrogenase ( | 1.0 | 1.7 | 1.0 | 2.5 | |
| Luciferase-like genes | 0.4 | 1.4 | 1.5 | 2.9 | |
| Transposase | 1.6 | 1.7 | 8.1 | 4.6 | |
| Phage-related | 2.0 | 1.3 | 2.5 | 3.6 | |
| Chaperones | 2.1 | 2.4 | 2.5 | 3.7 | |
| Dehalogenases | 0.5 | 0.2 | 0.6 | 1.4 | |
Figure 5Comparison of COG distribution of fosmid-ends in Km3 and ALOHA water column.
Fosmid-ends were classified according to the COG database both Km3 and ALOHA [14] datasets were analyzed (see methods).
Figure 6Distribution of Km3 fosmid-ends in KEGG categories.
A, Detailed KEGG categories. B, Major KEGG categories and classification by type of substrate of Km3 fosmid-ends identified as transporters. * Other transporters.
Figure 7Normalized metagenome comparison of 3000 m-deep Mediterranean Km3 and Pacific ALOHA water column.
For normalization, a total of 6,853 sequences (size of the smallest library compared, ALOHA 130 m) from each library were randomly selected and compared. A, Neighbour joining analysis of fosmid-end sequences in Km3 and different depths in ALOHA. Temperature, salinity, and the total number of sequences available for each library are shown on the right; Jackknife values, at nodes. B, Normalized MUMmer plots showing the number of maximal unique matches (MUMs) shared by the 3000 m deep Km3 and the different ALOHA metagenomic libraries. MUMs are distributed as a function of their identity (ordinates) and the type of COG to which they belong (abscises). Average identity values are indicated for each pair of libraries compared. The number of MUMs having more that 80% identity are given to the right of each panel.
ALOHA and Km3 oceanographic data.
| North Pacific ALOHA | Mediterranean Km3 | |||||||
| coordinates | 22°45′N, 158°W | 36°30′N, 15°40′E | ||||||
| max.depth (m) | 4800 | 3243 | ||||||
|
|
|
|
|
|
|
|
|
|
| Sampling Time | Oct. 7th 2002 | Oct. 7th 2002 | Oct. 6th 2002 | Oct. 6th 2002 | Oct. 6th 2002 | Dec. 21st 2003 | Dec. 21st 2003 | Nov. 17th 2004 |
| Temp. (°C) | 26.40 (24.83±1.27)* | 24.93 (23.58±1.00)* | 22.19 (21.37±0.96)* | 18.53 (18.39±1.29)* | 7.25 (7.22±0.44)* | 4.78 (4.86±0.21)* | 1.46 (1.46±0.01)* | 13.93 (13.80±0.05)* |
| Salinity (PSU) | 35.08 (35.05±0.21)* | 35.21 (35.17±0.16)* | 35.31 (35.20±0.10)* | 35.04 (34.96±0.18)* | 34.07 (34.06±0.03)* | 34.32 (34.32±0.04)* | 34.69 (34.69±0.00)* | 38.74 (38.69±0.03)* |
| Chl (μg/Kg) | 0.08 (0.08±0.03)* | 0.18 (0.15±0.05)* | 0.10 (0.15±0.06)* | 0.02 (0.02±0.02)* | ND | ND | ND | ND |
| DOC (μM/Kg) | 78 (90.6±14.3)* | 79 (81.4±11.3)* | 69 (75.2±9.1)* | 63 (60.4±9.8)* | 47 (47.8±6.3)* | 39.9 (41.5±4.4)* | 37.5 (42.3±4.9)* | 54.2±5.85* |
| Oxygen (μM/Kg) | 204.6 (209.3±4.5)* | 217.4 (215.8±5.4)* | 204.9 (206.6±6.2)* | 198.8 (197.6±7.1)* | 118.0 (120.5±18.3)* | 32.2 (27.9±4.1)* | 147.8 (147.8±1.3)* | 203.7 (202.66±1.2)* |
| DIP (nmol/Kg) | 41.0 (56.0±33.7)* | 16 (43.1±25.1)* | 66.2 (106.0±49.7)* | 274.2±109.1* | 2153 (2051±175.7)* | 3070 (3000±47.1)* | 2558 (2507±19)* | 159.0±22.6* |
| N+N (nmol/Kg) | 1.0 (2.6±3.7)* | 1.3 (14.7±60.3)* | 284.8 (282.9±270.2)* | 1161.9±762.5* | 28850 (28460±2210)* | 41890 (40940±500)* | 36560 (35970±290)* | 4706±133.3* |
| SLCA (μl/Kg) | 1.30±0.37* | 1.34±0.37* | 1.72±0.56* | 5.31±0.74* | 45.37±5.75* | 92.06±4.08* | 160* | 8.32±0.24* |
| HPP (cell×104 ml−1) | 30.2±16.2* | 25.2±9.9* | 19.9±6.9* | 13.03±2.5* | 5.19±1.5* | 3.15±0.7* | 0.55±0.06* | 3.1±1.73* |
| POC (μM C / Kg) | 2.16±0.54* | 1.97±0.35* | 1.29±0.36* | 0.55±0.15* | 0.39±0.13* | 0.30±0.13* | - | 1.925±0.56* |
Values shown are those from the same CTD casts as the samples (DeLong et al. 2006 and this work). *Archival data are from ALOHA HOT-DOGS© database (http://hahana.soest.hawaii.edu/hot/hot-dogs/) or in the case of Km3 from the ICES oceanographic database (http://www.ices.dk/ocean) and correspond to several datasets collected at the depth and approximate location (less than 50 NM away) as the samples. Values in parentheses are the average value±standard deviation. Abbreviations are Temp, Temperature; Chl, chlorophyll; DOC, dissolved organic carbon; DIP, dissolved inorganic phosphate; N+N, nitrate plus nitrite; SLCA, silicate; HPP, heterotrophic picoplankton (DAPI counts); POC, particulate organic carbon.