| Literature DB >> 22745802 |
Daniel Vaulot1, Cécile Lepère, Eve Toulza, Rodrigo De la Iglesia, Julie Poulain, Frédéric Gaboyer, Hervé Moreau, Klaas Vandepoele, Osvaldo Ulloa, Frederick Gavory, Gwenael Piganeau.
Abstract
Among small photosynthetic eukaryotes that play a key role in oceanic food webs, picoplanktonic Mamiellophyceae such as Bathycoccus, Micromonas, and Ostreococcus are particularly important in coastal regions. By using a combination of cell sorting by flow cytometry, whole genome amplification (WGA), and 454 pyrosequencing, we obtained metagenomic data for two natural picophytoplankton populations from the coastal upwelling waters off central Chile. About 60% of the reads of each sample could be mapped to the genome of Bathycoccus strain from the Mediterranean Sea (RCC1105), representing a total of 9 Mbp (sample T142) and 13 Mbp (sample T149) of non-redundant Bathycoccus genome sequences. WGA did not amplify all regions uniformly, resulting in unequal coverage along a given chromosome and between chromosomes. The identity at the DNA level between the metagenomes and the cultured genome was very high (96.3% identical bases for the three larger chromosomes over a 360 kbp alignment). At least two to three different genotypes seemed to be present in each natural sample based on read mapping to Bathycoccus RCC1105 genome.Entities:
Mesh:
Year: 2012 PMID: 22745802 PMCID: PMC3382182 DOI: 10.1371/journal.pone.0039648
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of metagenomic sequences and assemblies for eastern South Pacific picoeukaryote samples T142 and T149.
| Sample Name | T142 | T149 |
| Cruise | BIOSOPE | BIOSOPE |
| Date | 06/12/2004 | 08/12/2004 |
| Station | UPW1 | UPW3 |
| Longitude | 73° 22.177 W | 73° 20.413 W |
| Latitude | 33° 59.779 S | 33° 51.630 S |
| Depth (m) | 5 | 30 |
| Population | Photosynthetic picoeukaryotes | Photosynthetic picoeukaryotes |
| Number of cells sorted | 104 000 | 233 000 |
|
| ||
| Number | 671 249 | 671 832 |
| Total length (bp) | 287 619 051 | 279 858 614 |
| Mean length (bp) | 429 | 417 |
| Range (bp) | 40-2015 | 40-2044 |
| GC mean | 46.7% | 46.5% |
|
| ||
| Number | 17 633 | 28 262 |
| Total length (bp) | 16 984 438 | 24 845 872 |
| Number > = 500 bp | 9 010 | 13 213 |
| Largest (bp) | 35 494 | 43 276 |
|
| ||
| Number | 23 187 | 34 839 |
| Total length (bp) | 22 907 873 | 34 947 661 |
| Number > = 500 bp | 15 074 | 22 219 |
| Largest (bp) | 28 395 | 36 675 |
| Number of reads used for contigs | 633 780 | 607 236 |
Assignment of reads from samples T142 and T149 to individual chromosomes of B. prasinos RCC1105 using Geneious Assembler (see Materials and Methods for details).
| T142 | T149 | ||||||||||||
| Chromosome final | Chromosome draft | Length | GC% | Reads | Coverage | Coverage | Coverage Depth | Identical sites | Reads | Coverage | Coverage | Coverage Depth | Identical sites |
| bp | % | # | bp | % | x | % | # | bp | % | x | % | ||
| chromosome_1 | Bathy_chrom000 | 1 352 574 | 48.87% | 17 507 | 887 310 | 64.40% | 4.61 | 95.50% | 44 115 | 1 260 242 | 90.60% | 11.26 | 90.50% |
| chromosome_2 | Bathy_chrom001 | 1 122 692 | 48.55% | 76 131 | 763 755 | 65.50% | 23.96 | 91.70% | 45 448 | 1 045 505 | 89.90% | 13.84 | 88.90% |
| chromosome_3 | Bathy_chrom002 | 1 089 374 | 48.62% | 23 634 | 747 266 | 67.00% | 7.68 | 94.50% | 36 961 | 1 032 050 | 91.80% | 11.66 | 90.00% |
| chromosome_4 | Bathy_chrom003 | 1 037 991 | 48.29% | 14 217 | 613 146 | 58.00% | 4.76 | 95.70% | 19 099 | 914 632 | 86.10% | 6.22 | 91.90% |
| chromosome_5a | Bathy_chrom012 | 550 167 | 48.34% | 4 559 | 299 250 | 53.40% | 2.95 | 96.50% | 13 535 | 504 509 | 89.20% | 8.44 | 90.90% |
| chromosome_5b | Bathy_chrom016 | 467 783 | 48.58% | 2 152 | 222 904 | 47.00% | 1.58 | 97.20% | 13 712 | 437 480 | 90.60% | 10.02 | 90.00% |
| chromosome_6 | Bathy_chrom004 | 989 707 | 48.30% | 17 733 | 677 434 | 66.90% | 6.25 | 94.30% | 29 292 | 915 584 | 90.10% | 10.06 | 90.60% |
| chromosome_7 | Bathy_chrom005 | 955 054 | 48.42% | 49 892 | 608 532 | 61.60% | 17.54 | 93.30% | 30 116 | 845 293 | 85.80% | 10.63 | 90.30% |
| chromosome_8 | Bathy_chrom006 | 937 610 | 48.54% | 10 892 | 545 919 | 57.00% | 4.09 | 95.70% | 29 489 | 841 544 | 86.80% | 10.70 | 89.90% |
| chromosome_9 | Bathy_chrom007 | 895 347 | 48.51% | 17 995 | 636 008 | 69.30% | 7.07 | 93.80% | 23 672 | 806 272 | 87.50% | 9.05 | 90.70% |
| chromosome_10 | Bathy_chrom008 | 794 148 | 48.38% | 20 350 | 533 074 | 65.30% | 9.03 | 93.70% | 19 996 | 743 420 | 90.90% | 8.66 | 90.20% |
| chromosome_11 | Bathy_chrom009 | 741 502 | 48.62% | 26 119 | 516 527 | 67.50% | 12.50 | 92.40% | 12 368 | 634 120 | 83.70% | 5.70 | 92.50% |
| chromosome_12a | Bathy_chrom019 | 201 229 | 47.52% | 2 066 | 106 614 | 52.00% | 3.66 | 96.30% | 3 707 | 180 588 | 87.40% | 6.31 | 91.40% |
| chromosome_12b | Bathy_chrom014 | 511 334 | 48.55% | 19 985 | 391 273 | 74.20% | 13.84 | 92.10% | 15 530 | 483 876 | 91.70% | 10.45 | 89.30% |
| chromosome_13 | Bathy_chrom010 | 706 576 | 48.54% | 90 889 | 363 302 | 48.00% | 45.29 | 91.30% | 17 849 | 584 838 | 80.20% | 8.54 | 91.00% |
| chromosome_14 | Bathy_chrom011-50 | 662 304 | 42.24% | 1 873 | 264 505 | 39.30% | 0.93 | 97.70% | 12 822 | 433 085 | 63.70% | 6.41 | 93.90% |
| chromosome_15 | Bathy_chrom013 | 519 535 | 48.13% | 6 320 | 324 598 | 61.10% | 4.30 | 94.80% | 10 865 | 473 213 | 88.80% | 7.14 | 91.60% |
| chromosome_16 | Bathy_chrom015 | 481 036 | 48.08% | 14 765 | 268 376 | 52.30% | 10.67 | 94.00% | 14 526 | 445 984 | 87.50% | 10.16 | 89.90% |
| chromosome_17 | Bathy_chrom017-28 | 465 570 | 47.66% | 18 609 | 283 891 | 58.50% | 14.21 | 91.80% | 10 433 | 390 682 | 81.30% | 7.63 | 90.60% |
| chromosome_18 | Bathy_chrom018 | 310 170 | 46.97% | 9 003 | 226 193 | 70.50% | 10.30 | 91.20% | 5 203 | 269 409 | 84.60% | 5.67 | 91.00% |
| chromosome_19 | Bathy_chrom020 | 146 238 | 41.65% | 73 | 10 779 | 7.20% | 0.13 | 99.30% | 129 | 11 504 | 7.70% | 0.21 | 99.10% |
| Chloroplast | Bathy_chrom021 | 54 761 | 41.12% | 21 | 5 029 | 6.90% | 0.09 | 99.70% | 46 | 12 675 | 17.30% | 0.20 | 99.40% |
| Mitochondrion | Bathy_chrom024 | 42 168 | 39.97% | 261 | 30 825 | 72.20% | 1.74 | 97.00% | 3 470 | 43 650 | 99.90% | 21.90 | 91.70% |
|
| 15 034 870 | 445 046 | 9 326 510 | 62.03% | 10.38 | 412 383 | 13 310 155 | 88.53% | 9.34 | ||||
| % of reads | 66.3% | 61.4% | |||||||||||
The numeration of chromosomes is provided both for the draft version of the genome (used in this work) and for the final version of the genome [37].
Figure 1Assembly of metagenomic reads obtained from flow cytometry sorted picoeukaryote samples T142 and T149 from the Chile upwelling to Bathycoccus prasinos RCC1105 genome.
(A) Relationship between average coverage fraction (expressed as % of the length of the chromosome covered by at least one read) and GC content for samples T142 and T149 for the 21 draft nuclear chromosomes of B. prasinos RCC1105 as well as the mitochondrion and plastid genomes. (B) Idem for average coverage depth (number of reads at each position).
Figure 2GC content and coverage depth by reads from the eastern South Pacific picoeukaryote samples T142 and T149 of individual chromosomes of B. prasinos RCC1105.
(A) Chromosome 1. (B) Chromosome 14.
Similarity between B. prasinos RCC1105 genome and T142 and T149 assemblies for the three larger chromosomes.
| Length alignment (bp) | % Identical sites | % Identical sites (CDS + non CDS) | |||||||
| CDS + non CDS | CDS | non CDS | CDS + non CDS | CDS | non CDS | Bathy vs T142 | Bathy vs T149 | T142 vs T149 | |
| chromosome_1 | 78 022 | 70 644 | 7 376 | 96.7% | 97.5% | 89.5% | 97.8% | 97.3% | 98.3% |
| chromosome_2 | 158 544 | 142 659 | 19 681 | 95.8% | 97.0% | 87.2% | 97.0% | 96.6% | 97.4% |
| chromosome_3 | 123 912 | 115 661 | 12 823 | 96.7% | 97.4% | 89.5% | 97.8% | 97.4% | 97.8% |
|
| 360 478 | 329 964 | 39 880 | 96.3% | 97.2% | 88.5% | 97.4% | 97.0% | 97.7% |
Only regions with coverage in excess 10× for both samples were considered. Total genome and non-CDS regions were analyzed separately (see Material and Methods for details).
Figure 3Genotypes observed for B. prasinos RCC1105 gene Bathy02g01050 (pigment synthesis protein, BOGAS annotation code) for samples T142 (left) and T149 (right).
At least two different sequences appear to be present in both samples, differing by one and three positions, respectively, from the reference B.prasinos RCC 1105 sequence. Only 23 representative reads are shown for each sample although more reads covered this region (see Table S6).