| Literature DB >> 28620454 |
Espen Mikal Robertsen1, Hubert Denise2, Alex Mitchell2, Robert D Finn2, Lars Ailo Bongo1, Nils Peder Willassen1.
Abstract
Metagenomics, the study of genetic material recovered directly from environmental samples, has the potential to provide insight into the structure and function of heterogeneous microbial communities. There has been an increased use of metagenomics to discover and understand the diverse biosynthetic capacities of marine microbes, thereby allowing them to be exploited for industrial, food, and health care products. This ELIXIR pilot action was motivated by the need to establish dedicated data resources and harmonized metagenomics pipelines for the marine domain, in order to enhance the exploration and exploitation of marine genetic resources. In this paper, we summarize some of the results from the ELIXIR pilot action "Marine metagenomics - towards user centric services".Entities:
Keywords: Marine; gap analysis; metagenomics; pipelines
Year: 2017 PMID: 28620454 PMCID: PMC5461914 DOI: 10.12688/f1000research.10443.1
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Figure 1. Tools and steps in EMG and META-pipe.
Effect on assembly with rRNA filtering.
| Muddy
[ | Muddy
| Sandy
[ | Sandy
| Moose
[ | Moose
| Sea Urchin
[ | Sea Urchin
| |
|---|---|---|---|---|---|---|---|---|
|
| 267 433 | 266 814 | 148 228 | 147 928 | 973 097 | 972 462 | 1 010 649 | 1 010 610 |
|
| 25 581 | 25 302 | 25 294 | 25 138 | 211 333 | 210 348 | 114 307 | 114 189 |
|
| 5 659 | 5 572 | 6 213 | 6 118 | 57 779 | 57 147 | 32 593 | 32 433 |
|
| 25 155 475 | 24 822 906 | 25 301 011 | 25 038 101 | 248 491 504 | 246 880 376 | 143 551 468 | 143 216 805 |
|
| 291 296
| 181 694
| 444 398
| 286 608
| 4 266 146
| 3 504 213
| 11 923 718
| 11 615 439
|
|
| 931 | 930 | 976 | 972 | 1214 | 1211 | 1287 | 1285 |
|
| 0 | 0 | 4 | 1 | 10 | 4 | 38 | 20 |
|
| 4 | 2 | 8 | 1 | 23 | 5 | 89 | 68 |
|
| 918 | 214 | 2 836 | 1 093 | 5 609 | 4 202 | 15 674 | 14 132 |
|
| 83 | 19 | 277 | 96 | 278 | 167 | 1003 | 814 |
1MarRef database length: 1 135 Mb, 2MetaQUAST downloaded reference database length: 262 Mb.
Number of identified unique taxa.
Numbers in parenthesis includes eukaryotic hits classified by META-pipe.
| Dataset | Muddy | Sandy | Moose | Sea urchin | ||||
|---|---|---|---|---|---|---|---|---|
| Pipeline | META-pipe | EMG | META-pipe | EMG | META-pipe | EMG | META-pipe | EMG |
|
| 41 (57) | 40 | 22 (38) | 16 | 16 (26) | 17 | 18 (54) | 13 |
|
| 67 (88) | 83 | 38 (62) | 36 | 22 (39) | 33 | 38 (97) | 30 |
|
| 126 (150) | 111 | 69 (98) | 61 | 32 (51) | 42 | 72 (135) | 68 |
|
| 113 (138) | 97 | 93 (115) | 84 | 44 (62) | 61 | 88 (157) | 107 |
|
| 111 (129) | 79 | 138 (155) | 120 | 79 (92) | 85 | 160 (227) | 170 |
|
| 6 (11) | 19 | 31 (40) | 38 | 5 (14) | 30 | 61 (102) | 73 |
Figure 2. Krona chart representation of taxonomic classification of the “Muddy” dataset from META-pipe ( A) and EBI Metagenomics Portal ( B) pipelines.ppl.
Figure 3. Krona chart representation of taxonomic classification of the “Moose” dataset from META-pipe ( A) and EBI Metagenomics Portal ( B) pipelines.
Figure 4. Predicted gene length distribution from META-pipe and EMG pipelines.
EMG, EBI Metagenomics Portal.
Figure 5. Comparison of counted GO-slim annotations from META-pipe and the EMG pipeline.
Thickness of bars corresponds to fraction size of accumulated GO-slim annotations for each pipeline. GO, Gene Ontology; EMG, EBI Metagenomics Portal.