| Literature DB >> 20426792 |
Philippe Vandenkoornhuyse1, Alexis Dufresne, Achim Quaiser, Gwenola Gouesbet, Françoise Binet, André-Jean Francez, Stéphane Mahé, Myriam Bormans, Yvan Lagadeuc, Ivan Couée.
Abstract
Environmental genomics and genome-wide expression approaches deal with large-scale sequence-based information obtained from environmental samples, at organismal, population or community levels. To date, environmental genomics, transcriptomics and proteomics are arguably the most powerful approaches to discover completely novel ecological functions and to link organismal capabilities, organism-environment interactions, functional diversity, ecosystem processes, evolution and Earth history. Thus, environmental genomics is not merely a toolbox of new technologies but also a source of novel ecological concepts and hypotheses. By removing previous dichotomies between ecophysiology, population ecology, community ecology and ecosystem functioning, environmental genomics enables the integration of sequence-based information into higher ecological and evolutionary levels. However, environmental genomics, along with transcriptomics and proteomics, must involve pluridisciplinary research, such as new developments in bioinformatics, in order to integrate high-throughput molecular biology techniques into ecology. In this review, the validity of environmental genomics and post-genomics for studying ecosystem functioning is discussed in terms of major advances and expectations, as well as in terms of potential hurdles and limitations. Novel avenues for improving the use of these approaches to test theory-driven ecological hypotheses are also explored.Entities:
Mesh:
Year: 2010 PMID: 20426792 PMCID: PMC2901524 DOI: 10.1111/j.1461-0248.2010.01464.x
Source DB: PubMed Journal: Ecol Lett ISSN: 1461-023X Impact factor: 9.492
Figure 1Real-life and ideal fluxes of analysis and information in environmental genomics. Current throughputs of analysis and information-processing are given as black arrows, whereas the ideal throughputs to be achieved are shown as white arrows. Arrow thickness reflects the efficiency of the analyses.
Advantages and limitations in environmental genomics and post-genomics
| Stage of analysis | Advantages | Limitations |
|---|---|---|
| Sampling | No culture- or growth-related bias | Spatio-temporal heterogeneity |
| Direct environmental sampling; large multi-species sampling; large multi-tissue sampling | Cost of representative or exhaustive sampling | |
| Analysis of complex experimental designs involving populations and communities | Careful ecological assessment of environmental sampling and of experimental designs | |
| Possible long-term storage of DNA, RNA, or protein samples | Availability of reliable protocols for the extractions of nucleic acids and proteins | |
| Sequencing | High-throughput technologies for DNA, RNA and proteins | Possibilities of sequencing bias; poor sequencing of less-represented genomes |
| Decreasing cost of sequencing and mass spectrometry | Cost of sequencing for large sample collections, in relation to the exhaustiveness of sampling | |
| Long-term public databases | Exponential increase of the amount of sequence data; cost and maintenance of database infrastructure | |
| Information processing and functional analysis of organisms, communities and ecosystems | Biodiversity and phylogenetic analysis | Taxonomic bias in databases |
| Functional profiling of naturally occurring organisms and communities | Assembly of short genomic fragments giving a partial view of organismal functional capacities | |
| Link function and diversity and answer the question ‘who is doing what? | Functional bias in database; computational demand for bioinformatics analyses; poor quality of annotations and amplification of annotation errors | |
| Discovery of novel ecologically relevant functions | Functional inferences from genomics data in the absence of transcriptomic and/or proteomic data; biased conclusions on the basis of apparent absence of function | |
| Identifying links between diversity, functional changes and environmental variables | Experimental bottleneck of functional characterization of new genes | |
| Evolvability of genomics data analysis through improvement of annotations | Computational cost of re-annotating sequences | |
| Re-analysis of genomics data in the light of novel environmental data | Comprehensive environment variable surveys;environment variable databases;environment-dedicated bioinformatics tools;exponential increase of environmental data; increased complexity of the comparison between environmental data and genomics data | |
| Comparison of present-day ecosystem functioning with earth history and paleo-ecosystem functioning Combination of synchronic and diachronic analysis | ||
| Identifying links between diversity, functionalchanges and environmental variables | Confusing the reality of ecosystem functioning with the reconstructed image from environmental genomics |
Figure 2Mathematical modelling in environmental genomics analysis. Reconstructed networks from environmental genomics data (Box S2) can be analysed by various methods of mathematical modelling (Getz 2003; Feist ; Westerhoff & Palsson 2008; Fuhrman 2009), that can assess and quantify their dynamic properties and generate hypotheses on community and ecosystem functioning. Hypothesis testing can then be carried out by experimental and environmental verification approaches, with the subsequent possibility of iterations between the different steps of the process. The main steps in this flowchart are derived from the description of the systems biology paradigm by Palsson (2006).
Figure 3Spatio-temporal three-dimensional organisation of sequence-derived datasets. The set of environmental genomic, cDNA, or protein sequences (grey bars) is ascribed to a set of i Species (S), thus resulting in species-labelled sequences (colour bars). The aim of functional analysis and profiling is to ascribe species-labelled sequences to a set of j functional categories (F), thus resulting in a ‘potential function × species’ understanding of the ecosystem. The third dimension of the matrix corresponds to spatio-temporally replicated samples, such as samples subjected to various environmental constraints, or samples at different points in time. This kind of dataset can be analysed not only to understand the mechanisms induced by a forcing variable, but also to select and parameterize the components that have to be included in a model.