| Literature DB >> 23175748 |
Carlotta De Filippo1, Matteo Ramazzotti, Paolo Fontana, Duccio Cavalieri.
Abstract
Metagenomic approaches are increasingly recognized as a baseline for understanding the ecology and evolution of microbial ecosystems. The development of methods for pathway inference from metagenomics data is of paramount importance to link a phenotype to a cascade of events stemming from a series of connected sets of genes or proteins. Biochemical and regulatory pathways have until recently been thought and modelled within one cell type, one organism, one species. This vision is being dramatically changed by the advent of whole microbiome sequencing studies, revealing the role of symbiotic microbial populations in fundamental biochemical functions. The new landscape we face requires a clear picture of the potentialities of existing tools and development of new tools to characterize, reconstruct and model biochemical and regulatory pathways as the result of integration of function in complex symbiotic interactions of ontologically and evolutionary distinct cell types.Entities:
Mesh:
Year: 2012 PMID: 23175748 PMCID: PMC3505041 DOI: 10.1093/bib/bbs070
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1Flowchart of the main steps and bioinformatics tools required for pathway reconstruction from metagenomics surveys. Numbers in circles correspond to specific tools and programs developed for the corresponding steps and listed in the right part of the figure (links listed on Table 1). Curly brackets point to application specific databanks. The analytic procedure ideally bifurcate at the starting point according to the investigation strategy: DNA can undergo a PCR-based amplification step to increase the amount of a specific marker gene (e.g. ribosomal RNA) and then subject to Roche 454 sequencing or can be fragmented and prepared into libraries for metagenomics Illumina/SOLiD sequencing. Both those techniques are characterized by the generation of a huge amount of short reads that necessitate care and powerful instrumentation for their handling and processing. The simplest analytic choice is to map short reads into reference databases such as that maintained by the Ribosomal Database Project for the taxonomy survey via 16 S sequencing (1) or into NCBI non-redundant (nr/nt) for environmental microbiome or, in case of gut microbiome surveys, the better-scoped MetaHIT (2). Another possibility is to assemble the short reads into longer contigs using new generation assemblers specific for unevenly distributed reads deriving from the multitude of different microbes represented in the community (3). Their application improves the efficiency of gene finding programs that, even though applicable directly on reads, have a higher level of information to ensure more confident gene identification (4). Once coding sequences have been obtained, their corresponding proteins can be searched in reference functional databases encoding information in the form of HMMs or PSSM from multiple sequence alignments (5) or directly in reference protein sets derived from primary databanks or from genome-derived collections. The first approach leads to a direct identification of associated functions that can be used to identify and score pathways (6) and in the end apply a battery of statistical techniques for sample characterization (7). The second approach can be used to obtain taxonomic and functional distributions (8) and allows to directly feed metabolic pathway identification (9) that in turn can be converted into stoichiometric models (10) for simulating the behaviour of single organisms or the relationships within a community, with the potential of predicting their response to changing environmental conditions.
Tools for metagenomic analysis indexed by scope
| Scope | Name | Link to program |
|---|---|---|
| Recruitment | BWA | bio-bwa.sourceforge.net |
| Bowtie | bowtie-bio.sourceforge.net | |
| FR-HIT | weizhong-lab.ucsd.edu/frhit | |
| Assembly | Meta-Velvet | metavelvet.dna.bio.keio.ac.jp |
| META-IDBA | i.cs.hku.hk/∼alse/hkubrg/projects/metaidba/ | |
| IDBA-UD | i.cs.hku.hk/∼alse/hkubrg/projects/idba_ud/ | |
| Genovo | cs.stanford.edu/group/genovo/ | |
| Genes | FragGeneScan | omics.informatics.indiana.edu/FragGeneScan |
| MGA | whale.bio.titech.ac.jp/metagene | |
| Glimmer-MG | ||
| GeneMark | exon.gatech.edu/metagenome | |
| Annotation | RPSBlast | |
| HMMer3 | hmmer.janelia.org | |
| BLAST | blast.ncbi.nlm.nih.gov | |
| RAPSearch2 | omics.informatics.indiana.edu/mg/RAPSearch2 | |
| RAST | rast.nmpdr.org/ | |
| Taxonomy | RDPclassifier | rdp.cme.msu.edu |
| NBC | nbc.ece.drexel.edu | |
| CARMA3 | webcarma.cebitec.uni-bielefeld.de | |
| MEGAN | ab.inf.uni-tuebingen.de/software/megan | |
| SOrt-ITEMS | metagenomics.atc.tcs.com/binning/SOrt-ITEMS | |
| Servers | MG-RAST | metagenomics.anl.gov |
| IMG/M | img.jgi.doe.gov/ | |
| EBI metagenomics | ||
| Models | PathwayTools | bioinformatics.ai.sri.com/ptools/ |
| Model SEED | seed-viewer.theseed.org/seedviewer.cgi?page=ModelView | |
| Analysis | GSEA | |
| ShotgunFunctionalizeR | ||
| MetaPath | ||
| STAMP | ||
| HUMAnN | uttenhower.sph.harvard.edu/humann |
Figure 2Example of a ‘meta-pathway’: amino acid biosynthesis in the Acyrthosphion pisum/Buchnera aphidicola symbiosis. Amino acids in squared boxes are non-essential, methionine in round box is essential. Solid lines and gene names are for Buchnera, dashed lines and EC codes are for Acyrthosphion. The non-essential amino acid cysteine (Cys) is synthesized by Buchnera aphidicola from phloem sap provisioned sulphate and A. pisum synthesized serine (non-essential). Adapted from [11].