Literature DB >> 22493539

MAPT and PAICE: Tools for time series and single time point transcriptionist visualization and knowledge discovery.

Parsa Hosseini, Arianne Tremblay, Benjamin F Matthews, Nadim W Alkharouf.   

Abstract

UNLABELLED: With the advent of next-generation sequencing, -omics fields such as transcriptomics have experienced increases in data throughput on the order of magnitudes. In terms of analyzing and visually representing these huge datasets, an intuitive and computationally tractable approach is to map quantified transcript expression onto biochemical pathways while employing datamining and visualization principles to accelerate knowledge discovery. We present two cross-platform tools: MAPT (Mapping and Analysis of Pathways through Time) and PAICE (Pathway Analysis and Integrated Coloring of Experiments), an easy to use analysis suite to facilitate time series and single time point transcriptomics analysis. In unison, MAPT and PAICE serve as a visual workbench for transcriptomics knowledge discovery, data-mining and functional annotation. Both PAICE and MAPT are two distinct but yet inextricably linked tools. The former is specifically designed to map EC accessions onto KEGG pathways while handling multiple gene copies, detection-call analysis, as well as UN/annotated EC accessions lacking quantifiable expression. The latter tool integrates PAICE datasets to drive visualization, annotation, and data-mining. AVAILABILITY: The database is available for free at http://sourceforge.net/projects/paice/http://sourceforge.net/projects/mapt/

Entities:  

Year:  2012        PMID: 22493539      PMCID: PMC3321241          DOI: 10.6026/97320630008287

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

With next-generation sequencing becoming a mainstay in molecular biology, transcriptomics research will continue to make ever-growing leaps and bounds. Genomic coverage, not to mention advances in gene expression and gene copies are now at our fingertips. Just as our knowledge of highthroughput experiments continues to progress, so too will our understanding of annotated biochemical pathways. Databases such as KEGG [1] and Reactome provide a visual means of exploring functional enzyme activity within biological pathways. Numerous tools are actively in use which interface - omics data with KEGG: Paintomics [2], Genoscape [3], and KEGGanim [4]. The Caleydo software [5] utilizes KEGG to provide a means of visualizing gene expression in a 3D manner, equipped with capabilities such as hierarchal clustering and a user-driven GUI to assist pathway exploration and analysis. The above tools provide useful features and are built with solid capabilities, however we found that these tools are organism dependent or have minimal features for processing time series data and handling of multiple gene copies. We present MAPT and PAICE, tools to provide an organism independent transcriptomics workbench. Equipped with time series analysis, visualization and data-mining capabilities, both tools provide a low-resource and user friendly environment to drive knowledge discovery, data-mining and time-series analysis.

Software input/output

PAICE and MAPT are cross-platform standalone applications built using Python 2.7. The former tool requires the Python ‘suds’ SOAP client to facilitate KEGG pathway querying, while the latter tool requires ‘PyQt’ and ‘matplotlib’ to enable GUI and graphing capabilities respectively. Running PAICE is the first step to initiate analysis within this suite. In order to do so, a populated four-column tab-delimited text file is required. Each row in this file represents the necessary values for each of the four columns: an EC accession, a numerical experimental and control expression value, and a unique reference identifier (i.e. gene loci or chromosomal coordinates). PAICE utilizes the KEGG web-service to map EC accessions onto biochemical pathways, a service heavily studied with numerous resultant manuscripts and tools. PAICE however introduces additional features designed to deal with the complexities of todays -omics datasets. First is its handling of multiple EC gene copies: if a set of isoforms differ in expression such that some copies are induced while others are suppressed, each member in this set will be flagged. This feature provides insight into individual isoform quantification, useful when investigating gene duplication or alternative splicing as some copies may differ in expression more than others. Secondly, rather than adopting static coloring schemes whereby green and red represent induced and suppressed respectively, isoform expression is statistically stratified (lightly expressed, moderately expressed, heavily expressed). This stratification translates to color gradients whereby each stratum has a unique color. Lastly, two additional strata are further allocated, one for accessions failing to pass a userdefined fold-change cutoff, and another for annotated accessions that lack expression. This latter strata serves the goal of highlighting accessions which are annotated but do not have quantifiable expression, hence failing to map onto any pathway. Upon PAICE completion, a collection of KEGG pathways will be generated whereby all mapped EC accessions are colored based on their applicable strata. These pathways are then fed into MAPT, a graphical interface for sifting through expression- Overlaid pathways. Numerous analytical tools like MAPT have been developed: CPTRA [6], GeneVestigator [7], and TRAM [8]. MAPT differs from the above tools by bundling biological pathways with quantified expression whilst providing an organism-independent data-mining and transcriptomics analysis platform. There are two analytical views to make such analysis possible: single and multi time point view. The single time point view within MAPT is ideal for analyzing a single timepoint or PAICE dataset, equipped with features such as functional annotation, k-Means clustering and pathway similarity analysis. On the contrary, multiple timepoint view (Figure 1) visualizes gene copy expression per time point as well as additional analyses into gene copy expression levels; useful in cases where X copies are induced but Y copies are suppressed across differing loci.
Figure 1

MAPT time series analysis and viewer. The three tables above represent isoform expression levels, minimum and maximum expression levels per isoform, and an image viewer to visualize all pathways and their expression side by side; driven by PAICE-generated KEGG pathways. Any individual time point can honed in and analyzed independently in conjunction with additional built-in data-mining tools.

Conclusions

MAPT and PAICE are two tools designed for visualization and analysis of transcriptomics datasets. PAICE utilizes the proven and successful KEGG web-service to map numerical expression onto biochemical pathways, while MAPT provides an analytical framework to dissect such datasets and ultimately accelerate knowledge discovery through visualization and data-mining. Both MAPT and PAICE are actively in use throughout numerous research projects, e.g. in understanding the hostpathogen interactions within Soybean (Glycine max).

Future Improvement

PAICE and MAPT are continuously being worked on and improved. We welcome user feedback and suggestions as we strive to make them easier to use and intuitive in nature.
  7 in total

1.  KEGGanim: pathway animations for high-throughput data.

Authors:  Priit Adler; Jüri Reimand; Jürgen Jänes; Raivo Kolde; Hedi Peterson; Jaak Vilo
Journal:  Bioinformatics       Date:  2007-12-01       Impact factor: 6.937

2.  Genoscape: a Cytoscape plug-in to automate the retrieval and integration of gene expression data and molecular networks.

Authors:  Mathieu Clément-Ziza; Christophe Malabat; Christian Weber; Ivan Moszer; Tero Aittokallio; Catherine Letondal; Sandrine Rousseau
Journal:  Bioinformatics       Date:  2009-08-03       Impact factor: 6.937

3.  Caleydo: connecting pathways and gene expression.

Authors:  Marc Streit; Alexander Lex; Michael Kalkusch; Kurt Zatloukal; Dieter Schmalstieg
Journal:  Bioinformatics       Date:  2009-07-20       Impact factor: 6.937

4.  TRAM (Transcriptome Mapper): database-driven creation and analysis of transcriptome maps from multiple sources.

Authors:  Luca Lenzi; Federica Facchin; Francesco Piva; Matteo Giulietti; Maria Chiara Pelleri; Flavia Frabetti; Lorenza Vitale; Raffaella Casadei; Silvia Canaider; Stefania Bortoluzzi; Alessandro Coppe; Gian Antonio Danieli; Giovanni Principato; Sergio Ferrari; Pierluigi Strippoli
Journal:  BMC Genomics       Date:  2011-02-18       Impact factor: 3.969

5.  Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes.

Authors:  Tomas Hruz; Oliver Laule; Gabor Szabo; Frans Wessendorp; Stefan Bleuler; Lukas Oertle; Peter Widmayer; Wilhelm Gruissem; Philip Zimmermann
Journal:  Adv Bioinformatics       Date:  2008-07-08

6.  Novel software package for cross-platform transcriptome analysis (CPTRA).

Authors:  Xin Zhou; Zhen Su; R Douglas Sammons; Yanhui Peng; Patrick J Tranel; C Neal Stewart; Joshua S Yuan
Journal:  BMC Bioinformatics       Date:  2009-10-08       Impact factor: 3.169

7.  KEGG for linking genomes to life and the environment.

Authors:  Minoru Kanehisa; Michihiro Araki; Susumu Goto; Masahiro Hattori; Mika Hirakawa; Masumi Itoh; Toshiaki Katayama; Shuichi Kawashima; Shujiro Okuda; Toshiaki Tokimatsu; Yoshihiro Yamanishi
Journal:  Nucleic Acids Res       Date:  2007-12-12       Impact factor: 16.971

  7 in total
  1 in total

Review 1.  Proteogenomic convergence for understanding cancer pathways and networks.

Authors:  Emily S Boja; Henry Rodriguez
Journal:  Clin Proteomics       Date:  2014-06-01       Impact factor: 3.988

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.