| Literature DB >> 24564732 |
Alejandra González-Beltrán, Steffen Neumann, Eamonn Maguire, Susanna-Assunta Sansone, Philippe Rocca-Serra.
Abstract
BACKGROUND: The ISA-Tab format and software suite have been developed to break the silo effect induced by technology-specific formats for a variety of data types and to better support experimental metadata tracking. Experimentalists seldom use a single technique to monitor biological signals. Providing a multi-purpose, pragmatic and accessible format that abstracts away common constructs for describing Investigations, Studies and Assays, ISA is increasingly popular. To attract further interest towards the format and extend support to ensure reproducible research and reusable data, we present the Risa package, which delivers a central component to support the ISA format by enabling effortless integration with R, the popular, open source data crunching environment.Entities:
Mesh:
Year: 2014 PMID: 24564732 PMCID: PMC4015122 DOI: 10.1186/1471-2105-15-S1-S11
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Standards used in 'omics public repositories
| Technology Type | Minimum Information Guidelines | Metadata Format | Ontology or Controlled Vocabulary | Public Repositories |
|---|---|---|---|---|
| DNA microarray | MIAME [ | MAGE-Tab [ | MGED [ | ArrayExpress [ |
| next generation sequencing | MIMARKS, MIxS [ | SRA-XML | internal to SRA-schema | SRA, ENA |
| mass spectrometry | MIAPE [ | PRIDE-XML | MS | PRIDE |
| mass spectrometry, NMR spectroscopy | CIMR | ISA-Tab | OBI [ | Metabolights [ |
Some examples of formats, data models, terminologies and public repositories for omics experiments
Both repositories, ArrayExpress and GEO, were originally designed to store DNA microarray data. While nowadays, they also allow submission of next generation sequencing data, these are done through underlying submissions to ENA (see http://www.ebi.ac.uk/training/online/course/arrayexpress-submitting-data-using-mage-tab/submission-hts-data) and SRA
The GEO database does not deal with the MAGE-Tab format, but data from GEO can be accessed in ArrayExpress exposed in MAGE-Tab.
PRIDE submission guidelines: http://www.ebi.ac.uk/pride/submissionGuidelines.do
CIMR: http://msi-workgroups.sourceforge.net/
Figure 1ISA-guided domain specific workflows. A possible workflow for an ISA-Tab augmented experiment design and execution. ① The experiment is designed with e.g. the ISAcreator to define the samples. ② The experiment is performed and samples are collected. ③ The sample names are transferred to the machine, e.g. a mass spectrometer (MS), to run the assays. ④ Assays are performed. In the MS example, it often occurs that the MS instrument software allows to copy & paste into its sample table, and creates a report (including MS filenames) that can go into the ISA-Tab assay information. ⑤ Domain specific R packages, such as xcms for MS, process the raw data. ⑥ The Risa objects are augmented with the results of the assay and the completed ISA-Tab dataset can be written back to disk.
Figure 2Code for the ISATab-class.
Figure 3Code for the AssayTab-class.
Mapping ISA configurations to BiocViews
| ISA Configuration File | ISA measurement | BiocViews | ISA technology | BiocViews |
|---|---|---|---|---|
| cellcount_flowcytometry | cell counting | flow cytometry | FlowCytometry | |
| cellsorting_flowcyt | cell sorting | flow cytometry | FlowCytometry | |
| clinical_chemistry | clinical chemistry analysis | |||
| copynumvariation_micro | copy number variation profiling | CopyNumberVariants | DNA microarray | Microarray, aCGH |
| dnamethylation_micro | DNA methylation profiling | DNAMethylation | DNA microarray | Microarray, ChIPchip, CpGIsland, Methylseq |
| dnamethylation_seq | DNA methylation profiling | DNAMethylation | nucleotide sequencing | Sequencing, ChIPseq, CpGIsland, Methylseq |
| envgen survey_seq | environmental gene survey | nucleotide sequencing | Sequencing | |
| genome_seq | genome sequencing | nucleotide sequencing | Sequencing | |
| hematology | hematology | |||
| heterozygosity_micro | loss of heterozygosity profiling | SNP, CopyNumber Variants | DNA microarray | Microarray |
| histology | histology | |||
| histonemodification_seq | histone modification profiling | Regulation | nucleotide sequencing | Sequencing, ChIPseq |
| metaboliteprofiling_ms | metabolite profiling | Metabolomics | mass spectrometry | MassSpectrometry |
| metaboliteprofiling_nmr | metabolite profiling | Metabolomics | NMR spectroscopy | |
| metagenome_seq | metagenome sequencing | nucleotide sequencing | Sequencing | |
| ppi_detection_micro | protein-protein interaction detection | protein microarray | Microarray | |
| protein_dna_binding_ident_micro | protein-DNA binding site identification | Regulation | DNA microarray | Microarray, ChIPchip |
| protein_dna_binding_ident_seq | protein-DNA binding site identification | Regulation | nucleotide sequencing | Sequencing, ChIPseq |
| protein expression_ge | protein expression profiling | Proteomics | gel electrophoresis | |
| protein expression_micro | protein expression profiling | Proteomics | protein microarray | Microarray |
| protein expression_ms | protein expression profiling | Proteomics | mass spectrometry | MassSpectrometry, Proteomics |
| proteinident_ms | protein identification | mass spectrometry | MassSpectrometry, Proteomics | |
| snpanalysis_micro | SNP analysis | SNP | DNA microarray | Microarray, GeneticVariability |
| studySample | ||||
| tfbsident_micro | transcription factor binding site identification | Regulation | DNA microarray | Microarray, ChIPchip |
| tfbsident_seq | transcription factor binding site identification | Regulation | nucleotide sequencing | Sequencing, ChIPseq |
| transcription_micro | transcription profiling | Transcription, GeneExpression | DNA microarray | Microarray, DifferentialExpression, ExonArray |
| transcription_rtpcr | transcription profiling | Transcription, GeneExpression, DifferentialExpression | real time PCR | qPCR |
| transcription_seq | transcription profiling | Transcription, GeneExpression | nucleotide sequencing | Sequencing, DifferentialExpression, RNAseq |
Figure 4Output of suggestBiocPackage(faahkoISA).
Figure 5Risa usage. Download statistics for the Risa package in Bioconductor retrieved on 24June 2013, latest data available at http://bioconductor.org/packages/stats/bioc/Risa.html.