| Literature DB >> 28095799 |
Carles Hernandez-Ferrer1,2,3, Carlos Ruiz-Arenas1,2,3, Alba Beltran-Gomila1,2,3, Juan R González4,5,6.
Abstract
BACKGROUND: Reduction in the cost of genomic assays has generated large amounts of biomedical-related data. As a result, current studies perform multiple experiments in the same subjects. While Bioconductor's methods and classes implemented in different packages manage individual experiments, there is not a standard class to properly manage different omic datasets from the same subjects. In addition, most R/Bioconductor packages that have been designed to integrate and visualize biological data often use basic data structures with no clear general methods, such as subsetting or selecting samples.Entities:
Keywords: Data infrastructure; Data integration; Data organization; Omics data; R
Mesh:
Year: 2017 PMID: 28095799 PMCID: PMC5240259 DOI: 10.1186/s12859-016-1455-1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1This schema shows how the information is stored in the five attributes of a MultiDataSet and how the different parts are linked. phenoData and assayData share the dimension corresponding to samples. featureData, rowRanges and assayData share the dimension corresponding to features. All the attributes are linked through the data-sets name
Fig. 2This figure represents the organization of the specific and basic functions to add datasets to a MultiDataSet object. The basic functions, that receive generic eSet and SummarizedExperiment objects, directly interact with the MultiDataSet objects and developers should use them to extend the functionality of MultiDataSet. Specific functions receive more specific datasets and interact with the MultiDataSet object through basic functions. They check the structure of the dataset and users should use them