| Literature DB >> 24808906 |
Robert M Flight1, Benjamin J Harrison2, Fahim Mohammad3, Mary B Bunge4, Lawrence D F Moon5, Jeffrey C Petruska6, Eric C Rouchka7.
Abstract
Assessment of high-throughput-omics data initially focuses on relative or raw levels of a particular feature, such as an expression value for a transcript, protein, or metabolite. At a second level, analyses of annotations including known or predicted functions and associations of each individual feature, attempt to distill biological context. Most currently available comparative- and meta-analyses methods are dependent on the availability of identical features across data sets, and concentrate on determining features that are differentially expressed across experiments, some of which may be considered "biomarkers." The heterogeneity of measurement platforms and inherent variability of biological systems confounds the search for robust biomarkers indicative of a particular condition. In many instances, however, multiple data sets show involvement of common biological processes or signaling pathways, even though individual features are not commonly measured or differentially expressed between them. We developed a methodology, categoryCompare, for cross-platform and cross-sample comparison of high-throughput data at the annotation level. We assessed the utility of the approach using hypothetical data, as well as determining similarities and differences in the set of processes in two instances: (1) denervated skin vs. denervated muscle, and (2) colon from Crohn's disease vs. colon from ulcerative colitis (UC). The hypothetical data showed that in many cases comparing annotations gave superior results to comparing only at the gene level. Improved analytical results depended as well on the number of genes included in the annotation term, the amount of noise in relation to the number of genes expressing in unenriched annotation categories, and the specific method in which samples are combined. In the skin vs. muscle denervation comparison, the tissues demonstrated markedly different responses. The Crohn's vs. UC comparison showed gross similarities in inflammatory response in the two diseases, with particular processes specific to each disease.Entities:
Keywords: comparative analysis; meta-analysis; metabolomics; proteomics; transcriptomics
Year: 2014 PMID: 24808906 PMCID: PMC4010757 DOI: 10.3389/fgene.2014.00098
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 2Results of SBTB hypothetical data analysis. (A) Distribution of number of genes annotated to GO::BP terms in human. (B) Difference in transformed p-values comparing maximum p-value from sample calculations and p-value from combined sample as a function of the fraction of genes in the sample annotated to the GO term. Plot is divided by the size classification of the GO term. (C) Average and standard deviations of performing the same calculations as in (B) using 100 independent sample generations. (D) Contour plot after using 100 different GO::BP sets to perform the same calculations as in (B). (E) Median p-value differences (taking the median using size as the grouping) as a function of the number noise genes added, colored by the fraction of noise genes shared between two samples.
Figure 1Flow diagram of comparative-analysis approaches. F, feature list; A, annotation list; FA, feature to annotation relationship data; 1,2, data set origination. “Start” denotes initial data availability, in this case from two different data sets, F1, and F2. The “Traditional Analysis” considers feature intersection and set-differences, possibly combined with direct comparison of the features (shown as a heatmap); whereas categoryCompare uses the enriched annotations of the feature list from each data set derived independently, combined with the feature-annotation relationships. See Results and Methods for more details.
Figure 3Results of RBTF hypothetical data analysis. (A) Difference of transformed p-values comparing the maximum term p-value from sample calculations and p-value from a combined sample where samples were combined using averages of t-statistics, as a function of the proportion of genes in the top 50% of entries by rank. (B) As in (A), but the maximum p-value was used to combine samples gene-wise.
Number of significantly enriched annotations in each list following denervation.
| 263 | 343 | 247 | 259 | 151 | 130 |
Significant term annotation groups and the gene-list they appeared significant in for the Skin vs. Muscle comparison.
| ATP biosynthesis | X | X | X | |||
| Muscle contraction and development | X | X | X | |||
| Mitotic cell cycle checkpoint | X | X | X | |||
| pH and lysosome regulation | X | X | X | |||
| Neurotransmitter secretion and transport | X | X | ||||
| Epithelial tube branching | X | X | ||||
| Endocrine regulation of blood pressure | X | X | ||||
| Fluid transport | X | X | ||||
| Tube development | X | X | ||||
| Response to glucose | X | X | ||||
| Steriod biosynthesis | X | |||||
| Lung cell differentiation | X | |||||
| Neg regulation of astrocyte diff. | X | |||||
| Fear response | X | |||||
| Dopamine transport | X | |||||
| Collateral sprouting | X | |||||
| Response to osmotic stress | X | |||||
| Positive regulation of epidermal growth factor signaling | X | |||||
| Response to leptin | X | |||||
| Catecholamin biosynthesis | X | |||||
| Microtubule organization | X | X | ||||
| Blood vessel endothelial cell differentiation | X | X | ||||
| Negative regulation of protein transport | X | X | ||||
| Negative regulation of response to granulocyte/myeloid cell diff. | X | X | ||||
| Type II hypersensitivity | X | |||||
| Entry into host and movement | X | |||||
| Response to virus | X | |||||
| Mitotic spindle assembly | X | |||||
| N acetylglucosamine metabolism | X | |||||
| Proteoglycan biosynthesis | X | |||||
| Negative regulation of cell junction assembly | X | |||||
| Endocardial cell differentiation | X | |||||
| Sequestering of actin monomers | X | |||||
| Beta amyloid formation | X | |||||
| Response to platelet derived growth factor stimulus | X | |||||
| Lipopolysaccharide biosynthesis | X | |||||
| Aminoglycan metabolism | X | |||||
| Epithelial to mesenchymal transition, endocardial cushion formation | X | |||||
| Pulmonary valve dev. and morphogenesis | X | |||||
| Cellular respiration | X | X | ||||
| acetyl CoA biosynthesis | X | X | ||||
| Response to muscle activity | X | X | ||||
| Protein localization in mitochondrion | X | |||||
| Cation channel activity | X | |||||
| Interferon gamma response | X | |||||
| Calcineurin NFAT signaling cascade | X | |||||
| Histone demethylation | X | |||||
| Negative regulation of Ras signal transduction | X | |||||
| Positive regulation of metalloenzyme activity | X | |||||
| Glucocorticoid receptor signaling pathway | X | |||||
| Atrioventricular valve dev. and morphogenesis | X | X | ||||
| Vitamin metabolism | X | X | ||||
| Digestion | X | |||||
| Synapse assembly and nervous system development | X | |||||
| Folic acid compound metabolism | X | |||||
| Retinoic acid biosynthesis | X | |||||
| Urea cycle | X | |||||
| Response to prostaglandin | X | |||||
| Keratinocyte migration | X | |||||
| Amino acid transport | X | |||||
| Mesenchymal cell diff in kidney/Renal system dev. | X | |||||
| Axon extension | X | |||||
| Lung lobe dev. and morphogenesis | X | |||||
| Embryonic digestive tract dev. and morphogenesis | X | |||||
| Phosphate ion transport | X | |||||
| Fluid secretion | X | |||||
| Interleukin 13 production | X | |||||
| Glia guided migration | X | |||||
| Central nervous system maturation | X | |||||
| Bile acid transport | X | |||||
| Positive regulation of muscle cell apoptosis | X | |||||
| Purine metabolism | X | |||||
| Regulation of phospholipase C | X | |||||
| Response to VEGF | X | |||||
| Membrane repolarization | X | |||||
| Carbohydrate catabolism | X | |||||
| rRNA processing | X | |||||
| Response to indole 3 methanol | X | |||||
| DNA break repair | X |
Number of significant annotations for each comparison for Crohn's compared to UC using .
| 434 | 169 | 264 | 110 |
Significant term annotation groups and the gene-list they appeared significant in for the CROHNS vs. UC comparison.
| Hydrogen peroxide metabolism | X | X | ||
| Nucleotide and nucleoside metabolism | X | X | ||
| Amine metabolism | X | X | ||
| Extrinsic signal transduction | X | X | ||
| Regulation of nitric-oxide synthase | X | X | ||
| Fatty-acyl-CoA biosynthesis | X | X | ||
| Chemokine and cytokine production | X | X | ||
| ER unfolded protein response | X | X | ||
| Antigen processing and presentation | X | X | ||
| Response to lipopolysaccharide and bacterial | X | |||
| Regulation of inflammatory response | X | |||
| Regulation of cell cycle and DNA damage response | X | |||
| Regulation of ubiquitination and ligase activity | X | |||
| nik/nk-kappab cascade | X | |||
| Regulation of ras/rac/rho gtpase activity | X | |||
| COPII vesicle coating and targeting | X | |||
| Negative regulation of peptidase activity | X | |||
| Response to type 1 interferon | X | |||
| Protein N-linked glycosylation | X | |||
| Glandular cell differentiation | X | |||
| Membrane biogenesis and assembly | X | |||
| Activin receptor signaling | X | |||
| Oligodendrocyte differentiation | X | X | ||
| Cellular pattern specification | X | X | ||
| NAD biosynthesis | X | |||
| Hormone metabolism | X | |||
| Response to growth hormone | X | |||
| Melanin metabolism | X | |||
| Protein dephosphorylation | X |
Figure 4Examples of annotations groups and associated legends. (A) From left to right, visualization of annotation groups “regulation of ubiquitination and ligase activity,” “nucleoside and nucleotide metabolism,” and “amine metabolism,” with associated legend in (B). (C) Pie chart visualization of annotation group “muscle fiber development,” with legend in (D).