| Literature DB >> 24955372 |
Hui Zhao1, Fenglin Cao2, Yonghui Gong3, Huafeng Xu4, Yiping Fei2, Longyue Wu2, Xiangmei Ye2, Dongguang Yang2, Xiuhua Liu2, Xia Li3, Jin Zhou2.
Abstract
RNA-Seq is emerging as an increasingly important tool in biological research, and it provides the most direct evidence of the relationship between the physiological state and molecular changes in cells. A large amount of RNA-Seq data across diverse experimental conditions have been generated and deposited in public databases. However, most developed approaches for coexpression analyses focus on the coexpression pattern mining of the transcriptome, thereby ignoring the magnitude of gene differences in one pattern. Furthermore, the functional relationships of genes in one pattern, and notably among patterns, were not always recognized. In this study, we developed an integrated strategy to identify differential coexpression patterns of genes and probed the functional mechanisms of the modules. Two real datasets were used to validate the method and allow comparisons with other methods. One of the datasets was selected to illustrate the flow of a typical analysis. In summary, we present an approach to robustly detect coexpression patterns in transcriptomes and to stratify patterns according to their relative differences. Furthermore, a global relationship between patterns and biological functions was constructed. In addition, a freely accessible web toolkit "coexpression pattern mining and GO functional analysis" (COGO) was developed.Entities:
Mesh:
Year: 2014 PMID: 24955372 PMCID: PMC4052503 DOI: 10.1155/2014/969768
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1A schematic overview of COGO. A series of RNA-Seq data with three conditions was selected to illustrate the analysis process. The characteristic attributes “Der” and “SE” were extracted by a derivation method of polynomial curve fitting (DPCF) and by Shannon's Entropy (SE) models, respectively. Gene categories can then be established through clustering. A functional enrichment analysis was then performed for the categories to determine significant functions. Finally, the semantic similarity measurement was conducted to identify functional modules.
Figure 2Gene expression pattern classification results of the colon RNA-Seq dataset. (a) A chart showing gene expression patterns among different tissues for each cluster category. The y-axis is dimensionless and represents the mean gene relative expression level; error bars show the standard deviation. (b) The hollow dots represent the mean of SE for each category; error bars show the standard deviation. (c) The number of genes in each category and the cumulative percentage of the number of genes from C1 to C16.
Figure 3The functional relationship network of categories and enriched GO terms for the biological process category. The enriched GO terms of C15 and C16 are indicated by blue circles, and the other categories are indicated by red circles. The bar charts represent the expression pattern of the category. This figure was constructed to show the overall relationship of GO functions to gene patterns and gene patterns to gene patterns. More detailed GO terms are presented in Table S2 in Supplementary Material available online at http://dx.doi.org/10.1155/2014/969768.
Figure 4The functional similarity of the GO terms in the biological process category enriched in C1, C3, C8, C10, and C12 are displayed as a heatmap, and the similarity scores are indicated by color intensity, with red representing high similarity and green representing low similarity (FDR* = −log10(FDR)).