| Literature DB >> 26106460 |
Jincheol Park1, Cenny Taslim2, Shili Lin3.
Abstract
BOG (Bacterium and virus analysis of Orthologous Groups) is a package for identifying groups of differentially regulated genes in the light of gene functions for various virus and bacteria genomes. It is designed to identify Clusters of Orthologous Groups (COGs) that are enriched among genes that have gone through significant changes under different conditions. This would contribute to the detection of pathogens, an important scientific research area of relevance in uncovering bioterrorism, among others. Particular statistical analyses include hypergeometric, Mann-Whitney rank sum, and gene set enrichment. Results from the analyses are organized and presented in tabular and graphical forms for ease of understanding and dissemination of results. BOG is implemented as an R-package, which is available from CRAN or can be downloaded from http://www.stat.osu.edu/~statgen/SOFTWARE/BOG/.Entities:
Keywords: Bacterium and virus analysis; Clusters of Orthologous Groups; Gene set enrichment analysis; Hypergeometric test; Mann–Whitney Rank Sum test; Tabular and graphical visualization
Year: 2015 PMID: 26106460 PMCID: PMC4475783 DOI: 10.1016/j.csbj.2015.05.002
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Flowchart and sample outputs. (a) The flowchart depicts the three sequential modules that made up BOG. (b) COGs with adjusted p-value < 0.1 from the hypergeometric test. For each COG (P, F and E), the left bar represents the observed number of differentially expressed genes identified, while the right bar is for the expected number according to the size of the COG. The p-values indicated are adjusted p-value taking into account of multiple testing. (c) Tabular outcome from the Mann–Whitney rank test. The middle column gives raw p-values, while the last column provides adjusted p-values taking multiple testing into consideration. (d) An example GSEA scoring path for the “P” category. One can see the maximum score is reached at 358 genes, with the majority of the genes in the top 358 coming from the “P” category (in red). The p-value is adjusted for multiple testing. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)