Literature DB >> 24885641

Improving the sensitivity of sample clustering by leveraging gene co-expression networks in variable selection.

Zixing Wang, F Anthony San Lucas, Peng Qiu, Yin Liu1.   

Abstract

BACKGROUND: Many variable selection techniques have been proposed for the clustering of gene expression data. While these methods tend to filter out irrelevant genes and identify informative genes that contribute to a clustering solution, they are based on criteria that do not consider the potential interactive influence among individual genes. Motivated by ensemble clustering, there is a strong interest in leveraging the structure of gene networks for gene selection, so that the relationship information between genes can be effectively utilized, while the selected genes are expected to preserve all the possible clustering structures in the data.
RESULTS: We present a new filter method that uses the gene connectivity in the gene co-expression network as the evaluation criteria for variable selection. The gene connectivity measures the importance of the genes in term of their expression similarity with others in the co-expression network. The hard threshold and soft threshold transformations are employed to construct the gene co-expression networks. Both simulation studies and real data analysis have shown that the network based on soft thresholding is more effective in selecting relevant variables and provides better clustering results compared to the hard thresholding transformation and two other canonical filter methods for variable selection. Furthermore, a new module analysis approach is proposed to reveal the higher order organization of the gene space, where the genes of a module share significant topological similarity and are associated with a consensus partition of the sample space. We demonstrate that the identified modules can lead to biologically meaningful sample partitions that might be missed by other methods.
CONCLUSIONS: By leveraging the structure of gene co-expression network, first we propose a variable selection method that selects individual genes with top connectivity. Both simulation studies and real data application have demonstrated that our method has better performance in terms of the reliability of the selected genes and sample clustering results. In addition, we propose a module recovery method that can help discover novel sample partitions that might be hidden when performing clustering analyses using all available genes. The source code of our program is available at http://nba.uth.tmc.edu/homepage/liu/netVar/.

Entities:  

Mesh:

Year:  2014        PMID: 24885641      PMCID: PMC4035826          DOI: 10.1186/1471-2105-15-153

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  14 in total

1.  Identifying marker genes in transcription profiling data using a mixture of feature relevance experts.

Authors:  M L Chow; E J Moler; I S Mian
Journal:  Physiol Genomics       Date:  2001-03-08       Impact factor: 3.107

2.  CLIFF: clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts.

Authors:  E P Xing; R M Karp
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

3.  A mixture model-based approach to the clustering of microarray expression data.

Authors:  G J McLachlan; R W Bean; D Peel
Journal:  Bioinformatics       Date:  2002-03       Impact factor: 6.937

4.  Incorporating prior knowledge into Gene Network Study.

Authors:  Zixing Wang; Wenlong Xu; F Anthony San Lucas; Yin Liu
Journal:  Bioinformatics       Date:  2013-08-16       Impact factor: 6.937

5.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.

Authors:  U Alon; N Barkai; D A Notterman; K Gish; S Ybarra; D Mack; A J Levine
Journal:  Proc Natl Acad Sci U S A       Date:  1999-06-08       Impact factor: 11.205

6.  Comparison of threshold selection methods for microarray gene co-expression matrices.

Authors:  Bhavesh R Borate; Elissa J Chesler; Michael A Langston; Arnold M Saxton; Brynn H Voy
Journal:  BMC Res Notes       Date:  2009-12-02

7.  Discovering biological progression underlying microarray samples.

Authors:  Peng Qiu; Andrew J Gentles; Sylvia K Plevritis
Journal:  PLoS Comput Biol       Date:  2011-04-14       Impact factor: 4.475

8.  When is hub gene selection better than standard meta-analysis?

Authors:  Peter Langfelder; Paul S Mischel; Steve Horvath
Journal:  PLoS One       Date:  2013-04-17       Impact factor: 3.240

9.  A prediction-based resampling method for estimating the number of clusters in a dataset.

Authors:  Sandrine Dudoit; Jane Fridlyand
Journal:  Genome Biol       Date:  2002-06-25       Impact factor: 13.583

10.  SDED: a novel filter method for cancer-related gene selection.

Authors:  Wenlong Xu; Minghui Wang; Xianghua Zhang; Lirong Wang; Huanqing Feng
Journal:  Bioinformation       Date:  2008-04-11
View more
  9 in total

1.  GCEN: An Easy-to-Use Toolkit for Gene Co-Expression Network Analysis and lncRNAs Annotation.

Authors:  Wen Chen; Jing Li; Shulan Huang; Xiaodeng Li; Xuan Zhang; Xiang Hu; Shuanglin Xiang; Changning Liu
Journal:  Curr Issues Mol Biol       Date:  2022-03-25       Impact factor: 2.976

2.  Synonymous codon usage bias in plant mitochondrial genes is associated with intron number and mirrors species evolution.

Authors:  Wenjing Xu; Tian Xing; Mingming Zhao; Xunhao Yin; Guangmin Xia; Mengcheng Wang
Journal:  PLoS One       Date:  2015-06-25       Impact factor: 3.240

3.  A Bayesian Framework to Improve MicroRNA Target Prediction by Incorporating External Information.

Authors:  Zixing Wang; Wenlong Xu; Haifeng Zhu; Yin Liu
Journal:  Cancer Inform       Date:  2014-11-18

4.  Synonymous Codon Usage Bias in the Plastid Genome is Unrelated to Gene Structure and Shows Evolutionary Heterogeneity.

Authors:  Yueying Qi; Wenjing Xu; Tian Xing; Mingming Zhao; Nana Li; Li Yan; Guangmin Xia; Mengcheng Wang
Journal:  Evol Bioinform Online       Date:  2015-04-07       Impact factor: 1.625

5.  FastGCN: a GPU accelerated tool for fast gene co-expression networks.

Authors:  Meimei Liang; Futao Zhang; Gulei Jin; Jun Zhu
Journal:  PLoS One       Date:  2015-01-20       Impact factor: 3.240

6.  Classifying mild traumatic brain injuries with functional network analysis.

Authors:  F Anthony San Lucas; John Redell; Dash Pramod; Yin Liu
Journal:  BMC Syst Biol       Date:  2018-12-21

7.  Comparison of Methods for Feature Selection in Clustering of High-Dimensional RNA-Sequencing Data to Identify Cancer Subtypes.

Authors:  David Källberg; Linda Vidman; Patrik Rydén
Journal:  Front Genet       Date:  2021-02-24       Impact factor: 4.599

8.  Asymmetric Somatic Hybridization Affects Synonymous Codon Usage Bias in Wheat.

Authors:  Wenjing Xu; Yingchun Li; Yajing Li; Chun Liu; Yanxia Wang; Guangmin Xia; Mengcheng Wang
Journal:  Front Genet       Date:  2021-06-11       Impact factor: 4.599

9.  Polyploidization is accompanied by synonymous codon usage bias in the chloroplast genomes of both cotton and wheat.

Authors:  Geng Tian; Guoqing Li; Yanling Liu; Qinghua Liu; Yanxia Wang; Guangmin Xia; Mengcheng Wang
Journal:  PLoS One       Date:  2020-11-19       Impact factor: 3.240

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.