Literature DB >> 33914130

Predicting gene phenotype by multi-label multi-class model based on essential functional features.

Lei Chen1,2, Zhandong Li3, Tao Zeng4, Yu-Hang Zhang5, Hao Li3, Tao Huang6, Yu-Dong Cai7.   

Abstract

Phenotype is one of the most significant concepts in genetics, which is used to describe all the characteristics of a research object that can be observed. Considering that phenotype reflects the integrated features of genotype and environment factors, it is hard to define phenotype characteristics, even difficult to predict unknown phenotypes. Restricted by current biological techniques, it is still quite expensive and time-consuming to obtain sufficient structural information of large-scale phenotype-associated genes/proteins. Various bioinformatics methods have been presented to solve such problem, and researchers have confirmed the efficacy and prediction accuracy of functional network-based prediction. But general functional descriptions have highly complicated inner structures for phenotype prediction. To further address this issue and improve the efficacy of phenotype prediction on more than ten kinds of phenotypes, we first extract functional enrichment features from GO and KEGG, and then use node2vec to learn functional embedding features of genes from a gene-gene network. All these features are analyzed by some feature selection methods (Boruta, minimum redundancy maximum relevance) to generate a feature list. Such list is fed into the incremental feature selection, incorporating some multi-label classifiers built by RAkEL and some classic base classifiers, to build an optimum multi-label multi-class classification model for phenotype prediction. According to recent researches, our method has indeed identified many literature-supported genes/proteins and their associated phenotypes, and even some candidate genes with re-assigned new phenotypes, which provide a new computational tool for the accurate and effective phenotypic prediction.

Keywords:  Feature selection; Functional enrichment; Multi-label classification; Network embedding; Phenotype; RAkEL

Mesh:

Substances:

Year:  2021        PMID: 33914130     DOI: 10.1007/s00438-021-01789-8

Source DB:  PubMed          Journal:  Mol Genet Genomics        ISSN: 1617-4623            Impact factor:   3.291


  47 in total

1.  Systematic identification and functional screens of uncharacterized proteins associated with eukaryotic ribosomal complexes.

Authors:  Tracey C Fleischer; Connie M Weaver; K Jill McAfee; Jennifer L Jennings; Andrew J Link
Journal:  Genes Dev       Date:  2006-05-15       Impact factor: 11.361

Review 2.  Successful aging: from phenotype to genotype.

Authors:  Stephen J Glatt; Pamela Chayavichitsilp; Colin Depp; Nicholas J Schork; Dilip V Jeste
Journal:  Biol Psychiatry       Date:  2007-01-08       Impact factor: 13.382

Review 3.  Genome-wide genetic marker discovery and genotyping using next-generation sequencing.

Authors:  John W Davey; Paul A Hohenlohe; Paul D Etter; Jason Q Boone; Julian M Catchen; Mark L Blaxter
Journal:  Nat Rev Genet       Date:  2011-06-17       Impact factor: 53.242

4.  Analysis of cancer-related lncRNAs using gene ontology and KEGG pathways.

Authors:  Lei Chen; Yu-Hang Zhang; Guohui Lu; Tao Huang; Yu-Dong Cai
Journal:  Artif Intell Med       Date:  2017-02-13       Impact factor: 5.326

5.  A genome-wide screen for Saccharomyces cerevisiae nonessential genes involved in mannosyl phosphate transfer to mannoprotein-linked oligosaccharides.

Authors:  Isaac Corbacho; Isabel Olivero; Luis M Hernández
Journal:  Fungal Genet Biol       Date:  2005-09       Impact factor: 3.495

6.  In silicio identification of glycosyl-phosphatidylinositol-anchored plasma-membrane and cell wall proteins of Saccharomyces cerevisiae.

Authors:  L H Caro; H Tettelin; J H Vossen; A F Ram; H van den Ende; F M Klis
Journal:  Yeast       Date:  1997-12       Impact factor: 3.239

7.  Translation initiation factor 2gamma mutant alters start codon selection independent of Met-tRNA binding.

Authors:  Pankaj V Alone; Chune Cao; Thomas E Dever
Journal:  Mol Cell Biol       Date:  2008-09-15       Impact factor: 4.272

8.  GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists.

Authors:  Pedro Carmona-Saez; Monica Chagoyen; Francisco Tirado; Jose M Carazo; Alberto Pascual-Montano
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

9.  CYGD: the Comprehensive Yeast Genome Database.

Authors:  U Güldener; M Münsterkötter; G Kastenmüller; N Strack; J van Helden; C Lemer; J Richelles; S J Wodak; J García-Martínez; J E Pérez-Ortín; H Michael; A Kaps; E Talla; B Dujon; B André; J L Souciet; J De Montigny; E Bon; C Gaillardin; H W Mewes
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

10.  Classification of Widely and Rarely Expressed Genes with Recurrent Neural Network.

Authors:  Lei Chen; XiaoYong Pan; Yu-Hang Zhang; Min Liu; Tao Huang; Yu-Dong Cai
Journal:  Comput Struct Biotechnol J       Date:  2018-12-14       Impact factor: 7.271

View more
  5 in total

1.  PseAraUbi: predicting arabidopsis ubiquitination sites by incorporating the physico-chemical and structural features.

Authors:  Wei Wang; Yu Zhang; Dong Liu; HongJun Zhang; XianFang Wang; Yun Zhou
Journal:  Plant Mol Biol       Date:  2022-07-01       Impact factor: 4.335

2.  iMPT-FDNPL: Identification of Membrane Protein Types with Functional Domains and a Natural Language Processing Approach.

Authors:  Wei Chen; Lei Chen; Qi Dai
Journal:  Comput Math Methods Med       Date:  2021-10-11       Impact factor: 2.238

3.  A New Risk Score Based on Eight Hepatocellular Carcinoma- Immune Gene Expression Can Predict the Prognosis of the Patients.

Authors:  Dingde Ye; Yaping Liu; Guoqiang Li; Beicheng Sun; Jin Peng; Qingxiang Xu
Journal:  Front Oncol       Date:  2021-11-19       Impact factor: 6.244

4.  Identification of protein-protein interaction associated functions based on gene ontology and KEGG pathway.

Authors:  Lili Yang; Yu-Hang Zhang; FeiMing Huang; ZhanDong Li; Tao Huang; Yu-Dong Cai
Journal:  Front Genet       Date:  2022-09-12       Impact factor: 4.772

Review 5.  Computational systems biology in disease modeling and control, review and perspectives.

Authors:  Rongting Yue; Abhishek Dutta
Journal:  NPJ Syst Biol Appl       Date:  2022-10-03
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.