Literature DB >> 35221569

GIpred: a computational tool for prediction of GIGANTEA proteins using machine learning algorithm.

Sagarika Dash1, Tanmaya Kumar Sahu2, Subhrajit Satpathy2, Prabina Kumar Meher2,3, Sukanta Kumar Pradhan1.   

Abstract

In plants, GIGANTEA (GI) protein plays different biological functions including carbon and sucrose metabolism, cell wall deposition, transpiration and hypocotyl elongation. This suggests that GI is an important class of proteins. So far, the resource-intensive experimental methods have been mostly utilized for identification of GI proteins. Thus, we made an attempt in this study to develop a computational model for fast and accurate prediction of GI proteins. Ten different supervised learning algorithms i.e., SVM, RF, JRIP, J48, LMT, IBK, NB, PART, BAGG and LGB were employed for prediction, where the amino acid composition (AAC), FASGAI features and physico-chemical (PHYC) properties were used as numerical inputs for the learning algorithms. Higher accuracies i.e., 96.75% of AUC-ROC and 86.7% of AUC-PR were observed for SVM coupled with AAC + PHYC feature combination, while evaluated with five-fold cross validation. With leave-one-out cross validation, 97.29% of AUC-ROC and 87.89% of AUC-PR were respectively achieved. While the performance of the model was evaluated with an independent dataset of 18 GI sequences, 17 were observed as correctly predicted. We have also performed proteome-wide identification of GI proteins in wheat, followed by functional annotation using Gene Ontology terms. A prediction server "GIpred" is freely accessible at http://cabgrid.res.in:8080/gipred/ for proteome-wide recognition of GI proteins. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s12298-022-01130-6. © Prof. H.S. Srivastava Foundation for Science and Society 2022.

Entities:  

Keywords:  Circadian gene; Computational biology; Machine learning; Proteome; Support vector machine

Year:  2022        PMID: 35221569      PMCID: PMC8847649          DOI: 10.1007/s12298-022-01130-6

Source DB:  PubMed          Journal:  Physiol Mol Biol Plants        ISSN: 0974-0430


  70 in total

1.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

2.  MEIOTIC F-BOX Is Essential for Male Meiotic DNA Double-Strand Break Repair in Rice.

Authors:  Yi He; Chong Wang; James D Higgins; Junping Yu; Jie Zong; Pingli Lu; Dabing Zhang; Wanqi Liang
Journal:  Plant Cell       Date:  2016-07-19       Impact factor: 11.277

3.  Oxidative stress tolerance and longevity in Arabidopsis: the late-flowering mutant gigantea is tolerant to paraquat.

Authors:  J Kurepa; J Smalle; M Van Montagu; D Inzé
Journal:  Plant J       Date:  1998-06       Impact factor: 6.417

4.  protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences.

Authors:  Nan Xiao; Dong-Sheng Cao; Min-Feng Zhu; Qing-Song Xu
Journal:  Bioinformatics       Date:  2015-01-24       Impact factor: 6.937

5.  Involvement of GIGANTEA gene in the regulation of the cold stress response in Arabidopsis.

Authors:  Shuqing Cao; Ming Ye; Shaotong Jiang
Journal:  Plant Cell Rep       Date:  2005-10-18       Impact factor: 4.570

6.  GIGANTEA enables drought escape response via abscisic acid-dependent activation of the florigens and SUPPRESSOR OF OVEREXPRESSION OF CONSTANS.

Authors:  Matteo Riboni; Massimo Galbiati; Chiara Tonelli; Lucio Conti
Journal:  Plant Physiol       Date:  2013-05-29       Impact factor: 8.340

7.  Protocol Update for large-scale genome and gene function analysis with the PANTHER classification system (v.14.0).

Authors:  Huaiyu Mi; Anushya Muruganujan; Xiaosong Huang; Dustin Ebert; Caitlin Mills; Xinyu Guo; Paul D Thomas
Journal:  Nat Protoc       Date:  2019-02-25       Impact factor: 13.491

8.  DIRProt: a computational approach for discriminating insecticide resistant proteins from non-resistant proteins.

Authors:  Prabina Kumar Meher; Tanmaya Kumar Sahu; Anjali Banchariya; Atmakuri Ramakrishna Rao
Journal:  BMC Bioinformatics       Date:  2017-03-24       Impact factor: 3.169

9.  CGDB: a database of circadian genes in eukaryotes.

Authors:  Shujing Li; Ke Shui; Ying Zhang; Yongqiang Lv; Wankun Deng; Shahid Ullah; Luoying Zhang; Yu Xue
Journal:  Nucleic Acids Res       Date:  2016-10-26       Impact factor: 16.971

10.  A deep proteomics perspective on CRM1-mediated nuclear export and nucleocytoplasmic partitioning.

Authors:  Koray Kırlı; Samir Karaca; Heinz Jürgen Dehne; Matthias Samwer; Kuan Ting Pan; Christof Lenz; Henning Urlaub; Dirk Görlich
Journal:  Elife       Date:  2015-12-17       Impact factor: 8.140

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.