Literature DB >> 15797903

Mining SARS-CoV protease cleavage data using non-orthogonal decision trees: a novel method for decisive template selection.

Zheng Rong Yang1.   

Abstract

MOTIVATION: Although the outbreak of the severe acute respiratory syndrome (SARS) is currently over, it is expected that it will return to attack human beings. A critical challenge to scientists from various disciplines worldwide is to study the specificity of cleavage activity of SARS-related coronavirus (SARS-CoV) and use the knowledge obtained from the study for effective inhibitor design to fight the disease. The most commonly used inductive programming methods for knowledge discovery from data assume that the elements of input patterns are orthogonal to each other. Suppose a sub-sequence is denoted as P2-P1-P1'-P2', the conventional inductive programming method may result in a rule like 'if P1 = Q, then the sub-sequence is cleaved, otherwise non-cleaved'. If the site P1 is not orthogonal to the others (for instance, P2, P1' and P2'), the prediction power of these kind of rules may be limited. Therefore this study is aimed at developing a novel method for constructing non-orthogonal decision trees for mining protease data. RESULT: Eighteen sequences of coronavirus polyprotein were downloaded from NCBI (http://www.ncbi.nlm.nih.gov). Among these sequences, 252 cleavage sites were experimentally determined. These sequences were scanned using a sliding window with size k to generate about 50,000 k-mer sub-sequences (for short, k-mers). The value of k varies from 4 to 12 with a gap of two. The bio-basis function proposed by Thomson et al. is used to transform the k-mers to a high-dimensional numerical space on which an inductive programming method is applied for the purpose of deriving a decision tree for decision-making. The process of this transform is referred to as a bio-mapping. The constructed decision trees select about 10 out of 50,000 k-mers. This small set of selected k-mers is regarded as a set of decisive templates. By doing so, non-orthogonal decision trees are constructed using the selected templates and the prediction accuracy is significantly improved.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15797903      PMCID: PMC7197706          DOI: 10.1093/bioinformatics/bti404

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  45 in total

1.  Infectious RNA transcribed in vitro from a cDNA copy of the human coronavirus genome cloned in vaccinia virus.

Authors:  Volker Thiel; Jens Herold; Barbara Schelle; Stuart G Siddell
Journal:  J Gen Virol       Date:  2001-06       Impact factor: 3.891

2.  Characterizing proteolytic cleavage site activity using bio-basis function neural networks.

Authors:  Rebecca Thomson; T Charles Hodgman; Zheng Rong Yang; Austin K Doyle
Journal:  Bioinformatics       Date:  2003-09-22       Impact factor: 6.937

3.  The compositional adjustment of amino acid substitution matrices.

Authors:  Yi-Kuo Yu; John C Wootton; Stephen F Altschul
Journal:  Proc Natl Acad Sci U S A       Date:  2003-12-08       Impact factor: 11.205

4.  Comparison of the predicted and observed secondary structure of T4 phage lysozyme.

Authors:  B W Matthews
Journal:  Biochim Biophys Acta       Date:  1975-10-20

5.  Discovery of significant rules for classifying cancer diagnosis data.

Authors:  Jinyan Li; Huiqing Liu; See-Kiong Ng; Limsoon Wong
Journal:  Bioinformatics       Date:  2003-10       Impact factor: 6.937

6.  Reduced bio-basis function neural networks for protease cleavage site prediction.

Authors:  Zheng Rong Yang; Emily A Berry
Journal:  J Bioinform Comput Biol       Date:  2004-09       Impact factor: 1.122

7.  Viral replicase gene products suffice for coronavirus discontinuous transcription.

Authors:  V Thiel; J Herold; B Schelle; S G Siddell
Journal:  J Virol       Date:  2001-07       Impact factor: 5.103

8.  Characterization of a second cleavage site and demonstration of activity in trans by the papain-like proteinase of the murine coronavirus mouse hepatitis virus strain A59.

Authors:  P J Bonilla; S A Hughes; S R Weiss
Journal:  J Virol       Date:  1997-02       Impact factor: 5.103

9.  The crystal structures of severe acute respiratory syndrome virus main protease and its complex with an inhibitor.

Authors:  Haitao Yang; Maojun Yang; Yi Ding; Yiwei Liu; Zhiyong Lou; Zhe Zhou; Lei Sun; Lijuan Mo; Sheng Ye; Hai Pang; George F Gao; Kanchan Anand; Mark Bartlam; Rolf Hilgenfeld; Zihe Rao
Journal:  Proc Natl Acad Sci U S A       Date:  2003-10-29       Impact factor: 11.205

10.  Putative papain-related thiol proteases of positive-strand RNA viruses. Identification of rubi- and aphthovirus proteases and delineation of a novel conserved domain associated with proteases of rubi-, alpha- and coronaviruses.

Authors:  A E Gorbalenya; E V Koonin; M M Lai
Journal:  FEBS Lett       Date:  1991-08-19       Impact factor: 4.124

View more
  5 in total

Review 1.  Peptide bioinformatics: peptide classification using peptide machines.

Authors:  Zheng Rong Yang
Journal:  Methods Mol Biol       Date:  2008

2.  Prediction and biochemical analysis of putative cleavage sites of the 3C-like protease of Middle East respiratory syndrome coronavirus.

Authors:  Andong Wu; Yi Wang; Cong Zeng; Xingyu Huang; Shan Xu; Ceyang Su; Min Wang; Yu Chen; Deyin Guo
Journal:  Virus Res       Date:  2015-05-31       Impact factor: 3.303

3.  SARS-CoV-2 3CLpro whole human proteome cleavage prediction and enrichment/depletion analysis.

Authors:  Lucas Prescott
Journal:  Comput Biol Chem       Date:  2022-03-28       Impact factor: 3.737

4.  Big data analytics for preventive medicine.

Authors:  Muhammad Imran Razzak; Muhammad Imran; Guandong Xu
Journal:  Neural Comput Appl       Date:  2019-03-16       Impact factor: 5.102

5.  In silico prediction of SARS protease inhibitors by virtual high throughput screening.

Authors:  Dariusz Plewczynski; Marcin Hoffmann; Marcin von Grotthuss; Krzysztof Ginalski; Leszek Rychewski
Journal:  Chem Biol Drug Des       Date:  2007-04       Impact factor: 2.817

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.