Literature DB >> 24857907

iHyd-PseAAC: predicting hydroxyproline and hydroxylysine in proteins by incorporating dipeptide position-specific propensity into pseudo amino acid composition.

Yan Xu¹, Xin Wen², Xiao-Jian Shao³, Nai-Yang Deng⁴, Kuo-Chen Chou⁵.

Abstract

Post-translational modifications (PTMs) play crucial roles in various cell functions and biological processes. Protein hydroxylation is one type of PTM that usually occurs at the sites of proline and lysine. Given an uncharacterized protein sequence, which site of its Pro (or Lys) can be hydroxylated and which site cannot? This is a challenging problem, not only for in-depth understanding of the hydroxylation mechanism, but also for drug development, because protein hydroxylation is closely relevant to major diseases, such as stomach and lung cancers. With the avalanche of protein sequences generated in the post-genomic age, it is highly desired to develop computational methods to address this problem. In view of this, a new predictor called "iHyd-PseAAC" (identify hydroxylation by pseudo amino acid composition) was proposed by incorporating the dipeptide position-specific propensity into the general form of pseudo amino acid composition. It was demonstrated by rigorous cross-validation tests on stringent benchmark datasets that the new predictor is quite promising and may become a useful high throughput tool in this area. A user-friendly web-server for iHyd-PseAAC is accessible at http://app.aporc.org/iHyd-PseAAC/. Furthermore, for the convenience of the majority of experimental scientists, a step-by-step guide on how to use the web-server is given. Users can easily obtain their desired results by following these steps without the need of understanding the complicated mathematical equations presented in this paper just for its integrity.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：

Year: 2014 PMID： 24857907 PMCID： PMC4057693 DOI： 10.3390/ijms15057594

Source DB: PubMed Journal: Int J Mol Sci ISSN： 1422-0067 Impact factor: 5.923

Introduction

Most proteins perform their functions after post-translational modifications (PTMs). Protein hydroxylation is one type of PTM that involves the conversion of a CH group into a COH group (Figure 1) and is closely relevant to the regulation of the transcription factor (hypoxia-inducible factor) [1]. Both the proline and lysine residues in proteins can be hydroxylated, forming hydroxyproline (Figure 1a) or HyP and hydroxylysine (Figure 1b) or HyL, respectively. However, the former is more common than the latter [2,3]. Furthermore, HyP is the key factor in stabilizing collagens [4,5], whose instability or abnormal activity may cause stomach cancer [6] and lung cancer [7,8]. Therefore, identifying the HyP and HyL sites in proteins may provide useful information for both biomedical research and drug development.

Figure 1.

Schematic drawing to show protein hydroxylation occurring at (a) proline and (b) lysine to form hydroxyproline (HyP) and hydroxylysine (HyL), respectively.

Identification of hydroxylation residues with experiments was mainly done by means of mass spectrometry [1,9], which was expensive and laborious. Facing the avalanche of protein sequences generated in the post genomic age, it is highly demanded to develop a computational method for timely and effectively identifying the hydroxylation residues in proteins. However, to our best knowledge, so far, only two papers have been published in this regard [10,11]. Additionally, further development in this important area is definitely needed for the following reasons. First, with a rapidly growing database in protein hydroxylation, the benchmark datasets used in the two methods definitely need to be updated; Second, some sequence order effects were missed, which would certainly affect their prediction quality; Third, none of them provided a publicly accessible web-server, and hence, their practical usage value is substantially limited. The present study was devoted to develop a new predictor for identifying hydroxyproline and hydroxylysine in proteins by considering the above three aspects. The principle was based on a window sliding strategy, quite similar to the popular approach developed by Garnier and Robson [12] for predicting the secondary structure of globular proteins. As demonstrated by a series of recent publications [13-20] and summarized in a comprehensive review [21], to develop a really useful predictor for a protein or peptide system, we need to go through the following five steps: (1) select or construct a valid benchmark dataset to train and test the predictor; (2) represent the protein or peptide samples with an effective formulation that can truly reflect their intrinsic correlation with the target to be predicted; (3) introduce or develop a powerful algorithm or operation engine to conduct the prediction; (4) properly perform cross-validation tests to objectively evaluate the anticipated prediction accuracy; (5) establish a user-friendly web-server for the predictor that is accessible to the public. Below, let us elaborate on how to deal with these five steps.

Results and Discussion

Benchmark Dataset

In this study, the benchmark dataset was derived from dbPTM 3.0 [22] at http://dbptm.mbc.nctu.edu.tw/, a protein post-translational modifications database. For facilitating the description later, let us adopt Chou’s peptide formulation, which was used for investigating the HIV protease cleavage sites [23,24], the specificity of GalNAc-transferase [25], as well as signal peptide cleavage sites [26-29]. According to Chou’s scheme, a peptide with Pro (namely P in its single-letter code) or Lys (namely K) located at its center (Figure 2) can be expressed as:

Figure 2.

An illustration to show Chou’s scheme for peptides with (2ξ + 1) residues and their centers being (a) proline and (b) lysine. Adapted from Chou [27,29] with permission.

where the subscript, ξ, is an integer (cf. Figure 2), R−ξ represents the ξ-th downstream amino acid residue from the center, Rξ the ξ-th upstream amino acid residue, and so forth. Peptides P(ℙ) and P( ) with the profile of Equation (1) can be further classified into the following categories: and: where ∈ represents “a member of” in the set theory. As pointed out by a comprehensive review [30], there is no need to separate a benchmark dataset into a training dataset and a testing dataset for examining the performance of a prediction method if it is tested by the jackknife test or subsampling cross-validation test. Thus, the benchmark dataset for the current study can be formulated as: where is the benchmark dataset for studying hydroxyproline residues, the benchmark dataset for studying hydroxylysine residues, ⋃ the symbol for “union” in the set theory, contains the samples for the Pro-hydroxylated peptide only, contains the non-Pro-hydroxylated peptide only (cf. Equation (2)), contains the samples for the Lys-hydroxylated peptide only and contains the non-Lys-hydroxylated peptide only (cf. Equation (3)). After some preliminary trials, we found that ξ = 6 was a good choice for the current study. Accordingly, each of the samples extracted from proteins in this study is actually a 2ξ +1 = 13 tuple peptide. If the upstream or downstream in a peptide sample was 3≤ ξ < 6, the lacking residues were filled with the dummy code, @. Furthermore, to reduce the redundancy and to avoid homology bias, those peptides were excluded from the benchmark datasets that had ≥ 80% pairwise sequence identity to any other in a same subset. Finally, we obtained that the benchmark dataset, , contained 636 + 2699 = 3338 peptide samples, of which 636 were Pro-hydroxylated peptides belonging the positive subset , and 2669 were non-Pro-hydroxylated peptides belonging to the negative subset, ; and that the benchmark dataset, , contained 107 + 836 = 943 peptide samples, of which 107 were Lys-hydroxylated peptides belonging to the positive subset, , and 836 were non-Lys-hydroxylated peptides belonging to the negative subset, . For the reader’s convenience, the peptide sequences, as well as their hydroxylation or non-hydroxylation sites in proteins are given in the Supplementary Information, S1 and S2, for and , respectively.

Feature Vector Construction

To develop a statistical method for predicting the attribute of peptides in proteins, one of the fundamental procedures was to formulate the peptide samples with an effective mathematical expression that could really reflect the intrinsic correlation with the desired target. To realize this, various feature vectors (see, e.g., [17,31-36]) were proposed to express peptides by extracting their different features into the pseudo amino acid composition [37,38] or Chou’s pseudo amino acid composition [39-41] or Chou’s PseAAC (pseudo amino acid composition) [42,43]. According to [21], the general form of PseAAC for a protein or peptide, P, can be formulated by: where T is the transpose operator, while Ω is an integer to reflect the vector’s dimension. The value of Ω, as well as the components ψu (u = 1, 2, ···, Ω) in Equation (5) will depend on how to extract the desired information from the protein or peptide sequence. Below, let us describe how to extract the useful information from the benchmark datasets, and , to define the peptide samples via Equation (5). Since each of the samples concerned is a 13-tuple peptide, Equation (1) can be simplified to a more convenient form given by: where R7 = P or K, and R (i = 1, 2, ··· , 13; i ≠ 7) can be any of the 20 native amino acids or the dummy code @, as defined above. Hereafter, let us use the numerical codes 1, 2, 3, ···, 20 to represent the 20 native amino acids according to the alphabetic order of their single letter codes and use 21 to represent the dummy amino acid, @. Accordingly, the number of possible different dipeptides will be 21 × 21 = 441, and the number of dipeptide subsite positions on the sequence of Equation (6) will be (13 − 2 + 1) = 12. Now, let us introduce the following 441 × 12 matrix, ℤ, the so-called PSDP (position-specific dipeptide propensity) matrix [36], to define the component of Equation (5): where the element: and: In Equation (5)F+ (D | j) is the occurrence frequency of the i-th dipeptide (i = 1,2, ···, 441) at the j-th subsite on the sequence of Equation (6) that can be easily derived from the positive dataset in the Supplementary Information S1 or S2; while F− (D | j) is the corresponding occurrence frequency, but derived from the negative dataset. Thus, the peptide, P, of Equation (6) can be uniquely defined via the general form of PseAAC (cf. Equation (5)) with its dimension Ω = 12 and its u-th component given by:

Prediction Algorithm

Suppose ℙ+ are the standard vectors or norms for the peptide sequences in the positive benchmark dataset, or , and ℙ− are those in the negative benchmark dataset, or . Additionally, they are defined by: where: where N+ is the total number of Pro-hydroxylated peptides or Lys- hydroxylated peptides in the benchmark dataset, or , and the u-th component for the k-th Pro-hydroxylated peptide or Lys-hydroxylated peptide in the PseAAC space (see Equations (5) and (10)); whereas N− and have the same meanings, but are for the K-th non-Pro-hydroxylated peptide or non-Lys-hydroxylated peptide. For a query peptide, P, as formulated by Equation (5), suppose (P, ℙ+) is its similarity to the norm of hydroxylated peptides and (P, ℙ−) its similarity to the norm of non-hydroxylated peptides, as formulated by: Thus, according to the discriminant function algorithm [24,44], the attribute of the query peptide, P, can be formulated as: If there was a tie between (P, ℙ+) and (P, ℙ−), the query peptide would be randomly assigned between the hydroxylated peptide and non-hydroxylated peptide categories. However, this kind of tie case rarely happened and actually never happened in our study. The predictor established via the above procedures is called iHyd-PseAAC, where “i” stands for the first character of “identify”, “Hyd” for “hydroxylation” and “PseAAC” for the general form of the pseudo amino acid composition [21] used to formulate the peptide sequences. A flowchart of the predictor is given in Figure 3 to illustrate how iHyd-PseAAC was working during the process of prediction.

Figure 3.

Flowchart to show the process of how the iHyd-PseAAC (identify hydroxylation pseudo amino acid composition) predictor works in identifying the hydroxylated sites in proteins. PSDP, position-specific dipeptide propensity.

Experimental Section

A Set of Metrics for Measuring Prediction Quality

To provide a more intuitive and easier-to-understand method to measure the prediction quality, the following set of four metrics based on the formulation used by Chou [26-28] in predicting signal peptides was adopted. According to Chou’s formulation, the sensitivity, specificity, overall accuracy, and Matthews correlation coefficient can be respectively expressed as [18,33,36,45]: where N+ is the total number of the hydroxylated Pro-peptides (or Lys-peptides) investigated, while is the number of hydroxylated Pro-peptides (or Lys-peptides) incorrectly predicted as the non-hydroxylated Pro-peptides (or Lys-peptides); N− is the total number of the non-hydroxylated Pro-peptides (or Lys-peptides) investigated, while is the number of the non-hydroxylated Pro-peptides (or Lys-peptides) incorrectly predicted as the hydroxylated Pro-peptides (or Lys-peptides). According to Equation (15), we can easily see the following. When , meaning none of the hydroxylated Pro-peptides (or Lys-peptides) was mispredicted to be a non-hydroxylated Pro-peptide (or Lys-peptides), we have the sensitivity Sn = 1; while , meaning that all the hydroxylated Pro-peptides (or Lys-peptides) were mispredicted to be the non-hydroxylated Pro-peptides (or Lys-peptides), we have the sensitivity Sn = 0. Likewise, when , meaning none of the non-hydroxylated Pro-peptides (or Lys-peptides) was mispredicted, we have the specificity Sn = 1; while , meaning all the non-hydroxylated Pro-peptides (or Lys-peptides) were incorrectly predicted as hydroxylated Pro-peptides (or Lys-peptides), we have the specificity Sn = 0. When , meaning that none of the hydroxylated Pro-peptides (or Lys-peptides) in the dataset (or ) and none of the hydroxylated Pro-peptides (or Lys-peptides) in (or ) was incorrectly predicted, we have the overall accuracy Acc = 1; while and , meaning that all the hydroxylated Pro-peptides (or Lys-peptides) in the dataset (or ) and all the non-hydroxylated Pro-peptides (or Lys-peptides) in (or ) were mispredicted, we have the overall accuracy Acc = 0. The Matthews correlation coefficient (MCC) is usually used for measuring the quality of binary (two-class) classifications. When , meaning that none of the hydroxylated Pro-peptides (or Lys-peptides) in the dataset (or ) and none of the non-hydroxylated Pro-peptides (or Lys-peptides) in (or ) was mispredicted, we have MCC = 1; when and , we have MCC = 0, meaning no better than random prediction; when and , we have MCC = 0, meaning total disagreement between prediction and observation. As we can see from the above discussion, it is much more intuitive and easier-to-understand when using Equation (15) to examine a predictor for its four metrics, particularly for its Mathew’s correlation coefficient. It is instructive to point out that the metrics as defined in Equation (15) are valid for single-label systems; for multi-label systems [34,46-48], a set of more complicated metrics should be used, as given in [49].

Jackknife Cross-Validation

How to properly test a predictor for its anticipated success rates is very important in objectively evaluating its quality and potential application value. Generally speaking, the following three cross-validation methods are often used to examine the quality of a predictor and its effectiveness in practical application: the independent dataset test, the subsampling or the K-fold (such as 5-, 7- or 10-fold) crossover test and the jackknife test [50]. However, as elaborated by a penetrating analysis in [51], considerable arbitrariness exists in the independent dataset test. Furthermore, as demonstrated in [52], the subsampling (or K-fold crossover validation) test cannot avoid arbitrariness either. The jackknife test is the least arbitrary, which can always yield a unique result for a given benchmark dataset. Therefore, the jackknife test has been widely recognized and increasingly utilized by investigators to examine the quality of various predictors (see, e.g., [32,53-62]). Accordingly, in this study, the jackknife test was also adopted to evaluate the accuracy of the current predictor. Listed in Table 1 are the jackknife test results obtained by iHyd-PseAAC on the benchmark datasets of Supplementary Information S1 and the benchmark dataset of Supplementary Information S2, respectively.

Table 1.

The jackknife test results by the new predictor on the benchmark datasets in the Supplementary Information S1 and S2. HyP, hydroxyproline; HyL, hydroxylysine; Sn, sensitivity; Sp, specificity; Acc, accuracy; MCC, Matthews correlation coefficient.

Benchmark dataset a	Sn (%)	Sp (%)	Acc (%)	MCC
Supplementary Information S1 for HyP	80.66	80.54	80.57	0.51
Supplementary Information S2 for HyL	87.85	83.01	83.56	0.50

None of the sequences included has more than 80% pairwise sequence identity with any other.

To further demonstrate our predictor, the jackknife test was conducted on two more stringent benchmark datasets given in Supplementary Information S3 and S4, where none of the included sequences has more than 40% pairwise sequence identity with any other. The results thus obtained are listed in Table 2.

Table 2.

The jackknife test results by the iHyd-PseAAC predictor on the benchmark datasets in Supplementary Information S3 and S4.

Benchmark dataset a	Sn (%)	Sp (%)	Acc (%)	MCC
Supplementary Information S1 for HyP	70.68	89.03	78.42	0.52
Supplementary Information S2 for HyL	79.04	86.37	83.12	0.51

None of sequences included has more than 40% pairwise sequence identity with any other.

It is interesting to see by comparing the two tables that the rates of Acc and MCC are about the same in both cases. Although the rates of Sn in Table 2 are somewhat lower than those in Table 1, the rates of Sp in Table 2 are higher than those in Table 1. Accordingly, the success rates as measured by the four metrics in Equation (15) are basically about the same without dropping down significantly from using an 80% cutoff benchmark dataset to a 40% cutoff one, clearly indicating that iHyd-PseAAC is a useful predictor validated by rigorous cross-validation.

Test by Public Database

Moreover, from the Swiss-Prot database, we retrieved all those proteins whose hydroxylated sites were experimentally validated. After excluding those with a length less than 50 amino acids, we obtained 156 hydroxyproline proteins and 31 hydroxylysine proteins, respectively. Their codes and hydroxylated sites are given in Supplementary Information S5 and S6, respectively. The predicted results by iHyd-PseAAC on these real proteins are given in Table 3, from which we can see that the overall success rates thus obtained are quite consistent with those derived by the cross-validation on the benchmark datasets, as shown in Tables 1 and 2, fully indicating that iHyd-PseAAC is not only a valid predictor, but also may become a very useful high throughput toll for practical applications in this area.

Table 3.

The overall success rates in identifying hydroxylated sites for the proteins retrieved from the Swiss-Prot database.

Hydroxylated type	Sn (%)	Sp (%)	Acc (%)
Proline	71.2	79.3	75.3
Lysine	72.7	80.6	76.8

Conclusions

As we can see from Table 1, the overall accuracies for the hydroxyproline and hydroxylysine cases are 80.57% and 83.56%, which are higher than 76.0% and 82.1%, the corresponding rates reported by Hu et al. [11]. At first glance, the value of MCC seems relatively low. Actually, as mentioned in Section 3.1, different from Acc, whose score is between 100% and 0%, the score for MCC is between one and −1, with zero meaning no better than random prediction. Accordingly, the MCC rate of 0.50–0.51 is generally deemed as a quite decent result. Particularly, the benchmark dataset in the current system is very imbalanced, which contains 636 hydroxylated peptides and 2669 non-hydroxylated peptides for proline, which also may lower the MCC rate. The same is true for the case of hydroxylation. Particularly, no web-server was provided for the method in [11], and hence, its application value is quite limited. Actually, so far, no web-server whatsoever has been provided in this area. As pointed out in [63] and emphasized in a series of recent publications (see, e.g., [16-18,20,42,45]), one of the keys in developing a practically more useful prediction method is to establish a user-friendly and publicly accessible web-server. In view of this, the web-server for iHyd-PseAAC has been established, which can be freely accessed at http://app.aporc.org/iHyd-PseAAC/. Furthermore, for the convenience of the vast majority of biologists and pharmaceutical scientists, below, let us provide a step-by-step guide to show how the users can easily get the desired result by using iHyd-PseAAC without the need to follow the complicated mathematical equations presented in this paper just for its integrity. Step 1. Open the web-server at the site at http://app.aporc.org/iHyd-PseAAC/, and you will see the top page of the predictor on your computer screen, as shown in Figure 4. Click on the “Read Me” button to see a brief introduction about the iHyd-PseAAC predictor and the caveat when using it.

Figure 4.

The top-page of the web-server, iHud-PseAAC, at http://app.aporc.org/iHyd-PseAAC/.

Step 2. Either type or copy/paste the query protein sequences into the input box at the center of Figure 4. The protein sequences should be in FASTA format. The input examples can be seen by clicking on the “Example” button right above the input box. Step 4. Click on the “Citation” button to find the relevant paper that documents the detailed development and algorithm of iHyd-PseAAC. Step 5. Click on the “Data” button to download the benchmark dataset used to train and test the iHyd-PseAAC predictor.

58 in total

1. Using the augmented Chou's pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach.

Authors: Yu-hong Zeng; Yan-zhi Guo; Rong-quan Xiao; Li Yang; Le-zheng Yu; Meng-long Li
Journal: J Theor Biol Date: 2009-03-31 Impact factor: 2.691

Review 2. The importance of proline residues in the structure, stability and susceptibility to proteolytic degradation of collagens.

Authors: Stephen M Krane
Journal: Amino Acids Date: 2008-04-23 Impact factor: 3.520

3. A sequence-coupled vector-projection model for predicting the specificity of GalNAc-transferase.

Authors: K C Chou
Journal: Protein Sci Date: 1995-07 Impact factor: 6.725

4. A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins.

Authors: K C Chou
Journal: J Biol Chem Date: 1993-08-15 Impact factor: 5.157

5. Conformational implications of enzymatic proline hydroxylation in collagen.

Authors: R K Chopra; V S Ananthanarayanan
Journal: Proc Natl Acad Sci U S A Date: 1982-12 Impact factor: 11.205

6. Anti-metastatic effect of Biophytum sensitivum is exerted through its cytokine and immunomodulatory activity and its regulatory effect on the activation and nuclear translocation of transcription factors in B16F-10 melanoma cells.

Authors: Chandrasekharan Guruvayoorappan; Girija Kuttan
Journal: J Exp Ther Oncol Date: 2008

7. Prediction of G-protein-coupled receptor classes based on the concept of Chou's pseudo amino acid composition: an approach from discrete wavelet transform.

Authors: Jian-Ding Qiu; Jian-Hua Huang; Ru-Ping Liang; Xiao-Quan Lu
Journal: Anal Biochem Date: 2009-04-11 Impact factor: 3.365

8. Some remarks on protein attribute prediction and pseudo amino acid composition.

Authors: Kuo-Chen Chou
Journal: J Theor Biol Date: 2010-12-17 Impact factor: 2.691

9. iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins.

Authors: Yan Xu; Xiao-Jian Shao; Ling-Yun Wu; Nai-Yang Deng; Kuo-Chen Chou
Journal: PeerJ Date: 2013-10-03 Impact factor: 2.984

10. iEzy-drug: a web server for identifying the interaction between enzymes and drugs in cellular networking.

Authors: Jian-Liang Min; Xuan Xiao; Kuo-Chen Chou
Journal: Biomed Res Int Date: 2013-11-26 Impact factor: 3.411

41 in total

1. iPhosY-PseAAC: identify phosphotyrosine sites by incorporating sequence statistical moments into PseAAC.

Authors: Yaser Daanial Khan; Nouman Rasool; Waqar Hussain; Sher Afzal Khan; Kuo-Chen Chou
Journal: Mol Biol Rep Date: 2018-10-11 Impact factor: 2.316

Review 2. Structural Variability in the RLR-MAVS Pathway and Sensitive Detection of Viral RNAs.

Authors: Qiu-Xing Jiang
Journal: Med Chem Date: 2019 Impact factor: 2.745

3. Deep Learning-Based Advances In Protein Posttranslational Modification Site and Protein Cleavage Prediction.

Authors: Subash C Pakhrin; Suresh Pokharel; Hiroto Saigo; Dukka B Kc
Journal: Methods Mol Biol Date: 2022

4. RF-Hydroxysite: a random forest based predictor for hydroxylation sites.

Authors: Hamid D Ismail; Robert H Newman; Dukka B Kc
Journal: Mol Biosyst Date: 2016-07-19

5. Label-Free Method Development for Hydroxyproline PTM Mapping in Human Plasma Proteome.

Authors: Shakilur Rahman; Gourab Bhattacharje; Debabrata Dutta; Swarnendu Bag; Bidhan Chandra Sing; Jyotirmoy Chatterjee; Amit Basak; Amit Kumar Das
Journal: Protein J Date: 2021-04-11 Impact factor: 2.371

6. DNN-m6A: A Cross-Species Method for Identifying RNA N6-Methyladenosine Sites Based on Deep Neural Network with Multi-Information Fusion.

Authors: Lu Zhang; Xinyi Qin; Min Liu; Ziwei Xu; Guangzhong Liu
Journal: Genes (Basel) Date: 2021-02-28 Impact factor: 4.096

7. iNitro-Tyr: prediction of nitrotyrosine sites in proteins with general pseudo amino acid composition.

Authors: Yan Xu; Xin Wen; Li-Shu Wen; Ling-Yun Wu; Nai-Yang Deng; Kuo-Chen Chou
Journal: PLoS One Date: 2014-08-14 Impact factor: 3.240

8. Sequence-based identification of recombination spots using pseudo nucleic acid representation and recursive feature extraction by linear kernel SVM.

Authors: Liqi Li; Sanjiu Yu; Weidong Xiao; Yongsheng Li; Lan Huang; Xiaoqi Zheng; Shiwen Zhou; Hua Yang
Journal: BMC Bioinformatics Date: 2014-11-20 Impact factor: 3.169

9. Identifying new targets in leukemogenesis using computational approaches.

Authors: Archana Jayaraman; Kaiser Jamil; Haseeb A Khan
Journal: Saudi J Biol Sci Date: 2015-01-20 Impact factor: 4.219

10. iCTX-type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels.

Authors: Hui Ding; En-Ze Deng; Lu-Feng Yuan; Li Liu; Hao Lin; Wei Chen; Kuo-Chen Chou
Journal: Biomed Res Int Date: 2014-06-01 Impact factor: 3.411