Self-interacting proteins (SIPs) play crucial roles in biological activities of organisms. Many high-throughput methods can be used to identify SIPs. However, these methods are both time-consuming and expensive. How to develop effective computational approaches for identifying SIPs is a challenging task. In the article, we present a novel computational method called RRN-SIFT, which combines the recurrent neural network (RNN) with scale invariant feature transform (SIFT) to predict SIPs based on protein evolutionary information. The main advantage of the proposed RNN-SIFT model is that it uses SIFT for extracting key feature by exploring the evolutionary information embedded in Position-Specific Iterated BLAST-constructed position-specific scoring matrix and employs an RNN classifier to perform classification based on extracted features. Extensive experiments show that the RRN-SIFT obtained average accuracy of 94.34% and 97.12% on the yeast and human dataset, respectively. We also compared our performance with the back propagation neural network (BPNN), the state-of-the-art support vector machine (SVM), and other existing methods. By comparing with experimental results, the performance of RNN-SIFT is significantly better than that of the BPNN, SVM, and other previous methods in the domain. Therefore, we conclude that the proposed RNN-SIFT model is a useful tool for predicting SIPs, as well to solve other bioinformatics tasks. To facilitate widely studies and encourage future proteomics research, a freely available web server called RNN-SIFT-SIPs was developed at http://219.219.62.123:8888/RNNSIFT/ including the source code and the SIP datasets.
Self-interacting proteins (SIPs) play crucial roles in biological activities of organisms. Many high-throughput methods can be used to identify SIPs. However, these methods are both time-consuming and expensive. How to develop effective computational approaches for identifying SIPs is a challenging task. In the article, we present a novel computational method called RRN-SIFT, which combines the recurrent neural network (RNN) with scale invariant feature transform (SIFT) to predict SIPs based on protein evolutionary information. The main advantage of the proposed <span class="Chemical">RNN-SIFT model is that it uses SIFT for extracting key feature by exploring the evolutionary information embedded in Position-Specific Iterated BLAST-constructed position-specific scoring matrix and employs an RNN classifier to perform classification based on extracted features. Extensive experiments show that the RRN-SIFT obtained average accuracy of 94.34% and 97.12% on the yeast and human dataset, respectively. We also compared our performance with the back propagation neural network (BPNN), the state-of-the-art support vector machine (SVM), and other existing methods. By comparing with experimental results, the performance of RNN-SIFT is significantly better than that of the BPNN, SVM, and other previous methods in the domain. Therefore, we conclude that the proposed RNN-SIFT model is a useful tool for predicting SIPs, as well to solve other bioinformatics tasks. To facilitate widely studies and encourage future proteomics research, a freely available web server called RNN-SIFT-SIPs was developed at http://219.219.62.123:8888/RNNSIFT/ including the source code and the SIP datasets.
Protein-protein interaction (PPI) prediction revealed multiple roles in many
important biological activities. However, an interesting related research problem is
whether proteins can interact with their partner. Self-interacting proteins (SIPs)
is being considered as a special type of PPIs, which refers to more than 2 copies of
the protein can interact with each other and are the same copies of the protein and
can be represented by the same gene. This might bring about the formation of
homo-oligomer problem. Many recent studies have shown that SIPs play a vital role in
various cellular physiological functions and the evolution process of
protein-protein interaction networks.[1,2] Therefore, whether a protein can
self-interact for interpretation of its functions is very important. The research on
SIPs can provide a better understanding of the regulation of protein function and
the molecular mechanisms involved in biological activity and the underlying cellular
and genetic disease mechanisms. Many studies have been conducted for the
homo-oligomerization that is a vital function for biological activity and plays an
essential role in a wide range of biological processes, such as signal transduction,
gene expression regulation, enzyme activation, and immune response.[3-7] In addition, it has been
demonstrated by many previous studies that the diversity function of proteins can be
variously extended without increasing the length of genome through SIPs.
Self-interacting proteins can also provide some help in improving the protein
stability and preventing the protein denaturation by reducing its surface
area.[8,9] Therefore, it is
becoming more important to develop reliable and effective computational approaches
based on protein sequences for predicting SIPs.Also, more research has been devoted to develop computational methods to predict
PPIs. Gao et al[10] proposed a novel computational method called RF-AC, which combined the Rondom
Forest (RF) classifier with Autocovariance (AC) approach–based position-specific
scoring matrix (PSSM). Huang et al[11] presented a new computational approach, which used weighted sparse
representation as classifier and employed global encoding as a feature extraction
method for predicting PPIs. Pan et al[12] proposed a novel latent Dirichlet allocation-rondom forest (LDA-RF) model for
predicting human PPIs based on protein primary sequences, which has strong ability
for processing large-scale datasets by using the LDA-RF model. Zhang[13] proposed a novel approach based on protein sequence that used random tree and
genetic algorithm for predicting PPIs. Yang et al[14] presented a new approach that used local descriptors to represent protein
sequence and employed the k-nearest neighbors for performing
classification. Guo et al[15] adopted autocorrelation feature extraction technique for generating feature
vectors and used the support vector machine (SVM) classifier to identify PPIs. An et al[16] proposed a classification algorithm of compound kernel function RVM based on
<span class="Species">gray wolf optimization algorithm and k-fold cross-validation, which
fully consider the special features of local and global of PPI position. An et al[17] proposed a feature extraction approach based on local protein sequence PSSM
matrix coding and serial multifeature fusion. The method can capture PPI information
of continuous and discontinuous for protein sequence by using the local protein
sequence PSSM matrix coding; much key feature information–contained protein
sequences can be integrated through employing serial multifeature fusion. These
methods usually explore the correlational information between protein pairs, such as
coevolution, colocalization, and coexpression. However, this information is not
enough for predicting SIPs. In addition, the PPI datasets do not contain the PPIs
between the same protein partners. For all these reasons, it is not adequate for
predicting SIPs by using these computational approaches. In a previous study, Liu et al[18] proposed a method integrating multiple representative known properties to
create a prediction mode called as SLIPPER to predict SIPs. As far as we know, a
number of recent studies have been reported about PPIs, which may also be related to
SIPs.[19-24] However, there is obviously a
drawback that cannot deal with the proteins not covering the current human
interatomic by using these methods. Due to all the reasons presented, the
development of efficient computational methods for predicting SIPs is a necessary
work.
In the study, we proposed a new computational approach called RRN-SIFT, which
combines the recurrent neural network (RNN) with scale invariant feature transform
(SIFT) for predicting SIPs based on protein evolutionary information. The proposed
method uses SIFT to extract key features from PSSM that is constructed by using the
Position-Specific Iterated BLAST (PSI-BLAST) tool and contains protein evolutionary
information. The <span class="Chemical">RNN classifier is employed for executing classification based on
extracted features. The RRN-SIFT model obtained average accuracy of 94.34% and
97.12% on the yeast and human dataset,
respectively. Compared with the back propagation neural network (BPNN), the
state-of-the-art SVM, and previous computational models, our method takes full
advantage of RNN and SIFT, thereby improving the prediction accuracy. Therefore, the
experimental results demonstrated that the proposed RNN-SIFT model is a useful tool
for predicting SIPs and is also suitable for other bioinformatics tasks.
Materials and Methods
Dataset
The UniProt database contains 20 199 curated human protein sequences.[25] The PPI datasets can be downloaded from different databases, including DIP,[26] BioGRID,[27] IntAct,[28] InnateDB,[29] and MatrixDB.[30] In the article, we constructed the PPI data that only contain the same 2
interaction protein sequences and whose interaction type was defined as “direct
interaction” in relevant databases. As a result, we acquired 2994
<span class="Species">human self-interaction protein sequences. To verify the
performance of the RNN-SIFT model, we constructed the experimental datasets by
using the following 3 steps: (1) the protein sequences with length less than 50
residues and longer than 5000 residues were removed from the whole human
proteome; (2) we selected the SIP data to create the positive dataset, which
must satisfy with 1 of the following conditions: (a) it has been detected for
self-interacting by at least 2 kinds of large-scale experiments or 1 small-scale
experiment, (b) the protein has been defined as homo-oligomer (including
homodimer and homodimers) in UniProt, and (c) it has been reported by at least 2
publications for self-interacting; and (3) for constructing the negative
dataset, we removed all types of SIPs from the whole human proteome (including
proteins annotated as “direct interaction” and more extensive “physical
association”) and UniProt database. Consequently, we selected 15 938 non-SIPs as
negative samples and 1441 SIPs as positives samples for creating the
human dataset.[31] In addition, we also used the same strategy to construct the
yeast dataset that contains 5511 negative and 710 positive samples.[31]
Feature extraction method
Position-specific scoring matrix
Position-specific scoring matrix contains not only the position information
but also the evolution information of protein sequence. As a result, the
PSSM is used to extract the evolutionary information in the article.
Position-Specific Iterated BLAST[32] is used to convert each sequence into a PSSM. Assuming the length of
a given protein sequence is L, its PSSM can be expressed as
an L × 20 matrix. Figure 1 shows the schematic of a
PSSM.
Figure 1.
The schematic of a PSSM.
PSSM indicates position-specific scoring matrix.
The schematic of a PSSM.PSSM indicates position-specific scoring matrix.In the artwork, L represents the length of a given sequence,
20 is the number of 20 amino acids, and represents the score of the amino acid in the position for the query sequence. The is a real value, where if is greater than 0, it means that the amino acid is easily mutated into the amino acid during the evolution process, and a larger
value indicates a higher mutation probability. Conversely, if
is less than 0, the position is conservative and the
probability of mutation is small. Smaller are more conservative. To extract evolutionary information
from protein sequences, each SIP’s sequence was converted into a PSSM by
using the PSI-BLAST tool. To obtain highly and widely homologous sequences,
PSI-BLAST’s e-value parameter was set to 0.001 and 3 iterations were
selected.
Scale invariant feature transform
Scale invariant feature transform is an image descriptor for image-based
matching and recognition developed by Lowe.[33,34] The original SIFT
descriptor was calculated from the image intensities around interesting
locations in the image domain which can be named interest points or key
points. These interest points are obtained from scale-space extrema of
difference of Gaussians (DOG) within a <span class="Species">DOG pyramid. Lindeberg[35,36]
proposed a new method for finding out interest points by using the SIFT
approach. This method can be viewed as a variation of a scale-adaptive blob
detection approach, where blobs with associated scale levels are detected
from scale-space extrema of the scale-normalized Laplacian. The
scale-normalized Laplacian is normalized with respect to the scale level in
scale space and is defined as
For obtaining the maximum value of the DOG image under different scale
magnifications, the smoothed image value of a given original image is
convolved with Gaussian kernels of different widths by using the SIFT
algorithm, a scale-variable Gaussian function is defined as followsThese Gaussian blurred images are grouped according to their scale
magnification, so the number of Gaussian blurred images processed in each
group is the same. At this time, the DOG image can be obtained by
subtracting 2 adjacent Gaussian blurred images in the same group. The <span class="Species">DOG
operator constitutes an approximation of the Laplacian operator of different
widths, which denotes the standard deviation and the variance of the
Gaussian kernel. The DOG operator which constitutes an approximation of the
Laplacian operator is defined as follows
Which by the implicit normalization of the DOG responses, as obtained by a
self-similar distribution of scale levels = used by Lowe, also constitutes an approximation of the
scale-normalized Laplacian with thus implyingAfter the DOG image is obtained, the maximum and minimum values can be found
and is referred to as key points in the <span class="Species">DOG images. To quickly find the key
points, each pixel of the DOG image will be compared with 8 pixels around
itself and 9 pixels at the same position in the same group of the DOG images
at adjacent scales. The maximum and minimum values of these pixels are
called key points. As a result, the critical point detection of SIFT
algorithm is actually a variant of blob detection, which use Laplacian to
compute the maximum value in each magnification space. The Gaussian
difference can be approximated as the result of Laplace operator operation.
The SIFT employs the concept of “scale space” to capture features at
multiple scale levels or image resolutions, which not only increases the
number of available features but also makes the method highly tolerant to
scale changes.
In the article, we assumed that each PSSM is an image matrix. As a result, we
used the SIFT feature extraction method to generate feature vectors and its
dimensional is 128. The technology roadmap of the proposed method is shown
in Figure 2.
The technology roadmap of the proposed method.BPNN indicates back propagation neural network; PSI-BLAST,
Position-Specific Iterated BLAST; PSSM, position-specific scoring
matrix; RNN, recurrent neural network; SVM, support vector
machine.
Recurrent neural network
Recurrent neural network is a machine learning method based on deep learning,
which is used to solve binary or multiple classification problems. For tasks
that involve sequential inputs, such as speech and language, it is often better
to use RNNs. <span class="Chemical">RNNs process an input sequence one element at a time, maintaining
in their hidden units a “state vector” that implicitly contains information
about the history of all the past elements of the sequence. The final output of
the RNN model is the classification label of each feature vector.
Recurrent neural network is used to solve the problem that the input training
sample is a continuous sequence and the length of the sequence is different,
such as the problem based on time series. The basic neural network only
establishes weight connections between layers. The biggest difference of RNN is
that the weight connections also established between layers of
neurons.[37-39] The
structure of <span class="Chemical">RNN is as follows:
It can be seen from Figure
3 that the output of RNN at any moment is related to the current
input and the previous output. <span class="Chemical">RNN’s forward propagation is a combination of
multiplication, addition, and set operations. It is well known that
t moment of a given ordered sequence will lead to
computation of the hidden layer t times. The current state of
hidden layer is determined by the current input and the output of the previous layer. The mathematical description is as
follows
Figure 3.
The structure of RNN.
RNN indicates recurrent neural network.
The structure of RNN.RNN indicates recurrent neural network.where represents activation function. The output of the current
hidden layer can be calculated by using the following functionThe softmax function can be used to perform classification and output the final
prediction probability value, which is shown as followHere, the loss function of is different from . In practice, we can select different loss functions according
to the need of the different problem, such as the log loss function, the square
loss function, and so on. The loss function of the RNN model at moment
t can be expressed as followsThe loss function (global loss) of the RNN model at all moments
N can be expressed as followsThe gradient of 3 parameters U, V, and W of the
global loss can be defined as followsThe most commonly used method for optimization problems is the gradient descent.
In the article, the gradient update for the 3 parameters can be expressed as
followsThe major advantage of the RNN model in learning nonlinear sequential data is
well known and has been used in language modeling and sequential labeling. In
consideration of SIPs dataset is also a kind of nonlinear sequence data, so we
used the <span class="Chemical">RNN model to predict SIPs in the study. The prediction flowchart of
RNN-SIFT model is displayed in Figure 4.
In the article, we employed the following measures to assess the performance of
RNN-SIFTwhere Ac is the accuracy, Sn represents the sensitivity, Sp is the specificity,
Pe represents the precision, and Mcc is Matthews’s correlation coefficient.
TP and TN represent the number of true
interacting and true noninteracting pairs that were correctly predicted,
respectively. FP and FN are the count of true
noninteracting pairs and true interacting pairs falsely predicted, respectively.
In addition, we used receiver operating curve (ROC) to further evaluate the
performance of <span class="Chemical">RNN-SIFT in the experiment.
Results and Discussion
Performance of the proposed RNN-SIFT model
In the experiment, we used the yeast and <span class="Species">human
datasets to evaluate performance of the proposed RNN-SIFT model. Generally,
overfitting will affect experimental results. Therefore, we divided the whole
datasets into the training datasets and independent test datasets for preventing
a biased evaluation. Specifically, we split the yeast dataset
into 6 parts and selected 5 parts of them as the training set and the remaining
dataset selected as independent test dataset. The human dataset
was also processed by using the same strategy. Meanwhile, 5-fold
cross-validation tests were also employed to evaluate prediction ability of the
RNN-SIFT for fair comparison, and several parameters of the RNN model were
optimized through using the grid search for ensuring fairness. Here, we set up
the learning rate = 0.001, training step = 1000, and hidden units = 200. Tables 1 and 2 show the
experimental results of the proposed RNN-SIFT model on the
yeast and human datasets.
Table 1.
Fivefold cross-validation results shown using the RNN-SIFT model on
yeast.
Fivefold cross-validation results shown using the RNN-SIFT model on
<span class="Species">yeast.
Abbreviations: Ac, accuracy; Mcc, Matthews’s correlation coefficient;
Pe, precision; <span class="Chemical">RNN, recurrent neural network; SIFT, scale invariant
feature transform; Sn, sensitivity.
Fivefold cross-validation results shown using the RNN-SIFT model on
<span class="Species">human.
Abbreviations: Ac, accuracy; Mcc, Matthews’s correlation coefficient;
Pe, precision; <span class="Chemical">RNN, recurrent neural network; SIFT, scale invariant
feature transform; Sn, sensitivity.
As can be seen from Table
1, the proposed RNN-SIFT model obtained good experimental results on
<span class="Species">yeast dataset. The result of average accuracy 94.34%,
average sensitivity 67.12%, average precision 79.79%, and average Mcc 71.61% was
achieved in the experiments on 5-fold cross-validation tests. Similarly, another
promising finding from Table 2 was that the RNN-SIFT also achieved better prediction
results on human dataset, whose average accuracy, sensitivity,
precision, and Mcc are 97.12%, 83.70%, 85.24%, and 79.35%, respectively. As a
result, the proposed RNN-SIFT model has high value in research.
The good experimental results for predicting SIPs are mainly attributed to use
the SIFT feature extraction method and RNN classifier. The main advantage of the
<span class="Chemical">RNN-SIFT model is that the SIFT method can extract key evaluation features from
PSSM, and the RNN classifier has the advantage of processing sequence data. As
discussed, this is mainly due to the following 3 reasons: (1) PSSM contains not
only the position information but also the evolution information of protein
sequence and retains plenty of prior information. This makes it possible to
contain a number of key features that can be extracted. (2) SIFT uses the
concept of “scale space” to capture features at multiple scale levels, which not
only increases the number of available features but also makes the method highly
tolerant to scale changes. This makes it possible for extracting the
evolutionary information embedded in PSSM and capturing SIP information. (3)
Recurrent neural network has some characteristics in memory, parameter sharing,
and Turing completeness, so which provide an advantage for learning based on the
nonlinear characteristics of sequences. Therefore, RNN is used to perform
classification for predicting SIPs. The results demonstrate 2 things. First, the
SIFT method is very suitable for extracting SIP features. Second, the RNN
classifier performs well for predicting SIPs, giving good results.
Comparison with the method of BPNN-based and SVM-based
It is interesting to note that the RNN-SIFT model is very suitable for predicting
SIPs and can obtain good prediction results. However, to further evaluate the
performance of the <span class="Chemical">RNN-SIFT model, we compared the RNN classifier with the BPNN
classifier and the SVM classifier by using the same SIFT approach on
yeast and human datasets, respectively. To
ensure fair comparison, several parameter settings of BPNN were optimized by
employing grid search approach. Specifically, the epochs (the time of training),
the eta (learning rate), the BS (the batch size of each training), and the WS
(weights) of BPNN are set to 100, 0.006, 0.5, and 0.7. Similarly, by using the
same strategy as described above, the RBF kernel parameters of the SVM were
optimized, where c is 0.5 and g is 10.8 and
other parameters should be take the default values. In addition, the SVM
classifier used the LIBSVM tool[40] to perform classification.
Tables 3 to 6 below show the
experimental results of BPNN-SIF and SVM-SIFT on the yeast and
<span class="Species">human datasets, respectively. Meanwhile, the comparison of
ROC curves on the yeast and human datasets
between RNN, BPNN, and SVM is shown in Figures 5 and 6 below, respectively. As outlined in
Tables 3 and
4, the BPNN-SIFT
model achieved 91.31% average accuracy and the SVM-SIFT model obtained 89.58%
average accuracy on yeast dataset. Similarly, as can be seen
from Tables 5 and
6, the results
of average accuracy 93.84% and 91.79% are obtained by the BPNN-SIFT model and
the SVM-SIFT model on human dataset, respectively. When
comparing our results to those of BPNN-SIFT and SVM-SIFT, it must be pointed out
that the performance of RNN classifier is significantly better than that of the
other 2 classifiers. At the same time, from Figures 5 and 6, the ROC curves of RNN classifier are
also significantly better than those of the other 2 classifiers. A major reason
for good prediction results is that SIP sequence is nonlinear sequence data, and
RNN classifier has some characteristics in memory, parameter sharing, and Turing
completeness and can provide an advantage for learning based on the nonlinear
characteristics of sequences. From the above analysis, we conclude that the
proposed RNN-SIFT model is a useful tool for identifying SIPs, as well as other
bioinformatics tasks.
Table 3.
Fivefold cross-validation results shown by using the BPNN-SIFT model on
yeast.
Fivefold cross-validation results shown by using the BPNN-SIFT model on
yeast.Abbreviations: Ac, accuracy; BPNN, back propagation neural network;
Mcc, Matthews’s correlation coefficient; Pe, precision; SIFT, scale
invariant feature transform; Sn, sensitivity.Fivefold cross-validation results shown by using the SVM-SIFT model on
yeast.Abbreviations: Ac, accuracy; Mcc, Matthews’s correlation coefficient;
Pe, precision; SIFT, scale invariant feature transform; Sn,
sensitivity; SVM, support vector machine.Fivefold cross-validation results shown by using the BPNN-SIFT model on
human.Abbreviations: Ac, accuracy; BPNN, back propagation neural network;
Mcc, Matthews’s correlation coefficient; Pe, precision; SIFT, scale
invariant feature transform; Sn, sensitivity.Fivefold cross-validation results shown by using the SVM-SIFT model on
human.Abbreviations: Ac, accuracy; Mcc, Matthews’s correlation coefficient;
Pe, precision; SIFT, scale invariant feature transform; Sn,
sensitivity; SVM, support vector machine.Comparison of ROC curves between RNN, BPNN, and SVM on
<span class="Species">yeast dataset.
BPNN indicates back propagation neural network; RNN, recurrent neural
network; ROC, receiver operating curve; SIFT, scale invariant feature
transform; SVM, support vector machine.Comparison of ROC curves between RNN, BPNN, and SVM on
<span class="Species">human dataset.
BPNN indicates back propagation neural network; RNN, recurrent neural
network; ROC, receiver operating curve; SIFT, scale invariant feature
transform; SVM, support vector machine.
Comparison with other methods
To go a step further and validate the performance of the proposed RNN-SIFT model,
we compare the prediction results of the <span class="Chemical">RNN-SIFT model with those of the
previous methods, such as SLIPPER,[18]CRS,[31] SPAR,[31] DXECPPI, PPIevo,[41] and LocFuse.[42]
Tables 7 and 8 show a detailed
comparison results on the yeast and human
datasets. It can be seen from Table 7 that the average accuracy of
RNN-SIFT is obviously higher than that of the other 6 approaches on
yeast dataset. Similarity, Table 8 displays the prediction
accuracy obtained by the RNN-SIFT model is also significantly better than that
of the other 6 methods on human dataset. A similar conclusion
was reached by comparing the results from Tables 7 and 8 that the proposed RNN-SIFT model has
an excellent prediction capability and can be used for predicting the quality of
SIPs. This is a result of using a robust RNN classifier and an effectively SIFT
feature extraction technique. These comparison results are further evidence that
the RNN-SIFT is suitable for predicting SIPs.
Table 7.
Comparison results between RNN-SIFT and other methods on
yeast dataset.
Comparison results between RNN-SIFT and other methods on
<span class="Species">yeast dataset.
Abbreviations: Ac, accuracy; Mcc, Matthews’s correlation coefficient;
<span class="Chemical">RNN, recurrent neural network; SIFT, scale invariant feature
transform; Sn, sensitivity; Sp, specificity.
Comparison results between RNN-SIFT and other methods on
<span class="Species">human dataset.
Abbreviations: Ac, accuracy; Mcc, Matthews’s correlation coefficient;
<span class="Chemical">RNN, recurrent neural network; SIFT, scale invariant feature
transform; Sn, sensitivity; Sp, specificity.
Conclusions
In the study, we proposed a novel computational method called RRN-SIFT, which
combines the RNN with SIFT for predicting SIPs based on protein evolutionary
information. Extensive experiments show that the RRN-SIFT obtained an average
accuracy of 94.34% and 97.12% on the <span class="Species">yeast and
human dataset, respectively. We also compared our performance
with that of BPNN, the state-of-the-art SVM, and other exiting methods. By comparing
with the experimental results, the performance of RNN-SIFT is significantly better
than that of the BPNN, SVM, and other previous methods in the domain. This is mainly
due to the following 3 reasons: (1) PSSM contains not only the position information
but also the evolution information of protein sequence and retains plenty of prior
information. This makes it possible to contain a number of key features that can be
extracted. (2) Scale invariant feature transform uses the concept of “scale space”
to capture features at multiple scale levels, which not only increases the number of
available features but also makes the method highly tolerant to scale changes. This
makes it possible for extracting the evolutionary information embedded in PSSM and
capturing self-protein interaction information. (3) Self-interacting protein
sequence is nonlinear sequence data, and RNN has some characteristics in memory,
parameter sharing, and Turing completeness and can provide an advantage for learning
based on the nonlinear characteristics of sequences. Therefore, we conclude that the
proposed RNN-SIFT model is a useful tool for predicting SIPs, as well as to solve
other bioinformatics tasks.
Authors: Lukasz Salwinski; Christopher S Miller; Adam J Smith; Frank K Pettit; James U Bowie; David Eisenberg Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971
Authors: P Katsamba; K Carroll; G Ahlsen; F Bahna; J Vendome; S Posy; M Rajebhosale; S Price; T M Jessell; A Ben-Shaul; L Shapiro; Barry H Honig Journal: Proc Natl Acad Sci U S A Date: 2009-06-24 Impact factor: 11.205
Authors: Karin Breuer; Amir K Foroushani; Matthew R Laird; Carol Chen; Anastasia Sribnaia; Raymond Lo; Geoffrey L Winsor; Robert E W Hancock; Fiona S L Brinkman; David J Lynn Journal: Nucleic Acids Res Date: 2012-11-24 Impact factor: 16.971