Literature DB >> 18973866

Hidden Markov models incorporating fuzzy measures and integrals for protein sequence identification and alignment.

Niranjan P Bidargaddi¹, Madhu Chetty, Joarder Kamruzzaman.

Abstract

Profile hidden Markov models (HMMs) based on classical HMMs have been widely applied for protein sequence identification. The formulation of the forward and backward variables in profile HMMs is made under statistical independence assumption of the probability theory. We propose a fuzzy profile HMM to overcome the limitations of that assumption and to achieve an improved alignment for protein sequences belonging to a given family. The proposed model fuzzifies the forward and backward variables by incorporating Sugeno fuzzy measures and Choquet integrals, thus further extends the generalized HMM. Based on the fuzzified forward and backward variables, we propose a fuzzy Baum-Welch parameter estimation algorithm for profiles. The strong correlations and the sequence preference involved in the protein structures make this fuzzy architecture based model as a suitable candidate for building profiles of a given family, since the fuzzy set can handle uncertainties better than classical methods.

Entities: Chemical Disease Gene Species

Mesh：

Substances：
Globins
Protein Kinases

Year: 2008 PMID： 18973866 PMCID： PMC5054101 DOI： 10.1016/S1672-0229(08)60025-X

Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN： 1672-0229 Impact factor: 7.691

Introduction

Hidden Markov models (HMMs) are probabilistic models that have been applied in various biological problems. For example, profile HMMs 1., 2., 3., 4. are used for aligning protein sequences of the same family based on homology through Viterbi algorithm. The parameters of profile HMMs are estimated by MAP and Baum-Welch algorithms (. However, classical HMMs have several limitations. First, HMMs do not capture any higher-order correlations of the amino acids in protein sequences. An HMM assumes that the identity of an amino acid at a particular position is independent of the identity of all other positions (. Second, HMMs are also constrained by the statistical independence assumptions during the formulation of the forward and backward variables that are used to compute the matching scores of an unknown sequence to a known family. Due to such assumptions, the joint measure variables (forward and backward) are decomposed as a combination of two measures defined on amino acid emission probabilities and state probabilities. To relax such assumptions and achieve improved performance and flexibility, Mohamed and Gader ( proposed a fuzzy HMM based on fuzzy measures and integrals. Fuzzy measures are an extension of the classical additive measures, obtained by replacing the additive requirement of classical measures with weaker properties of monotonicity, continuity, and semi-continuity (. Integrals are used to aggregate the fuzzy measures by combining the partial support for a hypothesis from the viewpoint of each information source and the importance of various subsets of sources. This model does not require the assumption of decomposing the measures. It also does not require fixing the lengths of the sequences and the availability of large training datasets as required by classical HMMs in order to learn the transition parameters, thus it offers more flexibility and robustness. The fuzzy HMM has been successfully applied in various domains such as speech processing and image processing 9., 10., 11.. The fuzzy measure concept for protein sequence analysis was first introduced into profile HMMs in our previous studies 12., 13., which showed improved performance over classical profile HMMs. In this paper, we originally propose a fuzzy Viterbi search algorithm based on Choquet integrals and fuzzy measures in order to overcome the limitations of the classical Viterbi search algorithm that has been used traditionally to align a query sequence to a profile model. It incorporates ascending values of the scores of the neighboring states while calculating the scores for a given state, hence providing better alignment and accurate scores. We also propose a fuzzy Baum-Welch algorithm to relax the statistical independence assumption in the classical Baum-Welch algorithm. Evaluation results obtained with protein sequences from globin and kinase databases demonstrate the superiority of the fuzzy profile HMM over classical models measured in terms of matching scores and alignments.

Model

Classical profile HMM

Profiles introduced by Gribskov ( are statistical descriptions of the consensus of multiple sequence alignment, which use position-specific scores for amino acids and position-specific penalties for opening and extending an insertion or deletion. Figure 1 shows the Plan 7 architecure of profile HMM used in software HMMER 2. This architecture differs from the original (Plan 9) Krogh/Hausler architecture ( used in earlier version of HMMER by reducing the number of transitions from 9 to 7, without D → I and I → D transitions. Profile HMMs capture position-specific information such as how conserved each column of the alignment is and which residues are likely to occur in each column. They are capable of modeling gapped alignments including insertions and deletions, which allows modeling of a complete conserved domain (rather than just a small ungapped motif). If a trusted alignment is not yet known, profile HMMs can be trained from unaligned sequences using Baum-Welch expectation maximization. The profile HMM architecture shown in Figure 1 is characterized by the following parameters:It consists of a linear set of match (M), insert (I), and delete (D) states. There is one M state per consensus column in the multiple alignments. Each M state carries a vector of 20 probabilities for scoring the 20 amino acids. Each M state is associated with an I and a D state. The group of three states (M/D/I) at the same consensus position in the alignment is called a node. The states are interconnected by arrows as shown in Figure 1, representing state transition probabilities. The transitions are arranged so that at each node, either an M state is triggered (a residue is aligned and scored) or a D state is triggered (no residue is aligned, resulting in a deletion-gap character “–”). Insertions occur between nodes, and an I state can have a self-transition, allowing one or more inserted residues to occur between consensus columns. The transition to an I state for the first inserted residue, followed by zero or more I→I self-transitions for each subsequent inserted residue, is the probabilistic equivalent of the familiar gap-open and gap-extend affine gap penalty system. Like all HMMs, profile HMMs have emission and transition probabilities with probability distribution over the whole space of sequences, which is parameterized using Baum-Welch re-estimation formulas to peak the distribution around the members of the family (. Forward algorithm, backward algorithm, re-estimation algorithm, and Viterbi algorithm are the four main components of profile HMM.

Fig. 1

Profile HMM architecture based on Plan 7 (HMMER 2).

Forward algorithm

Forward algorithm is used to calculate the log-odd scores of a protein sequence. The forward variables for classical profile HMM, namely, , and for the kth match, insert, and delete state are estimated using (1), (2), (3), respectively (:

Backward algorithm

Backward algorithm is used for parameter estimation. In classical profile HMM, the backward variables , and for the kth match, insert, and delete state, respectively, are calculated as shown in (4), (5), (6) (:

Re-estimation algorithm

The emission and transition matrices for classical profile HMM are re-estimated by computing all the elements of the emission and transition matrices as shown in (7), (8), (9), (10), (11):P(O) represents the probability of the sequence.

Viterbi algorithm

Classical Viterbi algorithm is used to compute the negative logarithm of the probability of the single most likely path, , for a given sequence O. It can be formulated as:where represents the path containing the sequence of states (M, I, and D) that emitted the amino acid residues in sequence O for the given model . In classical profile HMM, the Viterbi variables for the kth match, insert, and delete state, respectively, are calculated as shown in (13), (14), (15) (: The equations formulated above are based on the Plan 9 architecture (. They can be easily extended to Plan 7 architecture by setting the transition parameters and to zero. Since protein sequences have high degrees of interdependencies, the additive hypothesis of probability measure is not well suited. The classical model (HMMER) based on probability theory assigns the same level of importance to the source, that is, the states in profile HMM. A more flexible way to overcome this limitation is provided by fuzzy measures and integrals (. They take into account the relative importance of the source along with the information (.

Fuzzy measures and integrals

Probability measure theory obeys the additivity of classical theory by assigning one to the universal set. Fuzzy measures are an extension of the classical additive theory. They are obtained by replacing the additive requirement of classical measures with weaker properties of monotonicity, continuity, and semi-continuity (. The aggregation of fuzzy measures is done using Choquet or Sugeno integrals.

Fuzzy measure

Let Ω be the power set of a universal set X. A set function g : Ω→[0, 1] defined on Ω, which satisfies the conditions of boundary, monotonicity, and continuity shown in (16), (17), (18), is called a fuzzy measure. It represents the mapping of a crisp power set of a universal set to a unit interval representing evidence.Continuity: For any increasing sequence of sets in Ω, if , then From the definition of a fuzzy measure g, the union of two disjoint subsets cannot be directly computed from the component measures. Possibility measure based on t-conorm S is one of the most widely used fuzzy measures. The t-conorm S is an operation on the unit interval [0,1] satisfying the following conditions on elements a, b, and c (. Two types of S operations, maximum and drastic t-conorm operators, are shown in (23), (24), respectively: Possibility measure is based on the above defintions of max t-conorm operation. If X is a universal set with Ω consisting of all the subsets of X, then the possibility measure g is:where g(ϕ)= 0 and g(X) = 1. It satisfies the constraints shown in Equation 26 along with the ones defined above for the fuzzy measures. The possibility measures on each element of the set X denoted by g(x) are called the possibility-density measures. Using these density measures, we can calculate the possibility measures for all the sets in Ω:

Fuzzy integrals

Fuzzy integrals, defined with respect to fuzzy measures, are nonlinear functions combining multiple sources of uncertain information (. They use information concerning the importance of individual source as well as source subsets to derive a reasonable numerical confidence value for the particular hypothesis decision under consideration. Here, we give a brief description of the Choquet integral, which is one of the most commonly used fuzzy integrals. Let (X, Ω) be a measurable space and let h : X → [0, 1] be an Ω measurable function. The Choquet integral over A ⊆ X of the function h with respect to a fuzzy measure g is defined by:where . For a discrete set X with N elements, the Choquet integral (e) can be computed as follows:where and .

Fuzzy profile HMM

In classical profile HMM, the joint probability measure P (O1, O2, ···, O, O, q = Z) is written as the product P(O1, O2, ···, O, O|q = Z) · P(q = Z), thus making the following two assumptions of the statistical independence. The amino acid O emitted by the HMM at position t + 1 at Z state is independent of the previously emitted amino acid sequences (O1, O2, ···, O) (). The active state at position t + 1, Z, is independent of the previous subsequence of amino acids (O1, O2, ···, O) observed. These assumptions are not realistic for the homologous sequences of a family since they have a high correlation among neighboring residues (. Improved results for building profiles can be expected through the relaxation permitted by fuzzy measures leading to the fuzzy profile HMM. As mentioned above, in the fuzzy profile HMM, the additive property of probability measures is replaced with the weaker condition of monotonicity by using fuzzy measures and integrals (. The Choquet integral, used to aggregate the fuzzy measures, takes into account the importance of the individual and subsets of source (states and subset of states). The fuzzy forward and backward variables form the basis of the fuzzy Viterbi algorithm used for alignments and the fuzzy Baum-Welch algorithm for parameter estimation (. The fuzzy profile HMM, , is characterized by the following parameters (.where , and .

Fuzzy forward algorithm

We formulate the fuzzy forward variable, for the fuzzy profile HMM, which can be reduced to the combination of two measures defined on and on the states y = (M, I, D). This avoids the assumption of decomposition of measures as done in classical HMMs. At any time, the fuzzy measure on can be constructed from its constituent forward variables through recursion, after integrating with the Choquet integral and with multiplication as an intersection operator. This is shown by the following equations:where ∧ is the fuzzy intersection operator and is the multiplication operator. The elements of matrix , containing the probability values for transition to state y from state x, are assigned accordingly to function h as shown below:All the values h(x, y) representing the transition probabilities to state y are sorted in Equation 33:Based on the above sorting, a set k(y) is obtained as:where x is the state number at the ith position according to constraints in Equation 33 based on transition to the yth state from all other states. According to the definition of fuzzy measures and fuzzy integrals, f(t + 1) is given by Equation 35, which satisfies the constraints in (33), (34).where d(y) represents the difference between successive fuzzy measures and g(k(y)) represents the fuzzy measure. After normalizing the difference between successive fuzzy measures with respect to fuzzy density , we obtain:Based on (35), (36), the forward variable for the yth state at position t + 1 can be reformulated as:Accordingly, we reformulate the forward variables , and for the kth match, insert, and delete state, respectively, using the possibility measure as shown below:The term ρ in the above equations represents the fuzzy measure difference, which is calculated using the Choquet integral as shown in Algorithm 1

Calculation of fuzzy measure difference density-ρ

Fuzzy backward algorithm

The fuzzy backward variable is a conditional fuzzy measure, measuring the fuzziness of the observation because of visiting state Z. Equation 42 shows the fuzzy backward variable when integrated using the Choquet integral with respect to any fuzzy measure and multiplication as the intersection operator. We reformulate the backward variables , and for the kth match, insert, and delete state, respectively, using the possibility measure as shown below:

Fuzzy Baum-Welch re-estimation algorithm

After formulating the forward and backward variables, we extend fuzzification to parameter estimation methods for the profile HMM. The emission and transition matrices for the fuzzy profile HMM are re-estimated by computing all the elements of the emission and transition matrices as given by (46), (47), (48), (49), (50):The transition parameters for insert and delete states can be calculated similarly.

Fuzzy Viterbi algorithm

The classical Viterbi algorithm can be modified using fuzzy measures to compute at position t for state Z as shown below ():where q represents the state visited at position t, emitting amino acid residue O, and can be either match, insert, or delete state represented by Z. The maximization is modified using fuzzy measure difference density ρ to obtain fuzzy Viterbi algorithm:where represents the initial state fuzzy density, and represents the fuzzy measure difference density. is computed recursively for the entire length of the sequence as shown in Algorithm 2. Based on Equation 1 and Algorithm 1, , and for the kth match, insert, and delete state, respectively, can be formulated as shown below:

Fuzzy Viterbi algorithm

Initialization for N match, insert and delete states; Recursion through the length of the sequence T for N states; Termination conditions; Backtracking for optimal paths;

Computational complexity analysis

In a classical (HMMER) profile model with N states, the forward variables and the Viterbi algorithm have a computational complexity of the order of O(N2T) in time ( for a protein sequence of length T. At any instant of time, transitions occur to the kth M state only from the (k−1)th M, I, and D states. Similarly, D and I states also have only three incoming transitions that reduce the computational complexity to O(3T). In the fuzzy profile model, the computational complexity is of the order of O((2)T) since 2 subsets are computed at each state during the forward variable calculation. The computational complexity for the fuzzy profile model can be reduced to O(7T). The fuzzy model is computationally expensive compared with the classical model, but the trade-off is provided by improved accuracy of family identification and biologically significant alignments. As the primary goal is to improve the accuracy, the issue of computational complexity becomes secondary, since these computations are carried offline.

Evaluation

We evaluated the performance of the fuzzy profile HMM using sequences of widely studied globin and kinase families, and compared the results with those of the HMMER profile model.

Evaluation on globins

Globins are part of a large family of heme-containing proteins involved in the storage and transport of oxygen that have different oligomeric states and overall architecture 18., 19.. They are responsible for binding and/or transporting oxygen. The major groups of globins are hemoglobins and myoglobins from vertebrates and invertebrates, leghemoglobins from plants, and flavohemoglobins from bacteria. Hemoglobin is a protein responsible for transporting oxygen from the lungs to other tissues, and is a tetramer of two α and β chains each. We extacted the globin sequences from the SWISS-PROT database ( by searching the keyword “globin”. The globin dataset sample used in the evaluation consists of 625 different globin sequences. These sequences also belong to the Pfam 21., 22. domains with accession numbers PF00042, PF0152, PF01099, and PF06438. The sequences vary in length from 109 to 428 amino acids. The globin dataset was divided into 12 random folds. The model parameters were trained and optimized using one of the folds and the remaining folds were used as test dataset. To incorporate noise into the model, 1,953 non-globin sequences were added to the test dataset. The non-globin sequences varied in length from 25 to 350 amino acids and were obtained from SWISS-PROT database. The match of sequences to classical and fuzzy profile HMMs was scored using log-odd scores (defined later). The alignments for globin sequences were obtained through fuzzy and classical Viterbi algorithms. The classical Viterbi algorithm was used to align the sequences to the globin profile model based on HMMER. The estimation of both fuzzy and classical model parameters was done 12 times and the models with the highest overall log-likelihood scores were selected. Similar cross-fold validations have been carried out in earlier studies (. The performance of the fuzzy model was evaluated and compared with the HMMER profile model based on estimation comparison and Z-score plots.

Estimation comparison

The transition parameters were obtained using the classical and fuzzy Baulm-Welch re-estimation methods on the 50 globin sequences used for training. Fig. 2, Fig. 3 graphically depict the converged transition probabilities for all match states of classical and fuzzy profile HMMs, respectively. The transition probability from the kth match state to the (k + 1)th match, the kth insert, and the (k + 1)th delete state are represented by the top, middle, and bottom subgraphs, respectively. It is observed from the transition diagrams that the classical and fuzzy HMMs learn different values at some specific points, indicating the difference in their observed behavior. We further observed the following behaviors for transition matrices in classical and fuzzy profile HMMs.

Fig. 2

Distribution of match (M) state transitions for the classical profile HMM trained by the classical Baum-Welch estimation algorithm.

Fig. 3

Distribution of match (M) state transitions for the fuzzy profile HMM trained by the fuzzy Baum-Welch estimation algorithm.

The classical profile HMM has 5 transition probability values from the kth match state to the kth insert state with value greater than 0.05 before the state number 60 is reached (Figure 2). This indicates that the classical HMM has more insertions compared with the fuzzy profile model, which has only 3 transition probability values greater than 0.05 in the same region (Figure 3). The more insert transitions observed in the classical HMM compared with the fuzzy HMM indicates that there are more transition probabilities from the kth insert state to the (k + 1)th match state. Both classical and fuzzy profile models have the same nature for the transitions from delete states. The emission parameters were also obtained using the classical and fuzzy Baum-Welch re-estimation methods on the 50 globin sequences used for training. Fig. 4, Fig. 5 show the emission probability distribution of 20 amino acids at different match states (25, 50, 100, and 125) obtained by classical and fuzzy models, respectively. When match state M = 125, the residue histidine has the highest emission probability according to the classical model, while serine is the highest in the fuzzy profile model. This difference in emission distributions contributes to a different consensus alignment.

Fig. 4

Emission distribution of 20 amino acids for match states (25, 50, 100, 125) of the classical profile HMM trained by the classical Baum-Welch estimation algorithm.

Fig. 5

Emission distribution of 20 amino acids for match states (25, 50, 100, 125) of the fuzzy profile HMM trained by the fuzzy Baum-Welch estimation algorithm.

Z-score plot

Log-odd score is given by the ratio of most probable alignment Π* of the sequence R for a given model λ with respect to the probability of R through a null random model (ϒ): The probability of R through RM, which assumes that the underlying sequences are unrelated, is given by the simple product of frequencies of residues as shown below:This ratio provides a significance assessment of log-odd scores (. The amino acid frequencies of the sequences in training dataset are used for ϒ. To calculate Z-score, a smooth curve is fitted for the log-odd score plot using the local window technique (. A standard deviation is estimated for each length and Z-score is calculated for each score by estimating its distance from the curve in terms of standard deviation. The normalized Z-score plots for the classical (HMMER) and fuzzy profile models are shown in Fig. 6, Fig. 7, respectively. For the fuzzy model, it is observed that all the globin member sequences (both from training and test datasets) are clustered with the normalized Z-score between 2.0 to 8.0. This is mainly because of the max operation performed by the possibility measure. The member sequences are scattered sparsely with the normalized Z-score varying from 1.0 to 8.0 for the HMMER profile model. There is also a greater overlap between globins and non-globins in the HMMER profile model compared with the fuzzy model. There are 3 globins overlapping with non-globins for Z-scores varying from 1.0 to 2.0 in the HMMER profile model. In contrast, the fuzzy model has no globins in this range. Fig. 8, Fig. 9 show the plots of sensitivity and specificity with respect to Z-scores for both classical and fuzzy profile models. The plots demonstrate that the fuzzy model performs better than the classical model.

Fig. 6

Normalized Z-score obtained for globin and non-globin sequences from the classical profile HMM (HMMER) against protein chain length.

Fig. 7

Normalized Z-score obtained for globin and non-globin sequences from the fuzzy profile HMM against protein chain length.

Fig. 8

Sensitivity of globin and non-globin sequences against Z-score for the fuzzy and classical profile HMMs.

Fig. 9

Specificity of globin and non-globin sequences against Z-score for the fuzzy and classical profile HMMs.

Evaluation on kinases

We repeated the evaluation on the kinase family. Kinases are enzymes belonging to a very extensive family of proteins, which share a conserved catalytic core common with both serine/threonine and tyrosine. They are responsible for transferring a phosphate group from a phosphate donor onto an acceptor amino acid in a substrate protein. Kinases have been extensively studied by Taylor ( and Krogh (. Complete protein kinase catalytic domains range from 250 to 300 residues. The kinase dataset used in this study consists of 126 sequences with 72 representative sequences. A total of 1,141 non-kinase sequences extracted from SWISS-PROT database ( were included in the test dataset. The kinase sequences range in length from 100 to 800 amino acids and the profile model was built using 5-fold cross validation. From the Z-score plots shown in Fig. 10, Fig. 11 for classical and fuzzy kinase profile models, respectively, similar trends were also observed in the kinase family as observed in the globin family.

Fig. 10

Normalized Z-score obtained for kinase and non-kinase sequences from the classical profile HMM (HMMER) against protein chain length.

Fig. 11

Normalized Z-score obtained for kinase and non-kinase sequences from the fuzzy profile HMM against protein chain length.

Conclusion

We have proposed a fuzzy profile HMM based on Choquet integrals and Sugeno fuzzy measures to overcome the limitations of statistical independence in classical HMMs and to achieve an improved alignment and better log-odd scores for the sequences belonging to a given family. The estimation of Choquet integrals takes into account the ascending values of the scores of the neighboring states while calculating the scores for a given state, hence providing better alignment and improved log-odd scores and Z-scores. The proposed fuzzy profile HMM was tested on the globin and kinase families and compared with the classical profile model. The obtained results establish the superiority of the fuzzy profile HMM. In addition, the fuzzy profile model produces more accurate biologically significant alignments than the classical model because of the relaxation of the statistical independence assumption. Our future study will try to make the fuzzy measures more effective by taking into account the relative importance of biological and physio-chemical factors of the family.

Authors’ contributions

NPB conceived the initial idea of the proposed approach, collected the datasets, conducted experiments and prepared the manuscript. MC and JK refined the idea, supervised the project and revised the manuscripts. All authors read and approved the final manuscript.

Competing interests

The authors have declared that no competing interests exist.

O	protein sequence
T	length of the protein sequence
N	profile model length
Ω	set of protein sequences of the family
X	finite set of states at position t
Y	finite set of states at position t + 1
Z	states {Z₁, Z₂, ···, Z_N}
πˆZ(⋅)	initial state fuzzy measure
πˆZ({Zi})	initial state fuzzy density
bˆj(Ot)	symbol fuzzy density
aˆy(⋅\|X)	transition fuzzy measure
aˆy(yj\|xi)	transition fuzzy density
q_t	state visited at position t

10 in total

1. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.

Authors: Brigitte Boeckmann; Amos Bairoch; Rolf Apweiler; Marie-Claude Blatter; Anne Estreicher; Elisabeth Gasteiger; Maria J Martin; Karine Michoud; Claire O'Donovan; Isabelle Phan; Sandrine Pilbout; Michel Schneider
Journal: Nucleic Acids Res Date: 2003-01-01 Impact factor: 16.971

2. Amino acid substitution matrices from protein blocks.

Authors: S Henikoff; J G Henikoff
Journal: Proc Natl Acad Sci U S A Date: 1992-11-15 Impact factor: 11.205

Review 3. Profile hidden Markov models.

Authors: S R Eddy
Journal: Bioinformatics Date: 1998 Impact factor: 6.937

4. Pfam: a comprehensive database of protein domain families based on seed alignments.

Authors: E L Sonnhammer; S R Eddy; R Durbin
Journal: Proteins Date: 1997-07

5. Identification of protein sequence homology by consensus template alignment.

Authors: W R Taylor
Journal: J Mol Biol Date: 1986-03-20 Impact factor: 5.469

6. Stochastic models for heterogeneous DNA sequences.

Authors: G A Churchill
Journal: Bull Math Biol Date: 1989 Impact factor: 1.758

7. Profile analysis: detection of distantly related proteins.

Authors: M Gribskov; A D McLachlan; D Eisenberg
Journal: Proc Natl Acad Sci U S A Date: 1987-07 Impact factor: 11.205

8. Determinants of a protein fold. Unique features of the globin amino acid sequences.

Authors: D Bashford; C Chothia; A M Lesk
Journal: J Mol Biol Date: 1987-07-05 Impact factor: 5.469

9. Hidden Markov models in computational biology. Applications to protein modeling.

Authors: A Krogh; M Brown; I S Mian; K Sjölander; D Haussler
Journal: J Mol Biol Date: 1994-02-04 Impact factor: 5.469

10. The Pfam protein families database.

Authors: Alex Bateman; Ewan Birney; Lorenzo Cerruti; Richard Durbin; Laurence Etwiller; Sean R Eddy; Sam Griffiths-Jones; Kevin L Howe; Mhairi Marshall; Erik L L Sonnhammer
Journal: Nucleic Acids Res Date: 2002-01-01 Impact factor: 16.971

10 in total

2 in total

1. Using context to improve protein domain identification.

Authors: Alejandro Ochoa; Manuel Llinás; Mona Singh
Journal: BMC Bioinformatics Date: 2011-03-31 Impact factor: 3.169

2. Large-scale analyses of glycosylation in cellulases.

Authors: Fengfeng Zhou; Victor Olman; Ying Xu
Journal: Genomics Proteomics Bioinformatics Date: 2009-12 Impact factor: 7.691

2 in total