Literature DB >> 27274091

Consensus protein design.

Abstract

A popular and successful strategy in semi-rational design of protein stability is the use of evolutionary information encapsulated in homologous protein sequences. Consensus design is based on the hypothesis that at a given position, the respective consensus amino acid contributes more than average to the stability of the protein than non-conserved amino acids. Here, we review the consensus design approach, its theoretical underpinnings, successes, limitations and challenges, as well as providing a detailed guide to its application in protein engineering.

Entities: Disease Gene Species

Keywords: consensus design; multiple sequence alignment; protein stability; semi-rational design; statistical sequence analysis; thermostability

Mesh：

Substances：
Proteins

Year: 2016 PMID： 27274091 PMCID： PMC4917058 DOI： 10.1093/protein/gzw015

Source DB: PubMed Journal: Protein Eng Des Sel ISSN： 1741-0126 Impact factor: 1.650

Introduction

Directed evolution and informatics-based rational design are transforming the field of protein engineering (Arnold and Volkov, 1999; Jiang ; Lutz, 2010; Brustad and Arnold, 2011; Bornscheuer ; Joh ; Woolfson ). Semi-rational or knowledge-based hybrid approaches, which mix rational design with directed evolution schemes, to create small libraries of very high quality, have gained substantial momentum (Patrick and Firth, 2005; Lutz, 2010; Wijma , 2014; Magliery, 2015). Typically, information from protein structure, function, sequence homology and predictive computational algorithms are combined to preselect sites for focussed mutagenesis with limited amino acid diversity. This focus translates into dramatically reduced library sizes with a major increase in functional content, allowing for a more efficient sampling of sequence space. A popular strategy in semi-rational design of stability is the use of evolutionary information encapsulated in homologous protein sequences. Multiple sequence alignments (MSAs) and phylogenetic analyses have become standard tools for exploring sequence conservation (Steipe ) and ancestral relationships (Pauling ; Yang ; Thornton ; Thornton, 2004) amongst protein homologues. Such sequences and alignments can be acquired from natural sequence databases (UniProt Consortium, 2008; NCBI Resource Coordinators, 2014), curated alignment databases (Sigrist , 2013; Wilson ; Finn ) and neutral drift experiments (Bershtein ; Jäckel ). Consensus design, like ancestral sequence reconstruction, utilises evolutionary history; however, rather than inferring phylogenetic hierarchy, all sequences are aligned, and the most frequently observed amino acid is identified at each position in the alignment (Fig. 1) (Steipe ). The consensus design approach has been widely successful in improving the stabilities of functional and non-functional proteins, for example increasing melting temperatures by 10–32°C (Wirtz and Steipe, 1999; Lehmann and Wyss, 2001; Lehmann ; Dai ; Lutz, 2010; Magliery ; Magliery, 2015; Porebski ; Paatero ). However, only ∼50% of conserved residues are associated with improved stability, with ∼10% being stability neutral and ∼40% being destabilising, leading to challenges and trade-offs during implementation (Steipe ; Nikolova ; Wang ; Lehmann , 2002; Lehmann and Wyss, 2001; Polizzi ; Khersonsky ).

Fig. 1

Sequence alignment of 12 WW domains across several species and parent proteins. In the consensus, a ‘−’ is a gap, whilst a ‘+’ is an ambiguous position with no consensus. The most conserved residues are highlighted. Consensus design involves the following four steps: (1) identification of a domain to be targeted (for example, boundaries within a larger sequence context), (2) acquisition and pre-processing of homologous sequences, (3) iterative assessment of several MSA regimes and removal of disruptive sequences, and (4) calculation of sequence conservation. Application of sequence conservation is typically performed in one of three ways. First, single- or multiple-point mutations of the most conserved amino acid positions can be made to a target protein (Schreiber ; Nikolova ; Wang ; Polizzi ; Ferreiro ), and these mutations may further be filtered or weighted by other statistical or computational methods (Socolich ; Polizzi ). Second, full-length sequences can be created de novo, avoiding the problem of identifying residues that are truly stabilising (Pantoliano ; Blatt ; Lehmann ; Dai ; Vazquez-Figueroa ; Sullivan ; Jacobs ; Porebski ). Third, conserved residues and positions can be spiked or targeted in directed evolution studies to increase sampling of functionally relevant sequence space (Amin ; Socolich ; Bershtein ; Case and Hackel, 2016). The strategy of implementation is highly dependent on requirements and available resources; however, all approaches have seen impressive results, with an exhaustive catalogue of consensus-designed proteins shown in Supplementary Table S1.

Factors to consider during consensus design

Acquisition of homologous sequences

The most efficient way to acquire sequences is via sequence alignment databases such as Pfam (Sonnhammer ; Finn ), Prosite (Sigrist ), SMART (Letunic ) and Superfamily (Wilson ). These databases contain small, manually curated seed alignments for the development of hidden Markov model (HMM) profiles (Finn ) or motif-specific rules and patterns (Sigrist ), which can then be applied to larger collections such as the UniProtKB/Swiss-Prot (UniProt Consortium, 2008), Protein Data Bank (PDB) and NCBI (NCBI Resource Coordinators, 2014) sequence databases. If a protein target is not well represented in existing alignment databases, the best approach is to query the UniProtKB or NCBI sequence databases for a small number (<20) of the most homologous sequences, which can be used to curate a representative alignment that can be subjected to HMM profiling with the HMMER suite (Finn ), and subsequently align to more distant homologues. In the unfortunate instance that the target protein is minimally or not represented in the UniProtKB or NCBI sequence databases, the next option is to generate diversity through the use of neutral drift studies (Bershtein ; Jäckel ). In neutral drift experiments, a target protein is subjected to rounds of random mutagenesis and selected purely on whether it is folded and/or functional, and is then sequenced. This approach was used as a means of generating unbiased sequence diversity in the consensus design of a chorismate mutase, showing the method to be successful with fewer than 30 selected sequences (Jäckel ).

Homology

The effect of sequence homology on consensus design is poorly understood and highly likely to be a function of the target protein's biophysical properties, evolutionary history and the taxonomic representation in sequencing databases. Theoretically, inclusion of evolutionarily distant or diverse sequences should improve the probability of identifying more conserved features, as increased distance may imply increased sampling of sequence space. Although there are reports that too little (Jacobs ) and too much diversity of the input MSA is problematic (Parmeggiani ; Jäckel ; Sullivan ), this area has not been thoroughly explored. Determining the right amount of diversity is challenging. Sullivan et al. noted this in the consensus design of a triosephosphate isomerase (TIM) using comparisons of two Pfam alignments from database versions 18 and 22 (Sullivan ). The input sequences of version 18 were a roughly even mixture of bacterial and eukaryotic sequences, resulting in a weakly active and poorly folded consensus protein. However, the version 22 alignment was composed of predominantly bacterial sequences and resulted in a well-folded and fully active protein (Sullivan ). To further complicate matters, the version 22 sequences were filtered to be roughly the same length, and duplicate entries were removed, which may have had other effects and therefore reduced the general applicability of this approach. Highly successful designs such as FN3con (Porebski ) and cLRRTM2 (Paatero ) used sequences that were predominantly or exclusively from higher-order eukaryotes without the need to filter based on sequence length, suggesting that spanning the MSA over taxonomic domains or kingdoms may negatively affect results. Parmeggiani also observed a similar problem as a result of too broad protein family selections of armadillo repeat proteins (Parmeggiani ). However, rather than filtering or removing sequences, they subclassified their MSA into closer taxonomic groups and combined conserved residues from each subclassification, resulting in a well-expressed and stable protein. Extending sequence homology too far may result in poor conservation, which can prevent accurate alignment and lead to design failure. For example, sequence conservation within the β-defensin family is <33%, even though structural similarity is very high (Bauer ). Here, alignment within a specific species is challenging (Rost, 1999), and consensus design would likely be impossible using natural sequences—leaving neutral drift studies as the only solution for generating homology and a chance of successful alignment (Jäckel ). Managing homology is therefore a balance between sequence similarity, which is good for computing a MSA, and sequence diversity, which provides a greater coverage of sequence space that can be sampled during design.

Bias

In contrast to diversity, the weighting or skew of the MSA may bias consensus design towards a predominant clade, such as a taxon, species or protein classification (Jäckel ). This is typically the result of preferences from genome sequencing projects, which tends to over-represent particular species or proteins in sequence databases. Bias is more likely to be an issue for domains, motifs or repeat proteins that are found within larger proteins. In some instances, bias may be intentional, as to preserve functional networks of a protein family from a single species or subclassification. In the interest of purely identifying robustness and stability, it is reasonable to assume that bias and over-representation should generally be avoided; these traits may mask conserved and possibly stabilising features from other less represented evolutionary lineages. Bias reduction of natural sequences can be performed with relative ease using the sequence clustering software CD-HIT (Huang ) or by using likelihood-based methods to account for phylogeny (Bloom and Glassman, 2009).

Sequence count

One of the key advantages of consensus design over other sequence-based methods is its ability to identify stability enhancing mutations from a MSA with as few as four members. Examples include Subtilisin BPN’ (from 4 members) (Pantoliano ) and FN3 repeats (15 members) (Jacobs ). In the latter study, the top 10 most stable sequences were less successful in promoting thermodynamic stability than all 15 members (Jacobs ), demonstrating that even the less stable sequences contribute to the overall stability of the resulting consensus design. In this case, more sequences provide greater diversity, thus improving the signal-to-noise ratio, and therefore the detection of conserved residues in weakly conserved regions. This effect is exemplified by recent consensus designs using very large alignments, such as FN3con (2123 sequences, ΔT of >27°C) (Porebski ) and cLRRTM2 (6271 sequences, ΔT of 32°C) (Paatero ).

Quality of the sequence alignment

Difficulties arise with MSAs containing sequences of varying length, or when there are clusters of sequences that are locally, but not globally, homologous (Rost, 1999; Pearson, 2013). Large insertions and deletions between members can affect the identification of weakly conserved positions, for example resulting in the design of a weakly active and poorly folded protein (Sullivan ). In this specific case, filtering the homologous sequences to be roughly the same length and removing duplicate entries, the design was greatly improved, resulting in a well-folded and fully active protein. Interestingly, sequence differences between the ‘raw’ versus the filtered design were in predominantly non-conserved stretches of the protein. By sequence assessment alone, there was no obvious reason for why these differences resulted in vastly different biophysical properties. It is therefore possible that filtering sequences to those that are more homologous improved the alignment, which allows for better identification of weakly conserved residues (Sullivan ). Generating a ‘good’ MSA can be difficult and may actually be considered more art than science (Morrison, 2015). Unfortunately, MSA methods tend to vary significantly, and there is currently no quantitative measure for the quality of alignment (Nuin ; Kemena and Notredame, 2009; Pearson, 2013). This is further compounded by homology, bias and sequence count and its convoluted interplay with the particular evolutionary history of a target protein and its family. Therefore, it is highly recommended to carefully examine resulting alignments prior to consensus design, possibly with an overlay of secondary structure to gauge conservation boundaries and gaps (Durani and Magliery, 2013). Iterative rounds of phylogenetic assessment and sequence pruning can improve alignment quality, which should be inspected for aligned columns that correspond with structural motifs or secondary-structure elements that have few insertions, deletions and gaps.

Statistical enhancements to consensus design

It is intriguing that consensus design is successful despite its assumption of amino acid independence, ignoring the known importance of cooperativity and coupling of amino acids (Horovitz and Fersht, 1992; Matthews, 1993). Furthermore, successes rival and often exceed those of rational design and directed evolution, which is impressive given the relative ease in which consensus design can be performed. Coupling manifests as simple pairwise interactions, through to dense and complex inter-atomic networks (LiCata and Ackers, 1995; Chen and Stites, 2001; Luque ). For consensus design to work, coupling must be encoded into the evolutionary history and represented by amino acid conservation to some extent, which might explain why ∼40% of reported consensus mutations are destabilising (Steipe ; Nikolova ; Wang ; Lehmann , 2002; Lehmann and Wyss, 2001; Polizzi ; Khersonsky ). Attempts to improve consensus design have typically utilised additional statistical analysis that identifies coupling or covariation (Göbel ; Lockless and Ranganathan, 1999; Atchley ; Socolich ; Talavera ) and have generally been very successful in the engineering of stability (Magliery and Regan, 2004; Ozer and Ray, 2006; Sullivan , 2012; Durani and Magliery, 2013). The inclusion of both conserved and coupled mutations was necessary for statistical coupling analysis (SCA) in the design of a WW domain as consensus design alone was insufficient in creating a protein that folded correctly (Socolich ). However, two previous studies had no difficulty in generating folded and stable WW domains (Macias ; Jiang ), suggesting that failure of consensus design may have been a result of the MSA composition rather than a limitation of the WW domain itself. Another approach used the mutual information method to calculate the pairwise statistical interactions between positions in the MSA and chose to avoid making mutations to those positions, thereby improving the accuracy of identifying stabilising mutations from ∼50 to 90% (Sullivan ). However, this approach may not always be necessary as the pairwise covariation within and between ankyrin repeat motifs was found to be well represented by consensus design alone (Mosavi ). The role of covarying residues is even less understood than those of consensus mutations, although it appears that in some instances conserved residues encode most, if not all cooperativity. Therefore, consensus design and its enhancement by filtering correlated residues are dependent on how well the cooperativity is encoded into conserved residues, and whether other such correlations are statistically discernible from the alignment. Consensus design also appears to suffer when there are incompatible conserved residues and couplings as a result of divergent evolution, although this can be corrected by covariation methods (Magliery and Regan, 2004; Socolich ; Talavera ). However, covariation methods may not work in all scenarios; they typically require large MSAs to discern mutual amino acid dependencies (Socolich ; Talavera ) and are not applicable in situations where neutral drift studies are required, due to the rare event of coevolution. Interestingly, covaried residues in many cases actually have no physiospatial interactions with one another, recently sparking debate over what these methods are actually measuring (Talavera ). Covarying substitutions are often found on different branches of the phylogenetic tree and are perhaps independent events that may or may not be attributable to molecular coevolution (Talavera ). In the case of consensus design, highly conserved residues tend to be found within the protein core, evolve slowly and are therefore unlikely to be detected by covariation analysis even in very large alignments (Zvelebil ; Bartlett ; Talavera ). Regardless, covariation methods overall do seem to have utility and appear to generally identify favourable pairs of residues that can be used on their own and in conjunction with consensus design.

Engineering thermodynamic stability

The origin of consensus mutant stabilisation is currently described as that at a given position in a MSA of homologous proteins, the respective consensus amino acid contributes more than average to the stability of the protein than non-consensus amino acids (Fig. 1) (Steipe ; Nikolova ; Wang ; Lehmann , 2002; Lehmann and Wyss, 2001; Polizzi ; Khersonsky ). That is, a conserved residue is more likely to be stabilising than a random mutation at that same position (Polizzi ; Tokuriki ; Tokuriki and Tawfik, 2009a). However, this does not explain why conserved residues are likely to be more stabilising. A possible explanation is that as proteins evolved from a non-specialised but stable common ancestor, evolutionary drift allowed for the sampling of different stabilising mutations needed for adequate stability. Through the evolution of specialist function, many proteins now exist on a knife-edge of stability and function (Shoichet ; Tokuriki ; Tokuriki and Tawfik, 2009a, 2009b); for this reason, stabilising residues tend to be conserved. Consensus design is therefore able to leverage on millions of years of evolution and identify stabilising features from numerous protein homologues—amalgamating mostly additive mutations that no single protein has needed to amass. Much of the discussion about consensus design focusses primarily on the general trend of improving thermostability (Lehmann and Wyss, 2001; Lutz, 2010; Magliery ; Magliery, 2015). Indeed, consensus design reports a wide range of improvements to melting temperature from the modest increase of the marginally stable antibody VH domain (T of 36.4°C) by 6.1°C (Wirtz and Steipe, 1999), the modest increase of the highly stable Azami green fluorescent protein (T of ∼90°C) by 5.5°C (Dai ), through to the large increase of the moderately stable Mouse Leucine Rich Repeat Transmembrane Neuronal 2 (LRRTM2) (T of ∼50°C) by 32°C (Paatero ). However, improvements to thermodynamic stability are not necessarily the only observed effects of consensus design.

Protein evolvability

Proteins are often mutationally robust, with more than half of random single-point mutants retaining native function (Bloom ; Tokuriki ; Tokuriki and Tawfik, 2009a). However, extra thermodynamic stability is known to increase the robustness of the native structure to random mutations by increasing the fraction of variants that continue to possess the minimal stability required to fold (Nikolova ; Bloom , 2006). The mechanism by which this occurs is not fully understood, although is thought to involve a combination of raw stability and ‘global suppressor’ residues that buffer the effect of deleterious mutations (Poteete ; Steipe, 1999; Nikolova ; Tokuriki ). In the context of consensus design, raw stability is definitely observed (Supplementary Table S1); however, without extensive mutagenesis studies, it is unclear whether conserved residues infer global suppressor like properties. Given that global suppressor residues appear to be transferrable across protein homologues, such as the case in TEM β-lactamases (Brown ), it is reasonable to suggest that conserved residues which happen to be global suppressors will induce similar effects when made in consensus design. Consensus design has been used to enhance the evolvability of a computationally designed Kemp eliminase (KE59) (Khersonsky ). Optimisation of activity by directed evolution was initially desired; however, the stability of KE59 was insufficient to tolerate mutations, rapidly producing unfolded proteins, thereby trapping the evolutionary trajectory in a local minimum. To boost KE59's evolvability, conserved residues were spiked into the directed evolution library, thereby improving protein stability and allowing for fresh downhill evolution of function. A similar result was also reported for the directed evolution of a consensus-designed ankyrin repeat protein (DARPin) for binding to HER2 (Zahnd ). These studies therefore demonstrate the capacity for consensus design to provide stabilising features for downstream engineering studies.

Engineering the energy landscape

Protein folding and the kinetic stability is an often overlooked property in protein design projects due to many proteins exhibiting irreversible folding on denaturation and the associated complexities of studying multistate folding pathways (Sanchez-Ruiz, 2010). However, thermodynamic stability alone does not guarantee that the protein will fold or remain folded in the native state for extended periods of time under biological or arduous industrial conditions. In vivo, the biological function of many proteins requires a rugged energy landscape, which puts them at risk of misfolding and aggregation (Dinner ; Dobson, 2003; Ferreiro ; Sanchez-Ruiz, 2010; Gershenson ; Gianni ). The delicate balance between function and misfolding is exemplified by members of the serine protease inhibitor or serpin superfamily (Gettins, 2002; Lomas and Carrell, 2002; Law ; Krishnan and Gierasch, 2011). Inhibitory members fold to a metastable native state that undergoes a major conformational change in order to inhibit target proteases (Huntington ). As such, serpins have evolved a relatively complicated folding mechanism required for their function, with sequence and structural diversity within the superfamily reflecting specialised functional and regulatory requirements. We recently used consensus design to create a synthetic serpin, based on the hypothesis that a serpin molecule reflecting optimal sequence conservation may offer insight into the serpin folding function trade-off (Porebski et al., unpublished). Remarkably, the consensus serpin uniquely exhibits reversible two-state folding, is functional, thermostable and resistant to polymerisation. Structural and biophysical analysis suggests that consensus design remodelled its folding landscape, thereby reducing the lifetime of aggregation-prone intermediates (Porebski et al., unpublished). We also observed similar, though less dramatic effects with FN3con (Porebski ), where consensus design led to a large increase in the folding rate and decrease in the unfolding rate (high kinetic stability), suggesting a more smooth and funnel-like energy landscape. Although consensus design nearly always modifies energy landscapes, improvements to folding and kinetic stability are unlikely to be universal (for example see Main ; Sullivan ; Parker ). Success likely depends on the specific functional requirements and evolutionary history of the MSA as these will dictate the consensus energy landscape. In the case of serpins, off-pathway folding and polymerization is probably the result of independent evolutionary fine-tuning of the energy landscape as a means for conformational control of function. As these independent pathways are not highly conserved in a sequence alignment, they are removed by consensus design, thus remodelling the energy landscape to be smoother and funnel-like, whilst still retaining conserved conformational properties necessary for function (Fig. 2). It is tempting to speculate that consensus design may prove to be a fruitful avenue for investigating and engineering the risky energy landscapes of functional proteins (Barrick ; Gershenson ).

Fig. 2

Smoothing of five hypothetical energy landscapes by consensus design. Five protein homologues exhibit differences in their energy landscapes, with three containing kinetic traps that present a propensity for misfolding. As the kinetic traps are not conserved across all five homologues, consensus design is capable of smoothing out the energy landscape to eliminate non-conserved features.

Function

The function of consensus-designed proteins can be preserved and is influenced by the implementation of design (Supplementary Table S1). In general, consensus mutations, especially those that are distal from the catalytic site, give the highest odds of completely preserving function (Polizzi ; Risso ), whilst full sequence (de novo) designs are likely to reduce catalytic rates and specificity, as can be seen in Supplementary Table S1. Although full sequence designs often reduce catalytic rates and specificity, they tend to retain function at elevated temperatures and wider ranges of pH (Lehmann ; Sullivan ; Stevens ). Full sequence designs likely yield these results in a similar manner to the proposed energy landscape smoothing, with the finer features of catalytic activity being less conserved across all homologues, and are therefore removed during design.

Immunogenicity

The application of consensus design to the reduction of immunogenicity, an important factor in the design of protein therapeutics (Chirino ; De Groot and Scott, 2007; Jawa ), remains largely unexplored. Interestingly, what appears to be the first consensus-designed protein, alfacon-1 (Infergen), is less immunogenic and significantly more active than recombinant interferon-α (IFN-α) (Alton ; Blatt ). Created using a sequence alignment of 25 IFN-α subtypes, alfacon-1 is the only known consensus-designed protein that has been marketed as a therapeutic drug. Although alfacon-1 is the only reported experimental study of immunogenicity for a consensus-designed protein, computational predictions using FN3 domains suggest that Tencon is also of low immunogenic potential (Jacobs ). However, it is of course possible that the absence of reports may reflect a general failure in immunogenicity reduction by consensus design. Regardless, the possibility that consensus design can reduce immunogenicity warrants further investigation.

Concluding remarks

Consensus design is a proven and highly effective sequence-based method that is typically overlooked in protein engineering in favour of directed evolution and rational design methodologies. Given the challenges in computational modelling of entropy and non-native states, consensus design provides an additional tool for the protein engineer to not only stabilise the native state but also modify the folding landscape.

Author contributions

B.T.P. and A.M.B. wrote the paper.

Supplementary data

Supplementary data are available at

Funding

B.T.P. is a Medical Research Council Career Development Fellow. A.M.B. is a National Health and Medical Research Council Senior Research Fellow (1022688). Funding to pay the Open Access publication charges for this article was provided by Monash University.

104 in total

1. Analysis of catalytic residues in enzyme active sites.

Authors: Gail J Bartlett; Craig T Porter; Neera Borkakoti; Janet M Thornton
Journal: J Mol Biol Date: 2002-11-15 Impact factor: 5.469

2. Design of stable alpha-helical arrays from an idealized TPR motif.

Authors: Ewan R G Main; Yong Xiong; Melanie J Cocco; Luca D'Andrea; Lynne Regan
Journal: Structure Date: 2003-05 Impact factor: 5.006

Review 3. Protein folding and misfolding.

Authors: Christopher M Dobson
Journal: Nature Date: 2003-12-18 Impact factor: 49.962

4. PROSITE: a documented database using patterns and profiles as motif descriptors.

Authors: Christian J A Sigrist; Lorenzo Cerutti; Nicolas Hulo; Alexandre Gattiker; Laurent Falquet; Marco Pagni; Amos Bairoch; Philipp Bucher
Journal: Brief Bioinform Date: 2002-09 Impact factor: 11.622

5. Thermodynamic prediction of protein neutrality.

Authors: Jesse D Bloom; Jonathan J Silberg; Claus O Wilke; D Allan Drummond; Christoph Adami; Frances H Arnold
Journal: Proc Natl Acad Sci U S A Date: 2005-01-11 Impact factor: 11.205

Review 6. Immunogenicity of protein therapeutics.

Authors: Anne S De Groot; David W Scott
Journal: Trends Immunol Date: 2007-10-25 Impact factor: 16.687

Review 7. Structure- and sequence-analysis inspired engineering of proteins for enhanced thermostability.

Authors: Hein J Wijma; Robert J Floor; Dick B Janssen
Journal: Curr Opin Struct Biol Date: 2013-05-15 Impact factor: 6.809

8. A new method of inference of ancestral nucleotide and amino acid sequences.

Authors: Z Yang; S Kumar; M Nei
Journal: Genetics Date: 1995-12 Impact factor: 4.562

9. New and continuing developments at PROSITE.

Authors: Christian J A Sigrist; Edouard de Castro; Lorenzo Cerutti; Béatrice A Cuche; Nicolas Hulo; Alan Bridge; Lydie Bougueleret; Ioannis Xenarios
Journal: Nucleic Acids Res Date: 2012-11-17 Impact factor: 16.971

10. SMART: recent updates, new developments and status in 2015.

Authors: Ivica Letunic; Tobias Doerks; Peer Bork
Journal: Nucleic Acids Res Date: 2014-10-09 Impact factor: 16.971

42 in total

1. Coevolution of both Thermostability and Activity of Polyphosphate Glucokinase from Thermobifida fusca YX.

Authors: Wei Zhou; Rui Huang; Zhiguang Zhu; Yi-Heng P Job Zhang
Journal: Appl Environ Microbiol Date: 2018-08-01 Impact factor: 4.792

2. Consensus sequence design as a general strategy to create hyperstable, biologically active proteins.

Authors: Matt Sternke; Katherine W Tripp; Doug Barrick
Journal: Proc Natl Acad Sci U S A Date: 2019-05-20 Impact factor: 11.205

3. Computational tools help improve protein stability but with a solubility tradeoff.

Authors: Aron Broom; Zachary Jacobi; Kyle Trainor; Elizabeth M Meiering
Journal: J Biol Chem Date: 2017-07-14 Impact factor: 5.157

4. Analysis of Amylin Consensus Sequences Suggests That Human Amylin Is Not Optimized to Minimize Amyloid Formation and Provides Clues to Factors That Modulate Amyloidogenicity.

Authors: Daeun Noh; Rebekah L Bower; Debbie L Hay; Alexander Zhyvoloup; Daniel P Raleigh
Journal: ACS Chem Biol Date: 2020-06-03 Impact factor: 5.100

5. Structural and Dynamics Comparison of Thermostability in Ancient, Modern, and Consensus Elongation Factor Tus.

Authors: C Denise Okafor; Manish C Pathak; Crystal E Fagan; Nicholas C Bauer; Megan F Cole; Eric A Gaucher; Eric A Ortlund
Journal: Structure Date: 2017-12-21 Impact factor: 5.006

6. Validation and Stabilization of a Prophage Lysin of Clostridium perfringens by Using Yeast Surface Display and Coevolutionary Models.

Authors: Seth C Ritter; Benjamin J Hackel
Journal: Appl Environ Microbiol Date: 2019-05-02 Impact factor: 4.792

7. An Atypical Mechanism of Split Intein Molecular Recognition and Folding.

Authors: Adam J Stevens; Giridhar Sekar; Josef A Gramespacher; David Cowburn; Tom W Muir
Journal: J Am Chem Soc Date: 2018-09-10 Impact factor: 15.419

8. Substrate inhibition imposes fitness penalty at high protein stability.

Authors: Bharat V Adkar; Sanchari Bhattacharyya; Amy I Gilson; Wenli Zhang; Eugene I Shakhnovich
Journal: Proc Natl Acad Sci U S A Date: 2019-05-16 Impact factor: 11.205

9. Structural instability and divergence from conserved residues underlie intracellular retention of mammalian odorant receptors.

Authors: Kentaro Ikegami; Claire A de March; Maira H Nagai; Soumadwip Ghosh; Matthew Do; Ruchira Sharma; Elise S Bruguera; Yueyang Eric Lu; Yosuke Fukutani; Nagarajan Vaidehi; Masafumi Yohda; Hiroaki Matsunami
Journal: Proc Natl Acad Sci U S A Date: 2020-01-23 Impact factor: 11.205

10. The use of consensus sequence information to engineer stability and activity in proteins.

Authors: Matt Sternke; Katherine W Tripp; Doug Barrick
Journal: Methods Enzymol Date: 2020-07-17 Impact factor: 1.600