Literature DB >> 30988206

Functional characterization of 3D protein structures informed by human genetic diversity.

Michael Hicks1, Istvan Bartha2, Julia di Iulio3, J Craig Venter4, Amalio Telenti5.   

Abstract

Sequence variation data of the human proteome can be used to analyze 3D protein structures to derive functional insights. We used genetic variant data from nearly 140,000 individuals to analyze 3D positional conservation in 4,715 proteins and 3,951 homology models using 860,292 missense and 465,886 synonymous variants. Sixty percent of protein structures harbor at least one intolerant 3D site as defined by significant depletion of observed over expected missense variation. Structural intolerance data correlated with deep mutational scanning functional readouts for PPARG, MAPK1/ERK2, UBE2I, SUMO1, PTEN, CALM1, CALM2, and TPK1 and with shallow mutagenesis data for 1,026 proteins. The 3D structural intolerance analysis revealed different features for ligand binding pockets and orthosteric and allosteric sites. Large-scale data on human genetic variation support a definition of functional 3D sites proteome-wide.
Copyright © 2019 the Author(s). Published by PNAS.

Entities:  

Keywords:  deep mutational scanning; exome; genome constraint; protein structure

Mesh:

Substances:

Year:  2019        PMID: 30988206      PMCID: PMC6500140          DOI: 10.1073/pnas.1820813116

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


Recent large-scale sequencing projects of the human genome and exome detail the extent of genetic diversity in the human population (1–3). To date, there are over 4.5 million amino acid-changing (missense) variants reported in the human exome. Much attention has been directed to the association of variants with disease (3, 4). However, these data also represent an unprecedented opportunity to characterize protein structure–function relationships in vivo. In particular, the pattern of distribution of genetic variants describes the functional limits to structural and functional modifications for a given protein. Inference of critical 3D sites could also be informative for drug development and mechanisms of action, including selectivity, lack of response, and toxicity. Finding important sites within these structures has been done through a variety of methods. Genetics-based scoring metrics can measure the deleteriousness of genetic variants in a protein, a property that strongly correlates with both molecular functionality and pathogenicity (5, 6). Scores may also consider interspecies conservation (7) to discover “constrained elements” indicative of putative functional elements. Previous approaches have emphasized gene-level features (e.g., essentiality, burden of variation) and linear analyses of variation in a gene rather than the distribution of variants in 3D space. However, additional methods have been created in the field of cancer to assess the clustering of somatic variants in protein structures. Ryslik et al. (8–11) described Identification of Protein Amino acid Clustering (iPAC), Spatial Protein Amino acid Clustering (SpacePAC), Graph Protein Amino acid Clustering (GraphPAC), and Quaternary Protein Amino acid Clustering (QuartPAC). Fujimoto et al. (12), Tokheim et al. (13), and Meyer et al. (14) analyzed 3D position and clustering of mutations using exome sequence data from The Cancer Genome Atlas (TCGA) from up to 7,215 samples and 23 types of cancer and over 975,000 somatic mutations. A comparison of algorithms for the detection of cancer drivers at subgene resolution was just published (15). It should be noted that scoring methods in oncology emphasize mutational clustering, as critically relevant in cancer biology, and not intolerance to variation in the human proteome at large. Recent sequencing efforts of human genomes and exomes identify several hundreds of thousands of missense variants, which can be used to derive human-specific intolerant sites when aggregated in 3D space (3, 6, 16). Most studies that analyze the relationship between point mutations and experimentally observed 3D protein structures published to date have been limited to individual proteins. Bhattacharya et al. (17) manually analyzed one single nucleotide variant in each of 374 human protein structures to assess the effects of genetic variation on structure, function, stability, and binding properties of the proteins. Arodź and Płonka (18) analyzed a limited set of pairs of proteins of the same length differing by a single amino acid. Recently, Sivley et al. (19) presented a comprehensive analysis of the spatial distribution of missense variants in the human proteome. They identified 215 proteins with significant spatial constraints on the distribution of disease-causing missense variants in protein structures. Glusman et al. (20) reported on a workshop titled “Gene Variation to 3D (GVto3D).” The overarching goal of the workshop was to provide the framework to advance the integration of genetic variants and 3D protein structures.

Tolerance to Amino Acid Changes in the 3D Space of the Human Proteome

A thorough analysis of the proteome requires a large study population to observe enough genetic variation to allow the detection of intolerance and tolerance to mutation of spatial neighborhoods. To advance this field, we initiated a study that uses human genetic variation from 138,632 human exomes and genomes and 31,116 X-ray protein structures (corresponding to 4,715 proteins) to model tolerance to amino acid changes in the 3D space. To understand variation in the structural proteome, we first identified structures that fulfilled our inclusion criteria: X-ray crystal structures with a defined resolution and a minimum chain length greater than 10 amino acids. In addition, we mapped 139,535 Uniprot features [a combination of “structure-based” features, composed of helices, strands, and turns, and “all” features, which includes a list of features from the UniProt Knowledgebase (UniprotKB) defined in ] to the structures and extracted a 3D context for each feature defined as the union of the 5-Å-radius spheres around every atom of a feature, hereafter referred to as a 3D site. We identified 860,292 missense variants for these proteins from the analysis of 138,632 individuals’ exomes. From these contextualized data, we constructed a model that describes functional constraints in 3D protein structures ( section and Fig. 1). The strength of intolerance to missense variation was summarized by the mean of a posterior distribution that accounts for both observed missense variation and expected missense variation at the level of 3D sites ( section), termed the three-dimensional tolerance score (3DTS). While we used a 5-Å-radius space to generalize the analysis proteome-wide, the same approach can be applied to scoring whole domains as well or to tailor to the protein of interest. Below, we show the impact of varying the radius space on functional prediction of selected proteins.
Fig. 1.

Three-dimensional tolerance to variation in the proteome. (A) Missense variation data from genome and exome sequencing projects are mapped to 3D protein structures. Features extracted from Uniprot are also mapped to the 3D structures. Using these features as reference points, a 3D context is constructed, and the corresponding genetic data are extracted. A 3DTS is generated from this information. The 3DTS values are projected back onto the 3D structure. (B) The distribution of tolerance values across the structural proteome for 139,535 3D sites for structures representing 4,715 proteins. The 3DTS value at the 20th percentile (3DTS < 0.14) is used to define intolerant sites. (C) Median 3DTS for a subset of feature types with the interquartile ranges (IQR). The number of each feature type with a 3DTS value is shown above each column. The overall median across the structural proteome is represented by a horizontal dashed line. Feature types are colored by subsections defined by Uniprot (https://www.uniprot.org/help/sequence_annotation).

Three-dimensional tolerance to variation in the proteome. (A) Missense variation data from genome and exome sequencing projects are mapped to 3D protein structures. Features extracted from Uniprot are also mapped to the 3D structures. Using these features as reference points, a 3D context is constructed, and the corresponding genetic data are extracted. A 3DTS is generated from this information. The 3DTS values are projected back onto the 3D structure. (B) The distribution of tolerance values across the structural proteome for 139,535 3D sites for structures representing 4,715 proteins. The 3DTS value at the 20th percentile (3DTS < 0.14) is used to define intolerant sites. (C) Median 3DTS for a subset of feature types with the interquartile ranges (IQR). The number of each feature type with a 3DTS value is shown above each column. The overall median across the structural proteome is represented by a horizontal dashed line. Feature types are colored by subsections defined by Uniprot (https://www.uniprot.org/help/sequence_annotation). We describe the distribution of 3DTS values in Fig. 1. In total, 3,097 (66%) proteins had at least one intolerant 3D site defined at the 20th percentile proteome-wide (3DTS = 0.14). The most intolerant 3D sites corresponded to DNA binding sites, zinc fingers, and intramembrane domains, while the most tolerant 3D sites included nonstandard residues (i.e., selenocysteines), glycosylation sites, and transit peptides. Structural features (helix, turn, strand) showed median 3DTS values close to the proteome-wide median (Fig. 1), which holds true for interspecies conservation (genomic evolutionary rate profiling, GERP++) as well (). The rank correlation of the medians of the different feature types between 3DTS and GERP++ is 0.45. The precise interpretation of 3DTS values requires the assessment of functional consequences of amino acid changes in intolerant versus tolerant 3D sites. However, a challenge of functional testing proteome-wide is the requirement of cellular assays that are disease and gene relevant, robust, and scalable—a serious limitation that explains that to date, the experimental characterization of all possible missense variants in a mammalian gene [deep mutational scanning (21, 22)] has been limited to a handful of proteins: PPARG (23); MAPK1/ERK2 (24); p53 (25); PTEN and TPMT (26); UBE2I, SUMO1, TPK1, CALM1, CALM2, and CALM3 (27); and two single-protein domains of BRCA1 (the RING domain) and YAP65 (the WW domain) (21, 28). We therefore sought to validate 3DTS against the available functional data for the complete human proteins for which there is comprehensive deep mutation scanning (nine proteins covering ∼2,300 amino acid positions and ∼40,000 mutants). In addition, we evaluated 1,026 proteins with shallow mutagenesis (approximately 2,100 individual experimental mutational data from Uniprot) to show that 3DTS identifies functional mutations as intolerant preferentially.

Functional Readout of 3D Tolerance Scores

To introduce the approach, we first assessed the structure–function relationship for peroxisome proliferator-activated receptor gamma (PPARG). PPARG is a drug target for thiazolidinediones and newer partial PPARG modulators used in the treatment of diabetes (22). PPARG exemplifies the challenge of classifying newly identified variants even in a well-studied protein implicated in disease. In the original work (23), functional interpretation of PPARG variants required the construction of a cDNA library consisting of all possible amino acid substitutions in the protein. The library was introduced into human macrophages edited to lack the endogenous PPARG and stimulated with PPARG agonists to trigger the expression of CD36, a canonical target of PPARG. Sorted CD36+ and CD36− cell populations were sequenced to determine the distribution of each PPARG variant in relation to CD36 activity. We showed good correlation (r2 = 0.41, P = 2.6E-5) between the 3D sites defined by 3DTS on the structure [Protein Data Bank (PDB) ID code 3DZY] and the functional scores described in Majithia et al. (23). Specifically, both the in vitro and in silico scores identified the DNA-binding and ligand-binding sites as intolerant to missense variation, while the hinge domain reflected increased tolerance to missense variation (Fig. 2). Additionally, Majithia et al. (23) indicated that their transgene library may not have detected all possible functional effects of coding variation, suggesting that the concordance between in vitro and in silico readouts should be interpreted as conservative.
Fig. 2.

Validation of 3DTS. (A) Comparison of deep mutational screen data and in silico 3DTS data for the DNA-binding and ligand-binding domains of PPARG. (Top) Projection of the functional scores described in Majithia et al. (23) for each amino acid and the scores averaged across the 3DTS-defined sites for the crystal structure 3dzy (32). The color scheme is chosen to match the one described in Majithia et al. (Bottom) A projection of 3DTS onto PPARG is seen on the Left, and the 3D site level correlation between 3DTS and the 3D site averaged in vitro functional scores is shown in the plot on the Right. (B) Comparison of deep mutational screen data and 3DTS under different modeling assumptions for all available PDB structures covering 70% of the canonical protein length for nine genes. “Structure” refers to 3D sites defined by secondary structure elements, and “Allfeatures” uses 3D sites defined by all Uniprot features as detailed in the . “Constant” and “heptamer” refer to the mutation rates as discussed in the . (C) Comparison of the optimal 3DTS model to 23 other scoring methods at the 3D site level for nine genes. Pearson r2 values for comparisons of deep mutational screen data and in silico data at the 3D site level for the nine genes are provided. “NaN” refers to methods with unavailable scores. (D) Shallow mutagenesis data proteome-wide. Here, 3DTS identifies functional sites (loss of function) as more constrained (lower 3DTS values) at all levels of global gene essentiality compared with the rest of the protein. pLI > 0.9 (essential gene) functional to background Kolmogorov–Smirnov two-sided test P value = 9.3E-31; 0.1 > pLI > 0.9 functional to background Kolmogorov–Smirnov two-sided test P value = 2.3E-20; pLI < 0.1 functional to Kolmogorov–Smirnov two-sided test P value = 1.1E-18.

Validation of 3DTS. (A) Comparison of deep mutational screen data and in silico 3DTS data for the DNA-binding and ligand-binding domains of PPARG. (Top) Projection of the functional scores described in Majithia et al. (23) for each amino acid and the scores averaged across the 3DTS-defined sites for the crystal structure 3dzy (32). The color scheme is chosen to match the one described in Majithia et al. (Bottom) A projection of 3DTS onto PPARG is seen on the Left, and the 3D site level correlation between 3DTS and the 3D site averaged in vitro functional scores is shown in the plot on the Right. (B) Comparison of deep mutational screen data and 3DTS under different modeling assumptions for all available PDB structures covering 70% of the canonical protein length for nine genes. “Structure” refers to 3D sites defined by secondary structure elements, and “Allfeatures” uses 3D sites defined by all Uniprot features as detailed in the . “Constant” and “heptamer” refer to the mutation rates as discussed in the . (C) Comparison of the optimal 3DTS model to 23 other scoring methods at the 3D site level for nine genes. Pearson r2 values for comparisons of deep mutational screen data and in silico data at the 3D site level for the nine genes are provided. “NaN” refers to methods with unavailable scores. (D) Shallow mutagenesis data proteome-wide. Here, 3DTS identifies functional sites (loss of function) as more constrained (lower 3DTS values) at all levels of global gene essentiality compared with the rest of the protein. pLI > 0.9 (essential gene) functional to background Kolmogorov–Smirnov two-sided test P value = 9.3E-31; 0.1 > pLI > 0.9 functional to background Kolmogorov–Smirnov two-sided test P value = 2.3E-20; pLI < 0.1 functional to Kolmogorov–Smirnov two-sided test P value = 1.1E-18. While we use PPARG as an example of the implementation of 3DTS, we also analyzed the other proteins with existing deep mutational scanning data. Fig. 2 shows the distributions of Pearson r2 values for all structures (ranging from 0 to 0.72 for CALM1, 0 to 0.54 for CALM2, 0.02 to 0.33 for ERK2, 0.17 to 0.41 for PPARG, 0.21 to 0.39 for PTEN, 0 to 0.83 for SUMO1, 0.13 to 0.22 for TPK1, 0.09 to 0.17 for TPMT, and 0 to 0.62 for UBE2I) that cover at least 70% of the canonical isoform under four different 3DTS conditions: two different sets of 3D features and two different models of rate variation. Precision–recall curves and average precision for the comparison of deep mutational screen data of 3DTS and the various in silico methods is shown in . EVmutation has the highest average precision (0.75). Importantly, different structures for the same protein differ in the correlation value; the median r2 and the distributions tend to be large both within and between conditions and genes. These variations could occur for a variety of reasons such as alternative protein interaction partners, different structural coverages of the protein, varied crystallization conditions, etc. We speculate that 3DTS might serve to identify functionally relevant conformations for a given protein; that is, for a protein with multiple available structures, the best correlations may represent the most parsimonious and functionally plausible structures. Data regarding the optimal structures are available in Dataset S1. We compared the functional prediction of 3DTS with 23 published scores: CADD (5), SIFT (29), PROVEAN (30), FATHMM (31), MutationAssessor (32), fathmm-MKL (33), FitCons (34), DANN (35), MetaSVM/MetaLR (36), GenoCanyon (37), Eigen-PC (38), M-CAP (39), REVEL (40), PhyloP (41), PhastCons (42), GERP++ (7), SiPhy (43), Polyphen-2 (44), and EVmutation (45). Importantly, we bring these scores to the 3D environment, as the purpose of this analysis is the definition of functional regions and not the prediction of deleteriousness at single-amino acid level resolution. These various scores trained under a range of assumptions, most commonly interspecies conservation, coevolution, and pathogenicity. Overall, 3DTS performs comparably to these other methods in the 3D space (Fig. 2). In the future, use of ensemble methods (modeling on multiple scores) is expected to perform better than single scores (for a comparison of all structures and methods, see and Dataset S2). The diversity and complementarity of the various methods suggest that users should analyze proteins under various assumptions and models. Here, 3DTS adds a dimension that has not been included in previous predictors. The availability of multiple proteins with deep mutational screening data also supported a more formal assessment of the effect of varying the size of the 3D sites and confirming the general validity of the use of the 5-Å radius (). We then extended the evaluation to a large corpus of functional readouts for 1,026 proteins for which shallow mutational information was available. The median 3DTS score for 4,428 3D functional sites (those that carry an experimentally tested “loss of function” variant) is lower than the proteome background (Kolmogorov–Smirnov two-sided test P value = 3.7E-42), which may yet include undescribed functional sites. Importantly, at any level of global gene essentiality, functional sites are systematically more constrained than the rest of the protein (Fig. 2). In summary, the in silico 3DTS values may provide functional prediction without engaging in extensive and time-consuming in vitro assays and dedicated functional readouts; this is critical given the paucity of human proteins that have been subjected to deep mutational scanning and functional testing.

Three-Dimensional Tolerance to Amino Acid Change of Drug Target Sites

One application of the present work could involve prioritization of drug target sites. Protein structure-based methods are now routinely used at all stages of drug development, from target identification to optimization (46). Central to all structure-based discovery approaches is the knowledge of the 3D structure of the target protein or complex because the structure and dynamics of the target determine which ligands it binds (46). The characterization of human-specific intolerant sites and tolerance to genetic variation can be used to parse structural information to define active sites and also to define functionally important topographically distinct sites that can support allosteric interactions for small molecules to modulate protein function (47). We analyzed the 3D intolerance characteristics for 97 proteins that included known drug targets with a bound ligand and proteins with known allosteric sites (Dataset S3). The corresponding proteins carried a median number of one unique nonoverlapping intolerant 3D site (range 0–7). Overall, 17 proteins lacked an intolerant site, while 26 had more than one unique intolerant site. In the most intolerant bin, active sites were most constrained, followed by allosteric, protein–protein interaction, and ligand-binding pockets (Fig. 3 and Dataset S3). The higher scores of allosteric sites (more tolerant) relative to their orthosteric counterparts are consistent with the existing knowledge indicating that these sites tend to be under lower evolutionary conservation pressure (47). We also observed an unequal distribution of tolerant and intolerant binding sites across therapeutic classes (Fig. 3 and Dataset S3). For example, antineoplastic and immunomodulating agents preferentially target intolerant sites. The identification of multiple intolerant 3D sites and domains in many drug targets could be exploited for rational drug design and for analysis of drug screening results.
Fig. 3.

Characteristics of druggable sites. (A) Binned 3DTS scores describing active sites, allosteric sites, protein–protein interaction sites, drug ligand-binding sites, and background. The sum of each site type is 1. Active-site background Kolmogorov–Smirnov two-sided test P value = 4.9E-110. Allosteric background Kolmogorov–Smirnov two-sided test P value = 1.1E-84. Protein–protein interactions background Kolmogorov–Smirnov two-sided test P value = 1.8E-89. Drug ligand-binding background Kolmogorov–Smirnov two-sided test P value = 3.0E-75. (B) Counts of tolerant and intolerant drug ligand-binding sites grouped by therapeutic area. Here, tolerant is defined as 3DTS > 0.24 (50th percentile of 3DTS), while intolerant is defined as described in the text (3DTS < 0.14; 20th percentile of 3DTS); drug binding sites between these 3DTS values are not included. See Dataset S3 for full details about this dataset.

Characteristics of druggable sites. (A) Binned 3DTS scores describing active sites, allosteric sites, protein–protein interaction sites, drug ligand-binding sites, and background. The sum of each site type is 1. Active-site background Kolmogorov–Smirnov two-sided test P value = 4.9E-110. Allosteric background Kolmogorov–Smirnov two-sided test P value = 1.1E-84. Protein–protein interactions background Kolmogorov–Smirnov two-sided test P value = 1.8E-89. Drug ligand-binding background Kolmogorov–Smirnov two-sided test P value = 3.0E-75. (B) Counts of tolerant and intolerant drug ligand-binding sites grouped by therapeutic area. Here, tolerant is defined as 3DTS > 0.24 (50th percentile of 3DTS), while intolerant is defined as described in the text (3DTS < 0.14; 20th percentile of 3DTS); drug binding sites between these 3DTS values are not included. See Dataset S3 for full details about this dataset. Recently, we and others evaluated genome constraint based on depletion of human variation data in linearly defined regions in coding (48, 49) and in noncoding regions (16). The current study extends this approach to regions defined by tertiary structure. The increasing detail of the limits of protein diversity that can be gathered through large-scale sequencing of the human population and 3D proteins structures offers additional data on orthosteric, allosteric, and additional functional sites that could be harnessed for drug development.

Materials and Methods

Detailed information is provided in .

Genomic and Variant Data.

We included a set of 123,136 exomes and 15,496 whole human genomes from gnomAD (https://gnomad.broadinstitute.org/). Feature annotations were taken from Uniprot text files that were cross-referenced from Gencode. We used pairwise global sequence alignment to align the Uniprot amino acid sequence to the Gencode transcript. X-ray structure data from the Protein Data Bank were used if they were linked within the Uniprot text files. The PyMol molecular visualization system was used to identify any residue within 5 Å of a defined Uniprot feature (also referred to as a 3D site).

Creation of a 3D Tolerance Score.

We group variants based on their spatial proximity in 3D protein space and based on Uniprot feature annotation. We term these groups 3D sites. We calculate the expectation on the probability that the 3D site is intolerant to missense mutation using a model, which accounts for the differences among loci in the rates of neutral missense variation due to the genetic code, differential sample availability, and regional mutation rates.

Functional Data and Pathogenicity Scores.

Deep mutational scanning data are available for PPARG (23); MAPK1/ERK2 (24); p53 (25), PTEN and TPMT (26); UBE2I, SUMO1, TPK1, CALM1, CALM2, and CALM3 (27); and two single-protein domains of BRCA1 (the RING domain) and YAP65 (the WW domain) (21, 28). For most scores, comparative method data were sourced from dbNSFPv3.5a (36, 50) except for EVmutation (45) data. Scores resulting in missense variants were averaged across a nucleotide (where applicable), then an amino acid position, and, last, a 3D site.

Drug Ligand Data Set and Analyses.

A set of structures defined as therapeutic targets of FDA-approved drugs was used. Therapeutic targets were taken from the supplementary information of Santos et al. (51). Ligand-binding sites were defined as those residues within 5 Å of any of the bound therapeutic molecule residues. Drug liganded molecules were assigned to their ATC codes using the supplementary information of Santos et al. (51). We used the Allosteric Database (release no. 3.06) (52). A nonredundant list of protein active sites was included for those structures found in the Drug Ligand Data Set and the Allosteric Data Set. Additionally, protein–protein interfaces were included if those structures were found in the Drug Ligand Data Set, Active Site Data Set, and the Allosteric Data Set.

Statistics.

Statistics were calculated using the NumPy (www.numpy.org) and SciPy (https://www.scipy.org) libraries in Python and in-house statistical software in Scala.

Public Resources.

We provide final scores and intermediate results from the genome to proteome mapping, including the UniProt–PDB pairwise alignments, at https://doi.org/10.5281/zenodo.1311198 (53). We provide the source code at doi.org/10.5281/zenodo.2628193 (54). There is an interactive browser at protc.labtelenti.org.
  51 in total

1.  Effects of point mutations on protein structure are nonexponentially distributed.

Authors:  Tomasz Arodź; Przemysław M Płonka
Journal:  Proteins       Date:  2012-04-26

2.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm.

Authors:  Prateek Kumar; Steven Henikoff; Pauline C Ng
Journal:  Nat Protoc       Date:  2009-06-25       Impact factor: 13.491

3.  Detection of nonneutral substitution rates on mammalian phylogenies.

Authors:  Katherine S Pollard; Melissa J Hubisz; Kate R Rosenbloom; Adam Siepel
Journal:  Genome Res       Date:  2009-10-26       Impact factor: 9.043

4.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes.

Authors:  Adam Siepel; Gill Bejerano; Jakob S Pedersen; Angie S Hinrichs; Minmei Hou; Kate Rosenbloom; Hiram Clawson; John Spieth; Ladeana W Hillier; Stephen Richards; George M Weinstock; Richard K Wilson; Richard A Gibbs; W James Kent; Webb Miller; David Haussler
Journal:  Genome Res       Date:  2005-07-15       Impact factor: 9.043

5.  A method and server for predicting damaging missense mutations.

Authors:  Ivan A Adzhubei; Steffen Schmidt; Leonid Peshkin; Vasily E Ramensky; Anna Gerasimova; Peer Bork; Alexey S Kondrashov; Shamil R Sunyaev
Journal:  Nat Methods       Date:  2010-04       Impact factor: 28.547

6.  Understanding the function-structure and function-mutation relationships of p53 tumor suppressor protein by high-resolution missense mutation analysis.

Authors:  Shunsuke Kato; Shuang-Yin Han; Wen Liu; Kazunori Otsuka; Hiroyuki Shibata; Ryunosuke Kanamaru; Chikashi Ishioka
Journal:  Proc Natl Acad Sci U S A       Date:  2003-06-25       Impact factor: 11.205

7.  Identifying a high fraction of the human genome to be under selective constraint using GERP++.

Authors:  Eugene V Davydov; David L Goode; Marina Sirota; Gregory M Cooper; Arend Sidow; Serafim Batzoglou
Journal:  PLoS Comput Biol       Date:  2010-12-02       Impact factor: 4.475

8.  Identifying novel constrained elements by exploiting biased substitution patterns.

Authors:  Manuel Garber; Mitchell Guttman; Michele Clamp; Michael C Zody; Nir Friedman; Xiaohui Xie
Journal:  Bioinformatics       Date:  2009-06-15       Impact factor: 6.937

9.  High-resolution mapping of protein sequence-function relationships.

Authors:  Douglas M Fowler; Carlos L Araya; Sarel J Fleishman; Elizabeth H Kellogg; Jason J Stephany; David Baker; Stanley Fields
Journal:  Nat Methods       Date:  2010-08-15       Impact factor: 28.547

10.  Determinants of protein function revealed by combinatorial entropy optimization.

Authors:  Boris Reva; Yevgeniy Antipin; Chris Sander
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

View more
  14 in total

1.  SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants.

Authors:  Houcemeddine Othman; Sherlyn Jemimah; Jorge Emanuel Batista da Rocha
Journal:  J Pers Med       Date:  2022-02-11

2.  PDAUG: a Galaxy based toolset for peptide library analysis, visualization, and machine learning modeling.

Authors:  Jayadev Joshi; Daniel Blankenberg
Journal:  BMC Bioinformatics       Date:  2022-05-28       Impact factor: 3.307

3.  Missense Variants Reveal Functional Insights Into the Human ARID Family of Gene Regulators.

Authors:  Gauri Deák; Atlanta G Cook
Journal:  J Mol Biol       Date:  2022-03-04       Impact factor: 6.151

4.  MISCAST: MIssense variant to protein StruCture Analysis web SuiTe.

Authors:  Sumaiya Iqbal; David Hoksza; Eduardo Pérez-Palma; Patrick May; Jakob B Jespersen; Shehab S Ahmed; Zaara T Rifat; Henrike O Heyne; M Sohel Rahman; Jeffrey R Cottrell; Florence F Wagner; Mark J Daly; Arthur J Campbell; Dennis Lal
Journal:  Nucleic Acids Res       Date:  2020-07-02       Impact factor: 16.971

5.  Gastrointestinal (GI) Tract Microbiome-Derived Neurotoxins-Potent Neuro-Inflammatory Signals From the GI Tract via the Systemic Circulation Into the Brain.

Authors:  Walter J Lukiw
Journal:  Front Cell Infect Microbiol       Date:  2020-02-12       Impact factor: 5.293

Review 6.  Advances in Genomics for Drug Development.

Authors:  Roberto Spreafico; Leah B Soriaga; Johannes Grosse; Herbert W Virgin; Amalio Telenti
Journal:  Genes (Basel)       Date:  2020-08-15       Impact factor: 4.096

7.  Human gastrointestinal (GI) tract microbiome-derived pro-inflammatory neurotoxins from Bacteroides fragilis: Effects of low fiber diets and environmental and lifestyle factors.

Authors:  Walter J Lukiw
Journal:  Integr Food Nutr Metab       Date:  2020-03-09

8.  Self-assembled peptide and protein nanostructures for anti-cancer therapy: Targeted delivery, stimuli-responsive devices and immunotherapy.

Authors:  Masoud Delfi; Rossella Sartorius; Milad Ashrafizadeh; Esmaeel Sharifi; Yapei Zhang; Piergiuseppe De Berardinis; Ali Zarrabi; Rajender S Varma; Franklin R Tay; Bryan Ronain Smith; Pooyan Makvandi
Journal:  Nano Today       Date:  2021-03-11       Impact factor: 18.962

9.  Facilitation of Gastrointestinal (GI) Tract Microbiome-Derived Lipopolysaccharide (LPS) Entry Into Human Neurons by Amyloid Beta-42 (Aβ42) Peptide.

Authors:  Walter J Lukiw; Wenhong Li; Taylor Bond; Yuhai Zhao
Journal:  Front Cell Neurosci       Date:  2019-12-06       Impact factor: 5.505

10.  Comprehensive characterization of amino acid positions in protein structures reveals molecular effect of missense variants.

Authors:  Sumaiya Iqbal; Eduardo Pérez-Palma; Jakob B Jespersen; Patrick May; David Hoksza; Henrike O Heyne; Shehab S Ahmed; Zaara T Rifat; M Sohel Rahman; Kasper Lage; Aarno Palotie; Jeffrey R Cottrell; Florence F Wagner; Mark J Daly; Arthur J Campbell; Dennis Lal
Journal:  Proc Natl Acad Sci U S A       Date:  2020-10-26       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.