Literature DB >> 24392850

In silico analysis of the conservation of human toxicity and endocrine disruption targets in aquatic species.

Fiona M McRobb1, Virginia Sahagún, Irina Kufareva, Ruben Abagyan.   

Abstract

Pharmaceuticals and industrial chemicals, both in the environment and in research settings, commonly interact with aquatic vertebrates. Due to their short life-cycles and the traits that can be generalized to other organisms, fish and amphibians are attractive models for the evaluation of toxicity caused by endocrine disrupting chemicals (EDCs) and adverse drug reactions. EDCs, such as pharmaceuticals or plasticizers, alter the normal function of the endocrine system and pose a significant hazard to human health and the environment. The selection of suitable animal models for toxicity testing is often reliant on high sequence identity between the human proteins and their animal orthologs. Herein, we compare in silico the ligand-binding sites of 28 human "side-effect" targets to their corresponding orthologs in Danio rerio, Pimephales promelas, Takifugu rubripes, Xenopus laevis, and Xenopus tropicalis, as well as subpockets involved in protein interactions with specific chemicals. We found that the ligand-binding pockets had much higher conservation than the full proteins, while the peroxisome proliferator-activated receptor γ and corticotropin-releasing factor receptor 1 were notable exceptions. Furthermore, we demonstrated that the conservation of subpockets may vary dramatically. Finally, we identified the aquatic model(s) with the highest binding site similarity, compared to the corresponding human toxicity target.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24392850      PMCID: PMC3951377          DOI: 10.1021/es404568a

Source DB:  PubMed          Journal:  Environ Sci Technol        ISSN: 0013-936X            Impact factor:   9.028


Introduction

Aquatic vertebrates are targeted by pharmaceutical and industrial chemicals, both intentionally and unintentionally, in a variety of research and environmental contexts. In the wild, these animals are exposed to the pharmaceuticals and industrial chemicals present in the surface waters. In research settings, aquatic vertebrates may be used to evaluate novel chemicals for toxicity, including the early identification of adverse drug reaction (ADR) or endocrine disruption (ED) potential of pharmaceutical candidates and industrial chemicals. Lower order vertebrates, such as amphibians and fish, are being increasingly viewed as a replacement for rodent models. They are convenient and cost-effective model organisms due to their short life-cycles and the presence of traits that can be generalized to other organisms.[1] Species that are commonly used for toxicological evaluations include Danio rerio (zebrafish), Pimephales promelas (fathead minnow), Takifugu rubripes (Japanese pufferfish), Xenopus laevis (African clawed frog), and Xenopus tropicalis (Western clawed frog).[1−4] Specifically, D. rerio has been widely used to study ADRs that include reproductive toxicity, cardiotoxicity, hepatotoxicity, and neurotoxicity,[5] as well as the evaluation of potential endocrine disrupting chemicals (EDCs; reviewed in ref (1)). P. promelas has been used to predict the aquatic toxicity of environmental chemicals,[2] and T. rubripes has been used to evaluate EDCs.[6,7] Amphibians are known to be good models for studying EDCs that interact with thyroid hormone receptors[8] and X. laevis has been used to study ADRs related to membrane transporters.[9] Toxicity, for chemicals with low concentrations in the target organisms, is most frequently caused by their specificity to particular proteins in the organism. Comparing the protein sequences and structures of human toxicity targets to their orthologs in aquatic species can assist in the identification of the most similar ortholog. For the reliable prediction of pharmaceutical or environmental toxicity, robust animal models are required whose proteins are highly similar to the orthologous human ADR and toxicity targets. Additionally, in the wild, these species are more vulnerable than others to pharmaceuticals present in the environment that have been specifically designed for high-affinity interactions with the designated proteins.[10] Typically in toxicity studies, one rodent model and one nonrodent model are employed.[11] However, depending on the target and the class of chemicals in question, some animal models may be more relevant than others. The ever-increasing number of species with fully sequenced genomes has begun to allow for druggable genome and proteome comparisons. Recently, the genomes of eight relevant toxicological species were compared to the human genome.[12] Target similarity has been assessed at the level of protein sequence, with the degree of conservation of specific drug targets in humans and model organisms evaluated by performing sequence-by-sequence alignments,[10] and limited studies have been conducted on the domain conservation for the androgen receptor (AR) and estrogen receptor α (ERα).[13] Nevertheless, the levels of conservation between orthologous sequences usually vary throughout the sequence (Figure 1). Thus, it is important to focus on the similarity of sections of the sequence that are most relevant to chemical interactions. The conservation of residues directly involved in ligand binding is a more relevant parameter for evaluation of aquatic species models than full sequence similarity. Interspecies variations in the amino-acid composition of the binding-pocket can sometimes have dramatic effects on the utility of species in pharmacological assays. For example, in the serotonin 6 receptor (5-HT6R), two residues in the ligand-binding pocket were found to significantly change the pharmacology of the mouse 5-HT6R (resulting in a systematic one log unit shift of the 5-HT6R ligands), compared to the human and rat 5-HT6R,[15,16] making the mouse model an unfavorable choice for testing 5-HT6R-targeting pharmaceuticals, while the rat 5-HT6R binding pocket is identical to humans. Similarly, two (out of 13) minor amino-acid substitutions (Thr to Ala and Ala to Val) in the binding pocket of the rat and mouse histamine H3 receptors (H3R), compared to the human H3R, lead to a systematic compound potency measurement error and limits both of their utilities in H3-related studies.[17]
Figure 1

Variations in sequence conservation across the sequence of the AR for D. rerio, P. promelas, T. rubripes, X. laevis, and X. tropicalis compared to the human AR (binding site residues highlighted in cyan). All sequences were window averaged across 25 residues. Abbreviations: AF1/2, activation function 1/2; DBD, DNA binding domain; LBD, ligand binding domain.[14]

Variations in sequence conservation across the sequence of the AR for D. rerio, P. promelas, T. rubripes, X. laevis, and X. tropicalis compared to the human AR (binding site residues highlighted in cyan). All sequences were window averaged across 25 residues. Abbreviations: AF1/2, activation function 1/2; DBD, DNA binding domain; LBD, ligand binding domain.[14] Because orthologous proteins in different species typically bind the same or similar endogenous ligands,[8] the conservation of the binding pockets far exceeds the full length sequence conservation. They are also likely to bind the same exogenous chemicals. The aim of this research was to identify the aquatic organisms (from the set of D. rerio, P. promelas, T. rubripes, X. laevis, and X. tropicalis) that share the highest binding pocket similarity with humans in each of the 28 best-characterized toxicity targets. X-ray crystal structures were used to identify the amino-acid residues constituting the ligand-binding pockets, which were extrapolated to the aquatic orthologs. Sequence similarity and identity were calculated for the ligand-binding sites, and the most similar orthologs to the 28 human toxicity targets were identified.

Materials and Methods

Selection of Human EDC and ADR Targets

An initial set of 85 unique human proteins that have been previously characterized as side-effect and toxicity targets were compiled from the 73 protein assays listed in the Novartis in vitro safety panels (Table S1, Supporting Information), 11 targets from the VirtualToxLab,[18,19] and the Constitutive Androstane Receptor (CAR; NR1I3). All 85 proteins were used for sequence analyses. For binding pocket similarity analyses, the 85 targets were matched against the Pocketome encyclopedia (http://pocketome.org),[20] a collated set of annotated, binding pocket structure ensembles from the Protein Data Bank (PDB).[21] At the time of this study, 28 out of the 85 targets had Pocketome entries for their ligand-binding pockets available (Table S1, Supporting Information) that contained at least one cocrystallized ligand making it possible to precisely identify the binding site residues. These 28 targets were used for binding site similarity and identity comparisons.

Identification of Orthologs of Human EDC and ADR Targets in the Aquatic Species

The complete proteomes of D. rerio, Mus musculus (mouse), P. promelas, Rattus norvegicus (rat), T. rubripes, X. laevis, and X. tropicalis were downloaded in FASTA format from the UniProt Knowledgebase.[22] The M. musculus and R. norvegicus results have been included in all Supporting Information for comparison purposes. For each of the files, BLAST search index was generated using the bioinformatics module of the Internal Coordinate Mechanics (ICM) software version 3.7-3a (Molsoft L.L.C., La Jolla, CA).[23,24] A BLAST search[25] was performed to identify orthologs of the 85 human proteins in the corresponding aquatic species. One hit per target per species was retained using the following prioritization rules: (i) manually annotated orthologs of the toxicity and side-effect targets were retained with the highest priority; (ii) for automatically annotated analogues, orthologs with the same gene name as the human protein and the highest probability score to the human protein were kept; (iii) if only sequence fragments were available, the longest fragment was retained.

Sequence Alignment and Analysis

Pairwise alignments were constructed between the full sequence of human protein and the corresponding orthologs, and pairwise sequence scores were calculated with the Needleman and Wunsch algorithm[26] modified for the zero end-gap penalties (the ZEGA algorithm[27]) as implemented in the ICM program. We used gap opening and gap extension penalties of 2.4 and 0.15, respectively. Sequence identity was represented by the number of identical residues over the total number of aligned residues. Sequence similarity was calculated using the GONNET residue substitution comparison matrix.[28]

Binding Site Definition and Classification Using Ligand Contact Strength Fingerprints

For each ligand in the pocketome entry and each non-hydrogen atom in the protein, distance-dependent contact strengths were calculated using the parameters developed in context of GPCR Dock 2010 evaluation.[29,30] The per-atom contact strengths were aggregated into per-residue contact strength values by taking the sum over all non-hydrogen atoms in the residue side-chain. Only residue side-chains were included in the calculation because, except for proline, ligand contacts with backbone atoms may not be affected by residue substitutions between species. If a ligand was cocrystallized in multiple structures, the vectors of per-residue contact strengths were averaged. To reduce noise and binding site definition artifacts associated with increased conformational variability of individual residues, the contact strength vector components were multiplied by a factor ranging from 0 to 1 and inversely proportional to the observed conformational variability of the corresponding residue in the Pocketome ensemble. Each unique ligand L was characterized by a vector FP of per-residue numbers ranging from 0 (no contact) to 32 (extensive close contact with Phe168 in the adenosine A2A receptor (A2AR); Table S1, Supporting Information). Normalized fingerprint distance between ligands L and L was calculated as D = 1 – (ΣMin(FP,FP))/(Σ(FP + FP)/(2)) where Min(FP,FP) and (FP+FP)/2 are vectors of element-wise minima and element-wise averages between vectors FP and FP, respectively.[30] When defined that way, ligand fingerprint distances range from 0 (for identical fingerprints) to 1 (for nonoverlapping fingerprints). Ligand interaction fingerprints were clustered at the distance cutoff of D = 0.35 to identify classes of ligand occupying distinct areas in the binding site. The cutoff of 0.35 was found to be the optimal trade-off between the excessive number of clusters and the unwanted aggregation of substantially different ligand chemotypes in multiple targets. This cutoff indicates that the ligands will be classified as belonging to different clusters if their fingerprints vary by one-third (or more) of the contacts. Next, clusters of unique crystallographic ligands were ordered by their size, starting with the most populated one and ending with singletons (i.e., clusters containing only a single ligand). Top clusters containing 80% of the ligands were combined to define the set of residues interacting with the majority of the ligands. The remaining 20% were disregarded in the pocket definition to ensure that it is not affected by occasional or spurious contacts.

Binding Pocket Sequence Identity and Similarity Calculations

For each subpocket in the binding site, as determined by ligand contact strength fingerprint clustering, a subalignment was extracted by projecting the full sequence alignment between human and ortholog sequences onto the corresponding residue selection. Binding pocket/subpocket identity and similarity were calculated from these subalignments using the same parameters as the full sequence alignments. The same was done for the set of residues forming the interaction site(s) for at least 80% of the ligands, as described above, and thus represent the aggregation of the consistently populated regions of the pocket. The comparison of complete pockets (including interaction fingerprints of all crystallographic ligands) is available in Supporting Information.

Results

Orthologs of Human EDC and ADR Targets in Aquatic Vertebrates

Five fish and amphibians frequently used in toxicological evaluations were used in this study: D. rerio, P. promelas, T. rubripes, X. laevis, and X. tropicalis. In their proteomes, we identified the orthologs of the known human side-effect and environmental target proteins. In some cases, orthologs could not be found: 89% of the toxicity targets were identified in D. rerio, 20% in P. promelas, 84% in T. rubripes, 51% in X. laevis, and 85% in X. tropicalis (Table S1, Supporting Information). This may be explained by the fact that only the genomes of D. rerio,[31,32]T. rubripes,[33] and X. tropicalis(34) have been fully sequenced, while the remaining two genomes (P. promelas and X. laevis), and thus proteomes, are incomplete. Additionally, in some cases, only protein fragments of the toxicity target orthologs have been identified. The sequences of the human and orthologous toxicity proteins were aligned, and the full sequence similarity was calculated (Figure 2a, Table S1, Supporting Information).
Figure 2

Sequence similarity (percentage and color) and sequence identity (number of identical residues/number of aligned residues is shown in parentheses) for the 28 toxicity target proteins of (a) the full sequence and (b) the ligand-contact residues conserved for 80% of the cocrystallized ligands. White spaces indicate that no ortholog was identified (often due to an incomplete proteome).

Sequence similarity (percentage and color) and sequence identity (number of identical residues/number of aligned residues is shown in parentheses) for the 28 toxicity target proteins of (a) the full sequence and (b) the ligand-contact residues conserved for 80% of the cocrystallized ligands. White spaces indicate that no ortholog was identified (often due to an incomplete proteome).

Full Sequence Similarity between Human EDC/ADR Targets and Their Orthologs in Aquatic Vertebrates

The relevance of a model organism for prediction of toxicity in humans has previously been evaluated using the amino acid conservation across entire protein sequences, e.g., ref. (10). In the present study, the majority of the human toxicity targets displayed 60–70% sequence similarity with their aquatic vertebrate orthologs (Figure 2a). The average full sequence similarity between the human proteins and the aquatic orthologs was 69% for D. rerio, 63% for P. promelas, 70% for T. rubripes, 71% for X. laevis, and 72% for X. tropicalis (Figure S1, Supporting Information). In some cases, the overall sequence similarity was relatively high. For example, X. tropicalis had the highest full sequence similarity for the androgen receptor (AR, 88%). However, the protein sequence for X. tropicalis was only a fragment of the full sequence that lacked the N-terminal domain of the protein compared to the other species, giving artificially higher sequence similarity. The corticotropin-releasing factor receptor 1 (CRF1R) is highly conserved in four species (∼85% sequence similarity). The interspecies variations in full sequence similarity were more informative for the estrogen receptors α and β (ERα and ERβ, respectively), and the glucocorticoid receptor (GR), where the full sequences were similar in length. X. laevis and X. tropicalis shared higher conservation of these receptors with human (9–24% higher sequence similarity) than with D. rerio, P. promelas, and T. rubripes. The impact of the variability of the sequence length on the full sequence similarity demonstrates the difficulties with using the full protein sequence (or longest available sequence) in these calculations.

Ligand-Binding Pocket Similarity between Human EDC/ADR Targets and Their Orthologs in Aquatic Vertebrates

As expected, the ligand-binding pockets of the orthologous proteins generally shared higher sequence conservation with the human toxicity targets than the full protein sequences (Figure 2b). For example, the ligand-binding site of human AR shared ∼98% sequence similarity with all five species, whereas the full sequence similarity was only 47–88%. Likewise, the binding sites of ERα, ERβ, and GR are 92–100% conserved in all five aquatic species, while the highest full sequence conservation observed in X. laevis and X. tropicalis did not exceed 70–76%. The relative ranking of species by the full sequence similarity to humans often varies from that by binding pocket similarity. For example, on the basis of full sequence similarity, one would choose X. laevis or X. tropicalis as the most relevant model for testing ERα-targeting chemicals; however, our pocket similarity analysis indicates that all five species are almost equally good, with the fish species having a slight advantage over the frogs. Similarly, despite being most similar to human in terms of full β2 adrenergic receptor (β2AR) sequence, X. tropicalis is probably the least accurate of the five models for evaluation of β2AR ligand pharmacology, as it has as many as 5 residue substitutions in the binding pocket (Figure S3, Supporting Information). Surprisingly, two targets had lower sequence conservation in the binding site as compared to the full sequence. These were the obesity- and stress-related targets, peroxisome proliferator-activated receptor γ (PPARγ) and CRF1R. PPARγ displayed lower binding-site similarity (56–85%) than full sequence similarity (74–89%). CRF1R displayed higher sequence similarity across the full protein sequence (∼85%) than in the peptide-binding site in its extracellular domain (46–78%). However, GPCRs often have a greater degree of sequence variability in the extracellular domains; hence, the lower sequence similarity in the peptide-binding site of CRF1R is consistent with the nature of this receptor.

Ligand-Binding Pockets in ADR/EDC Targets: One Size Does Not Fit All

On closer inspection of the ligand-binding interactions in the X-ray crystal structures of the human EDC and ADR targets, there were often noticeably different residue interaction fingerprints for different ligand chemotypes. In some cases, different chemotypes can bind to distinct ligand-binding pockets or “sub-pockets” of the proteins. This is exemplified by the identification of three different subpockets of the adenosine A2A receptor (A2AR). Promisingly, the three subpockets identified for A2AR (Figure 3a) correspond to an agonist-bound structure (Figure 3b), the endogenous agonist-bound structure (Figure 3c), and the antagonist-bound structures (Figure 3d), respectively. All subpockets were fully conserved in X. laevis and X. tropicalis. Additionally, significant variations in the conservation of subpockets can be observed for the ortholog of β2AR in X. tropicalis (Figure S3, Supporting Information), where subpocket 1 displays 75% conservation, yet subpocket 2 has only 48% sequence similarity.
Figure 3

(a) Sequence similarity (percentage and color) and sequence identity (number of identical residues/number of aligned residues is shown in parentheses) for the three A2AR subpockets (white spaces indicate that no ortholog was identified). A2AR crystal structures (gray ribbons), all cocrystallized ligands (mesh), and subpocket (solid surface); (b) subpocket 1 (agonist-bound structures), (c) subpocket 2 (the endogenous agonist-bound structure), and (d) subpocket 3 (antagonist-bound structures).

(a) Sequence similarity (percentage and color) and sequence identity (number of identical residues/number of aligned residues is shown in parentheses) for the three A2AR subpockets (white spaces indicate that no ortholog was identified). A2AR crystal structures (gray ribbons), all cocrystallized ligands (mesh), and subpocket (solid surface); (b) subpocket 1 (agonist-bound structures), (c) subpocket 2 (the endogenous agonist-bound structure), and (d) subpocket 3 (antagonist-bound structures). Because the likelihood of a chemical interacting with an aquatic species ortholog of its target protein largely depends on the conservation of specific interacting residues and not the entire binding site, we sought to identify the individual subpockets in each of the target pockets and to separately evaluate their similarity to the corresponding subpockets in the studied aquatic organisms. Subpockets were identified by the clustering of contact-strength fingerprints (see Materials and Methods).

GPCR Subpocket Sequence Conservation

GPCRs are a superfamily of membrane bound proteins characterized by seven transmembrane (TM) helices and many have been implicated in ADRs, endocrine disruption, and reproductive toxicity.[35] The A2AR is implicated in a number of ADRs such as palpitations and angina.[18] ADRs for the β2 adrenergic receptor (β2AR) include tremor, cardiac failure, and angina;[18] it has also been implicated in ED in aquatic vertebrates.[4,36] The serotonin 2B receptor (5-HT2BR) is linked to valvular heart disease;[37] the histamine H1 receptor (H1R) is involved in sedation, and the human M2 muscarinic acetylcholine receptor (M2R) is associated with constipation.[18] The dopamine D3 receptor (D3R) is implicated in dyskinesia and Parkinsonism[18] and shown to bind the known endocrine disruptor BPA.[38] Two class B GPCRs were also evaluated: CRF1R, which is implicated in stress-related disorders,[39,40] and the gastric inhibitory polypeptide receptor (GIPR), which is implicated in diabetes and obesity.[41] Two subpockets were identified for β2AR (Figure S3, Supporting Information), the classical orthosteric site (subpocket 1) and the orthosteric site with some additional residues from the less conserved TM1/TM2/TM7 region (subpocket 2). Generally, X. tropicalis displayed poor ligand-binding pocket conservation to the human β2AR (75% and 48%, subpockets 1 and 2, respectively). Due to the scarcity of multiple crystal structures for many GPCRs, subpockets were unable to be explored for the 5-HT2BR, D3R H1R, κ opioid receptor (κOR), and M2R; however, the binding pockets were generally well conserved (69–100%; Figures 2b and S4, Supporting Information). At the time of this study, crystal structures were only available for the extracellular domains of the GPCRs CRF1R and GIPR, which contain the peptide-binding sites. These peptide-binding sites were expected to have lower levels of conservation because it is well established that the extracellular domains of GPCRs have a large degree of sequence variability. Only X. tropicalis had a moderately conserved ortholog for GIPR (61%, Figure S4, Supporting Information), indicating that alternate animal models should also be investigated. X. laevis and X. tropicalis displayed higher ligand-binding pocket similarity across both subpockets (60–78%, Figure S4, Supporting Information). However, it is unlikely that peptides in the environment would result in endocrine disruption via the peptide-binding site of CRF1R and GIPR in either humans or the fish and amphibians evaluated in this study, as potential ED peptides are unlikely to be readily absorbed. Consequently, this technique should also be applied to the small molecule binding site of GIPR when a structure becomes available and to the recently released structure of CRF1R.[42]

Nuclear Receptor Subpocket Conservation

Nuclear receptors are a superfamily of proteins that regulate development, growth, and homeostasis, and they are commonly implicated in endocrine disruption. Some classic examples of ED that occur via nuclear receptors include the weak agonistic activity of the plasticizer bisphenol A (BPA) against the ERα;[43] the feminization of fish by 17α-ethinylestradiol (EE2), a synthetic estrogen in human contraceptives;[44] and modulation of PPARγ by EDCs, which is implicated in obesity.[45] ERα subpockets were generally highly conserved across the aquatic species (94–100%, Figure S5, Supporting Information), with the exception of T. rubripes for subpocket 8, which is bound to a large estradiol metal chelate ligand (88%). The binding pocket of ERβ across the five species, compared to the human ERβ, was generally highly conserved (92–100%, Figure S4, Supporting Information). However, across all the subpockets, T. rubripes was slightly less conserved (92–95% vs. 98–100%). ERR1 has only been cocrystallized with two unique ligands in two unique subpockets (Figure S4, Supporting Information), with subpocket 2, cocrystallized with a thiazolidinedione, having higher sequence conservation (82–89% vs. 54–60%). The subpockets of the glucocorticoid receptor (GCR) were generally well conserved with the human receptor (91–98%, Figure S4, Supporting Information). The binding sites of the progesterone receptor (PR) for X. laevis and X. tropicalis shared slightly higher pocket conservation with the human receptor (98–100%, Figure S4, Supporting Information). The subpockets of the androgen receptor (AR) were highly conserved (96–100%, Figure S6, Supporting Information), and the subpockets of the thyroid hormone receptor β (TRβ) were fully conserved (100%, Figure S4, Supporting Information). Unlike TRβ, the thyroid hormone receptor α (TRα) did not show full sequence conservation across all species (86–100%; Figure S4, Supporting Information). All subpockets across all species (except for P. promelas for which no ortholog was identified) were fully conserved for the Liver X Receptor (LXR; Figure S4, Supporting Information). While no subpockets were identified for the mineralocorticoid receptor (MCR; Figure S4, Supporting Information), X. tropicalis had the lowest LBD similarity (81%). Of the five aquatic species, T. rubripes consistently displayed higher homology to the human Pregnane X receptor (PXR; 54–64%, Figure S7, Supporting Information). Despite this, the overall pocket similarity was relatively low (maximum 64%), indicating that PXR is not well conserved in these aquatic vertebrates and that other animal models with higher binding site conservation should also be investigated. Similarly, low binding-pocket conservation was observed for the Constitutive Androstane Receptor (CAR; 35–43%; Figure S4, Supporting Information). In 15 out of the 16 subpockets, X. tropicalis had the highest ligand-binding pocket sequence similarity to the human PPARγ (81–100%; Figure S8, Supporting Information). Interestingly X. laevis, a close relative of X. tropicalis, had significantly lower ligand-binding pocket sequence similarity (50–80%).

Cytochrome P450 Subpocket Sequence Conservation

Cytochrome P450s (CYPs) are a superfamily of enzymes that catalyze the oxidation of a diverse range of organic compounds and are commonly involved in the metabolism of xenobiotic compounds. CYPs typically have large and conformationally flexible binding sites in order to accommodate a wide range of chemically dissimilar compounds,[46,47] which is supported by the diverse array of subpockets identified. There were closely related orthologs to the human CYP1A2, with D. rerio having the highest pocket similarity (96%, Figure S4, Supporting Information). Both D. rerio and P. promelas had closely related orthologs of CYP3A4 across five out of six subpockets (89–100%, Figure S9, Supporting Information). X. laevis and X. tropicalis had the highest subpocket similarities for CYP2C9 (60–78%); however, the ligand-binding pocket conservation was moderate (Figure S4, Supporting Information). Orthologs of CYP2D6 were only identified in X. laevis and X. tropicalis, which displayed good conservation to the human protein (Figure S4, Supporting Information).

Subpocket Sequence Conservation of Other Enzymes

Monoamine oxidase A (MAO-A) is involved in the catabolism of neurotransmitters and dietary amines; inhibition can lead to neuroendocrine disruption,[48] and it is implicated in ADRs including psychosis and hypertensive crisis.[49] ADRs associated with cAMP-specific 3′,5′-cyclic phosphodiesterase 4D (PDE4D) include diarrhea and nausea,[50] and due to its role in the endocrine system, PDE4D may also be a target for EDCs.[51] The binding site of PDE4D was fully conserved across the identified ortholog binding sites (100%, Figure S4, Supporting Information). The subpockets for MAO-A, however, displayed higher sequence similarity for D. rerio and T. rubripes (95%, Figure S4, Supporting Information).

Discussion

The present study performs a comparison of 28 human toxicity targets to their orthologs in five aquatic species, with the goal of identifying the aquatic organisms with the highest ligand-binding pocket sequence similarity to the human toxicity target. The comparison was performed not only at the level of full protein sequences but also, more relevantly, at the level of the ligand-binding sites. By using the X-ray crystal structures of human toxicity targets, residue-level interaction fingerprints were calculated for each unique cocrystallized ligand, and binding pockets and spatially distinct subpockets were identified, with each residue selection extrapolated onto the orthologous proteins in the five aquatic vertebrates. In some cases, the contact fingerprints could also separate the toxicity target crystal structures based on the mode of action of the cocrystallized ligands (such as A2AR; Figure 3), providing a basis for understanding the subpocket sequence conservation. We identified the aquatic vertebrate(s) that share the highest sequence similarity for the ligand-binding pockets (Table 1), compared to the human toxicity targets, as well as determined the sequence similarity of the spatially distinct subpockets. X. tropicalis had the largest number of orthologs that shared the highest conservation with the human toxicity targets (out of the five aquatic species), having the highest ligand-binding site similarity for 21 out of the 28 toxicity targets, closely followed by D. rerio (19), T. rubripes (19), and X. laevis (18). P. promelas had the lowest number of highly conserved ligand-binding pockets with only 7 ligand-binding sites with high similarity, which can be partially attributed to an incomplete genome.
Table 1

Identification of the Aquatic Vertebrate Model(s) with the Highest Ligand-Binding Pocket Similarity (Denoted by X) Compared to the Corresponding Human Toxicity Targeta

 aquatic vertebrate model(s) with the highest ligand-binding pocket similarity
receptoraD. rerioP. promelasT. rubripesX. laevisX. tropicalis
5-HT2BR   XX
A2ARX XXX
M2RX X X
β2ARXXXX 
ARXXXXX
MAO-AX X  
CYP1A2X X  
CYP2C9*  XX 
CYP2D6   XX
CYP3A4XX   
CRF1R   XX
D3RX X  
ERR1X XXX
ERαXXXXX
ERβXXXXX
GCRXXXXX
GIPR    X
H1RX X X
MCRX XX 
LXRX XXX
PXR*  XXX
CAR*   XX
κORX X X
PDE4DX X X
PPARγ    X
PR   XX
TRαX  XX
TRβXXXXX

∗ indicates targets where other species should be investigated due to orthologs with only low or moderate pocket similarity.

∗ indicates targets where other species should be investigated due to orthologs with only low or moderate pocket similarity. In this study, we demonstrated that the major difficulty faced when using the full sequence similarity for the comparison of toxicity target orthologs to human proteins is due to variations in the length of the amino acid sequences. For example, while X. tropicalis has the highest full sequence similarity for AR (88%), the longest available sequence of the AR of X. tropicalis was actually incomplete, lacking the N-terminal of the protein including the DNA binding domain (393 residues vs. >729 residues), thus giving artificially higher sequence similarity. This also occurred for some of the aquatic orthologs of MCR, PDE4D, PPARγ, PR, TRα, and TRβ. Additionally, we have shown that high full sequence similarity does not always correlate with high ligand-binding site conservation. For example, the sequence similarity for the extracellular domains of CRF1R for all species is high (∼85%), yet the peptide-binding sites have lower conservation (46–78%). Generally, we have demonstrated that the ligand-binding sites share higher conservation between orthologs, compared to the full sequences (Figure 2). Consequently, we also have shown that the ligand-binding site similarity is the preferred method for the identification of the most conserved orthologs, because it is more informative than the full sequence similarity and it is not influenced by variations in the length of the longest available amino acid sequence of an ortholog. Additionally, if full sequence similarity alone is to be considered, variations in the length of the full (or longest available sequence) should also be incorporated into these assessments. There are a few caveats that need to be taken into consideration when using orthologous sequence comparisons to aid in the selection of animal models for the evaluation of toxicity. First, the provided principles only suggest toxicity target orthologs in aquatic species based on sequence similarity, without attention to possible variations in the protein function or the downstream pathways.[10,52] This method unfortunately does not provide any detail regarding the signaling pathways for orthologous protein and will, of course, require a certain level of understanding of the animal model. Binding pocket similarity may be a necessary but not a sufficient condition for model utility, as exemplified by the pair of human and rat ARs: a large-scale study of interspecies variations in binding affinity of chemicals[17] identified this pair as having systematic one log unit differences in potency of multiple diverse chemicals, despite the fact that not only the binding pockets but also the entire ligand binding domain of AR is strictly conserved between human and rat. Second, our method is reliant on the availability of the proteome of the organisms or, at the very least, the availability of sequences of the orthologs of the toxicity targets. Third, calculating ligand-binding site conservation requires X-ray crystal structures of the human toxicity targets, preferably in a complex with a diverse range of chemicals. Both of the problems regarding the availability of the full proteomes and crystal structures can be addressed in future studies, due to the increasing availability of these data. Thus, this study could be expanded to a wider range of toxicity targets and species, including toxicity targets that lack crystal structures, by using crystal structures of highly homologous proteins. By calculating the amino acid similarity in the ligand-binding pockets, we have successfully avoided the problem of full sequence length variability in sequence similarity calculations, to determine the aquatic orthologs with the most similar ligand-binding pockets for 28 human toxicity targets. This method also allows for the calculation of binding site similarity for subpockets that are involved in the specific chemical–protein interactions. We believe that this study will be a useful tool when designing target-specific assays for the assessment of ADRs and ED potential of chemicals.
  50 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

Review 2.  International Union of Pharmacology. XXXVI. Current status of the nomenclature for receptors for corticotropin-releasing factor and their ligands.

Authors:  Richard L Hauger; Dimitri E Grigoriadis; Mary F Dallman; Paul M Plotsky; Wylie W Vale; Frank M Dautzenberg
Journal:  Pharmacol Rev       Date:  2003-03       Impact factor: 25.468

Review 3.  Membrane transporters in drug development.

Authors:  Kathleen M Giacomini; Shiew-Mei Huang; Donald J Tweedie; Leslie Z Benet; Kim L R Brouwer; Xiaoyan Chu; Amber Dahlin; Raymond Evers; Volker Fischer; Kathleen M Hillgren; Keith A Hoffmaster; Toshihisa Ishikawa; Dietrich Keppler; Richard B Kim; Caroline A Lee; Mikko Niemi; Joseph W Polli; Yuichi Sugiyama; Peter W Swaan; Joseph A Ware; Stephen H Wright; Sook Wah Yee; Maciej J Zamek-Gliszczynski; Lei Zhang
Journal:  Nat Rev Drug Discov       Date:  2010-03       Impact factor: 84.694

Review 4.  Estrogen and androgen signaling in the pathogenesis of BPH.

Authors:  Clement K M Ho; Fouad K Habib
Journal:  Nat Rev Urol       Date:  2011-01       Impact factor: 14.432

Review 5.  β-blockers as endocrine disruptors: the potential effects of human β-blockers on aquatic organisms.

Authors:  Andrey Massarsky; Vance L Trudeau; Thomas W Moon
Journal:  J Exp Zool A Ecol Genet Physiol       Date:  2011-03-01

6.  Vertebrate genome evolution and the zebrafish gene map.

Authors:  J H Postlethwait; Y L Yan; M A Gates; S Horne; A Amores; A Brownlie; A Donovan; E S Egan; A Force; Z Gong; C Goutel; A Fritz; R Kelsh; E Knapik; E Liao; B Paw; D Ransom; A Singer; M Thomson; T S Abduljabbar; P Yelick; D Beier; J S Joly; D Larhammar; F Rosa; M Westerfield; L I Zon; S L Johnson; W S Talbot
Journal:  Nat Genet       Date:  1998-04       Impact factor: 38.330

7.  Minipig and beagle animal model genomes aid species selection in pharmaceutical discovery and development.

Authors:  Jessica J Vamathevan; Matthew D Hall; Samiul Hasan; Peter M Woollard; Meng Xu; Yulan Yang; Xin Li; Xiaoli Wang; Steve Kenny; James R Brown; Julie Huxley-Jones; Jon Lyon; John Haselden; Jiumeng Min; Philippe Sanseau
Journal:  Toxicol Appl Pharmacol       Date:  2013-04-19       Impact factor: 4.219

Review 8.  Healthy animals and animal models of human disease(s) in safety assessment of human pharmaceuticals, including therapeutic antibodies.

Authors:  Rakesh Dixit; Urs A Boelsterli
Journal:  Drug Discov Today       Date:  2007-04       Impact factor: 7.851

Review 9.  Monoamine oxidase inactivation: from pathophysiology to therapeutics.

Authors:  Marco Bortolato; Kevin Chen; Jean C Shih
Journal:  Adv Drug Deliv Rev       Date:  2008-07-04       Impact factor: 15.470

10.  Global analysis of small molecule binding to related protein targets.

Authors:  Felix A Kruger; John P Overington
Journal:  PLoS Comput Biol       Date:  2012-01-12       Impact factor: 4.475

View more
  19 in total

Review 1.  Applying evolutionary genetics to developmental toxicology and risk assessment.

Authors:  Maxwell C K Leung; Andrew C Procter; Jared V Goldstone; Jonathan Foox; Robert DeSalle; Carolyn J Mattingly; Mark E Siddall; Alicia R Timme-Laragy
Journal:  Reprod Toxicol       Date:  2017-03-04       Impact factor: 3.143

2.  Adverse outcome pathway development II: best practices.

Authors:  Daniel L Villeneuve; Doug Crump; Natàlia Garcia-Reyero; Markus Hecker; Thomas H Hutchinson; Carlie A LaLone; Brigitte Landesmann; Teresa Lettieri; Sharon Munn; Malgorzata Nepelska; Mary Ann Ottinger; Lucia Vergauwen; Maurice Whelan
Journal:  Toxicol Sci       Date:  2014-12       Impact factor: 4.849

3.  Assessing variation in the potential susceptibility of fish to pharmaceuticals, considering evolutionary differences in their physiology and ecology.

Authors:  A R Brown; L Gunnarsson; E Kristiansson; C R Tyler
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2014-11-19       Impact factor: 6.237

4.  Docking to multiple pockets or ligand fields for screening, activity prediction and scaffold hopping.

Authors:  Yu-Chen Chen; Max Totrov; Ruben Abagyan
Journal:  Future Med Chem       Date:  2014       Impact factor: 3.808

Review 5.  Leveraging existing data for prioritization of the ecological risks of human and veterinary pharmaceuticals to aquatic organisms.

Authors:  Carlie A LaLone; Jason P Berninger; Daniel L Villeneuve; Gerald T Ankley
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2014-11-19       Impact factor: 6.237

6.  Bioactive contaminants of emerging concern in National Park waters of the northern Colorado Plateau, USA.

Authors:  Rebecca H Weissinger; Brett R Blackwell; Kristen Keteles; William A Battaglin; Paul M Bradley
Journal:  Sci Total Environ       Date:  2018-05-02       Impact factor: 7.963

7.  Reconnaissance of Mixed Organic and Inorganic Chemicals in Private and Public Supply Tapwaters at Selected Residential and Workplace Sites in the United States.

Authors:  Paul M Bradley; Dana W Kolpin; Kristin M Romanok; Kelly L Smalling; Michael J Focazio; Juliane B Brown; Mary C Cardon; Kurt D Carpenter; Steven R Corsi; Laura A DeCicco; Julie E Dietze; Nicola Evans; Edward T Furlong; Carrie E Givens; James L Gray; Dale W Griffin; Christopher P Higgins; Michelle L Hladik; Luke R Iwanowicz; Celeste A Journey; Kathryn M Kuivila; Jason R Masoner; Carrie A McDonough; Michael T Meyer; James L Orlando; Mark J Strynar; Christopher P Weis; Vickie S Wilson
Journal:  Environ Sci Technol       Date:  2018-11-21       Impact factor: 9.028

8.  Comparative pharmacology and toxicology of pharmaceuticals in the environment: diphenhydramine protection of diazinon toxicity in Danio rerio but not Daphnia magna.

Authors:  Lauren A Kristofco; Bowen Du; C Kevin Chambliss; Jason P Berninger; Bryan W Brooks
Journal:  AAPS J       Date:  2014-10-18       Impact factor: 4.009

9.  Orphan receptor ligand discovery by pickpocketing pharmacological neighbors.

Authors:  Tony Ngo; Andrey V Ilatovskiy; Alastair G Stewart; James L J Coleman; Fiona M McRobb; R Peter Riek; Robert M Graham; Ruben Abagyan; Irina Kufareva; Nicola J Smith
Journal:  Nat Chem Biol       Date:  2016-12-19       Impact factor: 15.040

10.  Quantitative Chemical Proteomics Reveals Interspecies Variations on Binding Schemes of L-FABP with Perfluorooctanesulfonate.

Authors:  Jiajun Han; Jesse Fu; Jianxian Sun; David Ross Hall; Diwen Yang; Donovan Blatz; Keith Houck; Carla Ng; Jon Doering; Carlie LaLone; Hui Peng
Journal:  Environ Sci Technol       Date:  2021-06-16       Impact factor: 11.357

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.