Literature DB >> 33075469

Toward structure-based drug design against the epidermal growth factor receptor (EGFR).

Yazan Haddad1, Marek Remes1, Vojtech Adam1, Zbynek Heger2.   

Abstract

Most of the available crystal structures of epidermal growth factor receptor (EGFR) kinase domain, bound to drug inhibitors, originated from ligand-based drug design studies. Here, we used variations in 110 crystal structures to assemble eight distinct families highlighting the C-helix orientation in the N-lobe of the EGFR kinase domain. The families shared similar mutational profiles and similarity in the ligand R-groups (chemical composition, geometry, and charge) facing the C-helix, mutation sites, and DFG domain. For structure-based drug design, we recommend a systematic decision-making process for choice of template, guided by appropriate pairwise fitting and clustering before the molecular docking step. Alternatively, the binding site shape/volume can be used to filter and select the compound libraries.
Copyright © 2020 Elsevier Ltd. All rights reserved.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 33075469      PMCID: PMC7567673          DOI: 10.1016/j.drudis.2020.10.007

Source DB:  PubMed          Journal:  Drug Discov Today        ISSN: 1359-6446            Impact factor:   7.851


Drug design (ligand-based versus structure-based)

Over the past few decades, drug design has shifted from a computer-aided (i.e., computer graphics [1]) paradigm into a computer-directed or guided process. Depending on the elementary molecule used for computational design, two state-of-the-art strategies have advanced in this field, namely ligand-based and structure-based design. On the ligand-based side, quantitative structure–activity relationship (QSAR) was successfully used for nearly half a century to build reliable statistical models to predict the properties of new chemical compounds. The method is based on the study of quantitative relationships between physicochemical properties of chemical compounds and their biological activities [2]. Structure-based drug design is concerned with how ligands bind to the protein structure and methods of binding energy estimation. The historic analogy of protein and ligand with ‘lock and key’ was later replaced with the ‘hand and glove’ concept to mimic the flexibility of both the protein and ligand [3]. However, covalent inhibitors (in which the glove is permanently linked to the hand) need to first form a stable noncovalent complex in their search for optimal conditions to form covalent links in the final complex [4]. A structure-based virtual screening is performed via bottleneck rounds of high-throughput characterization, analysis, filtering, and selection. The protocol is summarized as follows: (i) selection of target protein structure(s) from crystal structure repositories (e.g., Protein Data Bank; PDB) or from computational models constructed via homology modeling or molecular dynamics (MD) simulations; (ii) characterization of the binding site; (iii) compound library construction, which involves several processes of characterization, filtering, and clustering; (iv) molecular docking of target with compounds supplemented with known actives and decoys followed by scoring; and (v) final evaluation and validation [5]. Given the dynamic nature of targets (which is required for them to function in physiological conditions), the binding site of the target–ligand complex is often adapted to new conformations involving either the protein side-chains or even the backbone [6]. Many specialized methods for optimizing and improving the accuracy of ligand-based and structure-based drug design have been developed, such as scaffold hopping and R-group replacement of lead compounds [7], the introduction of machine learning to QSAR for modeling of chemical features in compound libraries [8], and rotamer dynamics for monitoring side-chain flexibility in the molecular simulations of proteins [9]. The most accurate predictions can arise from using a MD simulation approach to screen libraries of compounds, such as the recent attempt to screen 8000 compounds utilizing the SUMMIT supercomputer at Oak Ridge National Laboratory, against the viral spike protein in Coronavirus 2019 (COVID-19) [10]. However, such an extreme amount of computational resource is not available to the average researcher. Docking software has recently improved in terms of its ability to accommodate side-chain flexibility [3]. Nevertheless, here we show that there are greater changes to be anticipated in the structure-based drug design targeting EGFR, a well-established receptor in many types of cancer.

EGFR tyrosine kinase inhibitors

EGFR (Fig. 1a) is the main druggable oncogenic target in nonsmall cell lung cancer (NSCLC), with nearly 50% of Asian patients and 15% of Caucasian patients presenting with growth-activating mutations. Most of these mutations are exon 19 deletions and L858R point mutations 11, 12. Almost 10 years have passed since tyrosine kinase inhibitors (TKIs) won the race over ‘chemotherapy’ in better outcomes of treatment for metastatic NSCLC. Nevertheless, there have been many obstacles, including the development of cellular chemoresistance to TKI therapy. The first-generation of TKIs [gefitinib (ZD1839; Iressa®; AstraZeneca, UK and Teva Pharmaceutical Industries, Israel) and erlotinib (Tarceva® by Astellas Pharma US, Inc., and Genentech, Inc., USA)], which were characterized by reversible binding to EGFR, were followed by a second-generation TKIs [afatinib (BIBW2992; Gilotrif® by Boehringer Ingelheim, Germany)] with irreversible binding to EGFR (and HER2 homolog) by TKI binding to Cys797. Yet, resistance to therapy continued. A selective mutation causing a key substitution (T790M) in the kinase domain of EGFR contributed to nearly half of chemoresistant cases. This led to the development of the third-generation TKIs [osimertinib (AZD9291; Tagrisso® by AstraZeneca, UK); rociletinib (CO-1686); olmutinib (HM61713); and nazartinib (EGF816)], which are still challenged by the development of chemoresistance 11, 13. Approximately 10–26% of cases develop resistance to second-line osimertinib treatment because of the EGFR C797S mutation, making it the most common tertiary EGFR mutation. When osimertinib was used as a front-line treatment, C797S mutation was reported in 7% of cases, making it the second most frequent mechanism of drug resistance behind MET amplification. Furthermore, C797S mutation is predicted to interfere with binding of other third-generation TKIs, such as rociletinib, olmutinib, and nazartinib [12]. One recent alternative strategy was to develop C797-targeting covalent inhibitors that do not lose their potency (and affinity) in the presence of C797S mutants [14]; however, there was still a need to develop therapeutics using alternative mechanisms of action.
Figure 1

N-lobe structural variations in the epidermal growth factor receptor (EGFR) kinase domain. (a) 3D structure of the EGFR kinase domain complexed with third-generation drug osimertinib showing the C-lobe (blue) and N-lobe (green). Distinct mutation sites are highlighted in orange, whereas the DFG domain is highlighted in red. (b) Dendrogram showing weighted pair-group average clustering of the N-lobe of EGFR based on distance matrix. Structure–structure alignment was used to compare four wild-type apo EGFR with representative structures from classified families of co-complexed EGFR with ligands. Method: The structure–structure multiple-alignment using aln.malign3d() function in Modeller (9.17, r10881) was used to perform throughout (N−1) dynamic programming where each new structure is compared to the average of the previous structures in each new cycle. Default settings were used [off diagonal = 100, gap penalties 3d = (0.0, 1.75), fit atoms = ‘CA’]. Dendrogram was built in Modeller with a cluster cutoff of −1.0.

N-lobe structural variations in the epidermal growth factor receptor (EGFR) kinase domain. (a) 3D structure of the EGFR kinase domain complexed with third-generation drug osimertinib showing the C-lobe (blue) and N-lobe (green). Distinct mutation sites are highlighted in orange, whereas the DFG domain is highlighted in red. (b) Dendrogram showing weighted pair-group average clustering of the N-lobe of EGFR based on distance matrix. Structure–structure alignment was used to compare four wild-type apo EGFR with representative structures from classified families of co-complexed EGFR with ligands. Method: The structure–structure multiple-alignment using aln.malign3d() function in Modeller (9.17, r10881) was used to perform throughout (N−1) dynamic programming where each new structure is compared to the average of the previous structures in each new cycle. Default settings were used [off diagonal = 100, gap penalties 3d = (0.0, 1.75), fit atoms = ‘CA’]. Dendrogram was built in Modeller with a cluster cutoff of −1.0. Fourth-generation TKIs were developed by targeting EGFR away from the ATP-binding site. The first allosteric inhibitor compound (called EAI001 or EGFR allosteric inhibitor-1) was discovered by screening a library of ∼2.5 million compounds against a mutant peptide derived from EGFR in the presence of ATP. Following optimization for selectivity, the new EAI045 inhibitor was found to have high potency and selectivity as an allosteric, non-ATP competitive inhibitor of L858R/T790M mutant EGFR [15]. The allosteric binding pocket is far from the C797S site and, although EAI045 is ineffective alone because of receptor dimerization, it is promising for use in combination with cetuximab against T790M and C797S mutants [16]. JBJ-04-125-02 is another example of fourth-generation allosteric TKIs that shows promising potency when used in combination with osimertinib [17]; however, when used alone, its efficiency drops because of EGFR dimerization. C797S mutation is not the only reason for TKI drug resistance [12]. Resistance also arises from other EGFR mutations (e.g., G796X), and other acquired alterations, such as gene amplification (e.g., MET, HER2, etc.), oncogenic fusions, and MAPK-PI3K mutations. Unfortunately, there is no standard or systematic treatment following the failure of the four generations of TKIs. The navigation of this ‘no man's land’ is sometimes managed by lytic-inducing strategies or onco-immunologic methods; however, many alternative strategies are under investigation [18]. Recent clinical trials on the combination of chemotherapy and osimertinib have shown promising progress, possibly because of the better toxicological profile of osimertinib compared with previous generation TKIs [19]. A recent meta-analysis on the outcomes of ∼4465 patients with EGFR-mutant NSCLC receiving TKI treatment showed that particular cohorts can have more benefits than others; for example, females over males, nonsmokers over smokers, and exon 19 deletion over L858R mutation [11]. Whether it is a sign of TKI development slow down or a natural development, ‘precision medicine’ has become an important step for management of the disease by stratification of patient cohorts into optimized treatment regimens according to molecular, pathophysiological and ‘omics profiling [20]. In addition to conventional therapies targeting EGFR, such as TKIs (provisionally via ligand-based drug design) and monoclonal antibodies (e.g., cetuximab, panitumumab, and necitumumab); a third strategy is now targeting EGF directly via EGFR-derived peptide-based inhibitors, anti-EGF vaccines, and single-domain antibodies (nanobodies) [21]. Structure-based drug design of EGFR TKIs has been attracting global attention among researchers (Table S1 in the supplemental information online). NMR and X-ray crystals were used to validate structures of ligands only rather than complexes. Whereas most studies used a single structure (e.g., 1M17) for screening, few studies used two models (wild-type versus mutant) for drug design, such as 22, 23. Interestingly, Sun et al. [24] used clusters of 19 EGFR-TKIs complexed structures for docking in an effort to create an ensemble that can be applied in ligand-based drug design. In 40 surveyed studies (Table S1 in the supplemental information online), structure-based drug design was mainly used as a supporting method to estimate ligand pose or interaction energy. Therefore, all MD simulations validations were too short to predict large movements in the EGFR protein.

Comparative overview of EGFR crystal structures

The development of EGFR TKIs is evolving quickly. At the time of writing, >71 900 ISI-impacted papers had been published on EGFR and over 200 X-ray crystalized protein 3D structures had been studied, mainly complexed with TKI ligands. Here, we analyzed a representative number of EGFR crystal structures to address the following questions: (i) how many different structural variations occur in the EGFR kinase-binding site? (ii) What are their causes based on TKIs and mutations? (iii) How big a role does the scaffold (i.e., core) of the TKIs have in binding compared with the R-groups? (iv) How can inhibitors be classified that reflects their effect? Nearly all the ∼40 publications of the 110 crystal structures described here were ligand-based drug design studies complemented with pharmacokinetics and X-ray crystallography to discover the mode of action. Nevertheless, it might be difficult to select which 3D structure to use. In other words, is there a subset of structures that are suitable for each TKI study? Despite being state-of-the-art, X-ray crystallography experiments take a relatively long time and are not expected to cover the most recent trends in EGFR TKIs. Therefore, it is likely that we have only mapped a small fraction of the conformational landscape of known EGFR TKIs. There are nearly 110 representative EGFR kinase 3D structures spanning the amino acid range 714–950 (numeration according to Uniprot ID P00533-1) and complexed with ligands. The kinase domain is divided into the N-lobe (714–795) and C-lobe (796–950), which clench the TKI ligands (in almost planar geometry) from both sides like a sandwich (Fig. 1a). There were 83 ligands salvaged in these structures, classified according to the core heterocycle of the scaffold into the following classes (numbers in brackets describe the represented PDB structures): one antibiotic (2), two benzimidazoles (3), six furopyrimidines (6), two indolocarbazoles (5), seven purines (11), one pyrazine (1), seven pyrazolopyrimidines (7), three pyridones (4), 34 pyrimidines (37), one pyrimidopyridone (1), eight pyrrolopyrimidines (13), seven quinazolines (16), one quinoline (2), and two thiazoles (2) (Table S2 in the supplemental information online). Pyrimidine compounds were further classified into the following groups based on their secondary heterocycles: 13 imidazopyridines, one imidazothiazole, two indazoles, two indoles, two pyrazoles, one pyrazolopyridine, two pyridones, four pyrrolopyridines, and seven without secondary heterocycles. Other compounds did not have classifiable heterocycles. This impressive diversity has been achieved in one decade when previously all kinase inhibitors were dominated by quinazoline-based TKIs [25]. Four apo-EGFR entries (resolution in Å) were without mutations or additional compounds: 1M14 (2.6 Å), 2GS2 (2.8 Å), 3GOP (2.8 Å), and 4TKS (3.2017 Å). The entry 3GOP was chosen as the reference structure because of its length coverage and good resolution. However, it is the only apo-EGFR entry with a C-helix-out (inactive kinase) conformation, whereas the remainder were presented in the C-helix-in (active kinase) conformation (Fig. 1B). The 110 EGFR 3D structures spanning the full range of the kinase domain were divided into five major family clusters at average weighted distances: 21.4069, 20.6099, 16.6741, and 9.4959 (Fig. S1 in the supplemental information online). There was no clear distribution of ligand classes among these families; however, EGFR mutations were relatively distributed among families in triple, double, single or no EGFR mutations. To identify local structural regions of divergence, the 3D structures of the two lobes in EGFR kinase were compared separately. Surprisingly, the C-lobe displayed relatively low divergence compared with the full-range and the N-lobe structures, with a maximum weighted distance of 5.0684 (Fig. S2 in the supplemental information online). This led to the conclusion that major structural variations must reside in the N-lobe, which requires further investigation. Of the 110 EGFR kinase N-Lobe crystal structures, two distinct clans (78 and 32 structures) were separated at an average weighted distance of 20.83, with the highest reported root-mean-square deviation (RMSD) against reference of 4.242 Å by 3LZB structure and lowest of 1.438 Å by 2JIT structure (Fig. 2 ; Table S3 in the supplemental information online) with different C-helix orientations (Fig. 3 ). Nearly 50 structures within the largest clan were at a distance below 5.0, whereas the other clan showed higher divergence. At least eight distinct families were classified in clusters that were irrelevant to the core of the molecules and similar in the R-groups (their composition, geometry, and charge) facing the C-helix and P-loop. The influence of mutations and similarities in pharmacological properties was also observed in different families (Figs. 1b and 2). X-ray crystallography showed that the ATP-binding cleft between the N-lobe and C-lobe can be wider or narrower as a result of different crystal packing [26], which we assume can influence the flexibility of the glycine-rich P-loop. Our comparative analysis shows highly distinct conformational landscapes for each of the mutant combinations. The most evident cluster of mutants was the triple T790M/L858R/V948R mutant in Family D2 in the N-lobe (Fig. 2) and at average weighted distance of 20.6099 in the whole kinase (Fig. S2 in the supplemental information online).
Figure 2

Dendrogram showing weighted pair-group average clustering of the N-lobe of epidermal growth factor receptor (EGFR) based on a distance matrix. Structure–structure alignment was used to compare 110 3D structures and resulted in two distinct clans that can be divided further into two to three families each. The largest clan of 78 structures was highly similar and divided into family A (22 structures with one or no mutations in EGFR), family B (31 structures with mostly double EGFR mutations), and family C (25 structures with one or two EGFR mutations, divided into subfamily 1 with 21 structures and subfamily 2 with four structures). The other clan was more divergent and was divided into family D (17 structures divided into subfamily 1 with seven wild-type EGFR structures and subfamily 2 with ten triple-mutated EGFR structures) and family E (15 highly divergent structures divided into two subfamilies of ten and five structures). Method: Using the keyword ‘EGFR’, a RCSB protein databank (www.rcsb.org) database search resulted in 260 structures. Structures were ordered according to best resolution (Å), and then entries that did not cover the kinase domain of EGFR were excluded. Thirteen entries were without mutations and also without inhibitor compounds: 2RFD, 2RF9, 4R3R, 2J5E, 2ITX, 3GT8, 4R3P, 2RFE, 4ZJV, 3VJO, 2GS6, 2GS7, and 4WRG. Fifteen entries were with mutations and without inhibitor compounds: 2ITN, 3UG1, 3VJN, 5CNN, 2EB3, 2ITV, 5CZH, 5CZI, 4I21, 5SX5, 4RIW, 4RIX, 4RIY, 4ZSE, and 5CNO. Only chain A was retrieved (the number of chains ignored were 28 from all structures). The 50% viability, dissociation, and inhibition constants (IC50, Kd, and Ki) were acquired from RCSB database links to the binding databases: PDBBind (www.pdbbind-cn.org), BindingDB (www.bindingdb.org), and BindingMOAD (bindingmoad.org) databases. *Data for the same ligand but the sequence identity of crystal structure was <100% indicating other mutant variants. The structure–structure alignment was done by the same method described in Fig. 1 in the main text.

Figure 3

3D superposition of N-lobe of epidermal growth factor receptor (EGFR) kinase domain (gray ribbons) and complexed ligands (green wire with HETATOM coloring). Residues within a 3-Å distance are shown as gray sticks with HETATOM coloring. Subfamilies are colored accordingly. (a) Family A (22 structures with one or no mutations in EGFR) showing the C-helix in proximity to ligands (arrow) and contacts with Glu762. (b) Family B (31 structures with mostly double EGFR mutations) showing T790M mutants and the C-helix in proximity to ligands (arrow) by contacts with Glu762. (c) Family C1 in gray (21 structures with one or two EGFR mutations) showing T790M mutants and the C-helix in proximity to ligands by contacts with Glu762. Family C2 in purple (four structures) showing shifting of the C-helix away from ligands because of a steric effect. (d) Family D1 in brown (seven wild-type EGFR structures) showing steric effect of ligands (brown wire inside the red circle) and the C-helix is shifting away with no contacts against Glu762. Family D2 in gray (ten triple-mutated EGFR structures) showing shifting of the C-helix away from ligands possibly because of effects of mutations. (e) Family E1 in gray (ten structures) showing the steric effect of ligands and the C-helix shifting away with no contacts against Glu762. (F) Family E1 and E2 superposed (in gray and gold, respectively). Family E2 (five highly divergent structures) showing a back-shift in the C-helix with ligand contacts against Glu762 and Ile759. Visualization of protein and ligand 3D Structures was performed using UCSF Chimera (version 1.10.2). The matchmaker plugin was used for superposition of all heavy atoms via the BLOSUM-62 scoring matrix and Needleman-Wunsch alignment algorithm.

Dendrogram showing weighted pair-group average clustering of the N-lobe of epidermal growth factor receptor (EGFR) based on a distance matrix. Structure–structure alignment was used to compare 110 3D structures and resulted in two distinct clans that can be divided further into two to three families each. The largest clan of 78 structures was highly similar and divided into family A (22 structures with one or no mutations in EGFR), family B (31 structures with mostly double EGFR mutations), and family C (25 structures with one or two EGFR mutations, divided into subfamily 1 with 21 structures and subfamily 2 with four structures). The other clan was more divergent and was divided into family D (17 structures divided into subfamily 1 with seven wild-type EGFR structures and subfamily 2 with ten triple-mutated EGFR structures) and family E (15 highly divergent structures divided into two subfamilies of ten and five structures). Method: Using the keyword ‘EGFR’, a RCSB protein databank (www.rcsb.org) database search resulted in 260 structures. Structures were ordered according to best resolution (Å), and then entries that did not cover the kinase domain of EGFR were excluded. Thirteen entries were without mutations and also without inhibitor compounds: 2RFD, 2RF9, 4R3R, 2J5E, 2ITX, 3GT8, 4R3P, 2RFE, 4ZJV, 3VJO, 2GS6, 2GS7, and 4WRG. Fifteen entries were with mutations and without inhibitor compounds: 2ITN, 3UG1, 3VJN, 5CNN, 2EB3, 2ITV, 5CZH, 5CZI, 4I21, 5SX5, 4RIW, 4RIX, 4RIY, 4ZSE, and 5CNO. Only chain A was retrieved (the number of chains ignored were 28 from all structures). The 50% viability, dissociation, and inhibition constants (IC50, Kd, and Ki) were acquired from RCSB database links to the binding databases: PDBBind (www.pdbbind-cn.org), BindingDB (www.bindingdb.org), and BindingMOAD (bindingmoad.org) databases. *Data for the same ligand but the sequence identity of crystal structure was <100% indicating other mutant variants. The structure–structure alignment was done by the same method described in Fig. 1 in the main text. 3D superposition of N-lobe of epidermal growth factor receptor (EGFR) kinase domain (gray ribbons) and complexed ligands (green wire with HETATOM coloring). Residues within a 3-Å distance are shown as gray sticks with HETATOM coloring. Subfamilies are colored accordingly. (a) Family A (22 structures with one or no mutations in EGFR) showing the C-helix in proximity to ligands (arrow) and contacts with Glu762. (b) Family B (31 structures with mostly double EGFR mutations) showing T790M mutants and the C-helix in proximity to ligands (arrow) by contacts with Glu762. (c) Family C1 in gray (21 structures with one or two EGFR mutations) showing T790M mutants and the C-helix in proximity to ligands by contacts with Glu762. Family C2 in purple (four structures) showing shifting of the C-helix away from ligands because of a steric effect. (d) Family D1 in brown (seven wild-type EGFR structures) showing steric effect of ligands (brown wire inside the red circle) and the C-helix is shifting away with no contacts against Glu762. Family D2 in gray (ten triple-mutated EGFR structures) showing shifting of the C-helix away from ligands possibly because of effects of mutations. (e) Family E1 in gray (ten structures) showing the steric effect of ligands and the C-helix shifting away with no contacts against Glu762. (F) Family E1 and E2 superposed (in gray and gold, respectively). Family E2 (five highly divergent structures) showing a back-shift in the C-helix with ligand contacts against Glu762 and Ile759. Visualization of protein and ligand 3D Structures was performed using UCSF Chimera (version 1.10.2). The matchmaker plugin was used for superposition of all heavy atoms via the BLOSUM-62 scoring matrix and Needleman-Wunsch alignment algorithm. Most ligands in each family were characterized by a secondary amide (scaffold–NH–R1) group projecting from the scaffold and bent at the second bond at a torsional angle of 45–75° toward the base of the C-helix. This conforms –R1 group in a new plane where its aromatic parts (if any) are sandwiched between Leu788 and Thr790 (or Met790 in T790M mutant). In families C, D, and E, the –R1 group is further extended to make more contacts with Leu777 and Leu788 (Fig. 3c–f), thus shifting the whole ligand molecule backwards (more precisely outwards) with the bent plane sandwiched between Thr790 and Lys745 (instead of Leu788). The narrow nature of that back-pocket has been a major challenge for R-group replacement in the past. Xu et al. [27] described a method where they first screened the –R1 group for potent ligands targeting the narrow pocket at the base of C-helix, and then screened for ligands selective for EGFR against ERBB2 by modifying the ligand from the opposite side (facing solvent). Several compounds with high potency to EGFR displayed low receptor selectivity. The shift toward mutation selectivity has been a game changer over the past decade. Structural selectivity of gefitinib and erlotinib for the L858R mutant is attributed to their specific recognition of the active kinase state and to weaker ATP binding by L858R EGFR (higher ATP Km value of the mutant EGFR relative to the wild-type). Therefore, the emergence of drug resistance because of the secondary mutation T790M is the result of the restoration of the ATP Km value of the double-mutant enzyme to wild-type levels [28]. An alternative strategy that targets the kinase entrance far from T789M focused on the hydrophobic clamp formed between N-lobe (L718 and V726) and C-lobe (L844). The insight behind this approach was from several EGFR-ligand co-crystals such as 5XDK, 3IKA, and 5X2C [29]. The latest generation of TKIs has no contact or effect on the C-helix. EGFR is activated through the formation of asymmetric dimer, where the C-lobe of an activator interacts with the N-lobe of a receiver, causing its C-helix to fold inwards (i.e., C-helix-in) [30]. EGFR inhibition via a phenomenon called ‘DFG-in-C-helix-out’ was observed in the families C2, D1, D2, and E1 and has been previously described particularly for irreversible TKIs 31, 32. The new conformation of a stable outward C-helix results in a larger and more hydrophobic pocket to accommodate an aromatic moiety from the TKI. In the crystal structures 3IKA and 5GNK, the C-helix is pushed farther from the pocket in the latter structure, but forms hydrogen bonds with Asp855, whereas the acrylamide group in the ligands forms a covalent bond with Cys797 [31]. Although the DFG-in-C-helix-out conformation can provide a selective advantage for EGFR kinases, mechanisms that stabilize the C-helix-in conformation would lead to drug resistance [30]. Indeed, oncogenic mutations causing resistant C-helix-in have been reported in HER2, BRAF, and EGFR exon 19 deletions, targeting particularly inhibitors of the C-helix-out families [33]. The role of C-helix rotation in forming inactive conformation has been well studied. The process is mediated by Leu858 (Fig. 1a), which forms a helical turn and hydrophobic interactions with other residues in the N-lobe, thus displacing the C-helix from the active site and rotating it outwards into an inactive conformation [34]. The DFG motif (Asp855–Gly857) in the C-lobe has been targeted by alternative TKI in two different receptors. Covalent ligands were reported to target Cys797 of EGFR L858R in the DFG-in conformation and Cys477 of FGFR4 V550L in the DFG-out conformation [35]. Examples of DFG motif targeting include the crystal structures 4JQ7, 4JQ8, 4JR3, and 4JRV [36].

Concluding remarks and recommendations

In their review, Lionta et al. [5] proposed a protocol for drug design in cases where the target structure exhibits many conformations to: (i) perform RMSD pairwise fitting comparison of the receptor conformations; (ii) perform clustering analysis of fitted models; and (iii) study the changes in the binding site regarding shape and volume and use it to filter out unsuitable ligands from the library. We suggest that it is important to clarify some peculiarities in using RMSD in comparative modeling. RMSD, which is commonly used among modelers, can provide a biased estimation of model similarity particularly when an entire subdomain in the structure (such as the case of C-helix in EGFR kinase) is shifted without complete changes inside that subdomain. To solve this bias, RMSD will have to be limited to isolated fragments (of local regions) of the structure [37], or instead, a score that considers local regions and includes local/global fitting can be used, such as the global distance test (GDT) [38]. Another widely accepted alternative is the template modeling scores (TM-score and TM-align), which, in additional to local/global fitting, take the length of polypeptide into consideration [39]. We provide the values for the TM-score and GDT scores of the N-lobe fittings of EGFR structures in Table S4 in the supplemental information online. Here, we have clarified some of the confusion regarding EGFR structure-based design and have provided a rationale for the applications for which these structure families can be used. For example, 3D structures of the A, B, and C1 families are C-helix-in with two mutations or less. They are suitable for all three generations of TKIs with minimum interactions with the C-helix. The remaining families are C-helix-out conformations with three or less mutations per structure, which are most suitable for targeting of the hydrophobic pocket formed by the C-helix-out conformation.

Declaration of Competing Interest

The authors report no declarations of interest.
  1 in total

Review 1.  An Insight into All Tested Small Molecules against Fusarium&amp;nbsp;oxysporum f. sp. Albedinis: A Comparative Review.

Authors:  Yassine Kaddouri; Redouane Benabbes; Sabir Ouahhoud; Magda Abdellattif; Belkheir Hammouti; Rachid Touzani
Journal:  Molecules       Date:  2022-04-22       Impact factor: 4.927

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.