Literature DB >> 28035082

Evolutionary and Comparative Genomics to Drive Rational Drug Design, with Particular Focus on Neuropeptide Seven-Transmembrane Receptors.

Michael Furlong1, Jae Young Seong1.   

Abstract

Seven transmembrane receptors (7TMRs), also known as G protein-coupled receptors, are popular targets of drug development, particularly 7TMR systems that are activated by peptide ligands. Although many pharmaceutical drugs have been discovered via conventional bulk analysis techniques the increasing availability of structural and evolutionary data are facilitating change to rational, targeted drug design. This article discusses the appeal of neuropeptide-7TMR systems as drug targets and provides an overview of concepts in the evolution of vertebrate genomes and gene families. Subsequently, methods that use evolutionary concepts and comparative analysis techniques to aid in gene discovery, gene function identification, and novel drug design are provided along with case study examples.

Entities:  

Keywords:  7TMR; Coevolution; Evolutionary history; G protein-coupled receptor; Gene duplication; Neuropeptide; Whole genome duplication

Year:  2017        PMID: 28035082      PMCID: PMC5207463          DOI: 10.4062/biomolther.2016.199

Source DB:  PubMed          Journal:  Biomol Ther (Seoul)        ISSN: 1976-9148            Impact factor:   4.634


INTRODUCTION

Seven transmembrane receptors (7TMRs) represent the largest membrane bound receptor superfamily in humans, with over 840 members (Oh ; Lagerstrom and Schioth, 2008). Genomic analysis of predictable pharmaceutical drug targets indicates that 7TMRs make up 19% of the drugable proteome, and that 36% of existing drugs target 7TMRs (Rask-Andersen ). Worldwide, as of 2014, 7TMR targeting drugs have a market value of $100 billion, which is expected to grow to $115 billion by 2018 (Ufuk, 2014). 7TMRs are cylindrical receptor proteins usually found in the cellular membrane and are involved in signal transduction, by which chemical messengers found outside of the cell are able to alter intra-cellular protein activity and gene expression. 7TMRs possess seven hydrophobic transmembrane α-helices, which anchor the receptor into the membrane layer. The α-helices are connected by extra- and intracellular loops, in addition to an extracellular N-terminal strand and an intracellular C-terminal strand (Katritch ). Ligands bind to pockets formed by the extracellular domains and/or transmembrane α-helices and induce conformational changes which modulate the intracellular domains’ ability to interact with various intracellular messenger proteins such as G-proteins and β-arrestins (Conroy ; M’Kadmi ). Vertebrate 7TMRs can be categorised into the glutamate, rhodopsin, adhesion, frizzled/taste2, and secretin families based on the GRAFS system. The rhodopsin-like family can be further subdivided into α, β, γ, and δ subgroups (Fredriksson ). Endogenous ligands for 7TMRs include peptides, amines, lipids, nucleotides, ions, and even photons. 7TMRs that are targeted by the neuropeptide category of peptide ligands are grouped in the secretin-like and β and γ groups of the rhodopsin-like 7TMRs (Fredriksson ). The rhodopsin-like 7TMRs and cognate neuropeptides can then be further categorised into 5 clades (Yun ) and the secretin-like 7TMRs and cognate neuropeptides into 5 families (Hwang ), as displayed in Fig. 1. To date, over 30 7TMR-neuropeptide families have been identified in humans consisting of over 70 7TMR genes and over 60 neuropeptide genes, with a further 8 orphan 7TMRs in 6 families with no known ligands (Table 1).
Fig. 1.

A simplified schematic of a phylogenetic tree of the pre-2R progenitor genes of the 5 clades of rhodopsin peptide-interacting 7TMRs, with the inclusion of the MCR family, and the single clade of secretin peptide-interacting 7TMRs. The inner ring consists of the phylogenetic schematic of the 7TMR progenitors and their VAC placement, and the outer ring consists of the neuropeptide ligand progenitors and their VAC placement. Lines between 7TMRs and their ligands indicate pre-2R interaction between multiple progenitor 7TMRs and ligands. 7TMRs annotated in red indicate the absence of known ligands for these receptors. The VAC bars are colour coded green, pink, tan, yellow, blue, and grey to represent VACs D, E, I/B, C, F, and A, respectively.

Table 1.

Rhodopsin-like and Secretin-like 7TMR and their cognate neuropeptide gene families

CladeFamily7TMRNeuropeptide


Pre-2R progenitorsPost-2R membersPost-2R membersPre-2R progenitors
1CholecystokininCCKR A/BCCKARCCKCCK/GAST
CCKBRGAST/2*
CCKR 3/4CCK3R
CCK4R
Neuropeptide FFNPFFR 1/2/3NPFFR1NPFFNPFF/NPVF
NPFFR2NPVF
NPFFR3
HypocretinHCRTR 1/2HCRTR1HCRTHCRT 1/2
HCRTR2HCRT2
TachykininTACR 1/2/3TACR1TAC1TAC 1/3/4
TACR2TAC3
TACR3TAC4
TACR 4/5TACR4/5*
ProkineticinPROKR 1/2/3PROKR1/2*PROK1PROK 1/2/3
PROKR3PROK2
PROK3
Orphan 83GPR83 1/2/3GPR83-1UnknownUnknown
GPR83-2/3*
Prolactin releasing peptidePRLHR 1PRLHR1PRLH1PRLH 1/2
PRLHR 2/3PRLHR2
PRLHR3
PRLHR 4/5PRLHR4
PRLHR5
Neuropeptide YNPYR 1/3/4/6NPY1RNPY/2*NPY/PYY/PPY
NPY3RPYY/PYY*
NPY4R
NPY6R
NPYR 2/7NPY2R
NPY7R
NPYR 5NPY5R
Pyroglutamylated Rfamide peptideQRFPR 1/2/3QRFPR1/1ii*QRFPQRFP 1/2
QRFPR2QRFP2
QRFPR3
QRFPR 4QRFPR4
2Galanin/SpexinGALR1GALR1AGALGAL
GALR1BGALP
GALR2/3GALR2ASPX1SPX
GALR2BSPX2/2b*
GALR3
KisspeptinKISSRKISSR1KISS1KISS 1/2/3
KISSR2KISS2
KISSR3KISS3
KISSR4
Urotensin-2UTS2R-1/4UTS2R-1URP/2*URP/1/2, UTS2
UTS2R-4UTS2
URP1
UTS2R-2UTS2R-2
UTS2R-3/5UTS2R-3
UTS2R-5
Melanin-concentrating hormoneMCHR 1/3MCHR1PMCH1PMCH1
MCHR3
MCHR2MCHR2
MCHR 4/5MCHR4
MCHR5
MCHR 6/7MCHR6
MCHR7
SomatostatinSSTR 1/4/6SSTR1SST1/3/4*SST 1-6
SSTR4SST2/6*
SSTR6SST5
SSTR 2/3/5SSTR2
SSTR3
SSTR5
SSTR 7SSTR7
Neuropeptide-B/WNPBWR 1/2NPBWR1NPWNPB/NPW
NPBWR2NPB
OpioidOPR D/K/L/MOPRDPENKPDYN/PENK/PNOC/POMC
OPRKPDYN
OPRLPNOC/POMC*
OPRM
MelanocortinMCR 1MC1RPOMC*
MCR 2MC2R
MCR 3/5MC3R
MC5R
MCR 4MC4R
3Neuromedin-B/Bombesin subtype 3/Gastrin-releasing peptideNMBR/BRS3/GRPRNMBRGRPGRP/NMB
BRS3NMB
GRPR
EndothelianEDNR A/B/B2EDNRAEDN1EDN
EDNRBEDN2
EDNRB2EDN3
EDN4
Orphan 37GPR37 /L1GPR37UnknownUnknown
GPR37L1
4Neuromedin-UNMUR 1/2/3NMUR1NMUNMU/NMS
NMUR2NMS
NMUR3
Growth hormone secretagogue/MotilinGHSR 1/2/3 MLNRGHSRGHSGHS/MLN
GHSR2MLN
GHSR3
MLNR
Orphan 39GPR39-1/2GPR39UnknownUnknown
GPR39-2
NeurotensinNTSR 1/2NTSR1NTSNTS
NTSR2NTS2
Thyrotropin-releasing hormoneTRHR 1/2/3TRHR1TRH1/2*TRH
TRHR2
TRHR3
NMUR4NMUR4NMUR4UnknownUnknown
Orphan 139/142GPR 139/142GPR139UnknownUnknown
GPR142
NMOGPR**UnclearUnclearUnknownUnknown
Orphan 139-like**UnclearUnclearUnknownUnknown
5Gonadotropin-releasing hormoneGnRHR1AGnRHR1AGnRH1GnRH
GnRH2
GnRH3
GnRHR1 B/CGnRHR1B
GnRHR1C
GnRHR2 A/B/CGnRHR2A
GnRHR2B
GnRHR2C
Orphan 150GPR150-1/2GPR150UnknownUnknown
GPR150-2
Neuropeptide-SNSPRNPSRNPSNPS
Arginine vasopressin/OxytocinOTR/AVPR1 A/BOTROXT/2*OXT
AVPR1A
AVPR1B
AVPAVP
AVPR 2AVPR2
AVPR 3/4/5AVPR3
AVPR4
AVPR5
Orphan 19GPR19GPR19/-2/-3*UnknownUnknown
Secretin-likeCorticotropin-releasing hormone/UrocortinCRHR 1/2CRHR1CRHCRH/UCN
CRHR2UCN
UCN2
UCN3
Calcitonin/Islet amyloid polypeptide/AdrenomedullinCALCR /LCALCRCALCA/B*CALC/IAPP
CALCRLIAPP
ADM1ADM 1/2
ADM2
Parathyroid hormonePTHR 1/2/3PTH1RPTH1PTH 1/2/LH
PTH2RPTH2
PTH3RPTHLH
Glucagon/Glucose-dependent insulinotropic polypeptide/glucagon related peptideGLP2RGLP2RGCGGCG/GIP/GCRP
GLP1RGLP1RGIP
GCGR/GIPR/GCRPR**GCGRGCRP
GIPR
GCRPR
Growth hormone-releasing hormone/Secretin/Vasoactive intestinal peptide/pituitary adenylate cyclase-activating polypeptideGHRHR1GHRHR1GHRHGHRH/SCT/PACAP/VIP
SCTRSCTRSCT
ADCYAP1R/ADVCYAPRPACAP
GHRHR2/3/VIPR1/2**GHRHR2/3* VIPR1/2*VIP

This table lists the known gene families of neuropeptide-interacting 7TMRs, divided into the 5 clades of the rhodopsin-like and single clade of the secretin-like. It lists the pre 2R progenitors of each family, as well as the post-2R products and subsequent duplications

Genes which are absent, or present as pseudogenes, in human are noted in grey. Some families possess unclear relationships.

Alongside each 7TMR family endogenous ligand genes for members of each 7TMR family are listed.

Neuropeptide-interacting 7TMRs mediate a multitude of roles in the nervous system and peripheral organs and influence a number of physiological and psychological processes including reproduction, growth, homeostasis, metabolism, food intake, sleep, and social and sexual behaviours (Cho ; Mirabeau and Joly, 2013; Vaudry and Seong, 2014). Mature neuropeptides are produced by cleavage of larger precursor proteins which share typical motifs in their amino acid sequences. These motifs include a signal peptide sequence at the N-terminus and an evolutionary conserved mature neuropeptide sequence, which is often flanked by cleavage sites (Steiner, 1998). As larger quantities of biological data become available, rational, specifically tailored techniques are emerging. For example, novel peptide genes have been identified by tailoring genomic-analysis algorithms to search for specific peptide motifs (Mirabeau ). In addition, comparative and evolutionary genomic analysis techniques are aiding in the delineation of the mechanisms that drive the emergence of gene families, permitting the determination of novel peptide and 7TMR interaction (Hwang ; Park ; Kim ). Therefore, this article will discuss, using previously published data, how an understanding of gene family evolution, within the context of vertebrate genome evolution over the past 500 million years, can provide valuable insight for gene discovery, gene function identification, and drug design for peptide ligand-7TMR families.

GENE FAMILY EXPANSION VIA INDIVIDUAL GENE AND WHOLE GENOME DUPLICATION

Gene families are established through evolutionary processes such as gene/genome duplication followed by alterations to gene function or gene loss (Abi-Rached ; Larhammar ; Holland, 2003; Santini ; Vienne ; Kim ; Hwang ). In particular, the quadruplication of genes by two rounds (2R) of whole genome duplication during early vertebrate evolution facilitated the rapid proliferation of genes within vertebrate gene families. These families emerged and were populated via continuous local gene duplications prior to 2R (Hwang ). Subsequent to 2R, heavy gene loss and additional local gene duplications followed by differentiation and functionalization has created the current variety of vertebrate gene families (Lundin, 1993; Holland ; Larhammar and Salaneck, 2004; Hwang , 2014; Kim ; Sefideh ). Imperfections in the DNA replication process on an evolutionary time scale results in divergence of the amino acid sequences of duplicated genes. Advantageous changes, which enhance gene function and organism reproduction, are more likely to be retained in the species while detrimental changes are more likely to be lost. Neutral changes, which have no overall effect on organism survivability, accumulate at a slower rate (Lynch ). Subsequent to duplication, daughter genes accumulate different mutations which result in a process of differentiation and functionalisation. There are numerous categories of functionalisation that are influenced by factors such as chromosomal location, the method of gene duplication, the location of the mutation, the gene type, and the replication rate of the species (Jensen and Bachtrog, 2011). However, the typical pattern is that one duplicate retains a larger proportion of the original functionality which leaves less function-conservation pressure on the other duplicate(s). This allows greater freedom for mutations to accumulate subsequent to duplication (Assis and Bachtrog, 2013). If the duplicate gene(s) is not rendered non-functional then a process of sub-functionalisation and specialisation often occurs; the expression patterns of the genes differentiate and each gene gains partial functionality of the pre-duplication gene. As the duplicates continue to diverge neofunctionalisation may occur and novel functions that did not exist in the pre-duplicate gene may be acquired (He and Zhang, 2005; Rastogi and Liberles, 2005; Gibson and Goldberg, 2009; Klingel ).

PHYLOGENY AND SYNTENY FOR DELINEATING GENE FAMILY EMERGENCE

Phylogenetic analysis, using the amino acid sequence of gene products, is an important tool to delineate evolutionary relationships among genes from different taxa. Together with recent advances of bioinformatics tools to discover novel genes, a large amount of protostomian and deuterostomian data are rapidly being accumulated (Mirabeau ). In addition, reverse pharmacological approaches in invertebrates (Hauser ; Lindemans ; Jiang ) and vertebrates (Civelli ) have facilitated discovery of a great number of peptide-7TMR families. It is of importance to note that phylogenetic analyses of protostomian and deuterostomian sequences of peptide 7TMRs show a large number of family subtrees containing both protostomian and deuterostomian 7TMRs, indicating that many vertebrate7TMR families originate prior to the divergence of deuterostomes and protostomes (Mirabeau and Joly, 2013). However, phylogenetic analysis without knowing the location of selected genes within the genomes of species often fails to correctly determine the evolutionary process for the establishment of a gene family (Abi-Rached ; Larhammar ). Syntenic analysis involves the comparison of the locations of orthologous or paralogous genes between chromosomes, within or among species, which facilitates more accurate analyses of the origins and relationships of individual peptide-7TMR families (Cerda-Reverter ; Lagerstrom ; Lee ; Kim , 2012; Dores, 2013; Osugi ). However, small scale synteny analysis is only suitable for determining the evolutionary history of closely related families. Delineating the evolutionary mechanisms for larger gene families with less closely related members is particularly difficult due to the scattered distribution of the genes on many different chromosomes. Thus, to elucidate the evolutionary history of a superfamily, such as 7TMRs and their neuropeptide ligands, large scale synteny such as comparison of large segments of chromosomes among species that represent a wide selection of the vertebrate clade is also required. Comparisons of entire genomes between evolutionarily distinct taxa have led to reconstructions of hypothetical ancestral chromosomes of early vertebrates or chordates (Nakatani ; Putnam ), which support the hypothesis that 2R occurred during early vertebrate emergence, approximately 500 million years ago. 2R produced, on average, four gnathastome ancestral chromosomes (GACs) that share related sets of genes, defined as ohnologs, from pre-2R progenitor vertebrate ancestral chromosomes (VACs) (Dehal and Boore, 2005; Meyer and Van de Peer, 2005; Nakatani ; Putnam ). Assigning GAC and VAC positions to members of a gene family provides a fast and relatively accurate tool to aid in tracing the origins of gene super families (Yegorov and Good, 2012; Hwang ; Yun ). For instance, the neuropeptide Y (NPYR), prolactin-releasing peptide (PRLHR), orphan G protein-coupled receptor 83 (GPR83), prokineticin (PROKR), tachykinin (TACR), neuropeptide FF (NPFFR), hypocretin (HCRTR), cholecystokinin (CCKR), and pyroglutamylated RFamide peptide (QRFPR) families are phylogenetically grouped in clade 1 of the rhosodpin-like neuropeptide-interacting 7TMRs and are located on VAC_C, except for the HCRTR family (Fig. 1). The receptor families in clade 5 consisting of the orphan GPR19, arginine vasopressin (AVP)/oxytocin (OTR), neuropeptide-S (NPSR), orphan GPR150, and gonadotropin-releasing hormone (Gn-RHR) are mainly located on VAC_D or VAC_A (Yun ) (Fig. 1). The secretin-like 7TMR families comprising of corticotropin releasing hormone receptor (CRHR1), calcitonin receptor (CALCR), parathyroid hormone receptor (PTHR), growth hormone-releasing hormone receptor (GHRHR, which also includes the secretin, vasoactive intestinal peptide, and pituitary adenylate cyclase-activating polypeptide receptors), and glucagon receptor (GCGR, which also includes the glucagon-like peptide 1, glucagon-like peptide 2, and glucose-dependent insulinotropic polypeptide receptors) families are located on VAC_E, except for the GCGR family (Hwang ) (Fig. 1). Thus, in general, it can be postulated that extensive tandem local duplication within ancestral chromosomes, which occurred prior to 2R, drove the emergence of these gene families. The presence of members of a single clade on two or more chromosomes is likely due to chromosome translocation before 2R. The Nakatani model only rebuilds putative VACs dated shortly before 2R, therefore this model does not account for prior translocation. Chromosomal translocations between the 1st and 2nd WGDs may also account for the spread of gene family members from a single VAC onto multiple distinct GACs. When performing evolutionary comparative analysis, either syntenic or phylogenetic, it is important to ensure specific types of representative species are included. Within a lineage, some species will have particularly well-conserved genomes that have undergone lower rates of chromosomal change and retain a greater variety of genes produced by 2R (Yun ). Those with the most conserved genomes/gene families/morphology are referred to as ‘basal’ species. Therefore, a variety of the most basal species from across the desired spectrum of taxa should be selected as representative species. Furthermore, for comparative evolutionary analysis, the genomes of these species must be available and to fully exploit their genomic data their DNA must be allocated to chromosomes and their genes be well annotated. Therefore, the following species provide good representatives: the human genome, which possesses unrivalled annotation and is better conserved than many other available mammal genomes (Burt ); the chicken genome, which has some of the best preserved chromosomes of the tetrapods (Nishida ); and spotted gar, which has recently been considered the best vertebrate representative as, unlike most teleost fish, it has not undergone a 3rd whole genome duplication and has undergone fewer translocations than other vertebrates with mapped genomes (Amores ). Furthermore, the inclusion of experimentally important species, such as mouse, zebrafish, and Xenopus species help to increase the usefulness of data and ensure a diverse selection of vertebrate gene samples. Unfortunately, the genomes of a number of important species such as coelacanth, a basal tetrapod (Amemiya ); elephant shark, the slowest known evolving vertebrate (Venkatesh ); and Branchiostoma floridae, a basal chordate and a useful outgroup for vertebrate analysis (Elphick and Mirabeau, 2014), have not yet been arranged into chromosomes and instead are available as short DNA sequences on scaffolds which diminishes their use for synteny analysis. Other species such as lamprey or hagfish, which could provide important perspectives on inter-2R genome evolution (Caputo Barucchi ; Mehta ), and Asymmetron lucayanum, which may be the most basal chordate discovered (Yue ), have only partial genomes available. The arrangement of genomic data onto chromosomes and annotation of the genes of these species would provide a large boon to vertebrate evolutionary research. In addition to using evolutionarily divergent species simply to put human gene families into perspective, analysing species with particularly interesting physiological attributes could help to design novel therapeutic treatments. For example, the naked mole rat shows incredible longevity, resistance to mammalian age-related disease, and cancer (Lewis ), and the elephant shark possesses an adaptive immune system that lacks several constituent genes that are vital to the mammalian immune system, but are capable of mounting an immune response (Venkatesh ). Further exploring how gene families have evolved in these species could bring novel insights into treating human medical issues.

DISCOVERY OF NOVEL PEPTIDE GENES AND THEIR EVOLUTIONARY DEVELOPMENT

The use of evolutionary conserved sequences and motifs to bulk analyse genomes is an established approach to novel gene discovery and genome annotation (Mirabeau ). However, many species, including humans, have lost various ohnologs which may be retained in more basal species (Yun ) (Table 1). For instance, in human there is a single kisspeptin (KISS) gene and a single KISS 7TMR gene (KISSR), while spotted gar has three KISS genes and four KISSR genes (Lee ; Yun ). In human there are two GnRH genes and one GnRHR gene but coelacanth has three GnRH genes and four GnRHR genes. The loss of multiple members of a family may leave the resulting members too divergent for analyses using conserved motifs to identify each other without related genes to span the evolutionary divide. Therefore, performing blast searches using the full repertoire of protein sequences from a gene family, especially if selected from more basal species, gives a higher likelihood of success. However, basal species are not guaranteed to possess more genes in every family. For instance, humans have a single NPS gene and a single NPSR gene while neither spotted gar nor coelacanth appear to have genes from this family (Yun ). In combination with phylogenetic analysis, small scale synteny (Yun ) and VAC/GAC models (Nakatani ), newly discovered genes can be correctly identified as orthologs (the same gene in different species), ohnologs (duplicates produced by WGD), or other paralogs (related genes created by other duplication events). For example, as displayed in Fig. 2, the single human KISS gene is located on a GAC_D2 linkage group while spotted gar possess an additional two KISS genes on GAC_D0 and D3. Because these two genes are located on different GACs we can hypothesise they are separate ohnologs created by 2R from a single progenitor and not created by local duplication previously or subsequently. The use of synteny is particularly helpful when analysing peptide gene families because of the limited use of phylogeny to determine the exact relationships among peptide gene families. Therefore, it can be postulated that, because of the lower rates of change in species such as spotted gar, coelacanth and elephant shark, using gene identification algorithms in these basal species may return genes that have diverged too far to be detected in humans, and that by using VAC/GAC models, the regions of the genome that are most likely to harbour novel genes can be prioritised.
Fig. 2.

A simple phylogenetic tree showing the development of the KISS and GAL branch of clade 2 via local duplication prior to 2R (red) and the ohnologs generated via 2R (blue). Genes present in humans are labelled in black, while those discovered in other vertebrate species are labelled in grey. Purple lines connect these 7TMRs to their ligand progenitors, with the dotted purple line between GAL and GALR3 indicating very low interaction. The ligand progenitors are placed on VAC_D and the red box below indicates how 2R expanded these peptide gene families from three progenitor genes into seven modern vertebrate genes, four of which are present in humans.

The evolutionary relationships among gene families can be examined by phylogenetic analysis. However, in the case of neuropeptide genes, signal peptide sequences are not conserved, and propeptide sequences, other than the mature peptide, are highly variable because these sequences are free from evolutionary conservational pressure (Lee ). Sequence comparison of the short, conserved mature peptides is often not sufficient to extrapolate reliable relational information, particularly if they emerged prior to 2R (Cardoso ; Hwang ). Duplications that have occurred more recently, especially those subsequent to 2R, are more likely to be found in the same linkage block and share a high degree of amino acid sequence similarity (Yun ). However, genes that emerged earlier, in pre-vertebrate evolution, or those that undergo particularly high rates of change accumulate mutations which lead to greater deviation in residue sequence and function. In contrast, 7TMR transmembrane domains are reasonably well conserved across vertebrate and invertebrate species, and the amino acid sequences are long enough to generate relatively reliable phylogenetic trees. Concerning the concept of co-evolution of peptides and their receptor genes, the evolutionary relationships among peptides can be extrapolated by matching them against their cognate 7TMR families. For instance, amino acid sequences of even the mature neuropeptides such as NPYR, PRLHR, PROKR, TACR, NPFFR, HCRTR, CCKR, and QRFPR families that are phylogenetically grouped in clade 1 (Fig. 1) are highly deviated such that phylogenetic analysis cannot be performed. However, when the genes for these neuropeptides are placed on VAC/GACs, 6 of the 8 neuropeptide gene families are found to be located on VAC_E (Yun ). Similarly, the AVP/OT, NPS, and GnRH neuropeptide gene families of clade 5 are on VAC_C. These results indicate that neuropeptide families also multiplied through local duplications prior to 2R by the same pattern as their cognate 7TMRs.

COEVOLUTION OF NEUROPEPTIDES AND THEIR RECEPTOR GENES

Every known ligand gene for the related GNRHR, NPSR, and AVPR groups of 7TMRs found in clade 5 (Yun ) can be found on a GAC_C linkage group (Fig. 1). Therefore, the ligands for the orphan GPR150 and GPR19 7TMRs, that are also present in clade five, can be postulated to also be present on a GAC_C linkage group. Furthermore, because the closest relative of GPR150 is NPSR, then the ligand for GPR150 can be postulated to share similarity with the NPSR ligand, NPS. By using these methods Kim were able to determine the receptor for the novel neuropeptide spexin (SPX) (Mirabeau ). Syntenic analysis and relocating SPX genes and neighbouring genes on reconstructed VACs reveals that SPXs are located in the vicinity of KISS and galanin (GAL) family genes, suggesting that SPX, GAL, and KISS genes arose from a common ancestor through local duplications before 2R and that SPX may interact with receptors exhibiting similarity in amino acid sequence with those of GAL 7TMRs (GALRs) and KISSRs. KISS and GAL 7TMRs are phylogenetically closest among rhodopsin-like G protein-coupled receptors, and synteny revealed the presence of 3 distinct receptor progenitors KISSR, GALR1, and GALR2/3 before 2R (Fig. 2). A ligand-receptor interaction study showed that SPX activates human, Xenopus, and zebrafish GALR2/3 but not GALR1, suggesting that SPX is a natural ligand for GALR2/3 (Kim ). Furthermore, linkage group analysis of the secretin neuropeptide and 7TMR family genes aided in the identification of a receptor for a novel GCRP neuropeptide (Hwang ; Park ). Gradual duplication, differentiation, subfunctionalisation, and neofunctionalisation form the basis of the model of slow, consistent genome evolution where largely self-contained gene families are inherited from parents and passed down to offspring in a vertical pattern of inheritance. This allows genes to be placed into related families and their relationships be traced through evolutionary history, sometimes over a billion years (Nordstrom ). However, occasionally sudden changes in gene interaction or lateral gene transfer defy this trend. For example, the pro-opiomelanocortin (POMC) gene possesses two evolutionary distinct subunits: the ACTH subunit and the opioid subunit, each of which is a ligand for two completely unrelated 7TMR families. It appears that the ACTH subunit existed as an individual gene in the vertebrate ancestor and, by chance, it’s peptide product started to interact with the progenitor of the melanocortin family of 7TMRs (MCR) (Haitina ). The MCR family progenitors emerged from the wider MECA group of receptors, none of which interact with peptide ligands, which means the MCR family is an evolutionary anomaly (Fredriksson ). Subsequent to 2R, it appears that a duplication of an opioid gene, prepronociceptin (PNOC), placed an opioid coding region into the ACTH gene, resulting in a novel hybrid gene, POMC. Furthermore, intra-gene duplication of the ACTH subunit and proliferation of the MCR family has led to the development of an entirely novel system that has only been found in vertebrates (Harris ). Lateral gene transfer can also produce sudden changes that do not fit into the standard evolutionary model. For example, it has been noted that the syncytin gene, which is critical for placental development in mammals (Dupressoir ), in distantly related mammals sometimes have similar syncytin peptide sequences while closely related species sometimes have largely divergent syncetin peptide sequences (Redelsperger ). It appears that mammalian syncytin genes originate from a viral gene and that periodically, in a pattern that does not follow standard evolutionary patterns, is updated via reinfection and gene adopted into evolving mammalian genomes (Cornelis ).

DRUG DESIGN USING EVOLUTIONARY COMPARATIVE ANALYSIS

Subsequent to gene duplication, subfunctionalisation and specialisation can result in related ligands with similar amino acid sequences which have greater or lesser ability to interact with 7TMR subtypes within a gene family (Kim ). The amino acid sequence of a peptide ligand defines the structural and chemical nature of that peptide, including the interactions that the peptide undergoes. Amino acids that are critical for protein function tend to be evolutionarily retained while those of decreasing importance will, on average, have increasing rates of variation. As the function of related peptides deviate further then variation even within the best conserved amino acids increases. As noted by Yun , humans often possess fewer, and different, paralogs within a gene family compared to other species. Therefore, by using a variety of paralogs from within a gene family from an array of species, different peptide sequences that are capable of binding human 7TMRs of interest at varying potencies and affinities can be analysed (Kim ) and the function of amino acids in receptor binding ascertained (Reyes-Alcaraz ). Once the functions of individual amino acids within a peptide have been analysed then mutational experiments can be conducted to specifically alter the binding affinity of the peptide sequences to create novel ligands. Furthermore, alteration of the nature of the amino acids used in peptide synthesis can reduce susceptibility to proteases. The SPX and GAL neuropeptide genes emerged through a local duplication from a common ancestor gene and both interact with members of the GAL 7TMR family (GALR1, 2, and 3). GAL can activate GALR1 and 2 to a high degree with a much lower ability to activate GALR3. SPX can activate GALR2 and 3 to a high degree. The mature neuropeptides of SPX and GAL share several conserved residues, including Trp2, Thr3, Tyr9, Leu10, and Gly12 (Kim ) (Fig. 3). This indicates that these common residues may be required for activation of GALR2, the SPX-specific residues are likely involved in retaining the agonist activity toward GALR3 while GAL-specific residues may contribute to decreased affinity to GALR3. This observation can lead to a postulation that the replacement of SPX-specific residues with those of GAL can produce a novel agonist acting only on GALR2 with no cross reactivity with GALR3. Indeed, out of SPX-specific residues, Gln5, Met7, Lys11, and Ala13 were found to be critical for GALR3 activation. Replacement of these residues with Gal-specific residues (Gln5→Asn, Met7→Ala, Lys11Phe, and Ala13→Pro) abolished the ability to activate GALR3 while retaining full activity to GALR2. This mutation study takes into account the evolutionary fates of duplicated neuropeptide ligand and receptor genes. The pre-2R local duplication followed by whole-genome duplication produced GALR1, GALR2 and GALR3. Likewise, pre-2R local duplication produced GAL and SPX progenitors and following whole-genome duplication, generated the GAL family (GAL and GALP) and SPX family (SPX1 and SPX2) (Fig. 2). During the divergence of the GAL/SPX and GALR1/2/3 system, GALR2 appears to have become an intermediate form as it responds to both SPX and GAL with high affinity, whereas GALR1 and GALR3 acquired significant preference to GAL and SPX, respectively (Reyes-Alcaraz ).
Fig. 3.

Mature peptides of human GAL and SPX, below which is the hybrid peptide created via mutagenesis to specifically target GALR2, below which is the modified hybrid peptide with increased serum stability. Each residue is colour coded to indicate residues that are common to both GAL and SPX peptides (yellow), divergent residues that don’t appear to alter receptor activity (grey), GAL specific residues (pink), and SPX specific residues (green). To provide serum stability, some residues were replaced with their D-amino isomer (orange) or N-terminal modifications were made (blue).

Based on this concept, Reyes-Alcaraz synthesised novel agonists that were capable of targeting and specifically activating GALR2. Furthermore, N-terminal modification and substitution of residues that were not found to alter GALR activity with D-isoforms of these residues greatly increased the stability of the peptide in serum. The endogenous ligands, SPX and GAL, and other synthetic ligands, such as M1145 and M1153, had cross-reactivity with multiple GALRs to some extent. The ability to activate GALR2, specifically, had interesting therapeutic possibilities as each subtype of GALRs has been found to exhibit largely divergent physiological function. For instance, a recent observation demonstrates that GALR1 and GALR2 mediate opposite anxiety-like effects in rats: GALR1 and GALR2 agonists exerted anxiogenic and anxiolytic-like effects, respectively (Morais ). GALR3 may also induce anxiogenic behaviour as GALR3-specific antagonists decrease anxiety and induce depression-like behavior in rats (Swanson ). In addition, the actions of SPX and GAL in appetite behaviour appear to oppose each other as well: SPX is anorexic while GAL is orexigenic (Taylor ; Shiba ; Wong ). Thus, the design of an agonist that discriminates GALR2 from GALR1 or GALR3 is of particular importance from therapeutic perspective. Conversely, Moon compared the amino acid sequences of two related secretin-like 7TMR interacting neuropeptides; glucagon like peptide (GLP-1), which binds GLP1R, and glucose-dependent insulinotropic polypeptide (GIP), which binds GIPR. These neuropeptides and 7TMRs share high similarity in amino acid sequence and both modulate insulin secretion from pancreatic B-cells, among other functions (Baggio and Drucker, 2007). However, despite their similarities they have no ability to activate each other’s receptor. Moon compared the two amino acid sequences and generated a hybrid that replaced four residues in GIP with those of GLP-1. This allowed the mutant peptide to activate both receptors with moderate potency, which allows two related but distinct messenger pathways to be activated with a single ligand. Subsequently, further research was conducted to increase the half-life and potency of GIP/GLP-1 hybrids (Moon ; Kim ). In addition, the use of comparative techniques where related proteins are compared to analyse the function of individual amino acids can be used to highlight amino acids in receptors that are highly conserved among vertebrate species. In 7TMRs, the intra- and extra-cellular loops as well as the N- and C-terminals of the receptors tend to deviate at a relatively high rate. Therefore, amino acid sequences in these regions that are well conserved among vertebrate species likely have a function in receptor activity or stability. By comparing GLP-1 receptor sequences and performing mutagenic studies, Moon were able to find, by using point mutations of evolutionary conserved residues, that amino acid residue Arg380 flanked by Leu379 and Phe381 in extra-cellular loop 3 may interact with Asp9 and Gly4 of the GLP-1 neuropeptide. This information helps to bring greater understanding to the mechanisms by which GLP-1 interacts with the GLP1R, which at the moment is not well known due to a lack of crystal structure data for the ligand-bound receptor complex.

CONCLUSIONS

Currently, a number of ‘orphan’ 7TMRs are predicted to bind peptidergic ligands, which may be expressed from currently undiscovered genes or known genes where functional relationship with orphan 7TMRs have not yet been identified. In these situations, the use of VAC/GAC maps and identification of peptide-receptor systems in more basal species, which have not diverged as greatly, may help to identify the ligand genes in humans. However, it has been postulated that some 7TMRs remain orphans, despite intensive efforts to identify endogenous ligands, because they simply do not have ligands (Davenport ). Instead, some 7TMRs may influence signal transduction through dimerisation and modulation of other 7TMRs (Levoye ). It is also possible that, as 7TMRs have constitutive activity rates irrespective of ligand binding, other mechanisms may be the primary mediator of some orphan 7TMR activity, such as pH, pressure, or temperature (Ahmad ). Prompted by the wide variety of important biological processes mediated by 7TMR signalling, there is a high demand for novel drugs that can target individual receptors and regulate these signalling pathways. However, high clinical standards with regards to new drugs being authorised for sale on the market combined with the high cost of drug development means that there is a strong interest in techniques that facilitate the design of drug candidates which can specifically target individual 7TMR mediated pathways. The discussed techniques and examples demonstrate how comparison of genomes, gene families, and individual protein sequences, using phylogeny and synteny, can aid in gene discovery, gene function identification, and the design of hybrid ligands.
  79 in total

1.  Evidence of en bloc duplication in vertebrate genomes.

Authors:  Laurent Abi-Rached; André Gilles; Takashi Shiina; Pierre Pontarotti; Hidetoshi Inoko
Journal:  Nat Genet       Date:  2002-04-22       Impact factor: 38.330

2.  Evolutionary conservation of regulatory elements in vertebrate Hox gene clusters.

Authors:  Simona Santini; Jeffrey L Boore; Axel Meyer
Journal:  Genome Res       Date:  2003-06       Impact factor: 9.043

3.  Revisiting the evolution of gonadotropin-releasing hormones and their receptors in vertebrates: secrets hidden in genomes.

Authors:  Dong-Kyu Kim; Eun Bee Cho; Mi Jin Moon; Sumi Park; Jong-Ik Hwang; Olivier Kah; Stacia A Sower; Hubert Vaudry; Jae Young Seong
Journal:  Gen Comp Endocrinol       Date:  2010-10-29       Impact factor: 2.822

4.  Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution.

Authors:  Xionglei He; Jianzhi Zhang
Journal:  Genetics       Date:  2005-01-16       Impact factor: 4.562

5.  Identification of G protein-biased agonists that fail to recruit β-arrestin or promote internalization of the D1 dopamine receptor.

Authors:  Jennie L Conroy; R Benjamin Free; David R Sibley
Journal:  ACS Chem Neurosci       Date:  2015-02-20       Impact factor: 4.418

Review 6.  From 2R to 3R: evidence for a fish-specific genome duplication (FSGD).

Authors:  Axel Meyer; Yves Van de Peer
Journal:  Bioessays       Date:  2005-09       Impact factor: 4.345

Review 7.  A review of neurohormone GPCRs present in the fruitfly Drosophila melanogaster and the honey bee Apis mellifera.

Authors:  Frank Hauser; Giuseppe Cazzamali; Michael Williamson; Wolfgang Blenau; Cornelis J P Grimmelikhuijzen
Journal:  Prog Neurobiol       Date:  2006-09       Impact factor: 11.685

8.  The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints.

Authors:  Robert Fredriksson; Malin C Lagerström; Lars-Gustav Lundin; Helgi B Schiöth
Journal:  Mol Pharmacol       Date:  2003-06       Impact factor: 4.436

9.  Adipokinetic hormone signaling through the gonadotropin-releasing hormone receptor modulates egg-laying in Caenorhabditis elegans.

Authors:  Marleen Lindemans; Feng Liu; Tom Janssen; Steven J Husson; Inge Mertens; Gerd Gäde; Liliane Schoofs
Journal:  Proc Natl Acad Sci U S A       Date:  2009-01-21       Impact factor: 11.205

10.  Local duplication of gonadotropin-releasing hormone (GnRH) receptor before two rounds of whole genome duplication and origin of the mammalian GnRH receptor.

Authors:  Fatemeh Ameri Sefideh; Mi Jin Moon; Seongsik Yun; Sung In Hong; Jong-Ik Hwang; Jae Young Seong
Journal:  PLoS One       Date:  2014-02-03       Impact factor: 3.240

View more
  2 in total

1.  Conceptual Progress for the Improvements in the Selectivity and Efficacy of G Protein-Coupled Receptor Therapeutics: An Overview.

Authors:  Kyeong-Man Kim
Journal:  Biomol Ther (Seoul)       Date:  2017-01-01       Impact factor: 4.634

2.  Spexin-Based Galanin Receptor Type 2 Agonist for Comorbid Mood Disorders and Abnormal Body Weight.

Authors:  Seongsik Yun; Arfaxad Reyes-Alcaraz; Yoo-Na Lee; Hyo Jeong Yong; Jeewon Choi; Byung-Joo Ham; Jong-Woo Sohn; Dong-Hoon Kim; Gi Hoon Son; Hyun Kim; Soon-Gu Kwon; Dong Sik Kim; Bong Chul Kim; Jong-Ik Hwang; Jae Young Seong
Journal:  Front Neurosci       Date:  2019-04-18       Impact factor: 4.677

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.