| Literature DB >> 19679754 |
Jason A Wilder1, Elizabeth K Hewett, Meredith E Gansner.
Abstract
GYPC encodes two erythrocyte surface sialoglycoproteins in humans, glycophorin C and glycophorin D (GPC and GPD), via initiation of translation at two start codons on a single transcript. The malaria-causing parasite Plasmodium falciparum uses GPC as a means of invasion into the human red blood cell. Here, we examine the molecular evolution of GYPC among the Hominoidea (Greater and Lesser Apes) and also the pattern of polymorphism at the locus in a global human sample. We find an excess of nonsynonymous divergence among species that appears to be caused solely by accelerated evolution of GYPC in the human lineage. Moreover, we find that the ability of GYPC to encode both GPC and GPD is a uniquely human trait, caused by the evolution of the GPC start codon in the human lineage. The pattern of polymorphism among humans is consistent with a hitchhiking event at the locus, suggesting that positive natural selection affected GYPC in the relatively recent past. Because GPC is exploited by P. falciparum for invasion of the red blood cell, we hypothesize that selection for evasion of P. falciparum has caused accelerated evolution of GYPC in humans (relative to other primates) and that this positive selection has continued to act in the recent evolution of our species. These data suggest that malaria has played a powerful role in shaping molecules on the surface of the human red blood cell. In addition, our examination of GYPC reveals a novel mechanism of protein evolution: co-option of untranslated region (UTR) sequence following the formation of a new start codon. In the case of human GYPC, the ancestral protein (GPD) continues to be produced through leaky translation. Because leaky translation is a widespread phenomenon among genes and organisms, we suggest that co-option of UTR sequence may be an important source of protein innovation.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19679754 PMCID: PMC2775107 DOI: 10.1093/molbev/msp183
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FStructure of human GYPC. The proteins GPC and GPD are encoded via initiation of translation at separate start codons in exons 1 and 2, respectively (coding exons are indicated by tall boxes, untranslated region by short boxes, and introns by lines). The gene is drawn to scale, except that exons 1 and 2 are separated by an intron that is approximately 34 kb in length.
F(A) Translation initiation regions of GPC and GPD. The GPC start codon (boxed) is present only in humans, whereas the GPD start codon is conserved across all Apes. Starred sites denote loci important for efficient recognition of the start codon by the ribosome; translation is typically most efficient when there is a purine at position −3 (which is true for GPC and GPD), and a G at position +1 (which is true only for GPD). (B) Western blotting of GYPC products in human and chimpanzee. Using an antibody recognizing an epitope shared by GPC and GPD, and exactly conserved between humans and chimpanzee, we detect both proteins in human erythrocyte ghosts (Hs) but only GPD in chimpanzee (Pt).
PAML-Implemented Site Models of Positive Selection
| Model | ln | Parameter Estimates | Selected Codons | |
| All species, full gene | ||||
| M1a | −679.00 | |||
| M2a | −673.02* | 24 (0.82), 27 (0.99), 36 (0.88) | ||
| M7 | −679.12 | |||
| M8A | −678.99 | |||
| M8 | −673.06* | 24 (0.85), 27 (0.99), 36 (0.91) | ||
| All species, exon 3 excluded | ||||
| M1a | −413.81 | |||
| M2a | −413.80 | |||
| M7 | −413.71 | |||
| M8a | −413.71 | |||
| M8 | −413.71 | |||
| Human excluded, full gene | ||||
| M1a | −619.31 | |||
| M2a | −618.26 | |||
| M7 | −619.32 | |||
| M8a | −619.31 | |||
| M8 | −618.26 | |||
*P < 0.05, comparing model including positive selection (M2a or M8) to nearly neutral model (M1a, M7, or M8a).
All models and comparisons among models are described in the text. Asterisks indicate comparisons where models that incorporate a class of sites affected by positive selection are better fits to the data than nearly neutral models (Models M2a and M8 in the “all species, full gene” comparison); in all other comparisons, models incorporating positive selection are not better fits to the data than nearly neutral models.
Parameter abbreviations are as follows: p0 = proportion of sites falling into nearly neutral site class (followed by estimate of ω0 for models M1a and M2a or shape parameters of the beta distribution (p,q) for models M7, M8a, and M8), p1 = proportion of sites falling into neutral site class (ω1 = 1); p2 = proportion of sites falling into positively selected site class (followed by estimate of ω2).
BEB estimates of sites falling into positively selected class are listed for those where the posterior probability (in parentheses) is greater than 0.8.
PAML-Implemented Branch-Site Tests of Positive Selection
| Model | ln | Parameter Estimates | Selected Codons |
| Anull | −679.00 | Foreground: | |
| Background: | |||
| A | −673.08* | Foreground: | 27 (0.99) |
| Background: |
NOTE.—Models are as discussed in text; parameters are as described in table 1. Foreground branch includes the branch leading to humans, background branches include all other branches of the Hominoidea phylogeny. On the foreground branch in Model A, two classes of selected sites (ω > 1) are possible (as described in Materials and Methods); however, only one class is populated in the present analysis.
*P < 0.05, comparing model A with Anull.
FPutative selected codons within GYPC. The posterior probability of each codon having an ω > 1 is shown for those with a probability >0.5 (values taken from the BEB analysis performed under PAML model M2a for the full data set). Extracellular, transmembrane, and cytosolic regions of the encoded protein are shown. The dotted line shows the location of the exon 3 deletion, which segregates in human populations and is hypothesized to confer resistance to falciparum malaria; bases upstream of the dotted line fall within exon 2 and downstream bases fall within exon 4.
Summary Statistics Describing Polymorphism at Human GYPC
| 9 | ||
| 0.141 | ||
| 0.096 | ||
| TD | −0.792 | |
| FWH | −3.788 | |
| FLD | −1.749 |
NOTE.—P values indicate probability of observing a smaller value based on neutral coalescent simulations.