| Literature DB >> 24356879 |
Zélia Ferreira1, Belen Hurle, Aida M Andrés, Warren W Kretzschmar, James C Mullikin, Praveen F Cherukuri, Pedro Cruz, Mary Katherine Gonder, Anne C Stone, Sarah Tishkoff, Willie J Swanson, Eric D Green, Andrew G Clark, Susana Seixas.
Abstract
Recent efforts have attempted to describe the population structure of common chimpanzee, focusing on four subspecies: Pan troglodytes verus, P. t. ellioti, P. t. troglodytes, and P. t. schweinfurthii. However, few studies have pursued the effects of natural selection in shaping their response to pathogens and reproduction. Whey acidic protein (WAP) four-disulfide core domain (WFDC) genes and neighboring semenogelin (SEMG) genes encode proteins with combined roles in immunity and fertility. They display a strikingly high rate of amino acid replacement (dN/dS), indicative of adaptive pressures during primate evolution. In human populations, three signals of selection at the WFDC locus were described, possibly influencing the proteolytic profile and antimicrobial activities of the male reproductive tract. To evaluate the patterns of genomic variation and selection at the WFDC locus in chimpanzees, we sequenced 17 WFDC genes and 47 autosomal pseudogenes in 68 chimpanzees (15 P. t. troglodytes, 22 P. t. verus, and 31 P. t. ellioti). We found a clear differentiation of P. t. verus and estimated the divergence of P. t. troglodytes and P. t. ellioti subspecies in 0.173 Myr; further, at the WFDC locus we identified a signature of strong selective constraints common to the three subspecies in WFDC6-a recent paralog of the epididymal protease inhibitor EPPIN. Overall, chimpanzees and humans do not display similar footprints of selection across the WFDC locus, possibly due to different selective pressures between the two species related to immune response and reproductive biology.Entities:
Keywords: WFDC; chimpanzees; innate immunity; natural selection; reproduction; serine protease inhibitor
Mesh:
Substances:
Year: 2013 PMID: 24356879 PMCID: PMC3879984 DOI: 10.1093/gbe/evt198
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FSchematic representation of the 20q13 WFDC locus, showing the relative positions of the WFDC genes. As depicted, the WFDC locus spans 700 kb and its genes are organized into two subloci (centromeric and telomeric; WFDC-CEN and WFDC-TEL, respectively), separated by 215 kb of unrelated sequence.
Estimated Parameters Using the 47 Control Regions
| NPtt | 51,975 (19,737–69,162) | — | 134,900 (75,900–251,200) | 26,900 (16,100–43,900) | 23,100 (8,600–59,700) |
| NPte | 43,512 (19,925–56,175) | — | — | — | — |
| NPtv | 21,062 (16,500–27,637) | — | 9,800 (5,000–72,400) | 7,400 (5,400–10,000) | 10,100 (7,700–21,100) |
| NAPtt-Pte | 57,412 (18,762–126,287) | — | — | — | — |
| NAPtt-Pte-Ptv | 70,525 (58,850–91,900) | — | — | — | — |
| NAPan | 22,787 (1,025–47,175) | — | 89,100 (36,300–245,500) | 7,100 (3,500–12,500) | 32,900 (22,200–48,700) |
| TDIVHomoPan (Myr) | 6.58 (4.34–8.54) | — | — | — | — |
| TDIV Ptt-Pte-Ptv (MY) | 0.31 (0.236–0.405) | 0.46 (0.37–0.53) | 0.55 (0.34–0.91) | 0.46 (0.35–0.65) | 0.44 (0.32–1.10) |
| TDIV Ptt-Pte (Myr) | 0.173 (0.034–0.237) | 0.11 (0.09–0.13) | — | — | — |
NPtt, P. t. troglodytes effective population size; NPte, P. t. ellioti effective population size; NPtv, P. t. verus effective population size; NAPtt-Pte, ancestral effective population size of P. t. troglodytes and P. t. ellioti; NAPtt-Pte-Ptv, ancestral effective population size of the three subspecies; NAPan, ancestral effective population size of common chimpanzee; TDIVHomoPan, human–chimpanzee divergence time in million years; TDIV Ptt-Pte-Ptv, P. t. verus divergence time from P. t. troglodytes and P. t. ellioti; TDIV Ptt-Pte, P. t. troglodytes and P. t. ellioti divergence time in million years.
aConfidence intervals are 95% highest posterior density intervals.
bConfidence intervals are 90% highest posterior density intervals.
FSchematic representation of the inferred demographic history of the three subspecies: Pan troglodytes troglodytes (Ptt); P. t. verus (Ptv); and P. t. ellioti (Pte). NA, ancestral effective population size; N, effective population size; TDIV, divergence time.
FFolded SFS for the species that were resequenced. The x axis depicts the frequency of the allele frequency bin in the generated data set, whereas the y axis represents the number of alleles found within each frequency bin. Syn, synonymous changes; NSyn, nonsynonymous changes. (A) Folded SFS in WFDC locus; (B) folded SFS of WFDC locus highlighting coding mutations.
Summary Statistics for All the WFDC Genes
| Gene | Subspecies | π (10−4) | θw | ||||||
|---|---|---|---|---|---|---|---|---|---|
| 5,536 | 32 | 11.87 | 8.077 | −0.4202 | −0.3432 | −1.6644 | 0.3858 | ||
| 1,323 | 16 | 3.026 | 4.039 | −0.6318 | 0.4407 | −0.6437 | 0.1866 | ||
| 3,377 | 25 | 7.748 | 6.310 | −0.3775 | 0.1365 | 1.5448 | 0.9625 | ||
| 3,305 | 23 | 4.015 | 5.806 | −1.172 | −0.8307 | −1.2782 | 0.7463 | ||
| 4,324 | 31 | 8.471 | 7.825 | −0.8778 | 0.2350 | −2.4092 | 0.9241 | ||
| 4,709 | 30 | 16.84 | 7.573 | 0.5078 | 0.6260 | 2.5195 | 0.9095 | ||
| 3,984 | 24 | 5.850 | 6.058 | −0.7272 | 0.3391 | −3.6322 | 0.4648 | ||
| 3,727 | 35 | 10.72 | 8.835 | −0.8613 | −0.9377 | 14.278 | 0.0604 | ||
| 2,807 | 19 | 1.269 | 4.796 | −2.073 | −2.290 | −1.7839 | 0.1907 | ||
| 3,233 | 23 | 2.356 | 5.806 | −1.811 | −2.218 | 5.2828 | 0.1362 | ||
| 7,179 | 36 | 9.139 | 9.087 | −1.169 | −1.054 | 0.5977 | 0.5583 | ||
| 6,863 | 44 | 16.64 | 11.11 | −0.8419 | −0.8662 | 2.0736 | 0.2643 | ||
| 5,037 | 58 | 37.48 | 14.64 | −0.3621 | −0.1044 | 9.0184 | 0.4493 | ||
| 7,365 | 41 | 13.99 | 10.35 | −0.9045 | −0.8869 | −5.0023 | 0.882 | ||
| 3,527 | 14 | 2.296 | 3.534 | −0.7060 | −0.1432 | −1.9862 | 0.9805 | ||
| 7,572 | 51 | 20.48 | 12.87 | −0.9501 | −0.6436 | −0.5163 | 0.4458 | ||
| 5,536 | 24 | 9.751 | 5.110 | 0.8670 | 0.4099 | −1.9799 | 0.1653 | ||
| 1,323 | 7 | 1.277 | 1.491 | 0.8178 | 0.4173 | −0.0063 | 0.9754 | ||
| 3,377 | 14 | 3.947 | 2.981 | 0.9111 | −0.4956 | 1.1021 | 0.3997 | ||
| 3,305 | 28 | 3.491 | 5.962 | −1.250 | −0.5904 | −0.9815 | 0.321 | ||
| 4,324 | 22 | 2.378 | 4.685 | −1.193 | −0.4544 | −0.1861 | 0.2107 | ||
| 4,709 | 30 | 5.464 | 6.388 | −0.8478 | −1.875 | 4.6113 | 0.9302 | ||
| 3,984 | 25 | 7.905 | 5.323 | 0.2832 | −0.1960 | 5.2226 | 0.5012 | ||
| 3,727 | 14 | 3.866 | 2.981 | 0.8654 | 0.5262 | 0.7953 | 0.2557 | ||
| 2,807 | 13 | 1.364 | 2.768 | −0.7455 | −1.177 | −1.6838 | 0.3477 | ||
| 3,233 | 23 | 3.936 | 4.898 | −0.6386 | −0.0079 | 12.0571 | 0.054 | ||
| 7,179 | 29 | 5.759 | 6.175 | −0.6888 | −1.990 | 4.7842 | 0.3548 | ||
| 6,863 | 28 | 10.18 | 5.962 | 0.3822 | 0.9311 | 3.1793 | 0.0392 | ||
| 5,037 | 42 | 15.62 | 8.943 | −0.1915 | −0.1990 | 4.7848 | 0.2656 | ||
| 7,365 | 29 | 6.956 | 6.175 | −0.4046 | −0.2142 | 0.055 | 0.2968 | ||
| 3,527 | 14 | 0.8043 | 2.981 | −1.498 | −1.007 | 4.3977 | 0.8262 | ||
| 7,572 | 59 | 14.79 | 12.56 | −1.182 | −0.8554 | 15.3739 | 0.7554 | ||
| 5,536 | 14 | 1.743 | 3.218 | −0.8086 | 0.6041 | 9.148 | 0.4733 | ||
| 1,323 | 7 | 1.396 | 1.609 | 0.7787 | −1.027 | −0.8393 | 0.1127 | ||
| 3,377 | 7 | 1.979 | 1.609 | 1.6256 | 0.4908 | 0.8076 | 0.6147 | ||
| 3,305 | 20 | 2.323 | 4.598 | −1.2464 | −0.8163 | −1.4884 | 0.0296 | ||
| 4,324 | 9 | 0.883 | 2.069 | −0.7344 | −1.207 | −0.5666 | 0.2473 | ||
| 4,709 | 14 | 2.797 | 3.218 | −0.0454 | 1.071 | 0.8203 | 0.8173 | ||
| 3,984 | 3 | 0.212 | 0.690 | −0.4387 | −0.3775 | 0.7653 | 0.3611 | ||
| 3,727 | 9 | 1.865 | 2.069 | 0.5711 | 0.0750 | 3.3425 | 0.6688 | ||
| 2,807 | 9 | 0.629 | 2.069 | −1.1725 | −0.5657 | −1.0973 | 0.0616 | ||
| 3,233 | 7 | 1.523 | 1.609 | 0.9748 | −0.2682 | −0.4207 | 0.2532 | ||
| 7,179 | 7 | 0.815 | 1.609 | −0.2537 | 0.4908 | −1.1501 | 0.0954 | ||
| 6,863 | 22 | 3.333 | 5.057 | −1.002 | −0.2720 | 3.2896 | 0.4979 | ||
| 5,037 | 25 | 7.986 | 5.747 | 0.0283 | −0.3329 | 2.408 | 0.0605 | ||
| 7,365 | 14 | 3.356 | 3.218 | 0.3020 | 1.071 | 0.4249 | 0.4759 | ||
| 3,527 | 6 | 1.062 | 1.379 | 0.6795 | −0.4940 | −0.4989 | 0.856 | ||
| 7,572 | 32 | 17.27 | 7.356 | 0.6888 | 0.3704 | 12.222 | 0.8309 |
L, length sequenced (bp); S, number of segregating sites; π, nucleotide diversity per base pair (× 10−4); θw, Watterson’s estimator of Ө (4Neµ) (Watterson 1975) per base pair (× 10−4); D, Tajima’s D statistic (Tajima 1989); D*, Fu and Li’s D* test (Fu and Li 1993); H, Fay and Wu H test (Fay et al. 2002; Zeng et al. 2006); P(HKA), HKA test P value (Hudson et al. 1987).
*P value ≤0.025 using three different demographic models (constant size, our best-fit model, and Hey 2010).
**P value <0.01 using three different demographic models (constant size, our best-fit model, and Hey 2010).
***P value ≤0.025 using our best-fitting model.
FEmpirical comparisons generated from the 47 control regions. Tajima’s D (Tajima 1989) was calculated for each region using SLIDER and plotted with the 2.5 and 97.5 percentiles represented as dashed lines.
FInferred haplotype network at the WFDC6. Each circle represents a unique haplotype, and its area is proportional to its frequency. Within each circle, Pan troglodytes verus, P. t. ellioti, and P. t. troglodytes are labeled in green, purple, and orange, respectively. The mutations that differentiate each haplotype are shown along each branch.
FAmino acid alignment of WFDC6 and EPPIN. Cysteines are marked in light green; PSA binding site is marked in pink; and disulfide bridges are marked in black lines. Black squares represent stop codons.