| Literature DB >> 25758009 |
Andrew E Webb1, Z Nevin Gerek2, Claire C Morgan1, Thomas A Walsh1, Christine E Loscher3, Scott V Edwards4, Mary J O'Connell5.
Abstract
It has been proposed that positive selection may be associated with protein functional change. For example, human and macaque have different outcomes to HIV infection and it has been shown that residues under positive selection in the macaque TRIM5α receptor locate to the region known to influence species-specific response to HIV. In general, however, the relationship between sequence and function has proven difficult to fully elucidate, and it is the role of large-scale studies to help bridge this gap in our understanding by revealing major patterns in the data that correlate genotype with function or phenotype. In this study, we investigate the level of species-specific positive selection in innate immune genes from human and mouse. In total, we analyzed 456 innate immune genes using codon-based models of evolution, comparing human, mouse, and 19 other vertebrate species to identify putative species-specific positive selection. Then we used population genomic data from the recently completed Neanderthal genome project, the 1000 human genomes project, and the 17 laboratory mouse genomes project to determine whether the residues that were putatively positively selected are fixed or variable in these populations. We find evidence of species-specific positive selection on both the human and the mouse branches and we show that the classes of genes under positive selection cluster by function and by interaction. Data from this study provide us with targets to test the relationship between positive selection and protein function and ultimately to test the relationship between positive selection and discordant phenotypes.Entities:
Keywords: adaptive evolution; innate immune evolution; predicting phenotypic response; protein functional shift; species-specific responses
Mesh:
Substances:
Year: 2015 PMID: 25758009 PMCID: PMC4476151 DOI: 10.1093/molbev/msv051
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Sample of the Genes under Positive Selection in Human and Mouse and Their Parameter Estimates.
| Gene Name | Positively Selected Sites | |
|---|---|---|
| Genes under positive selection in human | ||
| CARD6 | 264, 346, 382, 750, 767, 805, 818, 903, 916, 937, 998, 1010, and 1031 | 13 > 0.50, 0 > 0.95, 0 > 0.99 |
| IRF3 | 119, 129, and 333 | 3 > 0.50, 0 > 0.95, 0 > 0.99 |
| Genes under positive selection in mouse | ||
| Adipoq | 25, 27, 29, and 82 | 4 > 0.50, 0 > 0.95, 0 > 0.99 |
| Atg9a | 634 and 662* | 2 > 0.50, 1 > 0.95, 0 > 0.99 |
| C1inh | 332*, 365, 468, 479 | 4 > 0.50, 1 > 0.95, 0 > 0.99 |
| C1ra | 468, 520, 574, 631, 633, and 634* | 6 > 0.50, 1 > 0.95, 0 > 0.99 |
| C6 | 220, 233, 319, 353, 378, 408, 419, 430, 554, 655, 681, 703, 792, and 930 | 14 > 0.50, 0 > 0.95, 0 > 0.99 |
| C8b | 242*, 263, 278, 383*, and 488 | 5 > 0.50, 2 > 0.95, 0 > 0.99 |
| Card6 | 394, 501, and 702 | 3 > 0.50, 0 > 0.95, 0 > 0.99 |
| Cd200 | 129 and 177 | 2 > 0.50, 0 > 0.95, 0 > 0.99 |
| Cd63 | 31, 118, 143, 184*, 194, and 203 | 6 > 0.50, 1 > 0.95, 0 > 0.99 |
| Cfh | 209, 243, 474, 767, 1005, 1068, 1074, 1104, 1181, and 1227 | 10 > 0.50, 0 > 0.95, 0 > 0.99 |
| Ecsit | 10, 12, 75, 82, 176, 325, 330**, 348, and 371 | 9 > 0.50, 1 > 0.95, 1 > 0.99 |
| Eif2ak2 | 136, 155, 181, 182*, 344, and 345 | 6 > 0.50, 1 > 0.95, 0 > 0.99 |
| F12 | 45, 65, 166, 243**, and 454 | 5 > 0.50, 1 > 0.95, 1 > 0.99 |
| Grn | 18, 101, 198, 303, 375*, 382, 411, 549, and 597 | 9 > 0.50, 1 > 0.95, 0 > 0.99 |
| Ifit2 | 191, 402, and 420 | 3 > 0.50, 0 > 0.95, 0 > 0.99 |
| Il1rapl2 | 566, 628, and 666* | 3 > 0.50, 1 > 0.95, 0 > 0.99 |
| Il2rb | 4, 13, 31, 55, 174, 202, 347*, 402, 418, 491, 496, and 516 | 12 > 0.50, 1 > 0.95, 0 > 0.99 |
| Il4ra | 47, 67, 308, 330, and 626 | 5 > 0.50, 0 > 0.95, 0 > 0.99 |
| Irf5 | 232, 259, and 262 | 3 > 0.50, 0 > 0.95, 0 > 0.99 |
| Lbp | 24, 40 and 329 | 3 > 0.50, 0 > 0.95, 0 > 0.99 |
| Lgals3 | 22, 92, 94, and 260 | 4 > 0.50, 0 > 0.95, 0 > 0.99 |
| Lrrfip1 | 328, 449, 468, 480, and 571 | 5 > 0.50, 0 > 0.95, 0 > 0.99 |
| Ltb4r1 | 53, 101, and 175 | 3 > 0.50, 0 > 0.95, 0 > 0.99 |
| Nlrp14 | 77, 79*, 186, 212, 219, 254, 257, 263, 272, 281, 284, 291, 294, 315, 319, 333, 358, 393, 415, 424, 453, 465, 530, 549, 552, 553, 584, 613, 657, 679, 684, 685, 687, 696, 782, 810, 814, 829, 846, 848, 902, 908, 912, 931, 953, 956, 958, 978, 982, 984, and 986 | 51 > 0.50, 0 > 0.95, 0 > 0.99 |
| Nlrp6 | 22, 25, 72, 77, 80, 81, 85, 96, 101, 113, 114, 190, 192, 251, 260**, 329, 344, 479, 488, 515, 553, 571, 628, 657, 727, 737, 739, 744, 771, 775*, 776, 793, 807, 865, 877, and 880 | 36 > 0.50, 2 > 0.95, 1 > 0.99 |
| Oas2 | 55, 56, 139, 171, 199, 211, 221, 298, 481, 549, and 711 | 11 > 0.50, 0 > 0.95, 0 > 0.99 |
| Plcg2 | 461 and 594* | 2 > 0.50, 1 > 0.95, 0 > 0.99 |
| Ptpn2 | 166, 206, 319, 321, and 329 | 5 > 0.50, 0 > 0.95, 0 > 0.99 |
| Rnf31 | 203, 431, and 1025 | 3 > 0.50, 0 > 0.95, 0 > 0.99 |
| Sirt1 | 107, 537, 698, and 701 | 4 > 0.50, 0 > 0.95, 0 > 0.99 |
| Snap23 | 109, 133, and 197 | 3 > 0.50, 0 > 0.95, 0 > 0.99 |
| Stat2 | 21, 130, 149, 157, 195, 205, 218, 354, 623*, 869, 871, 874, 876, and 877 | 14 > 0.50, 1 > 0.95, 0 > 0.99 |
| Tcf4 | 139 | 1 > 0.50, 0 > 0.95, 0 > 0.99 |
| Tlr3 | 266, 297, and 603 | 3 > 0.50, 0 > 0.95, 0 > 0.99 |
| Trif | 18, 327, 338, 388, 482, 556, and 711 | 8 > 0.50, 0 > 0.95, 0 > 0.99 |
Note.—For each gene the positively selected sites are displayed and the associated posterior probability is indicated as follows: * >0.95, ** >0.99 and no asterisk implies 0.50 < PP < 0.95.
FPhylogeny of species included in this study and summary of lineage-specific positive selection results. The lineages that were tested for species-specific selective pressure variation are highlighted. The number and percentage of genes displaying evidence of species-specific positive selection are also shown.
FInnate immune pathways incorporating multiple positively selected genes. The positively selected genes in (a) the complement system and (b) the TLR signaling pathway are illustrated as darkened rectangles. Signaling cascades are depicted as arrows and inhibitors are depicted as blunt-ended lines. Defined pathways and complexes are highlighted in gray boxes with the given name. (c) Positively selected sites of the complement system alongside information on domain structure. Information on the function of these domains is also given.
Fdfi of human TLR3 ectodomain. Ribbon diagrams of the crystal structure of the TLR3 ectodomain of (a) human (PDB id: 2A0Z) and (b) mouse (PDB id: 3CIG) colored with a spectrum of red–yellow–green–cyan–blue with respect to dfi. Red indicates the highest dfi values whereas blue indicates the lowest values. (c) The stability change for all possible substitutions was computed for: the positively selected sites in mouse (E266, Y297, and E603), their human homologs (N265, W296 and P602), known human disease-associated sites (N284, F303, L412, and P554; Stenson et al. 2003, and randomly selected sites. Except for the randomly selected sites, sites have been colored and indicated on the respective ribbon diagrams.
Population Level Analysis of Positively Selected Sites in Human and Mouse Genes.
| Genes Tested | Number of Positively Selected Sites | Number of coding SNPs tested | Unfixed Sites |
|---|---|---|---|
| Genes under positive selection specifically in the human lineage | |||
| CARD6 | 13 | 38 | G264E |
| IRF9 | 3 | 9 | None |
| Genes under positive selection specifically in the mouse lineage | |||
| Adipoq | 4 | 9 | None |
| Atg9a | 1 | 3 | None |
| C1ra | 6 | 16 | None |
| C6 | 14 | 53 | R554L |
| C8b | 5 | 36 | M263I |
| Card6 | 3 | 4 | None |
| Cd63 | 6 | 6 | None |
| Cfh | 10 | 8 | None |
| Ecsit | 9 | 1 | S75L |
| Grn | 9 | 16 | None |
| Ifit2 | 3 | 14 | None |
| Il1rapl2 | 3 | 11 | None |
| Il4ra | 5 | 51 | F47S & D626G |
| Irf5 | 3 | 14 | None |
| Lbp | 2 | 33 | None |
| Lgals3 | 4 | 15 | None |
| Lrrfip1 | 5 | 49 | None |
| Ltb4r1 | 3 | 16 | None |
| Nlrp14 | 51 | 16 | A613S |
| Nlrp6 | 36 | 53 | None |
| Plcg2 | 1 | 51 | None |
| Rnf31 | 3 | 24 | None |
| Snap23 | 3 | 2 | None |
| Stat2 | 14 | 58 | L874M |
| Tcf4 | 1 | 20 | None |
| Tlr3 | 3 | 4 | None |
| Trif | 7 | 3 | None |
note.—Assessment of the fixation of positively selected sites was only possible on the genes shown above. For each gene, the number of positively selected sites and the number of coding SNPs tested are given. If a coding SNP within a positively selected codon resulted in a nonsynonymous substitution, the resulting unfixed variant is shown above as a mutation at their respective site (e.g., G264E).
FNeutrality tests for positively selected genes in the human lineage. Sliding window analysis of Tajima’s D of the positively selected genes identified in human. The analysis was conducted using a window size of 1 kb within 100 kb upstream and downstream of each gene. The 95% confidence interval is shown as red highlighted region.