| Literature DB >> 29083404 |
Hannes Svardal1, Anna J Jasinska2,3, Cristian Apetrei4,5, Giovanni Coppola2,6, Yu Huang7, Christopher A Schmitt8, Beatrice Jacquelin9, Vasily Ramensky2,10, Michaela Müller-Trutwin9, Martin Antonio11, George Weinstock12, J Paul Grobler13, Ken Dewar14, Richard K Wilson12, Trudy R Turner13,15, Wesley C Warren16, Nelson B Freimer2, Magnus Nordborg1.
Abstract
Vervet monkeys are among the most widely distributed nonhuman primates, show considerable phenotypic diversity, and have long been an important biomedical model for a variety of human diseases and in vaccine research. Using whole-genome sequencing data from 163 vervets sampled from across Africa and the Caribbean, we find high diversity within and between taxa and clear evidence that taxonomic divergence was reticulate rather than following a simple branching pattern. A scan for diversifying selection across taxa identifies strong and highly polygenic selection signals affecting viral processes. Furthermore, selection scores are elevated in genes whose human orthologs interact with HIV and in genes that show a response to experimental simian immunodeficiency virus (SIV) infection in vervet monkeys but not in rhesus macaques, suggesting that part of the signal reflects taxon-specific adaptation to SIV.Entities:
Mesh:
Year: 2017 PMID: 29083404 PMCID: PMC5709169 DOI: 10.1038/ng.3980
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330
Fig. 1Sample information and genetic relatedness. (a) Taxon-distribution, approximate sampling locations (triangles) and representative drawings (where available: hilgerti and pygerythrus are morphologically very similar and have often not been considered separately). Number of whole-genome sequenced monkeys given in parentheses. (b) Neighbor-joining tree based on pairwise differences oriented to approximately fit geographic sampling locations. (c) Matrix-plot of pairwise genetic differences per callable site (above diagonal) and fixation index (FST) between groups (below diagonal). Rows and columns are sorted according to a hierarchical clustering (UPGMA) tree of vervet pairwise genetic differences. (d) Clustering tree of SIVagm pol gene sequences sampled from wild vervets (Africa only). Vervet taxonomic relationships are given on the left for comparison. The tree was constructed with sequences from both vervets included in this study and with sequences from the HIV Sequence Databases. Colors correspond to group labels in (a). Monkey heads in (a) are redrawn from Haus et al. (2013)[16] adapted from Hill (1966)[17]. Distribution maps are adapted from IUCN (2017)[18].
Fig. 2Evidence for gene flow across taxa. (a) Admixture clustering of individuals. Each pie-chart represents an individual and colours represent contributions from five assumed admixture clusters. The choice of five clusters is discussed in the legend of Supplementary Fig. 12. Full results are shown in Supplementary Fig. 13. Colored lines mark comparisons in panels b and c. (b) and (c) MSMC plots of cross-coalescence rate, a measure of gene flow, across time (on a log scale). Shaded areas correspond to +/− 3 block-jackknifing standard deviations. (d) UPGMA tree of pairwise distance matrix summarized by country. Arrows point to evidence of cross-taxon gene flow. (E) D-statistic (ABBA-BABA test) for instances of gene flow shown in (d). For full results see Supplementary Fig. 14 and Supplementary Data 2.
Fig. 3Enrichment map network of Gene Ontology (GO) categories enriched for high average gene selection scores. Edges represent overlap in genes. Colors represent p-values on a log scale (red most highly significant, TopGO Kolmogorov-Smirnov weight01 p<0.001). Node size represents number of genes in a category (capped at 474). Terms are grouped using Cytoscape clustermaker[34].
Fig. 4Gene co-expression modules with differential expression pre- and post-SIV-infection which are also significantly enriched for high selection scores. Genes that were differentially expressed in CD4+ blood cells in vervet or macaque as response to SIV infection were grouped into 36 co-expression modules using WGCNA (shown in Supplementary Fig. 35). (a) Expression pattern of the five co-expression modules which are significantly enriched for high selection scores (p<0.01, FWER <0.05). (b) Joint enrichment map network of GO enrichments of the genes in the top three panels of (a) (“acute”, circles) and the bottom two panels of (a) (“chronic”, diamonds). GO enrichment was tested using TopGO Fisher’s exact test with weight01 algorithm. Edges represent overlap in genes. Node size represents number of genes in a category (capped at 474).
Fig. 5Selection scores across the genome and candidate genes with strong selection signals. (a) Manhattan plot of selection scores across all chromosomes. (b–c) Selection scores along chromosomes 6 and 16, respectively (d) Magnification of the region containing NFIX. (e) Magnification of a peak containing multiple candidates, among them CD68 (Cluster of Differentiation 68), a glycoprotein highly expressed on monocytes/macrophages, and FXR2 (Fragile X mental retardation, autosomal homolog 2) that interacts with HIV-1 Tat gene. Slightly downstream of the shown region, we note the highly scoring gene KDM6B (lysine-specific demethylase 6B), which is upregulated by HIV-1 gp120 in human B-cells.