Literature DB >> 35757177

Oxytocin and vasotocin receptor variation and the evolution of human prosociality.

Constantina Theofanopoulou1, Alejandro Andirkó2,3, Cedric Boeckx2,3,4, Erich D Jarvis1,5.   

Abstract

Modern human lifestyle strongly depends on complex social traits like empathy, tolerance and cooperation. These diverse facets of social cognition have been associated with variation in the oxytocin receptor (OTR) and its sister genes, the vasotocin/vasopressin receptors (VTR1A/AVPR1A and AVPR1B/VTR1B). Here, we compared the available genomic sequences of these receptors between modern humans, archaic humans, and 12 non-human primate species, and identified sites that show heterozygous variation in modern humans and archaic humans distinct from variation in other primates, and for which we could find association studies with clinical implications. On these sites, we performed a range of analyses (variant clustering, pathogenicity prediction, regulation, linkage disequilibrium frequency), and reviewed the literature on selection data in different modern-human populations. We found five sites with modern human specific variation, where the modern human allele is the major allele in the global population (OTR: rs1042778, rs237885, rs6770632; VTR1A: rs10877969; VTR1B: rs33985287). Among them, variation in the OTR-rs6770632 site was predicted to be the most functional. Two alleles (OTR: rs59190448 and rs237888) present only in modern humans and archaic humans were putatively under positive selection in modern humans, with rs237888 predicted to be a highly functional site. Three sites showed convergent evolution between modern humans and bonobos (OTR: rs2228485 and rs237897; VTR1A: rs1042615), with OTR-rs2228485 ranking highly in terms of functionality and reported to be under balancing selection in modern humans (Schaschl, 2015) [1]. Our findings have implications for understanding hominid prosociality, as well as the similarities between modern human and bonobo social behavior.
© 2022 The Authors.

Entities:  

Keywords:  Denisovan; Neanderthal; Oxytocin; SNPs; Vasopressin; Vasotocin

Year:  2022        PMID: 35757177      PMCID: PMC9227999          DOI: 10.1016/j.cpnec.2022.100139

Source DB:  PubMed          Journal:  Compr Psychoneuroendocrinol        ISSN: 2666-4976


Introduction

Modern humans are characterized by prosociality, a broad term that encompasses intraspecies empathy, social tolerance, cooperation and altruism. While our closest living relatives, chimpanzees (Pan Troglodytes) and bonobos (Pan Paniscus), live in highly organized social groups, modern human social networks are larger and denser, powered by a complex social cognitive machinery [2]. Modern humans are also characterized by great intrasocial compassion and display a clear tendency to act in concert, to the extent that our species has been labeled as ‘ultra-social’ [3]. Less is known about the sociality of our extinct human relatives, the Neanderthals and the Denisovans. Paleogenomic studies show that both Neanderthal and Denisovan populations endured a population size bottleneck (i.e., sharp reductions in their population size) [4]. In addition to this, high inbreeding rates evident in both the Altai Neanderthal [5] and the Denisovan [6] genomes, as well as lower genetic diversity in Neanderthals compared to modern humans [7] suggest that their social groups were most likely quite smaller than ours. According to Rogers and colleagues [8], Neanderthals were deeply subdivided into small isolated populations with scarce contact between them, which may be associated with a social profile distinct from Homo sapiens. A wide range of studies have highlighted a role of oxytocin and vasopressin/vasotocin receptor genes in a broad range of behaviors, including social [[9], [10], [11]] and non-social behaviors [12,13]. Relevant to this study, some of the social behaviors these genes have been associated with are bond formation, parental care, social recognition and social vocalizations [[9], [10], [11]]. Alterations in these genes have been linked to disorders that include social deficits, such as Autism Spectrum Disorders (ASD) [10]. These findings have led some researchers, most prominently Hare [14], to ascribe to these genes a key role in the emergence of human ultra-social behavior. Considering both the evidence on modern human prosociality and the evidence on the involvement of the oxytocin and vasopressin/vasotocin receptors in social behaviors, we hypothesized that the evolution of these receptors might elucidate the genetic basis of the evolution of hominin prosociality. For this reason, we searched for sequence variation in the OTR-VTR1 family of genes in modern humans, archaic humans and non-human primates (specifically, OTR (a.k.a. OXTR), VTR1A (a.k.a. AVPR1A) and VTR1B (a.k.a. AVPR1B), following the nomenclature proposed in Ref. [15]). Since fixed or nearly fixed changes associated with sociality have not been found so far in these genes in comparisons between modern humans (MH) and archaic humans (AH) or non-human primates (NHP) (chimpanzees and bonobos in particular) [5,[16], [17], [18]], we sought to identify potential polymorphic heterozygous sites in MH that are not found in AH or/and NHP, and that have been associated with social features/deficits; polymorphic heterozygous sites are those where at least two alternative sequences are found in a population. We identified 5 sites with MH specific variation, where the modern human alleles are the major alleles in the global population. Among them, there were sites in the OTR regulatory regions that are active in the brain and associated with prosocial behavior in modern humans. We also identified 3 convergent sites between modern humans and bonobos, a primate species that shows convergence of prosocial behaviors with humans, separate from chimpanzees. Some of the sites have been reported to be under balancing or positive selection in MH. Our findings shed light on the evolutionary questions on modern human and hominid prosociality in general, as well as on similarities in the social behavior of MH and bonobos.

Results

Variation pattern analyses and associations with sociality

We performed alignments of the OTR, VTR1A and VTR1B genomic sequences (exon, introns, and surrounding regulatory regions 600 bp upstream and downstream) between seven high-coverage present-day MH genomes, 14 Neanderthal [5,[19], [20], [21], [22], [23], [24]] a Denisovan [6] genome (i.e. archaic humans; AH), and multiple genomes of 3 non-human primate (NHP) species (chimpanzee, bonobo, macaque; Supplementary Table 1). We also included several VTR1A-microsatellites that have been associated with social-related phenotypes [25,26]: RS3-(CT)4TT (CT)8 (GT)24; RS1-(GATA)14; GT25; and AVR-(GT)14 (GA)13(A)8. For the NHP, we additionally used available Single Nucleotide Polymorphism (SNP) data from bonobo, chimpanzee, and macaque populations (∼30–∼1000 individuals/species; Supplementary Table 2; Methods) to account for variation in non-human primate populations. We inferred the ancestral state of the sites using Ortheus multialignments (Supplementary Note 1). From the total number of SNPs found on the OTR (5409 SNPs), VTR1A (2340 SNPs) and VTR1B (3031 SNPs) in the NCBI Variation Viewer (National Center for Biotechnology Information) in MH (GRCh38. p12 reference genome), only 61 had associated literature findings, consisting of 42 SNPs (285 studies) for OTR, 8 SNPs (24 studies) for VTR1A, and 11 SNPs (14 studies) for VTR1B (Supplementary Tables 3, 4 and 5). We classified these associations based on whether they were strongly associated to sociality (e.g., ‘prosociality’, ‘social cognition’), possibly related to sociality (e.g., ‘depression’, ‘Attention Deficit Hyperactivity Disorder’), or not related to sociality (e.g., ‘diabetes’). We then calculated the percentage of the associations for each site, averaged all the percentages for each category relative to the total number of the SNPs and tested whether the differences in the associations between the 3 receptors were significant. We found that variant sites in the OTR had a higher functional association to sociality over VTR1A and VTR1B (Supplementary Note 2, Supplementary Tables 3–6). Specifically, of the 61 sites analyzed, 29 had associations with sociality (Supplementary Tables 7–9) and simultaneously showed a variation pattern found only in MH, in AH and MH, or in MH and a NHP species: 19 OTR-SNPs, 7 VTR1A-SNPs and 3 VTR1B-SNPs. Based on their variation patterns in MH, AH and NHP, we classified the 29 sites into five categories (Fig. 1): 1) Modern Human Unique (MHU) for sites where the heterozygous allelic variation is unique to MH (e.g., T/A in MH and A/A in all other species); 2) Modern Human Specific (MHS) for sites where MH have a specific variation pattern (e.g. T/A) different from other species (e.g, C/A); 3) Homo Unique (HU) for sites that are variant in both MH and AH in the same fashion (e.g., T/A), while the rest of the NHP are invariant (e.g., A/A) or have a different variant (e.g., C/A); 4) Homo Specific (HS) for sites that are variant in MH (e.g., T/A), and invariant in AH (e.g., T/T), different from all NHP species (e.g. A/A); and 5) MH-NHP for sites with convergent variation between MH (e.g., T/A) and another NHP (e.g., T/A in bonobo), where other species/populations are invariant for the ancestral allele (e.g., A/A). Of these 29 sites associated with sociality, most (13) were MHU in OTR, while the 3 in VTR1B were MHS. The others were in a mixture of gene-species combinations, with 2 HU sites in OTR and 1 in VTR1A; 4 HS sites in OTR and 1 VTR1A; and 3 convergent sites between MH and bonobos (2 in OTR and 1 in VTR1A) (Fig. 1, Table 1).
Fig. 1

Variant clustering. a, For each variant site in OTR, VTR1A and VTR1B, we list the variation pattern found in MH (Modern Humans), AH (Archaic humans: Neanderthals and Denisovans), Chimpanzees, Bonobos and Macaques. In the column ‘Type’ we classify each allele on its assigned Type-category. Alleles that were key for each classification are in red font. b, A schematic representation of the four classification types. Shown in the columns are examples of each type. More details described for each type are in the main text results. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Table 1

List of identified variants in Modern humans and pathogenicity prediction scores. For each variant, we have listed where it is found in the genetic sequence (intronic, 3′-UTR) and if in the exon, whether it gives rise to a missense or a synonymous mutation (‘Consequence’); if exonic, and what ‘Amino acids’ it gives rise to; all MH (Modern human) alleles found on this site, with the ancestral being in bold; the ‘Type’ of each site based on our 5 categories (MHU: Modern Human Unique; MHS: Modern Human Specific; HU: Homo Unique; HS: Homo Specific; MH-NHP: Modern Human- Non-Human Primate); the Major Allele Frequency (MAF); the Highest Minor Allele Frequency (MAF) in a specific population from the 1000 Genomes or the gnomAD project (AFR: African, AMR: American, EAS: East Asian, SAS: South Asian, AMI: Amish); CADD (Combined Annotation Dependent Depletion) and GERP (Genomic Evolutionary Rate Profiling), Mutation Assessor, REVEL (Rare Exome Variant Ensemble Learner), MetaLR (Meta Logistic Regression), SIFT (Sorting Intolerant From Tolerant) and PolyPhen-2 scores. Scores in red: deleterious; scores in yellow: neutral; scores in green: benign.

Variant clustering. a, For each variant site in OTR, VTR1A and VTR1B, we list the variation pattern found in MH (Modern Humans), AH (Archaic humans: Neanderthals and Denisovans), Chimpanzees, Bonobos and Macaques. In the column ‘Type’ we classify each allele on its assigned Type-category. Alleles that were key for each classification are in red font. b, A schematic representation of the four classification types. Shown in the columns are examples of each type. More details described for each type are in the main text results. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.) List of identified variants in Modern humans and pathogenicity prediction scores. For each variant, we have listed where it is found in the genetic sequence (intronic, 3′-UTR) and if in the exon, whether it gives rise to a missense or a synonymous mutation (‘Consequence’); if exonic, and what ‘Amino acids’ it gives rise to; all MH (Modern human) alleles found on this site, with the ancestral being in bold; the ‘Type’ of each site based on our 5 categories (MHU: Modern Human Unique; MHS: Modern Human Specific; HU: Homo Unique; HS: Homo Specific; MH-NHP: Modern Human- Non-Human Primate); the Major Allele Frequency (MAF); the Highest Minor Allele Frequency (MAF) in a specific population from the 1000 Genomes or the gnomAD project (AFR: African, AMR: American, EAS: East Asian, SAS: South Asian, AMI: Amish); CADD (Combined Annotation Dependent Depletion) and GERP (Genomic Evolutionary Rate Profiling), Mutation Assessor, REVEL (Rare Exome Variant Ensemble Learner), MetaLR (Meta Logistic Regression), SIFT (Sorting Intolerant From Tolerant) and PolyPhen-2 scores. Scores in red: deleterious; scores in yellow: neutral; scores in green: benign. Ancestral allele and functional association analyses revealed that in almost all cases, the ancestral and the MHU, MHS, HU or HS alleles showed mixed associations with sociality, i.e., the same allele was associated with both ‘prosocial’ and ‘antisocial’ features in the literature. In other words, we did not find a general trend where the ancestral allele was associated with ‘antisocial’ features, and the new allele with ‘prosocial’ ones, or the other way around (Supplementary Tables 7–9). For example, the MHU OTR-rs1042778(G), which is very frequent in MH populations, has been associated with ASD [27], but also with altruistic and comforting behaviors [28] (Supplementary Table 7). A similar trend of mixed social associations for each variant, or absence of evidence for one of the allelic variants, was found for the rest of the sites (Supplementary Tables 7–9).

Pathogenicity prediction analyses

Most of the sites we identified were located in regulatory regions of the genes, but some in exons giving rise to missense substitutions (OTR-rs4686302 [HS], VTR1B-rs28632197 [MHS] and VTR1A-rs1042615 [MH-NHP]) or synonymous substitutions (VTR1A-rs2228485 [MH-NHP] and VTR1B-rs28676508 [MHS]) (Table 1). We performed pathogenicity prediction analyses on the 29 sites we identified using genome wide scores of conservation that predict deleteriousness of the substitutions (CADD [29] [Combined Annotation Dependent Depletion] and GERP [30] [Genomic Evolutionary Rate Profiling]), and algorithms that evaluate the impact of missense variants (Mutation Assessor [31], REVEL [32] [Rare Exome Variant Ensemble Learner], MetaLR [33] [Meta Logistic Regression], SIFT [34] [Sorting Tolerant From Intolerant] and PolyPhen-2 [35]) (Table 1). CADD scores that predict variant deleteriousness showed that MH-NHP OTR-rs2228485(A) and VTR1A-rs1042615 (G, C, T) alleles are likely deleterious (i.e., scores >20 predict deleteriousness), while HS OTR-rs4686302(T) also scored very close to this value (19; Table 1). Scores for most other sites ranged from ∼0 to 6, meaning their predicted effect is likely benign. GERP tests showed positive evolutionary rate scores (Methods) only for MHU OTR-rs2254298 and HS OTR-rs13316193 (Table 1), which represent highly conserved sites that are likely to be functional. Combined REVEL, MetaLR, SIFT and PolyPhen-2 scores classified HS OTR-rs4686302 as more likely benign, with only the Mutation Assessor classifying it as likely deleterious (Table 1). MH-NHP VTR1A-rs1042615 was ranked by most of the aforementioned analyses as deleterious or likely deleterious, with only MetaLR ranking it as benign. SIFT and PolyPhen-2 tests categorized MHS VTR1B-rs28632197 as benign. The finding that in both MH-NHP sites we identified deleterious signals for the ancestral alleles, namely those present in MH and all NHP, including bonobos, and not for the MH-NHP convergent alleles, perhaps suggests selection differences in social behavior in MH and bonobos vs. other NHP.

Regulation analyses

We performed regulation analyses using RegulomeDB [36] and the Ensembl regulation data resources [37]. RegulomeDB makes use of large datasets, including ChIP-seq, FAIRE-seq and DNase I hypersensitive information for a variety of important regulatory factors across a diverse set of cell types; this includes chromatin state information for over 100 cell types, transcription factor binding sites (including Position-Weight Matrix and DNase Footprinting), and expression quantitative trait loci (eQTL) information allowing the association of distal sites with gene promoters. The Ensembl resources provide a functional annotation of the regulatory elements in the human genome, including data on epigenetic marks, transcription factor binding and DNA methylation, as well as microarray probe mappings. Our RegulomeDB analyses ranked highly the MHU OTR-rs237889 variant site as lying in a functional genomic location (rank 1f; Table 2). Consistent with these findings, this same site is found in a genomic location with peak Chip-seq and DNase-seq signal for open chromatin in the brain and other cell types (Fig. 2). Open chromatin regions are regions that can be bound by protein factors and play various roles in gene transcription, DNA replication, and nuclear organization. Accordingly, this region (OTR-rs237889) contains binding sites for the EGR2 (Early Growth Response 2) and ZFHX2 (Zinc Finger Homeobox 2) transcription factors and is located in an eQTL that affects the expression of CAMK1 (Calcium/Calmodulin Dependent Protein Kinase I) (Table 2). The second most highly ranked site was the MH-NHP OTR-rs2228485 allele (rank 2b). ChIP-seq, DNase-seq and FAIRE-seq data indicate that this site is in a genomic region active in the brain, and in particular the dopaminergic substantia nigra neurons and one of its projection targets, the caudate nucleus of the striatum, as well as the hippocampus, other parts of the temporal lobe, the cingulate and angular gyri, and the middle frontal area 46 (Table 2). This site is implicated in binding with CTCF (CCCTC-Binding Factor), EZH2 (Enhancer Of Zeste 2 Polycomb Repressive Complex 2 Subunit), POLR2A (RNA Polymerase II Subunit A), RAD21 (RAD21 Cohesin Complex Component) and IKZF1 (IKAROS Family Zinc Finger 1) transcription factors, but with the SNP potentially altering the binding for PAX6 (Paired Box 6) and ZSCAN4C (Zinc Finger And SCAN Domain Containing 4) (Table 2).
Table 2

RegulomeDB high-ranked results and Ensembl regulation results. For each site, we list the gene they are found in, ranking scores from RegulomeDB, if they are found at open chromatin regions based on Chip-seq, DNase-seq or FAIRE-seq (✓) data from Ensembl regulation database. This includes sites found on promoter or open chromatin regions, expression quantitative trait loci (eQTL) regions, proteins bound (* for those mentioned only in Ensembl, *+ for those supported from both RegulomeDB and Ensembl data. All the rest come from RegulomeDB), PWM (positional weight matrix), whether they are active in the brain (e.g. in chromatin with evidence for strong transcription in the brain) and in which brain regions (SN: Substantia nigra; Cd: Caudate nucleus; CG: Cingular gyrus; Hi: Hippocampus; MFA 46: Middle Frontal Area 46; ANG: Angular Gyrus; TL: temporal lobe).

Fig. 2

Ensembl regulation and RegulomeDB results. a, Ensembl regulation data resources for OTR (chromosome 3), showing the CTCF (transcription factor) binding domain (green), the open chromatin regions (grey), the promoter regions (red) and the enhancer regions (yellow). We highlighted SNP sites analyzed in this study. b, RegulomeDB browser (chromosome 3: 8800–8802 Mb, OTR intron) showing open chromatin regions (blue peaks) for 3 sites identified in this study. Shown are DNase sequencing data from undifferentiated embryonic stem cells (DNase-seq-H7-hESC), from osteoblasts, and from astrocytes from the hippocampus. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

RegulomeDB high-ranked results and Ensembl regulation results. For each site, we list the gene they are found in, ranking scores from RegulomeDB, if they are found at open chromatin regions based on Chip-seq, DNase-seq or FAIRE-seq (✓) data from Ensembl regulation database. This includes sites found on promoter or open chromatin regions, expression quantitative trait loci (eQTL) regions, proteins bound (* for those mentioned only in Ensembl, *+ for those supported from both RegulomeDB and Ensembl data. All the rest come from RegulomeDB), PWM (positional weight matrix), whether they are active in the brain (e.g. in chromatin with evidence for strong transcription in the brain) and in which brain regions (SN: Substantia nigra; Cd: Caudate nucleus; CG: Cingular gyrus; Hi: Hippocampus; MFA 46: Middle Frontal Area 46; ANG: Angular Gyrus; TL: temporal lobe). Ensembl regulation and RegulomeDB results. a, Ensembl regulation data resources for OTR (chromosome 3), showing the CTCF (transcription factor) binding domain (green), the open chromatin regions (grey), the promoter regions (red) and the enhancer regions (yellow). We highlighted SNP sites analyzed in this study. b, RegulomeDB browser (chromosome 3: 8800–8802 Mb, OTR intron) showing open chromatin regions (blue peaks) for 3 sites identified in this study. Shown are DNase sequencing data from undifferentiated embryonic stem cells (DNase-seq-H7-hESC), from osteoblasts, and from astrocytes from the hippocampus. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.) MH-NHP OTR-rs2228485, MHU OTR-rs2268493, HU OTR-rs60902022, HU OTR-rs237888 and MHS VTR1B-rs28632197 were all less likely to affect regulation (rank 3a), but still had open chromatin regions that either fell within the binding site of different proteins, or alter gene motifs, or both, and showed activity in brain regions like the cingulate gyrus and the hippocampus (Table 2; Fig. 2a and b). The remaining sites (OTR: rs2268490, rs4686302, rs237917, rs6770632, rs59190448 and rs9872310; VTR1A: rs10877969 and rs1042615; VTR1B: rs33985287) did not rank highly in changing regulation, according to the RegulomeDB scores (ranks 4–6), but were also found in open chromatin or promoter regions (according to Ensembl or RegulomeDB data), or in a transcription factor binding domain (Fig. 2a); all showed activity in the brain. A total of 8 of these 15 sites were located within CTCF and the POLR2A transcription factor binding motifs (Table 2; Supplementary Table 10). These findings indicate that variant sites associated with sociality are located within transcription factor binding motifs, which could cause variant brain gene expression among the MH population.

Linkage disequilibrium analysis

We ran Linkage Disequilibrium analysis (using LDmatrix [38]) to test if any of the sites we identified are co-inherited, using data from all populations included in the 1000 Genomes Project, using a cut off of R2 > 0.8. This analysis revealed four haplotype blocks: one containing co-inheritances among rs60902022, rs13316193 and rs11131149 in OTR; a second between rs10784339 and rs10877969 in VTR1A; a third between rs11174811 and rs3021529 also in VTR1A; and a fourth between rs33985287 and rs28676508 in VTR1B (Fig. 3, Supplementary Table 11). This suggests that the combinations of these variant alleles might act in concert to shape functional differences that end up affecting phenotypes controlled by these genes.
Fig. 3

Linkage Disequilibrium analysis. a, OTR heatmap matrix of pairwise linkage disequilibrium statistics. Pairwise statistics are colored based on their R2 and D' significance scores, with the darkest shades of red being the most significant R2 score, and white the least, and the darkest shades of blue being the highest D' scores, and white the least. ✓, pairwise R2 > 0.8, our threshold for a significant co-inheritance of the variant alleles. The same coloring and symbols were used in the rest of the heatmaps. b, VTR1A heatmap matrix of pairwise linkage disequilibrium statistics. c, VTR1B heatmap matrix of pairwise linkage disequilibrium statistics. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Linkage Disequilibrium analysis. a, OTR heatmap matrix of pairwise linkage disequilibrium statistics. Pairwise statistics are colored based on their R2 and D' significance scores, with the darkest shades of red being the most significant R2 score, and white the least, and the darkest shades of blue being the highest D' scores, and white the least. ✓, pairwise R2 > 0.8, our threshold for a significant co-inheritance of the variant alleles. The same coloring and symbols were used in the rest of the heatmaps. b, VTR1A heatmap matrix of pairwise linkage disequilibrium statistics. c, VTR1B heatmap matrix of pairwise linkage disequilibrium statistics. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Population data and putative selection signatures

We further analyzed the human allele frequency data for these sites (1000 Genomes Project [39] Phase 3 and gnomAD [40] v2.1.1 and v3) and reviewed the literature for selection on these sites (Table 1, Supplementary Tables 12–13). We used the supplementary information in Schaschl et al., 2015 [1], and filtered the results of their FDIST, Bayescan, and extended Lewontin and Krakauer (FLK) tests, reporting here only those on our identified SNPs with significant values (p or q values < 0.05) (Supplementary Table 13). We also analyzed data from Kovalaskas and colleagues [41] on 13 bonobos, where strong positive selection signals in OTR and VTR1A were identified. We recognize that selection signals are hard to ascertain, and no strong claims should be laid on them. We report them as an additional source of available information concerning the alleles under discussion. We found four MHU and one MHS case where the new allele is currently the major allele in MH populations: (MHU: OTR-rs6770632(C), rs1042778(G), rs237885(T); MHS: VTR1A-rs10877969(T); VTR1B-rs33985287(T); Table 1). Several of these sites have a high derived allele frequency of ≥70% globally (OTR-rs6770632(C), VTR1A-rs10877969(T), VTR1B-rs33985287(T)), while OTR-rs1042778(G) and OTR-rs237885(T) showed lower frequencies (59% and 51%, respectively). Similarly, we found one HU and one HS case, where the new hominin (AH or MH) allele is the major allele in the global MH population (HU: OTR-rs59190448(G); HS: OTR-rs237888(T); Table 1). Both sites show derived allele frequencies of >85%, and they have been associated with signals of positive selection. Specifically, based on our filtering of the results in Ref. [1], all three approaches (FDIST, BayeScan and FLK) identified the OTR-rs59190448(G) to be strongly under positive directional selection (FDIST p = 0.0004, BayeScan q = 0.0004, FLK p = 0.001), while only the FLK test (p = 0.011) detected the OTR-rs237888 to be under positive selection (Supplementary Table 13). Interestingly, both the OTR-rs237888(T) and the OTR-rs59190448(G) are present in the majority of the AH genomes (6/8 for OTR-rs237888(T) and 5/7 for OTR-rs59190448(G); Supplementary Table 1); however, we acknowledge that the genomes of a small number of AH individuals currently available might not be representative of the general archaic population. These results, though tentative, show a similar frequency trend in AH and MH. Several sites where we encountered a new allele either in AH or in MH are currently the minor allele in MH populations, with some of them being present at low percentages in the global population (<25%) (OTR-rs9872310, rs2268493, rs2254298, rs4686302, rs237911; VTR1A-rs11174811, rs3803107, rs3021529, rs3759292; and VTR1B-rs28676508, rs28632197). Our filtering of the results in Ref. [1] showed that some of them are under balancing selection: OTR-rs237911 (FDIST, p = 0.0485), VTR1A-rs11174811 (FDIST, p = 0; BayeScan, q = 0.0336) and VTR1A-rs3021529 (FDIST, p = 0.0001; BayeScan q = 0.0266); only the VTR1A-rs3759292(A) (not the MHU G allele) was found to be under positive selection (F.LK, p = 0.001; Supplementary Table 13). Interestingly, VTR1A-rs3759292 is also in a region that is found deleted in 0.3% of the macaques sampled on that site (739; Supplementary Table 2), and according to Donaldson and colleagues [26], who found sites with both variation and deletion events in the MH and NHP VTR1A microsatellite, such regions might have influenced sociobehavioral traits during primate evolution. Among the three sites where MH and bonobos showed convergent variation (OTR-rs2228485 and rs237897; VTR1A-rs1042615), the OTR-rs2228485(G) was present in 13/15 (87%) bonobos we studied, and the alternate allele (A) was present in 2/15 (13%) bonobos. Interestingly, the 13 G-carriers were all from the same bonobo population [42], while the remaining 2 A-carriers were bonobos whose genome was sequenced for reference genomes. Although in all these sites, the convergent allele was not the major allele in the global MH population, in some MH populations it is present at ∼50% (Table 1, Supplementary Table 12). In particular, the convergent OTR-rs2228485 G-allele that was found at 87% among the bonobo population, is present at 28% in the global MH population, while at 46% in the American population. OTR-rs237897(A) was found in 40% of the MH global population (50% in Europeans) and 13% in bonobos. VTR1A-rs1042615(A) reached 35% in the global MH population (49% in Europeans), and 87% in bonobos. Of these sites, only the OTR-rs2228485 was found to show signs of balancing selection (FDIST, p = 0.0149) in MH (Supplementary Table 13). None of these convergent sites were among the specific sites of strong positive selection identified independently in bonobos [41], but several of these sites were located in close vicinity to our identified sites in the OTR and VTR1A (e.g., one site was found at a ∼14 kb distance from VTR1A-rs1042615). These findings suggest a correlation of several sites associated with sociality also showing convergent changes in MH and bonobos, not found in chimpanzees and other NHP examined.

Discussion

As Ambrose [43] stated more than a decade ago, “Neanderthals may have been deficient in the neuroendocrine receptor variants that responded to the hormones that have been shown to correlate with trust, reciprocity, and cooperation”. When they made that statement, they pointed out that “comparison of Neanderthal and modern human genes for these neuroendocrine systems could help test this hypothesis.” In some sense, we followed this suggestion, taking advantage of the increasing number of (ancient) genomes available. We found multiple cases of allelic heterozygosity variation in the OTR, VTR1A and VTR1B genes that are specific to MH or to both MH and AH, where one allele is unique to them among primates and the other shared with other primates. Variants in all three genes have been associated with sociality, although more so for OTR than VTR1A and VTR1B. We are aware that this could also be influence by a publication bias favoring sociality studies with OTR. Taken as a whole, our results paint a mosaic picture concerning the evolution of our prosocial nature. In line with other works [44,45], we do not contend that a specific allele made all the difference for any increase in prosociality. Rather, we believe it is the combination of heterozygous alleles among the population that can have different impacts on genes and traits, such as the protective affect for sickle cell anemia [46]. In the same way that SNPs act as general risk factors, and not as specific modifier alleles in certain behavioral or biological contexts (i.e. disorder, gender) [47] (Supplementary Note 2), we suggest that these variants could have a synergistic heterozygous impact on the evolution of social behavior in hominids. In other words, we found that certain alleles that were associated with more social deficits in one disorder (e.g., ASD) had either no effect, or conferred a benefit with respect to social abilities in another disorder (e.g., ADHD). Additionally, the impact of the alleles of a certain variant site is different depending on environmental factors, because environmental processes may be differentially accentuated through hormone signaling. We thus focus our interpretations on the functional and frequency analyses. We consider that of the MHU and MHS SNPs, only those five where the new allele is the major allele in MH populations could be responsible for beneficial changes in MH relevant to sociality (OTR-rs6770632, rs1042778, rs237885; VTR1A-rs10877969; VTR1B-rs33985287), especially those found at ≥70% (OTR-rs6770632, VTR1A-rs10877969, VTR1B-rs33985287). OTR-rs6770632 is the only site among these three that falls in open chromatin regions active in the brain, and particularly in the cingulate gyrus, a brain region involved in social cognition [48]. Moreover, this site is in a genomic region containing the binding domain of EZH2, a gene that is involved in the etiology of memory impairments [49], autism [50] and Weaver syndrome, an overgrowth/intellectual disability syndrome [51]. These MHS and MHU sites bear on several working hypotheses that account for MH prosociality. Among them, the ‘self-domestication hypothesis’ [14,[52], [53], [54], [55]] posits that certain physiological and behavioral traits that MH share with domesticated animals support a significant turning point exclusive of MH on the prosocial continuum. Interestingly, Hare [14] and Theofanopoulou [53] have hypothesized that human self-domestication was likely facilitated by changes in the oxytocin/vasotocin system, based in part on studies showing that oxytocin and vasotocin ligands or receptors are under relaxed selective constraint in domesticated species [56], and their gene expression and methylation patterns differ in wild versus domesticated species [[57], [58], [59]]. We also found that some of the variants present only in MH among primates (VTR1A-rs11174811 and VTR1A-rs3021529) show signs of relaxed selection. Moreover [60], it is likely that just as argued for other species' domestication [60], oxytocin did not play a single role in human self-domestication. Different stages in the self-domestication process likely drew on different properties of the oxytocin system. This in part explains why not all our results align on a single ‘prosocial’ dimension. These sites can also shed light on the genetics underlying possible sociality differences identified between MH and AH (Fig. 4). For example, they might be relevant to the smaller social groups attributed to Neanderthals and Denisovans [4] or to the decreased MH androgenization revealed by both craniofacial anatomy and digit ratio measurements compared to AH. According to Nelson and colleagues [61], the digit ratios of Neanderthals could be linked to a polygynous social structure and a higher level of male–male competition than most contemporary MH populations. Interestingly, one of the MHU sites identified in this study (OTR-rs53576) has been found to predict digit ratios, emotional (anger) and cognitive (hostility) aggression in 7 ethnic MH groups, something that could be the case also between MH and AH.
Fig. 4

Hypothesized functional links between identified SNPs on the OTR-VTRs and sociality-related changes/similarities across species/lineages. On the left, a schematic tree between portraying the phylogenetic relationships between chimpanzee, bonobo, MH and AH. On the right, a table showing the evolutionary comparisons discussed (1st Column), the sociality-related functional links reviewed for each respective evolutionary comparison (2nd Column; all references can be found in the Discussion), and the strongest putative SNPs on the OTR-VTRs we identified (3rd Column) for each comparison (1st Column), which we hypothesize might be underlying the neurobiology of their functional differences/similarities (2nd Column).

Hypothesized functional links between identified SNPs on the OTR-VTRs and sociality-related changes/similarities across species/lineages. On the left, a schematic tree between portraying the phylogenetic relationships between chimpanzee, bonobo, MH and AH. On the right, a table showing the evolutionary comparisons discussed (1st Column), the sociality-related functional links reviewed for each respective evolutionary comparison (2nd Column; all references can be found in the Discussion), and the strongest putative SNPs on the OTR-VTRs we identified (3rd Column) for each comparison (1st Column), which we hypothesize might be underlying the neurobiology of their functional differences/similarities (2nd Column). HU sites, where AH is homozygous with a new mutation not found in NHP, and MH are variant, can offer information on those SNPs that were likely present in AH and were selected for or purged out in the MH populations. OTR-rs59190448 and OTR-rs237888 are the only two sites where the HU allele ended up being positively selected in MH [1], possibly resulting in beneficial phenotypes. With the open chromatin region for OTR-rs237888 also in the cingulate gyrus, we propose that this site along with the aforementioned OTR-rs59190448, could account for a more prosocial phenotype in the human lineage compared to our non-human primate ancestors at the time of the Pan-Homo lineage split [45,62]. These sites can elucidate the genetic underpinnings of the progressive social tolerance needed for the intensive cultural transmission of technological innovations (e.g., fire use) in the evolution of humankind [63], as well as for the reduced aggression indicated by several markers in early hominid evolution, such as the reduction of male canine size [64] and the accelerated demographic success [65]. Contrariwise, some of the other HU alleles that were found in open chromatin or conserved sites, have not been highly retained in the global MH population, and probably can account for changes that were not beneficial (OTR: rs9872310 [open chromatin], rs13316193 [conserved site], rs4686302 [open chromatin, mixed results on whether it is benign or deleterious]; VTR1A: rs3803107). For the three sites of convergent evolution between MH and bonobos, OTR-rs2228485 was found to be under balancing selection in MH and is also one of the most highly ranked sites in our RegulomeDB analysis, pointing to putative high functionality. Only the OTR-rs2228485(G) and the VTR1A-rs1042615(A) sites were the major alleles in our bonobo sample, but in order to conclusively resolve the state of these sites in bonobos, larger population samples will be needed. Nonetheless, our findings on convergent changes for MH with bonobos could be insightful for understanding the posited similarities in prosociality, social tolerance and cooperation between MH and bonobos, and the differences of both compared to chimpanzees [66,67]. For example, bonobos outperform chimpanzees on tasks relevant to social causality or theory of mind [68] and are more attentive to the face and eyes [69], suggestive of higher empathic sensitivity. In addition, both adult testosterone [70] and prenatal androgen [71] levels are lower in bonobos than in chimpanzees, with the latter being more in line with human measures in relevant correlates (i.e., second-to-fourth finger length ratio), and associated with higher empathy [72]. Added to that, according to a recent study [41], bonobos, since their split from chimpanzees, show strong positive selection signals on variants located in genes of the oxytocin and vasotocin pathway (including OTR and VTR1A), something viewed as evidence for reduced aggression and self-domestication in bonobos (we note though that a more recent study [73] did not find such selection signatures using branch and branch-site models). Staes and colleagues [73] in a recent comparison of several neurochemical receptor genes in MH, chimpanzees and bonobos, but not AH, showed partial overlap with our results, which we confirmed (Nicky Staes; personal communication) was due to them not including the reference bonobo genomes in their sample, as well as using different ancestral state inferences. Our results for convergent evolution between MH and bonobos for the OTR-rs237897(A) agree, although in our sample bonobos are heterozygous (A/G), like MH, and in their sample homozygous (G/G). Our findings disagree for our identified OTR-rs2228485(G) convergence between MH and bonobos, where we found that both of them were heterozygous (A/G), with the ancestral state being homozygous (A/A) in all studied NHP. Due to different bonobo samples used, where all bonobos were homozygous (G/G), and no further ancestral state being considered, Staes et al. deemed this was a MH-chimpanzee convergence, something that contradicts our data. Lastly, when it comes to variants they identified in MH only, our results widely agree, although in our study, since we included AH genomes, we found that some of their MH-variants were present already in AH (e.g., OTR-rs59190448 and OTR-rs237888). Similarly, other contradictions can be explained due to our use of macaque SNPs in our inference of the ancestral state. For instance, since we found the OTR-rs237911(G) allele in one of the macaque databases we used (Supplementary Table 2), this excluded it from being considered as unique to MH, in contrast to what Staes et al. suggested. In summary, we used an interdisciplinary approach to understand the evolution of hominid prosociality through the lens of the OTR-VTRs, where we combined evidence from modern and archaic genomics, population genetics, transcriptomics, and behavioral and neuroscientific studies, among others (see Supplementary Table 14 for all our results at a glance). Our approach was hypothesis-driven, narrowed down to 3 gene candidates, which we deem necessary in conjunction with genome-wide studies, whose strict filters end up discovering what is present only in very high percentages in certain species' comparisons. In doing this, we identified several strong allelic candidates pointing towards a prosocial shift in MH (5 sites in OTR-VTRs) and in the hominin populations altogether (2 sites). Future studies are needed to investigate further their presence in larger sample sizes, as well as their functionality. So far, our results and interpretations are broadly compatible with information based on the fossil record, paleogenomic evidence [[5], [6], [7], [8]], and with behavioral differences between chimpanzees and bonobos [66,67]. We lastly suggest that other receptors and ligands should be studied as well (glutamate receptors [74], dopamine [45], β-endorphines [62], as well as other genes in the genetic signaling pathways where OTR-VTRs function), so that we can achieve a broader understanding of how they could all have acted in concert to form a broader set of changes that set the stage for our prosocial profile.

Methods

Variation pattern analyses

We retrieved the OTR, VTR1A and VTR1B DNA sequences from the following sources: the publicly available genomes of 14 Neanderthal individuals [5,[19], [20], [21], [22], [23], [24]] and a Denisovan [6]; seven high-coverage present-day human genomes (San (HGDP01036), Mbuti (HGDP00982), Karitiana (HGDP01015), Yoruba (HGDP00936), Dinka (DNK07), French (HGDP00533) and Han (HGDP00775), originally sequenced for [5]; 1000 Genomes project data Phase 3 [39] (as shown in Ensembl [75]); a chimpanzee (Pan Troglodytes) (CHIMP2.1.4/panTro4), bonobo (Pan Paniscus) (panpan1.1/panPan2) and rhesus macaque (Macaca Mulatta) (BMC Mmul_8.0.1/rheMac8). We ran a multi-alignment between the modern human, archaic human, chimpanzee, bonobo and macaque gene sequences of OTR, VTR1A and VTR1B. Of the differences we found, we focused on those which are polymorphic in MH with literature associated with sociality or social deficits. We performed the alignments using the ‘Phylogenetic Context’ tool built in Ensembl [75], tools in the Max Planck for Evolutionary Anthropology Ancient Genome Browser (https://bioinf.eva.mpg.de/jbrowse/), including MUSCLE [76] and MView [77] to generate the alignments, and Aliview [78], Decipher for R [79], Bedtools to edit and analyze them. We used all the genomic sequence of the genes we aligned, as provided in the standard layout of the files of the genomic sequences in the Ensembl database, namely with 600 bp upstream and downstream. We defined the genomic sequences in the same way when we extracted the gene sequences from the archaic genomes. The only additional regions we included ad hoc in our analysis that fall further out of 600bp window were those of several VTR1A-microsatellites that have been associated with social-related phenotypes [25,26] (RS3-(CT)4TT (CT)8 (GT)24, RS1-(GATA)14, GT25, and AVR-(GT)14 (GA)13(A)8). We then aligned these sites with the rest of the available Neanderthal genomes (AH: Spy [20] (2 individuals), Goyet [21], Les Cottés [22] (5 individuals), Hohlenstein-Stadel [23], Scladina [23], Chagyrskaya [80]) and additional NHP genomes, chimpanzee (Pan Troglodytes) (Pan_tro 3.0/panTro5 and Clint_PTRv2/panTro6), bonobo (Pan Paniscus) (Max-Planck Institute panpan1/panPan1) and Rhesus macaque (Macaca Mulatta) (Mmul_10/rheMac10). Of the AH sites, we included only sequence reads that had a mapping quality of ≥25. Due to low-quality DNA for most of the ancient samples, and given the heterogeneity of procedures for data production in AH genome sequencing, we considered variant alleles only those present in at least two AH genomes, to avoid false positives due to sequencing errors. Following [81], we filtered out the samples that strongly deviated from the remaining individuals in the dataset. This filtering of samples only affected two major alleles discussed in the study: the MH-bonobo convergent VTR1A-rs1042615(A), that is also found in the Denisovan sequence; and the MHS OTR-rs6770632(C) allele that is also found in the Altai Neanderthal sequence. All alleles we retrieved can be found in Supplementary Table 1. To assess variation in NHP, we used Single Nucleotide Variant (SNV)-data from: bonobos (13 bonobos from Ref. [42], 27 individuals with only OTR and VTR1A data from Ref. [82], 113 individuals with only VTR1A data from Ref. [83]); chimpanzees (25 individuals from Ref. [42], 35 [82] and 62 [84] individuals with only VTR1A and OTR data); and macaques (1234 individuals from the mGAP database [85] and dbSNP 150 in Ensembl [75], where the mGAP data have been incorporated) (Supplementary Table 2). The SNV-data from bonobos and chimpanzees from Ref. [42] were lifted over to the GRCh38/hg38 version of the human genome (Homo sapiens), while the macaque SNVs [85] were lifted to the GRCh37/hg19 version of the human genome; they were lifted from different human references because these are where the different primate data sets were mapped to. We lastly used information from several other publicly available genomes of these species, since they were sequenced from different individuals (chimpanzee: Pan_tro 3.0/panTro5 and Clint_PTRv2/panTro6, bonobo: Max-Planck Institute panpan1/panPan1, and rhesus macaque: Mmul_10/rheMac10).

Ancestral allele inference

In order to infer the ancestral alleles of the identified sites, we aligned the MH sequences of OTR, VTR1A and VTR1B against 12 non-human primate species, and inferred the ancestral sequences through Ortheus [86] using the ‘Phylogenetic Context’ tool built in Ensembl [75]. Ortheus is a probabilistic method that infers the ancestral allele, using a phylogenetic model, that incorporates gaps, to infer insertion and deletion events. Ancestral sequences are predicted for each node of the phylogenetic tree. Multiple alignments of all the sites we identified in 12 primate species, with the inferred ancestral sequences in each node, can be found in Supplementary Note 1. Although we found several other putative sites of convergent evolution between humans and other non-human primates (e.g., mouse lemur, olive baboon, gelada etc.; Supplementary Note 1), in absence of variation data in these species we can't determine the reliability of this convergence as we have done with bonobo population data.

Association studies analysis

We went through all the association studies identified in the NCBI Variation Viewer [87] as of October 2020 (e.g. for OTR: https://www.ncbi.nlm.nih.gov/variation/view/?q=OXTR) for SNPs that have publications (Filters: ‘Variant type’ (: Single Nucleotide Polymorphism), and ‘Has publications’ (: Yes)), and report the association(s) identified for each SNP site (Supplementary Tables 3, 4 and 5). We then classified these associations on whether they were strongly related to sociality (e.g., ‘prosociality’, ‘social cognition’), ‘possibly related to sociality’ (e.g., ‘depression’, ‘Attention Deficit Hyperactivity Disorder’), or not (e.g. ‘diabetes’). We calculated the percentage of the associations for each site first, to then add all the percentages for each category together averaged out by the total number of the SNPs. Although there are more studies on the OTR SNPs in total, mostly due to replication bias, we normalized this factor, by calculating an association-percentage for each SNP first (whether it came from one or several studies), and then used the number of the SNPs to average out the final results (instead of the number of studies). We calculated p values between the percentages, using a Chi-squared test (n exact sample sizes, degrees of freedom and confidence intervals are noted in Supplementary Tables 3, 4 and 5). We lastly calculated the difference between the proportions of the studies related to sociality for each pair-wise comparison of the analyzed genes, using a Chi-squared test (n exact sample sizes, degrees of freedom and confidence intervals are noted in Supplementary Table 6). All the sites we report have studies that we classified as clearly related to sociality, with the exception of rs10784339 that has been associated with substance use disorders (Supplementary Tables 7–9). Although we classified those studies as not related to sociality, we included this site in the analysis, recognizing that there might be subjectivity in the classification of substance use disorders as being related to sociality or not. For some sites where we identified variation patterns we wanted to study further in terms of disorders, we performed an additional exhaustive literature research in the National Center for Biotechnology Information [87] (https://www.ncbi.nlm.nih.gov/pubmed/), SNPedia (http://snpedia.com) and Google Scholar (https://scholar.google.com/platforms) (as of October 2020) for the clinical significance of each one of the SNPs we identified in present-day human populations. On Supplementary Tables 10–12, we have included most of studies that report an association of these variants, a short description of their exact impact, as well as the trial sample of each study. We have not included review studies that recapitulate findings of original studies or studies showing negative results. We performed pathogenicity prediction analyses using the following seven tools: CADD [29] (Combined Annotation Dependent Depletion), which scores the predicted deleteriousness of SNVs and insertion/deletions variants in the human genome by integrating multiple annotations including conservation and functional information into one metric. In our analysis, we display scores above 20 as likely deleterious and scores below 20 as likely benign. GERP [30] (Genomic Evolutionary Rate Profiling), which identifies constrained loci in multiple sequence alignments by comparing the level of substitution observed to that expected if there was no functional constraint. Positive scores represent highly-conserved positions while negative scores represent highly-variable positions. Mutation Assessor [31], which predicts the functional impact of amino-acid substitutions in proteins using the evolutionary conservation of the affected amino acid in protein homologs. The rank score is between 0 and 1, with variants with higher scores being more likely to be deleterious. REVEL [32] (Rare Exome Variant Ensemble Learner), which is an ensemble method for predicting the pathogenicity of missense variants. It integrates scores from MutPred, FATHMM v2.3, VEST 3.0, PolyPhen-2, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP++, SiPhy, phyloP, and phastCons. Scores range from 0 to 1, and variants with higher scores are predicted to be more likely to be pathogenic. We considered scores above 0.5 as ‘likely disease causing’ and below 0.5 is ‘likely benign'. MetaLR [33] (Meta Logistic Regression), which uses logistic regression to integrate nine independent variant deleteriousness scores and allele frequency information to predict the deleteriousness of missense variants. Variants are classified as ‘tolerated’ or ‘damaging’; scores rank between 0 and 1, and variants with higher scores are more likely to be deleterious. SIFT [34] (Sorting Intolerant From Tolerant), which predicts whether an amino acid substitution is likely to affect protein function based on sequence homology and the physico-chemical similarity between the alternate amino acids. Scores <0.05 are classified as 'deleterious' and all others as ‘tolerated'. PolyPhen-2 [35], which predicts the effect of an amino acid substitution on the structure and function of a protein using sequence homology, Pfam annotations, 3D structures from PDB where available, and a number of other databases and tools (including DSSP, ncoils etc.). The PolyPhen-2 score represents the probability that a substitution is damaging, so values nearer 1 are more confidently predicted to be deleterious (note that this is opposite to SIFT). We performed regulation analyses using RegulomeDB [36] and Ensembl regulation data resources [37]. RegulomeDB analysis make use of large datasets, including ChIP-seq, FAIRE-seq and DNase I hypersens Itive information for a variety of important regulatory factors across a diverse set of cell types, chromatin state information across over 100 cell types, binding sites of transcription factors (including Position-Weight Matrix and DNase Footprinting), and expression quantitative trait loci (eQTL) information allowing the association of distal sites with gene promoters. The Ensembl resources provide a functional annotation of the regulatory elements in the human genome, including data on epigenetic marks, transcription factor binding and DNA methylation, as well as microarray probe mappings (Table 2, Fig. 2). RegulomeDB has developed a heuristic scoring system based on functional confidence of a variant. Category 1 variants are those that are known eQTLs for genes, and thus have been shown to be associated with expression. Within Category 1, subcategories indicate additional annotations from the most confident (1a, which has transcription factor (TF) binding, a motif for that TF, and a DNase footprint) to the least confident (1f, which has only TF binding or a DNase peak). According to Boyle and colleagues [36], who developed the system: “the additional categories represent analogous annotations to Category 1 but without eQTL data and, thus, no known direct effect on binding. Category 2(a–c) demonstrates direct evidence of binding through ChIP-seq and DNase with either a matched PWM (positional weight matrix) to the ChIP-seq factor or a DNase footprint. Category 3(a–b) is considered less confident in affecting binding due to a more incomplete set of evidence. These sites have ChIP-seq evidence and either a motif that matches the ChIP-seq data but no DNase evidence, or DNase evidence and any other motif. Finally, Categories 4–6 lack evidence of the variant actually disrupting the site of binding. These include DNase and ChIP-seq evidence (Category 4), DNase or ChIP-seq evidence (Category 5), or any single annotation not in the above categories (Category 6).” A list of the scores of all variants studied can be found in Supplementary Table 10. We ran Linkage Disequilibrium analysis (using LDmatrix [38]) to test if any of the sites we identified get coinherited, using data from all population included in the 1000 Genomes Project and a cut off of R2 > 0.8. LDMatrix (https://ldlink.nci.nih.gov/?tab=ldmatrix) creates an interactive heatmap matrix of pairwise linkage disequilibrium statistics. R2 and D′ scores for all sites' combinations can be found in Supplementary Table 11.

Population data and selection analysis

We collected human allele frequency data for our identified sites using data from the 1000 Genomes Project [39] (Phase 3) and gnomAD v2.1.1 and v3 [40]. We listed the minor allele frequency (MAF) and the highest MAF in present day human populations in Table 1, and specific frequencies only from the 1000 Genomes Project in Supplementary Table 12. For bonobos, we also analyzed the data from 13 bonobos by Kovalaskas and colleagues [41] (raw data provided by Sarah Kovalaskas), where strong positive selection signals on OTR and VTR1A were identified. We also reviewed the literature for studies on selection signals of these sites in MH. We used the supplementary information found in Schaschl et al., 2015 [1], and filtered the results of their FDIST, Bayescan and extended Lewontin and Krakauer (FLK) tests, reporting only those on our identified SNPs with significant values (p or q values < 0.05) (Supplementary Table 13). In their study, Schaschl et al., 2015 [1] reported in the main text only those SNPs with p or q values < 0.005. Schaschl et al., 2015 [1] analyzed 14 human populations (n = 1092) from the 1000 Genomes database and aimed to identify SNPs under selection employing two tests that use Bayesian models to detect FST outliers (FDIST and Bayescan) and the FLK test that compares patterns of differences between allele frequencies in several populations relative to their expectation under neutral evolution (details for each analysis in the Methods of Schaschl et al., 2015 [1]).

Funding statement

CB acknowledges the financial support from the Spanish Ministry of Economy and Competitiveness/FEDER funds (grant FFI2016-78034-C2-1-P), the Spanish Ministry of Science and Innovation (PID2019-107042 GB-I00), a Marie Curie International Reintegration Grant from the European Union (PIRG-GA-2009-256,413), research funds from the Fundacio' Bosch i Gimpera, from the Generalitat de Catalunya (2017-SGR-341), and the MEXT/JSPS Grant-in-Aid for Scientific Research on Innovative Areas 4903 (Evolinguistics: JP17H06379). CTh acknowledges support from the Generalitat de Catalunya in the form of a doctoral (FI) and the Rockefeller University. AA acknowledges financial support from the Spanish Ministry of Economy and Competitiveness and the European Social Fund (BES-2017-080,366). EDJ is funded by the Howard Hughes Medical Institute and the Rockefeller University.

Author contributions statement

CTh conceptualized and designed the study, ran variation, pathogenicity prediction, regulation, human population and macaque data, and association analyses. AA ran variation, pathogenicity prediction, human, chimpanzee and bonobo variation analyses and reviewed the literature. EDJ and CB supervised the study.

Data availability statement

All data generated or analyzed during this study are included in the Suppl. Notes and Tables.

Declaration of interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:
  77 in total

1.  MUSCLE: multiple sequence alignment with high accuracy and high throughput.

Authors:  Robert C Edgar
Journal:  Nucleic Acids Res       Date:  2004-03-19       Impact factor: 16.971

2.  Early hominids may have been weed species.

Authors:  Richard S Meindl; Morgan E Chaney; C Owen Lovejoy
Journal:  Proc Natl Acad Sci U S A       Date:  2018-01-22       Impact factor: 11.205

3.  Oxytocin variation and brain region-specific gene expression in a domesticated avian species.

Authors:  Yasuko Tobari; Constantina Theofanopoulou; Chihiro Mori; Yoshimi Sato; Momoka Marutani; Sayaka Fujioka; Norifumi Konno; Kenta Suzuki; Akari Furutani; Shiomi Hakataya; Cheng-Te Yao; En-Yun Yang; Chia-Ren Tsai; Pin-Chi Tang; Chih-Feng Chen; Cedric Boeckx; Erich D Jarvis; Kazuo Okanoya
Journal:  Genes Brain Behav       Date:  2021-12-02       Impact factor: 3.449

4.  Differences in the cognitive skills of bonobos and chimpanzees.

Authors:  Esther Herrmann; Brian Hare; Josep Call; Michael Tomasello
Journal:  PLoS One       Date:  2010-08-27       Impact factor: 3.240

5.  Signatures of positive selection in the cis-regulatory sequences of the human oxytocin receptor (OXTR) and arginine vasopressin receptor 1a (AVPR1A) genes.

Authors:  Helmut Schaschl; Susanne Huber; Katrin Schaefer; Sonja Windhager; Bernard Wallner; Martin Fieder
Journal:  BMC Evol Biol       Date:  2015-05-13       Impact factor: 3.260

6.  The complete genome sequence of a Neanderthal from the Altai Mountains.

Authors:  Kay Prüfer; Fernando Racimo; Nick Patterson; Flora Jay; Sriram Sankararaman; Susanna Sawyer; Anja Heinze; Gabriel Renaud; Peter H Sudmant; Cesare de Filippo; Heng Li; Swapan Mallick; Michael Dannemann; Qiaomei Fu; Martin Kircher; Martin Kuhlwilm; Michael Lachmann; Matthias Meyer; Matthias Ongyerth; Michael Siebauer; Christoph Theunert; Arti Tandon; Priya Moorjani; Joseph Pickrell; James C Mullikin; Samuel H Vohr; Richard E Green; Ines Hellmann; Philip L F Johnson; Hélène Blanche; Howard Cann; Jacob O Kitzman; Jay Shendure; Evan E Eichler; Ed S Lein; Trygve E Bakken; Liubov V Golovanova; Vladimir B Doronichev; Michael V Shunkov; Anatoli P Derevianko; Bence Viola; Montgomery Slatkin; David Reich; Janet Kelso; Svante Pääbo
Journal:  Nature       Date:  2013-12-18       Impact factor: 49.962

7.  Bonobo personality traits are heritable and associated with vasopressin receptor gene 1a variation.

Authors:  Nicky Staes; Alexander Weiss; Philippe Helsen; Marisa Korody; Marcel Eens; Jeroen M G Stevens
Journal:  Sci Rep       Date:  2016-12-02       Impact factor: 4.379

8.  A catalog of single nucleotide changes distinguishing modern humans from archaic hominins.

Authors:  Martin Kuhlwilm; Cedric Boeckx
Journal:  Sci Rep       Date:  2019-06-11       Impact factor: 4.379

9.  Dosage analysis of the 7q11.23 Williams region identifies BAZ1B as a major human gene patterning the modern human face and underlying self-domestication.

Authors:  Matteo Zanella; Alessandro Vitriolo; Alejandro Andirko; Pedro Tiago Martins; Stefanie Sturm; Thomas O'Rourke; Magdalena Laugsch; Natascia Malerba; Adrianos Skaros; Sebastiano Trattaro; Pierre-Luc Germain; Marija Mihailovic; Giuseppe Merla; Alvaro Rada-Iglesias; Cedric Boeckx; Giuseppe Testa
Journal:  Sci Adv       Date:  2019-12-04       Impact factor: 14.136

10.  Sickle cell anaemia and malaria.

Authors:  Lucio Luzzatto
Journal:  Mediterr J Hematol Infect Dis       Date:  2012-10-03       Impact factor: 2.576

View more
  1 in total

1.  Love and fear: A special issue.

Authors:  C Sue Carter; Robert Dantzer
Journal:  Compr Psychoneuroendocrinol       Date:  2022-06-17
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.