Literature DB >> 31969706

Ancient West African foragers in the context of African population history.

Mark Lipson¹, Isabelle Ribot², Swapan Mallick^3,4,5, Nadin Rohland³, Iñigo Olalde^3,6, Nicole Adamski^3,5, Nasreen Broomandkhoshbacht^3,5,7, Ann Marie Lawson^3,5, Saioa López⁸, Jonas Oppenheimer^3,5,9, Kristin Stewardson^3,5, Raymond Neba'ane Asombang¹⁰, Hervé Bocherens^11,12, Neil Bradman^8,13, Brendan J Culleton¹⁴, Els Cornelissen¹⁵, Isabelle Crevecoeur¹⁶, Pierre de Maret¹⁷, Forka Leypey Mathew Fomine¹⁸, Philippe Lavachery¹⁹, Christophe Mbida Mindzie¹⁰, Rosine Orban²⁰, Elizabeth Sawchuk²¹, Patrick Semal²⁰, Mark G Thomas^8,22, Wim Van Neer^20,23, Krishna R Veeramah²⁴, Douglas J Kennett²⁵, Nick Patterson^3,26, Garrett Hellenthal^8,22, Carles Lalueza-Fox⁶, Scott MacEachern²⁷, Mary E Prendergast^3,28, David Reich^3,4,5,26.

Abstract

Our knowledge of ancient human population structure in sub-Saharan Africa, particularly prior to the advent of food production, remains limited. Here we report genome-wide DNA data from four children-two of whom were buried approximately 8,000 years ago and two 3,000 years ago-from Shum Laka (Cameroon), one of the earliest known archaeological sites within the probable homeland of the Bantu language group1-11. One individual carried the deeply divergent Y chromosome haplogroup A00, which today is found almost exclusively in the same region12,13. However, the genome-wide ancestry profiles of all four individuals are most similar to those of present-day hunter-gatherers from western Central Africa, which implies that populations in western Cameroon today-as well as speakers of Bantu languages from across the continent-are not descended substantially from the population represented by these four people. We infer an Africa-wide phylogeny that features widespread admixture and three prominent radiations, including one that gave rise to at least four major lineages deep in the history of modern humans.

Entities: Chemical

Mesh：

Substances：

Year: 2020 PMID： 31969706 PMCID： PMC8386425 DOI： 10.1038/s41586-020-1929-1

Source DB: PubMed Journal: Nature ISSN： 0028-0836 Impact factor: 49.962

The deposits at Shum Laka, a rockshelter located in the Grassfields region of western Cameroon, are among the most important archaeological sources for the study of Late Pleistocene and Holocene prehistory in West-Central Africa [1-4]. The oldest human-occupied layers at the site date to ∼30,000 calendar years before present (BP), but of special interest are a series of artifacts and skeletons from ∼8000–3000 BP, between the Later Stone Age (LSA) and the Iron Age (Extended Data Fig. 1; Supplementary Information section 1). This transitional period, sometimes referred to as the Stone to Metal Age (SMA), featured a gradual appearance of new stone tools as well as pottery [3-5]. Subsistence evidence in the rockshelter during the SMA points primarily to foraging, but with increasing use of fruits from Canarium schweinfurthii coinciding with developments in material culture and serving as a foundation for later agriculture [3] (Supplementary Information section 1; Supplementary Table 1). These cultural changes and their early appearance at Shum Laka are particularly intriguing because the Cameroon/Nigeria border area during the late Holocene was likely the cradle of Bantu languages, and of populations whose descendants would spread across much of the southern half of Africa between ∼3000–1500 BP, resulting in the vast range and diversity of the Bantu language family [6-11].

Extended Data Figure 1:

Overview of the site of Shum Laka.

The left column represents generalized stratigraphy, with radiocarbon dates (uncalibrated) shown as red dots on the y-axis, and deposits indicated by their archaeological nomenclature (P, S/Si = Pleistocene; T, A = Holocene; Ao = Holocene ochre ashy layer; Ag = Holocene gray ashy layer; after ref. [76]). Columns 1–6 display chronological extents of technological traditions: 1, microlithic quartz industry; 2, macrolithic flake and blade industry on basalt; 3, bifaces of the axe-hoe type; 4, pecked grounded adze and arrow heads; 5, pottery; and 6, iron objects. Column 7 indicates the two Shum Laka burial phases. Column 8 shows climatic reconstructions based on carbon stable isotopes and pollen from organic matter extracted from sediment cores at Lake Barombi Mbo in western Cameroon (more arid conditions to the left and more humid conditions to the right [60, 76]), along with archaeological eras (LSA, Later Stone Age; SMA, Stone to Metal Age; IA, Iron Age). RMCA Collection; Drawings Y. Paquay, composition © RMCA, Tervuren; modified by E. Cornelissen [77].

A total of 18 human skeletons have been discovered at Shum Laka, comprising two distinct burial phases (Supplementary Information section 1) [1-3]. We attempted to retrieve DNA from six petrous bone samples and obtained working data from two early SMA and two late SMA individuals (∼8000 and ∼3000 BP, respectively; Table 1, Supplementary Table 2). The two earlier individuals—a boy of 4±1 years (2/SE I) lying on top of the lower limbs of an adolescent male of 15±3 years (2/SE II) [2]—were recovered from a primary double burial, while the two later individuals—a boy of 8±2 years (4/A) and a girl of 4±1 years (5/B) [2]—were in adjacent single burials.

Table 1:

Details for the four ancient Shum Laka individuals in the study

ID	Age at death (yrs)	Date (cal BP)	Radiocarbon date (uncal)	Sex	Mt hap	Y hap	Cov	SNPs	Mt/X contam (%)

2/SE I	4±1	7920–7690	6985 ± 30 BP (PSUAMS-6307)	M	L0a2a1	B	0.70	564164	1.0/1.0
2/SE II	15±3	7970–7800	7090 ± 35 BP (PSUAMS-6308)	M	L0a2a1	A00	7.71	1082018	1.5/0.6
4/A	8±2	3160–2970	2940 ± 20 BP (PSUAMS-6309)	M	L1c2a1b	B2b	3.83	935777	0.3/0.5
5/B	4±1	3210–3000	2970 ± 25 BP (PSUAMS-6310)	F	L1c2a1b	..	6.41	1014618	0.5/..

Calibrated direct radiocarbon dates are given as 95.4% CI (Methods). Age (mean ± SE) was determined from skeletal remains [2], and sex from genetic data. Mt/Y hap, mtDNA/Y-chromosome haplogroup; Cov, average sequencing coverage.; Mt/X contam, estimated contamination from mtDNA/X chromosome. See also Supplementary Table 2.

We extracted DNA from bone powder and prepared 2–4 libraries per individual for Illumina sequencing, enriching for ∼1.2 million target single-nucleotide polymorphisms (SNPs) across the genome (Methods; Supplementary Table 2). Final coverage ranged from 0.7–7.7× (0.56–1.08 million SNPs). Authenticity of the data was supported by the observed rate of apparent C-to-T substitutions in the final base of sequenced fragments (4–10%, within the expected range given our library preparation strategy [14]) and of heterozygosity for mitochondrial DNA (mtDNA) and for the X chromosome in males (estimated contamination 0.3–1.5%). We also generated whole-genome shotgun sequence data for individuals 2/SE II (∼18.5× coverage) and 4/A (∼3.9×), as well as genome-wide data (∼598,000 SNPs) for 63 individuals from five present-day Cameroonian populations (Extended Data Table 1; Supplementary Table 3).

Extended Data Table 1:

Populations used in the study

Population	Country	Language family	Date	Sample size	Data type	Reference

Shum Laka	Cameroon		~8000–3000 BP	4/1/1	1240k/DG/SG	This paper
Ancient Malawi HG	Malawi		~8100–2500 BP	7*	1240k	[22]
Mota	Ethiopia		~4500 BP	1	SG	[23]
Ancient South African HG	South Africa		~2000 BP	3[†]	SG	[21,22]
Taforalt	Morocco		~15,000–14,000 BP	6	1240k	[26]
Altai Neanderthal	Russia		~120,000 BP	1	DG	[78]
Aghem	Cameroon	NC	Present	28	HO	This paper
Bafut	Cameroon	NC	Present	11	HO	This paper
Baka	Cameroon	NC	Present	2	DG	[20]
Bakoko	Cameroon	NC	Present	1	HO	This paper
Bakola	Cameroon	NC	Present	2	DG	[20]
Bangwa	Cameroon	NC	Present	2	HO	This paper
Bedzan	Cameroon	NC	Present	2	DG	[20]
Fulani	Cameroon	NC	Present	2	DG	[20]
Lemande	Cameroon	NC	Present	2	DG	[25]
Mada	Cameroon	AA	Present	2	DG	[20]
Mbo	Cameroon	NC	Present	21	HO	This paper
Ngumba	Cameroon	NC	Present	2	DG	[20]
Tikar	Cameroon	NC	Present	2	DG	[20]
Agaw	Ethiopia	AA	Present	2	DG	[20]
Aka (Biaka)	Central African Republic	NC	Present	20/2	HO/DG	[22,25]
Chewa	Malawi	NC	Present	11	HO	[22]
Dinka	Sudan	NS	Present	7/4	HO/DG	[22,25]
French	France	IE	Present	3	DG	[25]
Hadza	Tanzania	KS	Present	5(2)/1	HO/DG	[22,25]
Han	China	ST	Present	4	DG	[25]
Herero	Namibia	NC	Present	2	DG	[25]
Khoesan	Namibia	KS	Present	22	HO	[22]
Mbuti	DR Congo	NC, NS	Present	10/4	HO/DG	[22,25]
Mende	Sierra Leone	NC	Present	8/2	HO/DG	[22,25]
Mursi	Ethiopia	NS	Present	2	DG	[20]
Sandawe	Tanzania	KS	Present	22	HO	[22]
Somali	Kenya	AA	Present	13	HO	[22]
Yoruba	Nigeria	NC	Present	70/3	HO/DG	[22,25]

List of populations used in analyses in the study. Data types are in-solution targeted SNP capture (1240k), whole-genome sequence with pseudo-haploid genotype calls (SG), high-coverage whole-genome sequence with diploid genotype calls (DG), and Human Origins SNP array (HO). For some populations, we used different sample sets for different analyses, indicated by slashes; Human Origins array genotyped individuals were used for PCA and for f-statistics testing differential relatedness to Shum Laka (Fig. 3B, Extended Data Fig. 3B). For Hadza, we used five individuals with Human Origins data for PCA and two of those five individuals for admixture graph modeling. HG, hunter-gatherers; AA, Afroasiatic; IE, Indo-European; KS, Khoesan; NC, Niger-Congo; NS, Nilo-Saharan; ST, Sino-Tibetan.

Individuals from Hora, Chencherere, and Fingira.

Individuals from Ballito Bay (A and B) and St. Helena Bay.

Uniparental markers and kinship analysis

All of the mtDNA and Y chromosome haplogroups we observe at Shum Laka are associated today with sub-Saharan Africans. The two earlier individuals carry mtDNA haplogroup L0a (specifically L0a2a1), which is widespread in Africa, while the two later individuals carry L1c (specifically L1c2a1b), which is found among both farmers and hunter-gatherers in Central and West Africa [15, 16]. Individuals 2/SE I and 4/A have Y chromosomes from macrohaplogroup B, often found today in Central African hunter-gatherers [17], while 2/SE II has the rare Y chromosome haplogroup A00, which was discovered in 2013 and is present at appreciable frequencies only in Cameroon, in particular among the Mbo and Bangwa in the western part of the country [12, 13]. A00 is the oldest known branch of the modern human Y chromosome tree, with a split time of ∼300,000–200,000 BP [12, 18, 19]. At 1666 positions (from whole-genome sequence data; Supplementary Table 4) that differ between present-day A00 [18] and all other Y chromosomes, the Shum Laka A00 carries the non-reference allele at 1521, translating to a within-A00 split at ∼37,000–25,000 BP (95% CI; Methods; Fig. 1).

Figure 1:

Y chromosome phylogeny.

Circles represent mutations along the (unrooted) A00 lineage where we observe the alternative (filled) or reference (empty) allele in the Shum Laka A00.

Leveraging the effects of chromosomal segments shared identical by descent (IBD), we computed rates of allelic identity for each pair of individuals to infer degrees of relatedness. Both contemporaneous pairs display elevated identity, with 2/SE I and 2/SE II at the level of fourth-degree relatives and 4/A and 5/B at the level of second-degree relatives (either uncle and niece, aunt and nephew, or half-siblings; Extended Data Fig. 2), supporting archaeological interpretations that the rockshelter was used as an extended family cemetery during both burial phases [2]. We would expect more recent shared ancestry for the contemporaneous pairs even if they were not closely related, but we observe clear signatures of long IBD segments across the genome, confirming their close family relatedness (Supplementary Information section 2). All four individuals also show evidence of recent inbreeding (i.e., intra-individual IBD).

Extended Data Figure 2:

Kinship analysis.

Shown are mean genome-wide allelic mismatch rates for each pair of individuals (blue), as well as intra-individual comparisons (red). We selected one read per individual at random at each targeted SNP (using all 1,233,013 targeted sites). Monozygotic twins (or intra-individual comparisons) are expected to have a value one-half as large as unrelated individuals; first-degree relatives, halfway between monozygotic twins and unrelated individuals; second-degree relatives, halfway between first-degree relatives and unrelated individuals; and so on. The presence of inbreeding also serves to reduce the rates of mismatches. For 4/A and 5/B, because both died as children, we can eliminate a grandparent-grandchild relationship, and the lack of long segments with both homologous chromosomes shared IBD implies that they are not double cousins (the few ostensible double-IBD stretches are likely a result of inbreeding; see Supplementary Information section 2). Thus, we can conclude that they were either uncle and niece (or aunt and nephew) or half-siblings. Bars show 99% confidence intervals (computed by block jackknife).

PCA and allele-sharing statistics

We visualized the genome-wide relationships between the Shum Laka individuals and diverse present-day and ancient sub-Saharan Africans (Extended Data Table 1) using principal component analysis (PCA). Initially, we computed axes using East and West Africans and southern and East-Central African hunter-gatherers (Fig. 2A). The Shum Laka individuals project to the right of Bantu speakers and related West African populations (Chewa, Mbo, and Mende), closest to present-day West-Central hunter-gatherers from Cameroon (Baka, Bakola, and Bedzan [20]) and the Central African Republic (Aka, often known as Biaka). We then carried out a second PCA using only West and East Africans and Aka to compute the axes, and again the Shum Laka individuals project in the direction of West-Central hunter-gatherers (Fig. 2B). By contrast, present-day Niger-Congo-speaking groups from western Cameroon cluster tightly with other West Africans (Fig. 2; Extended Data Fig. 3A). In both plots, the two earlier Shum Laka individuals fall slightly closer to West and East Africans, but based on their overall similarity, we grouped all four together for most subsequent analyses.

Figure 2:

PCA results.

(A) Broad-scale analysis. (B) Narrow-scale analysis. Groups in blue (including ancient individuals, filled symbols) were projected onto axes computed using the other populations, using 593,124 SNPs (Methods). HG, hunter-gatherers; S.L., Shum Laka; W-Cent. HG, Aka plus Cameroon hunter-gatherers (Baka, Bakola, and Bedzan).

Extended Data Figure 3:

Alternative PCA and allele-sharing analyses.

(A) Broad-scale PCA (differing from Fig. 2A by projecting all present-day Cameroon populations; again using 593,124 Human Origins SNPs). Groups shown in blue were projected onto axes computed using the other populations. HG, hunter-gatherers; S. L., Shum Laka. The W-Cent. HG grouping consists of Aka and Cameroon hunter-gatherers (Baka, Bakola, and Bedzan). The majority of the present-day Cameroon individuals fall in a tight cluster near other West Africans and Bantu speakers. (B) Relative allele sharing (mean ± SE, multiplied by 10,000, computed on 538,133 SNPs, as in Fig. 3B) with Shum Laka versus East Africans (f4 (X, Yoruba; Shum Laka, Somali); x-axis) and versus Aka (f4 (X, Yoruba; Shum Laka, Aka); y-axis) for present-day populations from Cameroon (blue points) and southern and eastern Bantu speakers (Herero in red and Chewa in orange). Mada and Fulani share more alleles with Shum Laka than with Aka, but this is likely a secondary consequence of admixture from East or North African sources (as reflected in greater allele sharing with Somali; see also Supplementary Information section 3). Bars show one standard error in each direction.

Using f-statistics (Fig. 3A), we investigated components of “deep ancestry” from sources diverging earlier than the split between non-Africans and most sub-Saharan Africans (above point (2) in Fig. 4A). We began with the statistic f (X, Mursi; South Africa HG, Han), which is expected to be increasingly positive for increasing deep ancestry in population X (via allele-sharing between X and ancient South African hunter-gatherers [21, 22]), with a baseline of zero set by Mursi, Nilotic-speaking pastoralists from western Ethiopia [20]. Shum Laka shows a large positive statistic, comparable to West-Central African hunter-gatherers (Fig. 3A, top), while other West Africans (e.g., Yoruba and Mende) yield smaller but significantly positive values, as do East African hunter-gatherers (Hadza from Tanzania and the ∼4500 BP Mota individual from Ethiopia [23]). We also obtained consistent results from analogous statistics with different reference groups (Extended Data Table 2).

Figure 3:

Allele-sharing statistics.

(A) Statistics sensitive to deep ancestry (mean ± 2SE, multiplied by 1000; blue, deeper than non-Africans; red, deeper than South African hunter-gatherers; computed on 1,121,119 SNPs). S.L., Shum Laka; SA, ancient South African hunter-gatherers. (B) Relative allele sharing (mean ± SE, multiplied by 10,000; computed on 538,133 SNPs) with Shum Laka versus East Africans (f4(X, Yoruba; Shum Laka, Somali); x-axis) and versus Aka (f4 (X, Yoruba; Shum Laka, Aka); y-axis) for present-day populations from Cameroon (blue) and southern (Herero, red) and eastern (Chewa, orange) Bantu speakers. See also Extended Data Fig. 3B.

Figure 4:

Admixture graph results.

Points at which multiple lineages are shown diverging simultaneously indicate splits occurring in short succession (whose order we cannot confidently assess) but do not represent exact multifurcations. Key points are (1) early modern human split, (2) East African divergences, and (3) Bantu expansion. Branch lengths not drawn to scale. (A) Full model; see also Extended Data Fig. 4. HG, hunter-gatherer; AP, agro-pastoralist; *proportion not well constrained. (B) Geographical structure: shaded areas denote hypothesized historical locations of lineages descended from split point (1) in panel (A), and branching order is shown for populations descended from split point (2) (one ancestry component per population, with leaf nodes at sampling locations). The blue star represents Shum Laka (dashed line, possible direction of gene flow).

Extended Data Table 2:

Allele-sharing statistics for deep ancestry

	f ₄ (X, Mursi; SA, Han)		f ₄ (X, Mota; SA, Han)		f ₄ (X, Han; SA, Mursi)		f ₄ (X, Mota; SA, Mursi)

Test pop	Value	Z -score	Value	Z -score	Value	Z -score	Value	Z -score

Dinka	1.4	5.8	−2.0	−5.5	0.1	0.2	−6.3	−20.2
Mota	3.4	9.0	0	0	6.3	18.1	0	0
Hadza	4.1	10.3	0.8	1.7	7.3	21.2	1.0	2.7
Yoruba	4.7	17.8	1.3	3.8	5.2	18.2	−1.1	−3.5
Lemande	5.0	16.8	1.7	4.5	5.7	18.2	−0.6	−2.1
Mende	5.7	19.1	2.3	6.3	6.3	20.0	0	0
Shum Laka	11.7	38.7	8.3	22.6	12.7	40.8	6.4	20.5
Aka	13.3	39.1	9.9	25.2	13.6	40.4	7.3	22.0
Mbuti	16.4	50.4	13.0	34.9	16.4	49.9	10.0	31.8
Mursi	0	0	−3.4	−9.0	..	..	..	..
Agaw	..	..	..	..	0.1	0.3	−6.2	−18.9
SA	..	..	..	..	..	..	..	..

	f ₄ (X, Mursi; SA, Mota)		f ₄ (X, Han; SA, Mota)		f ₄ (X, Han; SA, Yor)		f ₄ (X, Mursi; Chimp, Yor)

Test pop	Value	Z -score	Value	Z -score	Value	Z -score	Value	Z -score

Dinka	0.8	3.3	3.7	11.9	−0.7	−2.8	−0.9	−4.7
Mota	..	..	..	..	5.7	18.1	5.2	17.7
Hadza	4.1	11.5	7.0	17.7	4.8	15.2	3.4	11.4
Yoruba	4.1	15.7	7.1	21.6	..	..	..	..
Lemande	4.1	14.5	7.1	21.0	..	..	..	..
Mende	4.8	17.3	7.8	22.5	..	..	..	..
Shum Laka	9.1	29.8	12.0	33.7	8.0	28.7	8.3	31.9
Aka	10.3	33.4	13.2	35.5	7.8	24.8	8.5	30.1
Mbuti	12.5	41.8	15.5	44.1	11.6	40.8	11.8	46.3
Mursi	0	0	3.0	8.8	0.6	2.2	0	0
Agaw	−2.4	−7.7	0.6	1.8	0	0.2	−0.2	−0.9
SA	..	..	..	..	..	..	20.3	66.0

Variations of allele-sharing statistics (multiplied by 1000; computed on 1,121,119 SNPs) sensitive to ancestry in the test population X from a deeply-splitting lineage, along with Z-scores for difference from zero. We note that the zero level has a different meaning depending on which population is in the second position in the statistic. Blank entries are statistics that are confounded by specific relationships between the test population and one of the reference populations (in the third or fourth position; either duplication of the same group, Agaw with Han due to non-African-related ancestry, or Yoruba with other West Afrians). From the statistics f4 (Mursi/Agaw, Han; South Africa HG, Yoruba), we find minimal differences in deep ancestry proportions among Han, Mursi, and Agaw; from f4 (X, Mursi; Chimp, Yoruba), we obtain a value for South African hunter-gatherers that is roughly twice as large as for Central African hunter-gatherers. SA, ancient South African hunter-gatherers; Yor, Yoruba.

Next, we computed f4 (X, Mursi; Chimp, South Africa HG) (using chimpanzee as an outgroup symmetric to all human populations) to evaluate whether any of this deep ancestry is from sources diverging more deeply than southern African hunter-gatherers (the modern human lineage with the oldest known average split date [21, 24, 25]). Previous work has shown that southern African hunter-gatherers are not a symmetric outgroup relative to other sub-Saharan Africans, with West Africans (especially Mende) having excess affinity toward deeper outgroups [22]. Indeed, our test statistic is maximized in Mende and other West Africans (Fig. 3A, bottom). Hadza and Mota have values close to zero, and Shum Laka and Central African hunter-gatherers are intermediate. Some populations yield positive values for both f4-statistics (Fig. 3A), but the two sets are poorly correlated, implying that they in part reflect separate signals. Combining our newly genotyped individuals with published data [20], we searched for differential allele-sharing between the Shum Laka individuals (compared to either East Africans [Somali] or Aka) and present-day Cameroonians (Fig. 3B, Extended Data Fig. 3B). We identified three distinct clusters: (a) Mada and Fulani, (b) hunter-gatherers, and (c) other Niger-Congo-speaking populations (in closeup in Fig. 3B). Within the third cluster are the only groups—Mbo, Aghem, and Bafut, all living close to Shum Laka today—with significantly Shum Laka-directed statistics in both dimensions, consistent with small proportions of Shum Laka-related admixture (maximum ∼7–8%; Supplementary Information section 3).

Admixture graph analysis

Finally, we built an admixture graph (Methods, Fig. 4A, Extended Data Fig. 4) co-modeling the ancient Shum Laka, Mota, and South African hunter-gatherer individuals; present-day Mbuti, Aka, Agaw (Afroasiatic speakers from Ethiopia [20]), Yoruba, Mende, and Lemande; non-Africans (French); and two outgroups (Altai Neanderthal and chimpanzee). We also fit versions of the model using alternative SNP ascertainments and additional populations (Hadza, Mbo, Herero, Chewa, Mursi, Baka, Bakola, Bedzan, Mada, Fulani, and ancient individuals from Taforalt in Morocco [26]) and obtained similar results (Extended Data Table 3; Supplementary Information section 3).

Extended Data Figure 4:

Primary inferred admixture graph with full parameters.

Of the ∼1.2M targeted SNPs, 932k are used for fitting (i.e., are covered by all populations in the model). Branch lengths (in units of squared allele frequency divergence) are rounded to the nearest integer. All f-statistics relating the populations are predicted to within 2.3 standard errors of their observed values.

Extended Data Table 3:

Admixture graph parameter estimates

Model version:	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23

Mixture proportions (%)

Shum Laka basal WA	64	66	62	71	64	58	63	61	63	61	64	64	64	64	63	63	63		64	61	69	63	67/62*
Aka Bantu-associated	59	59	57	63	59	56	58	57	59	58	59	59	59	59	59	59	58	58	59	58	62	61	59
Mbuti Bantu-associated	26	24	33	19	28	27	26	12	28	30	32	25	24	26	29	28	35	35	25	35	23	36	27
Mbuti East African-related	17	19	10	27	14	9	16	23	15	13	11	19	20	18	13	14	6	6	18	9	23	8	16
West African clade archaic	2	2	4	4	3	3	3	2	2	2	3	2	2	2	3	3	3	2	..	..	..	..	2
West African clade	10	9	17	8	12	29	15	24	11	18	19	9	8	9	14	13	29	29	..	..	..	..	11
deep modern human Mende deep	4	4	4	3	4	3	4	6	5	5	5	4	4	4	5	5	5	5	4	4	4	3	4
ancestry Mota deep ancestry	29	29	30	29	30	31	31	30	29	31	29	29	29	28	30	31	29	29	29	30	27	26	29

Branch lengths

Basal WA split[†]	2	3	3	3	3	1	3	2	2	2	2	2	3	3	2	2	3	..	2	3	3	1	3
South African HG split[‡]	1	1	0	4	1	−1	1	2	1	1	1	1	1	1	1	1	1	1	1	0	4	0	1
Ghost modern human split[#]	1	1	1	−3	1	1	0	−2	1	0	−1	1	1	1	0	1	1	1	..	..	..	..	2

Key admixture graph parameter estimates across different model versions (see Supplementary Information section 3 for full details): 1, primary model; 2, no “dummy” admixture; 3, African-ascertained SNPs; 4, transversion SNPs; 5, Shum Laka whole-genome sequence data; 6, outgroup-ascertained transversions; 7, Hadza added; 8, Mbo in place of Lemande; 9, Herero added; 10, Chewa added; 11, Mursi in place of Agaw; 12, Baka added; 13, Bakola added; 14, Bedzan added; 15, Mada added; 16, Fulani added; 17, Taforalt added; 18, alternative admixture for Shum Laka; 19, alternative deep source; 20, alternative deep source with African-ascertained SNPs; 21, alternative deep source with transversion SNPs; 22, alternative deep source with outgroup-ascertained transversions; 23, Shum Laka pairs fit separately. HG, hunter-gatherers.

Earlier pair/later pair

Units above the main West African clade

Units below the split of the Central African hunter-gather lineage (negative value indicates distance above)

Units along the Central African hunter-gather lineage (negative values indicate distances along an adjacent edge)

Among modern humans, the deepest-splitting branch is inferred to be the one leading to Central African hunter-gatherers, although four lineages diverge in a very short span: those contributing the primary ancestry to (a) Central African hunter-gatherers, (b) southern African hunter-gatherers, and (c) other modern human populations, along with (d) a “ghost” source contributing a minority of the ancestry in West Africans and the Mota individual. Central African hunter-gatherers separate into eastern (Mbuti) and western clades, with the latter then branching into components represented in Aka and Shum Laka. Next, a second cluster of divergences involves West Africans, two East African lineages (hunter-gatherer-associated and agro-pastoralist-associated), and non-Africans, the latter tentatively inferred to be a sister group to Mota but with no deep “ghost” ancestry. Within the West African clade, we identify Yoruba and Mende as sister groups, with Lemande as an outgroup, and most basally a separate West African-related lineage contributing to Shum Laka (64%). A Bantu-associated source (most closely related to Lemande) contributes 59% of the ancestry in Aka and 26% in Mbuti (who also harbor ancestry [17%] from an East African agro-pastoralist-related source). In a model separating the ∼8000 BP and ∼3000 BP Shum Laka pairs, the latter have ∼5% more Central African hunter-gatherer-related ancestry (as confirmed by the significantly positive statistic f4 (Shum Laka 8000 BP, Shum Laka 3000 BP; Yoruba, Aka) [Z=4.2]; Supplementary Information section 3). We can also obtain a good fit for the Shum Laka individuals in a less-parsimonious alternative model using three components, replacing the basal West African source with a combination of ancestry from inside the clade defined by the other West African populations and from a source splitting between East and West Africans (near one lineage contributing to Taforalt; Extended Data Fig. 5, Supplementary Information section 3). However, two-component models for Shum Laka with the majority source splitting closer to other West or East Africans are rejected (Z=7.1 and Z=3.7, respectively).

Extended Data Figure 5:

Schematic of first alternative admixture graph.

Results are shown including ancient individuals from Taforalt in Morocco associated with the Iberomaurusian culture, with the Shum Laka individuals modeled as having a mixture of hunter-gatherer-related ancestry plus two additional components: one from within the main portion of the West African clade, and one splitting at nearly the same point as one of the sources contributing ancestry to Taforalt. Branch lengths are not drawn to scale. Points at which multiple lineages are shown diverging simultaneously indicate splits occurring in short succession (whose order we cannot confidently assess) but are not meant to represent exact multifurcations. HG, hunter-gatherer; AP, agro-pastoralist. ∗ Proportion not well constrained (for Mbuti, the sum of the two indicated proportions is well constrained but not the separate values). See Supplementary Information section 3 for full inferred model parameters.

The West African clade is distinguished by admixture from a deep source that can be modeled as a combination of modern human and archaic ancestry. The modern human component diverges at almost the same point as Central and southern African hunter-gatherers and is tentatively related to the deep source contributing ancestry to Mota, while the archaic component diverges close to the split between Neanderthals and modern humans (Supplementary Information section 3). The signals of deep ancestry in West African-related groups (Fig. 3A) can be explained by two admixture events: one along the ancestral West African lineage, and a second, smaller contribution (∼4%) to Mende from the same source (Fig. 4A). Accordingly, f4 -statistics testing for ancestry basal to southern African hunter-gatherers (Fig. 3A, bottom) are well correlated to inferred proportions of ancestry from the West African clade (Extended Data Fig. 6). We estimate the shared admixture to introduce 10% deep modern human and 2% archaic ancestry, although the first proportion is not well constrained (Extended Data Table 3). An alternative model with no archaic component, in which the West African clade receives deep ancestry from a single source [22] splitting before point (1) in Fig. 4A, also provides a reasonable fit to the data (Extended Data Fig. 5, Supplementary Information section 3), although it does not account for previous evidence of archaic ancestry in sub-Saharan Africans [27-31].

Extended Data Figure 6:

Deep ancestry correlation from the West African clade.

An allele-sharing statistic sensitive to ancestry splitting more deeply than South African hunter-gatherers (f4 (X, Mursi; Chimp, South Africa HG), mean ± 2SE from block jackknife, computed on 1,121,119 SNPs, as in Fig. 3A) is shown as a function of West African-related ancestry (from admixture graph results; Mota, Yoruba, and Lemande shifted slightly away from the boundaries for legibility). The (relative) allele-sharing rate for Mursi is identically zero according to the definition of the statistic.

Shum Laka in genetic and archaeological context

Our analyses show that the four sampled children from Shum Laka can be modeled as admixed with ∼35% ancestry related to West-Central African hunter-gatherers and ∼65% from a basal West African-related source, or alternatively as a mixture of hunter-gatherer-related ancestry plus two additional components, one from inside the clade of present-day West Africans and one splitting between East and West Africans. The first component plausibly represents ancestry present in the area since at least the LSA, whereas the second component (third in the alternative model) may have originated farther to the north, given the geography and phylogeny of other sampled populations (Fig. 4B). The chronology of the archaeological record at Shum Laka suggests a possible northern influence on cultural developments during the SMA [3, 9]; these include changes in stone tools, which can be interpreted as a fusion of local LSA tool-making traditions with new macrolithic technologies introduced from the north [3], and the appearance of ceramics (four sherds found in the early SMA burial layer, and more abundant and distinct ceramics in later SMA deposits) potentially related to earlier pottery-working traditions in the Sahara and Sahel [3, 32]. Gene flow from the north before 8000 BP is also plausible due to a short period of Saharan and Sahelian aridification [3, 33]. Present-day groups in northern West Africa and the Sahel have substantial admixture connected to later migrations [34], so identifying the exact source area may await additional ancient DNA studies. Although the scope of our sampling is limited to two individuals at either end of the SMA, the observed genetic similarity across a span of almost 5000 years—also consistent with skeletal morphometric analyses—suggests a long-term presence of related peoples who used the rockshelter for various activities, including burying their dead (Supplementary Information section 1). Today, however, most populations in Cameroon are more closely related to other West Africans than to the group represented by these individuals. Present-day hunter-gatherers in Cameroon are also not descended substantially from this specific group, as they lack the signal of basal West African ancestry (Supplementary Information section 3). We do observe elevated allele-sharing between the Shum Laka individuals and present-day Grassfields populations, so the genetic discontinuity is not absolute. Additionally, the adolescent male 2/SE II carried an A00 Y chromosome, suggesting that the concentration of this haplogroup in western Cameroon may have a long history, and moreover that A00 was formerly more diverse, given that the Shum Laka sequence falls outside of known present-day variation [12, 13]. The ∼300,000–200,000 BP divergence time of A00 from other modern human haplogroups [18, 19] could support its association either with the Central African hunter-gatherer-related ancestry component of the Shum Laka individuals or with the deep modern human portion of their West African-related ancestry. Linguistic and genetic evidence points to western Cameroon as the most likely area for the development of Bantu languages and as the ultimate source of subsequent migrations of Bantu speakers, and while the regional mid-Holocene archaeological record is sparse, Shum Laka has been highlighted as possibly an important site in the early phase of this process [1–4, 6–11]. However, the genetic profiles of our four sampled individuals—even by ∼3000 BP, when the spreads of Bantu languages and of ancestry associated with Bantu-speaking populations was already underway—are very different from those of most Niger-Congo speakers today, implying that these individuals are not representative of the primary source population(s) ancestral to present-day Bantu speakers. These results neither support nor contradict a central role for the Grassfields area in the origins of Bantu-speaking peoples, and it may be that multiple, highly differentiated populations formerly lived in the region, with potentially either high or low levels of linguistic diversity. It would not be surprising if the Shum Laka site itself was used (either successively or concurrently) by multiple groups with different ancestry, cultural traditions, or languages [1], evidence of which may not be visible from the collection of remains as preserved today.

Implications for deep African population history

By analyzing data from Shum Laka and other ancient individuals in conjunction with present-day groups, we gain new insights into African population structure on multiple timescales. First, we infer a series of closely spaced population splits involving West African-related and two East African-related lineages, as well as non-Africans (point (2) in Fig. 4A). From the geography of the populations involved, the center of this radiation was plausibly in East Africa (Fig. 4B), with a date of ∼80,000–60,000 BP based on estimated divergences of African and non-African populations [24, 35]. Such an expansion is also consistent with mtDNA phylogeography—specifically the diversification of haplogroup L3, likely originating in East Africa ∼70,000 BP [36, 37]—and potentially with the origins of clade CT in the Y chromosome tree at a similar time depth [18, 38]. Second, we infer a phase of divergences involving at least four lineages early in the history of modern humans (point (1) in Fig. 4A). Recent consensus has been that southern African hunter-gatherers, who split from other populations ∼250,000–200,000 BP, represent the deepest sampled branch of modern human variation [21, 24, 25]. Our results suggest that Central African hunter-gatherers split at close to the same time (perhaps slightly earlier), and thus that both clades, as well as the lineage that would later diversify at point (2), originated as part of a large-scale African radiation. In addition to the well-characterized deep lineages, we also detect at least one deep “ghost” source contributing to West Africans and East African hunter-gatherers. This signal corroborates previous evidence for Hadza and Sandawe [39] and for West Africans [22], although we find that the best fit is a source splitting near the same point as southern and Central African hunter-gatherers. Our results are also consistent with previous reports of archaic ancestry in African populations [27-31], specifically in West Africans. The presence of deep ancestry in the West African clade is notable in light of the Pleistocene archaeological record [5, 40], which includes Homo sapiens fossils dated to ∼300,000 BP in northwestern Africa [41], as well as an individual with archaic features buried ∼12,000 BP in southwestern Nigeria (the oldest known human fossil from West Africa proper) [42]. Middle Stone Age artifacts have also been found in parts of West Africa into the terminal Pleistocene [43], despite the development of LSA technologies elsewhere (e.g., Shum Laka). Thus, the available material and fossil evidence is concordant with our genetic results in indicating long-term African population structure and admixture [44, 45]. Further genetic studies may reveal additional complexities in deep human population history, while some early human groups will likely remain known only through fossils [44, 45]. Based on our current understanding, the presence of at least four modern human lineages that diversified ∼250,000–200,000 BP and are represented in people living today supports archaeological evidence that this was a pivotal period for human evolution in Africa.

Methods

Ancient DNA sample processing

We obtained bone powder from the Shum Laka skeletons (see Supplementary Information section 1 for more information on the site and burials) by drilling cochlear portions of petrous bone samples in a clean room facility at the Royal Belgian Institute of Natural Sciences. In dedicated clean rooms at Harvard Medical School, we extracted DNA using published protocols [46, 47]. From the extracts, we prepared barcoded double-stranded libraries treated with uracil-DNA glycosylase (UDG) to reduce the rate of characteristic ancient DNA damage [14, 48] in a modified partial UDG preparation including magnetic bead cleanups [14, 49]. For the SNP capture data, we used two rounds of in-solution target hybridization to enrich for sequences overlapping the mitochondrial genome and approximately 1.2 million genome-wide SNPs [50-54]. We then added 7-base-pair indexing barcodes to the adapters of each library [55] and sequenced on an Illumina NextSeq 500 machine with 76-base-pair paired-end reads. For individuals 2/SE II and 4/A, we also generated whole-genome shotgun data from the same libraries but without the target enrichment step. Sequencing was performed at the Broad Institute on an Illumina HiSeq X Ten machine, using 19 lanes for 2/SE II (yielding approximately 18.5× average coverage, including 1,216,658 sites covered from the set of target SNPs used in most analyses) and two lanes for 4/A (3.9× average coverage, 1,158,884 sites covered). From the raw sequencing results, we retained reads with no more than one mismatch per read pair to the library-specific barcodes. Prior to alignment, we merged paired-end sequences based on forward and reverse mate overlaps and trimmed barcodes and adapters. Preprocessed reads were then mapped to both the mitochondrial reference genome RSRS [37] and the human reference genome (version hg19) using the “samse” command with default parameters in BWA (version 0.6.1) [56]. Duplicate molecules (having the same mapped start and end positions and strand orientation) were removed post-alignment. We filtered the mapped sequences (requiring mapping quality scores of at least 10 for targeted SNP capture and 30 for whole-genome shotgun data) and trimmed two terminal bases to eliminate (almost all) damage-induced errors. For mitochondrial DNA, we called haplogroups using HaploGrep2 [57]. For nuclear DNA obtained from SNP capture and for the whole-genome shotgun data for individual 4/A, we selected one allele at random per site to create pseudo-haploid genotypes. For the whole-genome shotgun data for individual 2/SE II, we used a previously described reference-bias-free diploid genotype calling procedure [25], converting resulting genotypes into a fasta-like encoding allowing for extraction of data at specified sites via cascertain and cTools [25]. We determined the sex of each individual by examining the fractions of sequences mapping to the X and Y chromosomes [58], and we determined Y-chromosome haplogroups by comparing sequence-level SNP information to the tree established by the International Society of Genetic Genealogy (http://www.isogg.org). To ensure authenticity, we computed the proportion of C-to-T deamination errors in terminal positions of sequenced molecules and evaluated possible contamination via heterozygosity at variable sites in haploid genome regions, using contamMix [50] and ANGSD [59] for mtDNA and the X chromosome (in males), respectively. Observed damage rates (4–10%) were relatively low but within the expected range after partial UDG treatment [14], and apparent heterozygosity rates for mtDNA (0.3–1.5% estimated contamination) and the X chromosome (0.5–1.0% estimated contamination) were minimal. The molecular preservation of the samples is impressive given the long-term warm and humid climate at Shum Laka [60] (supporting a mixed forest-savannah environment, at an elevation of ∼1650 meters above sea level).

Radiocarbon dates

At the Pennsylvania State University (PSU) Radiocarbon Laboratory, we generated new direct radiocarbon dates via accelerator mass spectrometry (AMS) for the four analyzed individuals, using fragments of the same temporal bone portions that were sampled for ancient DNA. We extracted and purified amino acids using a modified XAD process [61] and assessed sample quality via stable isotope analysis. C:N ratios for all four samples fell between 3.3 and 3.4, well within the nominal range of 2.9–3.6 indicating good collagen preservation [62]. The PSU dates were in good agreement with previously reported direct dates for different bones from individuals 2/SE II (8160–7790 cal BP, 7150 ± 70 BP, OxA-5203) and 4/A (3380–3010 cal BP, 3045 ± 60 BP, OxA-5205) [1, 2, 63, 64], but on the basis of a (modestly) aberrant date [65] from a rib of individual 2/SE I (Supplementary Table 5), we restricted our final reported results to the temporal bones. We performed calibrations using OxCal [66] version 4.3.2 with a mixture of the IntCal13 [67] and SHCal13 [68] curves, specifying “U(0,100)” to allow for a flexible combination [66, 69], and rounding final results to the nearest 10 years (see also Supplementary Information section 1).

New present-day data

We generated genome-wide SNP genotype data for 63 individuals from five present-day Cameroonian populations on the Human Origins array: Aghem (28), Bafut (11), Bakoko (1), Bangwa (2), and Mbo (21) (Extended Data Table 1; Supplementary Table 3). Samples were collected with informed consent, with collection and analysis approved by the UCL/UCLH Committee on the Ethics of Human Research, Committee A and Alpha.

A00 Y chromosome split time estimation

Present-day A00 Y chromosomes are classified into the subtypes A00a, A00b, and A00c, whose divergence times from each other have not been precisely estimated but are quite recent, perhaps only a few thousand years [12, 13]. To estimate the split time of the Shum Laka A00 Y chromosome from present-day A00, we called genotypes for individual 2/SE II (from our whole-genome sequence data) at a set of positions where sequences from two present-day individuals with haplogroup A00 [18] differ from all non-A00 individuals. (At every subtype-specific site for which we had coverage, the Shum Laka A00 carries the ancestral allele.) To avoid needing to determine the status of mutations as ancestral or derived, we considered the entire unrooted lineage specific to A00 (see Fig. 1). The total time span represented by this lineage is approximately 359,000 years, using published values of ∼275,000 BP for the divergence of the A00 lineage from other modern human haplogroups [19] and ∼191,000 BP for the next-oldest split within macrohaplogroup A [70]. With a requirement of at least 90% agreement among the reads at each site, we called 1521 positions as having the alternative allele (i.e., matching present-day A00 and differing from the human reference sequence) and 145 as having the reference allele (taking the average of 143 and 147 for the two present-day individuals). The fraction 145/(145+1521) then defines the position of the Shum Laka split along the (unrooted) A00 lineage. We note that split times computed either from all sites (relaxing the 90% threshold and using the majority allele), or from additionally requiring at least two reads per site, differ from our primary estimate by only a few hundred years. To produce a confidence interval, we used the variance in the published estimates and assumed an independent Poisson sampling error for the number of observed reference alleles. The final point estimate was ∼31,000 BP (95% CI: 37,000–25,000 BP), meaning that the Shum Laka A00 (with a sample date of ∼8000 BP) cannot be directly ancestral to the present-day subtypes.

PCA and allele-sharing statistics

We performed PCA using smartpca (with the “lsqproject” and “autoshrink” options) [71, 72] and computed f4-statistics using ADMIXTOOLS (with standard errors estimated via block jackknife over 5 cM chromosomal segments) [73]. We projected all ancient individuals in PCA rather than using them to compute axes in order to avoid artifacts caused by missing data. In each PCA, we also projected a subset of the present-day populations to allow controlled comparisons with ancient individuals. In most cases, reported f4-statistics are based on the approximately 1.15M autosomal SNPs from our target capture set. For PCA and for f4-statistics testing differential relatedness to Shum Laka, we used autosomal SNPs from the Human Origins array (a subset of the target capture set), with some populations in the analyses only genotyped on this subset (see Extended Data Table 1). For these latter f4-statistics, we excluded for all populations a set of roughly 40k SNPs having high missingness in the present-day Cameroon data.

Admixture graphs

We fit admixture graphs with the ADMIXTUREGRAPH (qpGraph) program in ADMIXTOOLS (with the options “outpop: NULL,” “lambdascale: 1,” “inbreed: YES,” and “diag: 0.0001”) [73-75], using the 1.15M autosomal SNPs from our target capture set by default, and other sets of SNPs in alternative model versions as specified. The program requires as input the branching order of the populations in the graph and a list of admixture events, and it then solves for the optimal parameters of the model (branch lengths and mixture proportions) via an objective function measuring the deviation between predicted and observed values of a basis set of f-statistics. From the inferred parameters, poorly fitting topologies (including positions of admixture sources) can be corrected by changing split orders at internal nodes that appear as trifurcations under the constraints enforced by the input (see Supplementary Information section 3). To evaluate the fit quality of output models, we employed two metrics: first, a list of residual Z-scores for all f-statistics relating the populations in the graph, and second, a combined approximate log-likelihood score. The first metric is useful for identifying particularly poorly fitting models and the elements that are most responsible for the poor fits, while the second provides a means for comparing the overall fits of separate models (Supplementary Information section 3). In order to assess the degree of constraint on individual parameter inferences, we were guided primarily by the variability across different model versions (using different populations and SNP sets; see Extended Data Table 3 and Supplementary Information section 3), which reflects both statistical uncertainty and changes in model-specific assumptions. In our primary model, all f-statistics relating subsets of the populations are predicted to within 2.3 standard errors of their observed values. Initially, we detected a slight but significant signal (max Z=2.5) of allele-sharing between Shum Laka and non-Africans, which we hypothesize is due to a small amount of DNA contamination. To prevent this effect from influencing our results, we included a “dummy” admixture of non-African ancestry into Shum Laka (inferred 1.1%, consistent with mtDNA- and X chromosome-based contamination estimates), although model parameters without the dummy admixture are also very similar (Extended Data Table 3, Supplementary Information section 3).

Data availability

The aligned sequences are available through the European Nucleotide Archive under accession number PRJEB32086. Genotype data used in analysis are available at https://reich.hms.harvard.edu/datasets.

Overview of the site of Shum Laka.

Kinship analysis.

Alternative PCA and allele-sharing analyses.

Primary inferred admixture graph with full parameters.

Schematic of first alternative admixture graph.

Deep ancestry correlation from the West African clade.

Schematic of second alternative admixture graph.

Results are shown with a single-component deep source for West Africans. Branch lengths are not drawn to scale. Points at which multiple lineages are shown diverging simultaneously indicate splits occurring in short succession (whose order we cannot confidently assess) but are not meant to represent exact multifurcations. HG, hunter-gatherer; AP, agro-pastoralist.*Proportion not well constrained (for Mbuti, the sum of the two indicated proportions is well constrained but not the separate values). See Supplementary Information section 3 for full inferred model parameters. Populations used in the study List of populations used in analyses in the study. Data types are in-solution targeted SNP capture (1240k), whole-genome sequence with pseudo-haploid genotype calls (SG), high-coverage whole-genome sequence with diploid genotype calls (DG), and Human Origins SNP array (HO). For some populations, we used different sample sets for different analyses, indicated by slashes; Human Origins array genotyped individuals were used for PCA and for f-statistics testing differential relatedness to Shum Laka (Fig. 3B, Extended Data Fig. 3B). For Hadza, we used five individuals with Human Origins data for PCA and two of those five individuals for admixture graph modeling. HG, hunter-gatherers; AA, Afroasiatic; IE, Indo-European; KS, Khoesan; NC, Niger-Congo; NS, Nilo-Saharan; ST, Sino-Tibetan. Individuals from Hora, Chencherere, and Fingira. Individuals from Ballito Bay (A and B) and St. Helena Bay. Allele-sharing statistics for deep ancestry Variations of allele-sharing statistics (multiplied by 1000; computed on 1,121,119 SNPs) sensitive to ancestry in the test population X from a deeply-splitting lineage, along with Z-scores for difference from zero. We note that the zero level has a different meaning depending on which population is in the second position in the statistic. Blank entries are statistics that are confounded by specific relationships between the test population and one of the reference populations (in the third or fourth position; either duplication of the same group, Agaw with Han due to non-African-related ancestry, or Yoruba with other West Afrians). From the statistics f4 (Mursi/Agaw, Han; South Africa HG, Yoruba), we find minimal differences in deep ancestry proportions among Han, Mursi, and Agaw; from f4 (X, Mursi; Chimp, Yoruba), we obtain a value for South African hunter-gatherers that is roughly twice as large as for Central African hunter-gatherers. SA, ancient South African hunter-gatherers; Yor, Yoruba. Admixture graph parameter estimates Key admixture graph parameter estimates across different model versions (see Supplementary Information section 3 for full details): 1, primary model; 2, no “dummy” admixture; 3, African-ascertained SNPs; 4, transversion SNPs; 5, Shum Laka whole-genome sequence data; 6, outgroup-ascertained transversions; 7, Hadza added; 8, Mbo in place of Lemande; 9, Herero added; 10, Chewa added; 11, Mursi in place of Agaw; 12, Baka added; 13, Bakola added; 14, Bedzan added; 15, Mada added; 16, Fulani added; 17, Taforalt added; 18, alternative admixture for Shum Laka; 19, alternative deep source; 20, alternative deep source with African-ascertained SNPs; 21, alternative deep source with transversion SNPs; 22, alternative deep source with outgroup-ascertained transversions; 23, Shum Laka pairs fit separately. HG, hunter-gatherers. Earlier pair/later pair Units above the main West African clade Units below the split of the Central African hunter-gather lineage (negative value indicates distance above) Units along the Central African hunter-gather lineage (negative values indicate distances along an adjacent edge)

49 in total

1. Phylogeography of the human mitochondrial L1c haplogroup: genetic signatures of the prehistory of Central Africa.

Authors: Chiara Batini; Valentina Coia; Cinzia Battaggia; Jorge Rocha; Maya Metni Pilkington; Gabriella Spedini; David Comas; Giovanni Destro-Bisol; Francesc Calafell
Journal: Mol Phylogenet Evol Date: 2006-10-05 Impact factor: 4.286

2. Genetic and demographic implications of the Bantu expansion: insights from human paternal lineages.

Authors: Gemma Berniell-Lee; Francesc Calafell; Elena Bosch; Evelyne Heyer; Lucas Sica; Patrick Mouguiama-Daouda; Lolke van der Veen; Jean-Marie Hombert; Lluis Quintana-Murci; David Comas
Journal: Mol Biol Evol Date: 2009-04-15 Impact factor: 16.240

3. Dispersals and genetic adaptation of Bantu-speaking populations in Africa and North America.

Authors: Etienne Patin; Marie Lopez; Rebecca Grollemund; Paul Verdu; Christine Harmant; Hélène Quach; Guillaume Laval; George H Perry; Luis B Barreiro; Alain Froment; Evelyne Heyer; Achille Massougbodji; Cesar Fortes-Lima; Florence Migot-Nabias; Gil Bellis; Jean-Michel Dugoujon; Joana B Pereira; Verónica Fernandes; Luisa Pereira; Lolke Van der Veen; Patrick Mouguiama-Daouda; Carlos D Bustamante; Jean-Marie Hombert; Lluís Quintana-Murci
Journal: Science Date: 2017-05-05 Impact factor: 47.728

4. Whole-mtDNA genome sequence analysis of ancient African lineages.

Authors: Mary Katherine Gonder; Holly M Mortensen; Floyd A Reed; Alexandra de Sousa; Sarah A Tishkoff
Journal: Mol Biol Evol Date: 2006-12-28 Impact factor: 16.240

5. Contrasting patterns of Y chromosome and mtDNA variation in Africa: evidence for sex-biased demographic processes.

Authors: Elizabeth T Wood; Daryn A Stover; Christopher Ehret; Giovanni Destro-Bisol; Gabriella Spedini; Howard McLeod; Leslie Louie; Mike Bamshad; Beverly I Strassmann; Himla Soodyall; Michael F Hammer
Journal: Eur J Hum Genet Date: 2005-07 Impact factor: 4.246

6. An African American paternal lineage adds an extremely ancient root to the human Y chromosome phylogenetic tree.

Authors: Fernando L Mendez; Thomas Krahn; Bonnie Schrack; Astrid-Maria Krahn; Krishna R Veeramah; August E Woerner; Forka Leypey Mathew Fomine; Neil Bradman; Mark G Thomas; Tatiana M Karafet; Michael F Hammer
Journal: Am J Hum Genet Date: 2013-02-28 Impact factor: 11.025

7. The genetic structure and history of Africans and African Americans.

Authors: Sarah A Tishkoff; Floyd A Reed; Françoise R Friedlaender; Christopher Ehret; Alessia Ranciaro; Alain Froment; Jibril B Hirbo; Agnes A Awomoyi; Jean-Marie Bodo; Ogobara Doumbo; Muntaser Ibrahim; Abdalla T Juma; Maritha J Kotze; Godfrey Lema; Jason H Moore; Holly Mortensen; Thomas B Nyambo; Sabah A Omar; Kweli Powell; Gideon S Pretorius; Michael W Smith; Mahamadou A Thera; Charles Wambebe; James L Weber; Scott M Williams
Journal: Science Date: 2009-04-30 Impact factor: 47.728

8. A recent bottleneck of Y chromosome diversity coincides with a global change in culture.

Authors: Monika Karmin; Lauri Saag; Mário Vicente; Melissa A Wilson Sayres; Mari Järve; Ulvi Gerst Talas; Siiri Rootsi; Anne-Mai Ilumäe; Reedik Mägi; Mario Mitt; Luca Pagani; Tarmo Puurand; Zuzana Faltyskova; Florian Clemente; Alexia Cardona; Ene Metspalu; Hovhannes Sahakyan; Bayazit Yunusbayev; Georgi Hudjashov; Michael DeGiorgio; Eva-Liis Loogväli; Christina Eichstaedt; Mikk Eelmets; Gyaneshwer Chaubey; Kristiina Tambets; Sergei Litvinov; Maru Mormina; Yali Xue; Qasim Ayub; Grigor Zoraqi; Thorfinn Sand Korneliussen; Farida Akhatova; Joseph Lachance; Sarah Tishkoff; Kuvat Momynaliev; François-Xavier Ricaut; Pradiptajati Kusuma; Harilanto Razafindrazaka; Denis Pierron; Murray P Cox; Gazi Nurun Nahar Sultana; Rane Willerslev; Craig Muller; Michael Westaway; David Lambert; Vedrana Skaro; Lejla Kovačevic; Shahlo Turdikulova; Dilbar Dalimova; Rita Khusainova; Natalya Trofimova; Vita Akhmetova; Irina Khidiyatova; Daria V Lichman; Jainagul Isakova; Elvira Pocheshkhova; Zhaxylyk Sabitov; Nikolay A Barashkov; Pagbajabyn Nymadawa; Evelin Mihailov; Joseph Wee Tien Seng; Irina Evseeva; Andrea Bamberg Migliano; Syafiq Abdullah; George Andriadze; Dragan Primorac; Lubov Atramentova; Olga Utevska; Levon Yepiskoposyan; Damir Marjanovic; Alena Kushniarevich; Doron M Behar; Christian Gilissen; Lisenka Vissers; Joris A Veltman; Elena Balanovska; Miroslava Derenko; Boris Malyarchuk; Andres Metspalu; Sardana Fedorova; Anders Eriksson; Andrea Manica; Fernando L Mendez; Tatiana M Karafet; Krishna R Veeramah; Neil Bradman; Michael F Hammer; Ludmila P Osipova; Oleg Balanovsky; Elza K Khusnutdinova; Knut Johnsen; Maido Remm; Mark G Thomas; Chris Tyler-Smith; Peter A Underhill; Eske Willerslev; Rasmus Nielsen; Mait Metspalu; Richard Villems; Toomas Kivisild
Journal: Genome Res Date: 2015-03-13 Impact factor: 9.043

9. The Divergence of Neandertal and Modern Human Y Chromosomes.

Authors: Fernando L Mendez; G David Poznik; Sergi Castellano; Carlos D Bustamante
Journal: Am J Hum Genet Date: 2016-04-07 Impact factor: 11.025

10. African evolutionary history inferred from whole genome sequence data of 44 indigenous African populations.

Authors: Shaohua Fan; Derek E Kelly; Marcia H Beltrame; Matthew E B Hansen; Swapan Mallick; Alessia Ranciaro; Jibril Hirbo; Simon Thompson; William Beggs; Thomas Nyambo; Sabah A Omar; Dawit Wolde Meskel; Gurja Belay; Alain Froment; Nick Patterson; David Reich; Sarah A Tishkoff
Journal: Genome Biol Date: 2019-04-26 Impact factor: 13.583

23 in total

Review 1. Methods for detecting introgressed archaic sequences.

Authors: Sriram Sankararaman
Journal: Curr Opin Genet Dev Date: 2020-07-24 Impact factor: 5.578

Review 2. African genetic diversity and adaptation inform a precision medicine agenda.

Authors: Luisa Pereira; Leon Mutesa; Paulina Tindana; Michèle Ramsay
Journal: Nat Rev Genet Date: 2021-01-11 Impact factor: 53.242

Review 3. Origins of modern human ancestry.

Authors: Anders Bergström; Chris Stringer; Mateja Hajdinjak; Eleanor M L Scerri; Pontus Skoglund
Journal: Nature Date: 2021-02-10 Impact factor: 49.962

4. Paleo-ENSO influence on African environments and early modern humans.

Authors: Stefanie Kaboth-Bahr; William D Gosling; Ralf Vogelsang; André Bahr; Eleanor M L Scerri; Asfawossen Asrat; Andrew S Cohen; Walter Düsing; Verena Foerster; Henry F Lamb; Mark A Maslin; Helen M Roberts; Frank Schäbitz; Martin H Trauth
Journal: Proc Natl Acad Sci U S A Date: 2021-06-08 Impact factor: 11.205

Uniparental markers and kinship analysis

PCA and allele-sharing statistics

Admixture graph analysis

Shum Laka in genetic and archaeological context

Implications for deep African population history

Methods

Ancient DNA sample processing

Radiocarbon dates

New present-day data

A00 Y chromosome split time estimation

PCA and allele-sharing statistics

Admixture graphs

Data availability

Overview of the site of Shum Laka.

Kinship analysis.

Alternative PCA and allele-sharing analyses.

Primary inferred admixture graph with full parameters.

Schematic of first alternative admixture graph.

Deep ancestry correlation from the West African clade.

Schematic of second alternative admixture graph.

Review 1. Methods for detecting introgressed archaic sequences.

Review 2. African genetic diversity and adaptation inform a precision medicine agenda.

Review 3. Origins of modern human ancestry.

Review 6. Human behaviour and climate-linked fluctuations in the rainforests of West-Central Africa.

Review 9. The deep population history in Africa.

Review 10. Structure and ancestry patterns of Ethiopians in genome-wide autosomal DNA.