| Literature DB >> 35328030 |
Monica Fahey1,2, Maurizio Rossetto2, Emilie Ens1, Andrew Ford3.
Abstract
Over millennia, Indigenous peoples have dispersed the propagules of non-crop plants through trade, seasonal migration or attending ceremonies; and potentially increased the geographic range or abundance of many food species around the world. Genomic data can be used to reconstruct these histories. However, it can be difficult to disentangle anthropogenic from non-anthropogenic dispersal in long-lived non-crop species. We developed a genomic workflow that can be used to screen out species that show patterns consistent with faunal dispersal or long-term isolation, and identify species that carry dispersal signals of putative human influence. We used genotyping-by-sequencing (DArTseq) and whole-plastid sequencing (SKIMseq) to identify nuclear and chloroplast Single Nucleotide Polymorphisms in east Australian rainforest trees (4 families, 7 genera, 15 species) with large (>30 mm) or small (<30 mm) edible fruit, either with or without a known history of use by Indigenous peoples. We employed standard population genetic analyses to test for four signals of dispersal using a limited and opportunistically acquired sample scheme. We expected different patterns for species that fall into one of three broadly described dispersal histories: (1) ongoing faunal dispersal, (2) post-megafauna isolation and (3) post-megafauna isolation followed by dispersal of putative human influence. We identified five large-fruited species that displayed strong population structure combined with signals of dispersal. We propose coalescent methods to investigate whether these genomic signals can be attributed to post-megafauna isolation and dispersal by Indigenous peoples.Entities:
Keywords: Indigenous; anthropogenic dispersal; chloroplast genome; ethnobotany; fruit size; genomic screening; insipient domestication; non-crop species; propagule dispersal; rainforest assembly
Mesh:
Year: 2022 PMID: 35328030 PMCID: PMC8954434 DOI: 10.3390/genes13030476
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
The patterns expected from four tests of dispersal assuming different dispersal traits and histories. For each signal, we expected different results for species with a history of long-term isolation, long-term faunal-mediated dispersal or dispersal following long-term isolation. Note that more than one dispersal scenario is hypothesised for species in the small fruit categories. Signal 1 = “low Fst values and absence of isolation-by-distance”. Signal 2 = “admixture between sites”. Signal 3 = “genomic outliers within sites”. Signal 4 = “haplotype long-distance dispersal”. ✓ = expected genomic signal from post-megafauna Indigenous dispersal. ✗ = genomic pattern not consistent with post-megafauna Indigenous dispersal. IBD = isolation-by-distance. LDD = long-distance dispersal.
| Dispersal Trait | Signal 1 | Signal 2 | Signal 3 | Signal 4 |
|---|---|---|---|---|
| Small fruit | ||||
| Small fruit | ||||
| Large fruit | ||||
| Large fruit |
Figure 1The study area in eastern Australia. Geographic regions separated by disjunctions of rainforest vegetation are indicated by the blue boxes. NNSW = Northern New South Wales, SEQ = Southeast Queensland, CQLD = Central Queensland, AWT = Australian Wet Tropics, CYP = Cape York Peninsula. Low elevation biogeographic barriers that structure the genomic variation in some of the study species are demarcated by red lines. CRC = Clarence River Corridor, WBB = Wide Bay-Burnett, CCL = Cairns-Cardwell Lowlands, BMC = Black Mountain Corridor.
The study species and their fruit traits, the genomic data used in the study and references that report use of each species by Indigenous Australians. Fruit traits: S = Small (<30 mm), L = Large (>30 mm), F = Fleshy, W = Woody, O = Other. Seed traits: L = Large, S = Small. nDNA = nuclear DNA. cpDNA = chloroplast DNA. Fst = Wright’s Fixation Index. Location: AWT = Australian Wet Tropics, CQLD = Central Queensland, SEQ = Southeast Queensland, NNSW = Northern New South Wales, SBMC = South of the Black Mountain Corridor in the AWT, NBMC = North of the Black Mountain Corridor in the AWT.
| Family | Species | Common Names | Fruit Trait | Max. Fruit Width (mm) | Seed Number & Traits | nDNA Markers (SNPs) | cpDNA Sequence (bp) | Mantel score ( | Reported Indigenous Use |
|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||
| Fabaceae |
| Moreton Bay chestnut, black bean, bean tree | LO | 45 | 3–5 L | 38,124 | 0.67 ( | ‘Black bean was a staple food of many northern rainforest Aboriginal people and is still prepared and eaten today.’ (cited [ | |
| Lauraceae |
| Yellow walnut, | LO | 75 × 62 | 1 L | 2080 | 108,132 | 0.36 ( | Seed preparation described in the AWT [ |
| Lauraceae |
| Hairy walnut | LF | 90 × 100 | 1 L | 13,913 | 106,112 | 0.99 ( | Seed preparation described in the AWT [ |
| Sapotaceae |
| Black apple, brush apple, wild plum, native plum | LF | 50 | 1–5 S | 24,873 | 86,899 | 0.63 ( | Ethnographic records [ |
| Elaeocarpaceae |
| Kuranda quandong, ebony heart, nutwood, Johnstone River almond | LF | 55 × 40 | 1 L | 17,085 | 0.14 ( | Ethnographic records [ | |
| Lauraceae |
| LF | 71 × 60 | 1 L | 4025 | 107,869 | 0.91 ( | ||
| Lauraceae |
| Black walnut | LF | 60 × 60 | 1 L | 24,382 | 107,910 | 0.99 ( | |
| Lauraceae |
| Hairy walnut | LF | 75 × 75 | 1 L | 23,322 | 107,371 | 0.99 ( | |
| Sapotaceae |
| LF | 50 × 50 | 1 L | 22,778 | 84,279 | 0.91 ( | ||
| Sapotaceae |
| LF | 20–50 | 1 L | 10,669 | 87,841 | 0.61 ( | ||
| Elaeocarpaceae |
| Kuranda quandong | LO | 40 × 25 | 1 L | 1274 | 0.99 ( | Bush tucker guide described the seed as edible [ | |
| Elaeocarpaceae |
| Blue quandong, | SF | 33 × 33 | 1 L | 10,273 | 0.54 ( | ‘You can eat the thin layer of flesh of the ripe purple-blue fruits when flesh is soft.’ (cited [ | |
| Elaeocarpaceae |
| SF | 12 × 12 | 1 S | 14,731 | 0.56 ( | B. McLeod describes the fruit as “good bush tucker tea” that can be eaten raw or as a jam [ | ||
| Sapotaceae |
| SF | 22 × 9 | 1 S | 15,270 | 85,895 | |||
| Lauraceae |
| SF | 17 × 13 | 1 S | 23,081 | 107,031 | −0.05 ( | ||
|
| |||||||||
| Lauraceae |
| LF | 50 × 50 | 3461 | |||||
| Lauraceae |
| Brown walnut, | LF | 55 × 35 | 3461 | Bush tucker guide describes edible fruit [ | |||
| Lauraceae |
| LF | 67 × 65 | 3461 | |||||
| Sapindaceae |
| Native tamarind, tamarind tree, | SF | 15 | 4640 | 0.88 ( | Ethnographic sources [ | ||
| Lauraceae |
| SF | 11 × 11 | 2881 | 0.91 ( | ||||
| Lauraceae |
| SF | 15 × 18 | 14,970 | 0.89 ( | ||||
| Elaeocarpaceae |
| SF | 17 × 17 | 7429 | 0.59 ( | ||||
| Myrtaceae |
| Water gum, kanooka | W | 10 × 6 | 13,841 | 0.59 ( | |||
| Myrtaceae |
| Mountain water gum | W | 10 × 6 | 10,721 | 0.82 ( | |||
| Cunoniaceae |
| Coachwood | W | >8 | 659 | 0.75 ( | |||
Figure 2Violin plots of the average pairwise Fst values calculated for 25 species at 50 km distance intervals and coloured by fruit trait.
Figure A1The mean pairwise Fst values calculated across 100 replicate simulations of a post-glacial faunal-mediated dispersal scenario (“fd” in Table A1). This scenario of faunal dispersal assumes a symmetric distance-weighted migration matrix.
Figure A2The mean pairwise Fst values calculated across 100 replicate simulations of a post-glacial faunal-mediated dispersal scenario (“fd + exp” in Table A1). This scenario of faunal dispersal assumes that deme0 was established by propagules from deme1 6kya, and a symmetric distance-weighted migration matrix.
Figure A3The mean pairwise Fst values calculated across 100 replicate simulations of a post-megafauna dispersal scenario (“nd” in Table A1). This dispersal scenario assumes that there has been no migration for 60,000 years (3500 or 1750 generations).
Figure A4The mean pairwise Fst values calculated across 100 replicate simulations of a post-megafauna Indigenous-mediated dispersal scenario (“hd1” in Table A1). This scenario of Indigenous dispersal assumes a symmetric island model of migration between all demes from 5000–200 years ago.
Figure A5The mean pairwise Fst values calculated across 100 replicate simulations of a post-megafauna Indigenous-mediated dispersal scenario (“hd2” in Table A1). This scenario of Indigenous dispersal assumes that deme0 was established by propagules from deme1 5kya, followed by a symmetric island model of migration between all demes between 5000–200 years ago.
Figure A6The mean pairwise Fst values calculated across 100 replicate simulations of a post-megafauna Indigenous-mediated dispersal scenario (“hd3” in Table A1). This scenario of Indigenous dispersal assumes that deme0 was established by propagules from deme1 5kya, followed by a symmetric island model of migration between all demes between 5000–4000 years ago.
Figure A9The mean pairwise Fst values calculated across 100 replicate simulations of a post-glacial faunal-mediated dispersal scenario (“hd6” in Table A1). This scenario of Indigenous dispersal assumes an asymmetric stepping-stone model of migration between 5000–4000 years ago.
Figure A7The mean pairwise Fst values calculated across 100 replicate simulations of a post-megafauna Indigenous-mediated dispersal scenario (“hd4” in Table A1). This scenario of Indigenous dispersal assumes that deme0 was established by propagules from deme1 5 kya, with no further migration.
Figure A8The mean pairwise Fst values calculated across 100 replicate simulations of a post-glacial faunal-mediated dispersal scenario (“hd5” in Table A1). This scenario of Indigenous dispersal assumes an asymmetric stepping-stone model of migration between 5000–200 years ago.
Summary of dispersal signals found in the study species. The presence or absence of these signals can be used to evaluate whether a species would make a suitable candidate to investigate the influence of Indigenous dispersal. Signal 1 = “low Fst values and absence of IBD”. Signal 2 = “admixture between sites”. Signal 3 = “genomic outliers within sites”. Signal 4 = “haplotype LDD”. Species identified as candidates for Indigenous dispersal studies have an asterisk *.
| Species | Fruit | Verified Indigenous Use | Signal 1 | Signal 2 | Signal 3 | Signal 4 |
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
|
|
|
|
|
|
| |
|
|
|
|
|
|
| |
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| |
|
|
|
|
| |||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 3(a–o) The 15 study species evaluated for genomic signals of dispersal. For each species, (i) the distribution of the species in the study area is indicated by the black circles and the sample sites are coloured according to a latitudinal gradient defined by the extent of the study area. (ii) Genotype assignment proportions identified by sNMF, assuming K = 2–4. The sample site and geographic region (or position in relation to a barrier) are indicated by the bottom panel. (iii) Principal components analysis of nDNA genomic variance between samples, ordinated by first three primary axes of variation. Samples are coloured according to latitude and shape indicates sample site. (iv) Median-joining network of chloroplast haplotypes (epsilon = 0). Circles are proportional to the number of samples per haplotype and coloured by the latitude of the sample site. The number of mutations between haplotypes are in brackets, and the length of nodes are indicative but not directly proportional to number of mutations.
Candidate species that warrant investigation of historical Indigenous dispersal and suggested follow up studies. Species were identified as candidates if they displayed at least one of five genomic signals of dispersal that can be tested as anthropogenic vs. non-anthropogenic in future studies, and generated hypotheses on Indigenous dispersal scenarios. We considered species as weak candidates if they displayed genomic patterns from which putative Indigenous dispersal could not be differentiated from widespread faunal dispersal or if they showed an absence of dispersal events.
| Species | Dispersal Hypotheses | Follow Up Studies |
|---|---|---|
|
|
During the Holocene, Extensive human-dispersal pathways in NNSW disrupted natural patterns of IBD evident in the north. Upland populations in NNSW were established by humans. Founder effects and/or a subsequent lack of gene flow into these populations has led to drift. |
Sample upland sites and multiple lowland sites in multiple catchments across the species’ distribution, including CQLD. Whole-genome sequencing for phased dataset that can be used to identify the geographic distribution of identity-by-descent blocks and recent coalescent events. Select population samples within each region to date the arrival of Employ directional migration models between catchments to verify non-water modes of dispersal and test putative human-dispersal pathways inferred from ethnographic sources. Employ directional migration models within catchments to verify that connectivity has been lost at upland sites. |
|
|
Mid-late Holocene human-mediated dispersal between two previously isolated sites, B and CF. Holocene propagation along ancient walking routes between Atherton Tableland and the coast. A subsequent decline or loss of dispersal has led to drift between populations. |
Sample additional populations at Atherton where there is archaeological evidence of To investigate dispersal across the BMC and between isolated upland sites, sample additional sites north of the BMC and at southern part of the range near the most differentiated population at site B. Coalescent isolation with migration model to test for pre-Holocene vicariance between Bolinda and Curtain Fig, followed by Holocene-era LDD. |
|
|
Following megafauna decline, a long history of isolation has driven extreme haplotype differentiation between sites. Bottlenecks have reduced nDNA diversity and overall differentiation between sites. Reinforcement—Holocene-era Indigenous dispersal facilitated limited migration between sites. |
Additional cp-sequencing per population to identify further evidence of dispersal events. Isolation with migration coalescent models to test hypothesis of long-term vicariance followed by recent Indigenous-facilitated migration between sites. |
|
|
Rapid dispersal along cultural rather than geographic pathways. Reinforcement—Holocene-era Indigenous dispersal facilitated limited migration and admixture across the BMC. |
Cp-sequencing to better infer dispersal between sites. Coalescent model to evaluate ILS versus admixture between populations across the BMC. |
|
|
Mid-late Holocene human-mediated LDD explains the disjunct distribution of A subsequent decline or loss of dispersal has led to drift and strong nDNA structure. |
Sample additional populations in southern AWT to investigate the likelihood of vicariance versus LDD as the cause of disjunct distribution between AWT and CQLD. Coalescent analysis to date divergence between AWT and CQLD. Divergence < 10 kya is likely human LDD, >21 kya is likely climate-driven vicariance. Test for founder effects in CQLD, as support for LDD. |
Historical events that determine coalescence under 9 dispersal scenarios. The first three columns indicate the time of historical events in years or generations before present assuming a 20 year and 40 year generation time (“gen20” and “gen40”). Fission between demes was used to simulate rapid range expansion events. Going backwards in time, the “source” is the deme from which genes originate, “sink” is the deme to which they go, and “m” indicates the percentage of genes in the sink that originate from the source (1 = all genes). Ne is re-scaled by “size” at each historical event and by the “growth rate” per generation until the next event (negative values imply population expansion backwards in time). The migration matrix at each historical event is indicated for each dispersal scenario. fd = post-glacial faunal dispersal, fd + exp = post-glacial faunal dispersal and range expansion, nd = post-megafauna isolation, hd1–6 = post-megafauna Indigenous dispersal scenarios.
| Migration Matrix according to Dispersal Scenarios | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Years | gen20 | gen40 | Source | Sink | m | Size | Growth Rate | fd | fd + exp | nd | hd1 | hd2 | hd3 | hd4 | hd5 | hd6 |
| 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 200 | 10 | 5 | 0 | 0 | 0 | 1 | −0.02 | - | - | - | 6 | 6 | - | - | 7 | - |
| 3999 | 199 | 99 | 0 | 0 | 0 | 1 | −0.02 | - | - | - | - | - | 6 | - | - | - |
| 4000 | 200 | 100 | 0 | 0 | 0 | 1 | −0.02 | - | - | - | - | - | - | - | - | 7 |
| 4999 | 249 | 124 | 0 | 1 | 1 | 1 | −0.02 | - | - | - | - | 6 | 2 | 2 | - | - |
| 5000 | 250 | 125 | 0 | 0 | 0 | 1 | −0.02 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 6000 | 300 | 150 | 0 | 1 | 1 | 1 | −0.02 | - | 1 | - | - | - | - | - | - | - |
| 9000 | 450 | 225 | 0 | 0 | 0 | 1 | −0.005 | 0 | 0 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 18,000 | 900 | 450 | 0 | 0 | 0 | 0.5 | 0.02 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 25,000 | 1250 | 625 | 0 | 0 | 0 | 1 | 0.005 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 40,000 | 2000 | 1000 | 0 | 0 | 0 | 1 | −0.005 | 0 | 0 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 60,000 | 3000 | 1500 | 0 | 0 | 0 | 0.5 | 0.02 | 2 | 2 | - | 2 | 2 | 2 | 2 | 2 | 2 |
| 70,000 | 3500 | 1750 | 0 | 0 | 0 | 1 | −0.005 | 0 | 0 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| 110,000 | 5500 | 2750 | 0 | 0 | 0 | 1 | −0.02 | 1 | 1 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
Migration matrices employed in simulation models.
| Matrix | Migration | Nm | Dispersal Vector |
|---|---|---|---|
| 0 | Symmetric distance-weighted migration with barrier between deme2 and deme3 | 0.0005, 0.0002, 0.0000 | Volant fauna |
| 1 | High symmetric distance-weighted migration with no barrier | 0.0200, 0.0100, 0.0050, 0.0025, 0.0012 | Volant fauna |
| 2 | No migration | 0.0000 | NA |
| 3 | Low symmetric distance-weighted migration with barrier between deme2 and deme3 | 0.0025, 0.0012, 0.0000 | Volant fauna |
| 4 | Symmetric stepping-stone with barrier between deme2 and deme3 | 0.0050, 0.0000 | Megafauna |
| 5 | High symmetric stepping-stone migration with no barrier | 0.0200, 0.0000 | Megafauna |
| 6 | Low island migration model | 0.0025 | Human |
| 7 | Low asymmetric stepping-stone migration | 0.0025 | Human |