| Literature DB >> 29688544 |
Hilary C Martin1,2, Elizabeth M Batty1, Julie Hussin1, Portia Westall3, Tasman Daish4, Stephen Kolomyjec5, Paolo Piazza1,6, Rory Bowden1, Margaret Hawkins7, Tom Grant8, Craig Moritz9, Frank Grutzner4, Jaime Gongora3, Peter Donnelly1,10.
Abstract
The platypus is an egg-laying mammal which, alongside the echidna, occupies a unique place in the mammalian phylogenetic tree. Despite widespread interest in its unusual biology, little is known about its population structure or recent evolutionary history. To provide new insights into the dispersal and demographic history of this iconic species, we sequenced the genomes of 57 platypuses from across the whole species range in eastern mainland Australia and Tasmania. Using a highly improved reference genome, we called over 6.7 M SNPs, providing an informative genetic data set for population analyses. Our results show very strong population structure in the platypus, with our sampling locations corresponding to discrete groupings between which there is no evidence for recent gene flow. Genome-wide data allowed us to establish that 28 of the 57 sampled individuals had at least a third-degree relative among other samples from the same river, often taken at different times. Taking advantage of a sampled family quartet, we estimated the de novo mutation rate in the platypus at 7.0 × 10-9/bp/generation (95% CI 4.1 × 10-9-1.2 × 10-8/bp/generation). We estimated effective population sizes of ancestral populations and haplotype sharing between current groupings, and found evidence for bottlenecks and long-term population decline in multiple regions, and early divergence between populations in different regions. This study demonstrates the power of whole-genome sequencing for studying natural populations of an evolutionarily important species.Entities:
Mesh:
Year: 2018 PMID: 29688544 PMCID: PMC5913675 DOI: 10.1093/molbev/msy041
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 8.800
. 1.Map showing our sampling locations within river regions in eastern and southeastern Australia. The lighter lines on the figure are rivers and the darker lines demarcate catchment areas. The waterways our samples come from are indicated by arrows, with sample sizes in brackets after the river name. The specific catchments these rivers fall into and their corresponding larger drainage division are indicated in small letters after the river name. The text colors correspond to those used in later figures. The transparent gray region represents the Great Dividing Range (GDR). Note that all samples except those from the Fish River, Gwydir River, and Rifle Creek come from river basins that drain east from the GDR. This map is adapted from one obtained from the Australian Bureau of Meteorology (www.bom.gov.au/water/geofabric/documents/BOM002_Map_Poster_A3_Web.pdf) under a CC-BY license (https://creativecommons.org/licenses/by/2.0/).
Summary of Genetic Data and Samples Used in This and Previous Studies.
| Paper | Number of Samples | Sampling Locations | Genetic Data |
|---|---|---|---|
| 90 | QLD, NSW, VIC, TAS, South Australia | 57 retrotransposons | |
| 120 | Five river systems in NSW, predominantly Hawkesbury and Shoalhaven | 12 microsatellites | |
| 284 | 22 river systems across whole platypus range | Haplotypes of mitochondrial control region and cytochrome b gene | |
| 752 | 33 river systems across NSW, Victoria, TAS | Three microsatellites, two mitochondrial haplotypes | |
| This study | 57 | 12 river systems across QLD, NSW, TAS | 6.7 million SNPs from WGS data |
. 2.Principal component analysis on 43 unrelated samples. The first two principal components (PC1 and PC2) are shown, with the proportion of variance accounted for by each indicated in parentheses. Each unfilled circle in the plot represents an individual, with the colors of the circles corresponding to the sampling location. Circles are plotted at the value of the first two principal components for that individual.
Nucleotide Diversity (π) across Different Sampling Locations.
| Sampling Location | No. of Samples | |
|---|---|---|
| All | 43 | 0.00117 |
| Barnard | 11 | 0.00100 |
| Central NSW | 15 | 0.00104 |
| North NSW | 12 | 0.00101 |
| North QLD | 7 | 0.00048 |
| Shoalhaven | 12 | 0.00100 |
| TAS | 5 | 0.00059 |
Note.—North NSW is Barnard + Gwydir, and Central NSW is Shoalhaven + Wingecarribee + Fish Rivers. Central QLD has only three samples and is excluded from this analysis.
FST across Different Sampling Locations.
| Wingecarribee | North QLD | TAS | Barnard | Shoalhaven | |
|---|---|---|---|---|---|
| Wingecarribee | – | 0.364 | 0.544 | 0.086 | 0.088 |
| North QLD | 0.335 | – | 0.726 | 0.362 | 0.394 |
| TAS | 0.445 | 0.676 | – | 0.555 | 0.555 |
| Barnard | 0.078 | 0.335 | 0.459 | – | 0.145 |
| Shoalhaven | 0.078 | 0.361 | 0.463 | 0.126 | – |
Note.—The black numbers above the diagonal are calculated using SNPs before LD pruning, and the blue ones below the diagonal are calculated after LD pruning (see Materials and Methods).
. 3.Population structure inferred from 43 unrelated individuals using STRUCTURE. Each individual is represented by a vertical bar partitioned into K colored segments that represent the individual’s estimated membership fractions in K clusters. Ten STRUCTURE runs at each K produced very similar results, and so the run with the highest likelihood is shown. The Broken River and Carnarvon samples are labeled as central QLD and the Running River sample as north QLD, even though these samples did not form part of large clusters on the PCA and were excluded from the groupings used in tables 2 and 3 and supplementary figures S5 and S6, Supplementary Material online.
. 4.Coancestry matrix from 43 unrelated individuals using FineSTRUCTURE. Each row represents one of the sampled individuals, with the colors along the row for a particular individual representing the number of pieces of their genome for which each other individual shares most recent common ancestry with them. The tree shows the clusters inferred by FineSTRUCTURE from the coancestry matrix. The groupings on the x-axis are as in figure 3.
. 5.Historical effective population sizes inferred using PSMC. Each line represents a single individual with lines colored according to sampling location. Trajectories were scaled using g = 10 and μ = 7 × 10−9. Effective population size was truncated at 60,000. Samples from a similar sampling location show very similar trajectories.