| Literature DB >> 29954843 |
Emily Humble1,2, Kanchon K Dasmahapatra3, Alvaro Martinez-Barrio4, Inês Gregório5, Jaume Forcada2, Ann-Christin Polikeit5, Simon D Goldsworthy6, Michael E Goebel7, Jörn Kalinowski8, Jochen B W Wolf9,10, Joseph I Hoffman5,2.
Abstract
Recent advances in high throughput sequencing have transformed the study of wild organisms by facilitating the generation of high quality genome assemblies and dense genetic marker datasets. These resources have the potential to significantly advance our understanding of diverse phenomena at the level of species, populations and individuals, ranging from patterns of synteny through rates of linkage disequilibrium (LD) decay and population structure to individual inbreeding. Consequently, we used PacBio sequencing to refine an existing Antarctic fur seal (Arctocephalus gazella) genome assembly and genotyped 83 individuals from six populations using restriction site associated DNA (RAD) sequencing. The resulting hybrid genome comprised 6,169 scaffolds with an N50 of 6.21 Mb and provided clear evidence for the conservation of large chromosomal segments between the fur seal and dog (Canis lupus familiaris). Focusing on the most extensively sampled population of South Georgia, we found that LD decayed rapidly, reaching the background level by around 400 kb, consistent with other vertebrates but at odds with the notion that fur seals experienced a strong historical bottleneck. We also found evidence for population structuring, with four main Antarctic island groups being resolved. Finally, appreciable variance in individual inbreeding could be detected, reflecting the strong polygyny and site fidelity of the species. Overall, our study contributes important resources for future genomic studies of fur seals and other pinnipeds while also providing a clear example of how high throughput sequencing can generate diverse biological insights at multiple levels of organization.Entities:
Keywords: PacBio sequencing; RAD sequencing; comparative genomics; genomic inbreeding coefficient; linkage disequilibrium (LD); population structure; single nucleotide polymorphism (SNP)
Mesh:
Year: 2018 PMID: 29954843 PMCID: PMC6071602 DOI: 10.1534/g3.118.200171
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Individual assignment to genetic clusters based on STRUCTURE analysis for K = 4 using 27,592 SNPs. Each horizontal bar represents a different individual and the relative proportions of the different colors indicate the probabilities of belonging to each cluster. Individuals are separated by sampling locations as indicated on the map.
Genome assembly statistics for successive improvements to the original Antarctic fur seal genome assembly
| v1.0.2 ALLPATHS3 | v1.1 GapCloser | v1.2 PBJelly2 | v1.3 Quiver | v1.4 Pilon | |
|---|---|---|---|---|---|
| Number of scaffolds | 8,126 | 8,126 | 6,170 | 6,170 | 6,169 |
| N90 | 890,836 (768) | 888,912 (768) | 1,624,547 (387) | 1,511,352 (387) | 1,542,705 (387) |
| N50 | 3,169,165 (233) | 3,165,747 (233) | 6,454,664 (108) | 6,076,522 (108) | 6,207,322 (108) |
| N10 | 8,459,351 (25) | 8,458,289 (25) | 17,733,103 (11) | 16,529,571 (11) | 16,861,656 (11) |
| Longest scaffold (bp) | 13,012,173 | 12,999,316 | 34,690,325 | 32,399,786 | 33,062,611 |
| Total size (bp) | 2,405,038,055 | 2,403,626,805 | 2,426,014,533 | 2,268,217,244 | 2,313,485,084 |
| Gaps present (%) | 4.79 | 3.26 | 0.62 | 0.57 | 0.55 |
| Number of gaps | 136,284 | 90,432 | 45,102 | 22,783 | 20,611 |
| Average gap size (bp) | 845.56 | 866.87 | 331.16 | 570.37 | 613.39 |
Size in bp (number of scaffolds)
Excluding the mitochondrial genome, which was filtered out by Pilon.
Figure 2Synteny of the longest 40 Antarctic fur seal scaffolds (10.7–33.1 Mb; right, prefixed S) with dog chromosomes (left, prefixed D). Mapping each fur seal scaffold to the dog genome resulted in multiple alignment blocks (mean = 2.1 kb, range = 0.1–52.8 kb) and alignments over 5 kb are shown.
Figure 3Plot of linkage disequilibrium (r2) against physical distance between SNPs in the Antarctic fur seal. LD was calculated using 25,068 filtered SNPs from the 100 largest scaffolds of 33 South Georgia adults. Gray points indicate observed pairwise LD values. The dark gray curve shows the expected decay of LD in the data estimated by nonlinear regression.
Figure 4Scatterplots showing individual variation in principal components (PCs) one and two (panels A and B), and one and three (panels C and D) derived from a principal component analysis conducted using 27 microsatellites (panels A and C) and 27,592 SNPs (panels B and D). The amount of variance explained by each PC is shown in parentheses.
Figure 5(A) Distribution of genomic inbreeding coefficients () for 56 individual fur seals from South Georgia; (B) Distribution of identity disequilibrium () estimates from bootstrapping over individuals. The vertical dashed line represents the empirical estimate and the horizontal black line shows the corresponding 95% confidence interval based on 1000 bootstrap replicates. The vertical colored lines represent the variance in four different inbreeding coefficients. Panels C, D and E show pairwise Pearson’s correlation coefficients between and , and sMLH.