Literature DB >> 34702078

Speciation in the deep: genomics and morphology reveal a new species of beaked whale Mesoplodon eueu.

Emma L Carroll1, Michael R McGowen2, Morgan L McCarthy3, Felix G Marx4,5, Natacha Aguilar6, Merel L Dalebout7, Sascha Dreyer3, Oscar E Gaggiotti8, Sabine S Hansen3, Anton van Helden, Aubrie B Onoufriou6,8, Robin W Baird9,10, C Scott Baker11, Simon Berrow12, Danielle Cholewiak13, Diane Claridge14, Rochelle Constantine1, Nicholas J Davison15, Catarina Eira16,17, R Ewan Fordyce4,5, John Gatesy18, G J Greg Hofmeyr19,20, Vidal Martín21, James G Mead2, Antonio A Mignucci-Giannoni22,23, Phillip A Morin24, Cristel Reyes8, Emer Rogan25, Massimiliano Rosso26, Mónica A Silva27, Mark S Springer28, Debbie Steel11, Morten Tange Olsen3.   

Abstract

The deep sea has been described as the last major ecological frontier, as much of its biodiversity is yet to be discovered and described. Beaked whales (ziphiids) are among the most visible inhabitants of the deep sea, due to their large size and worldwide distribution, and their taxonomic diversity and much about their natural history remain poorly understood. We combine genomic and morphometric analyses to reveal a new Southern Hemisphere ziphiid species, Ramari's beaked whale, Mesoplodon eueu, whose name is linked to the Indigenous peoples of the lands from which the species holotype and paratypes were recovered. Mitogenome and ddRAD-derived phylogenies demonstrate reciprocally monophyletic divergence between M. eueu and True's beaked whale (M. mirus) from the North Atlantic, with which it was previously subsumed. Morphometric analyses of skulls also distinguish the two species. A time-calibrated mitogenome phylogeny and analysis of two nuclear genomes indicate divergence began circa 2 million years ago (Ma), with geneflow ceasing 0.35-0.55 Ma. This is an example of how deep sea biodiversity can be unravelled through increasing international collaboration and genome sequencing of archival specimens. Our consultation and involvement with Indigenous peoples offers a model for broadening the cultural scope of the scientific naming process.

Entities:  

Keywords:  Mesoplodon eueu; Mesoplodon mirus; Ramari's beaked whale; True's beaked whale; biodiversity; taxonomy

Mesh:

Year:  2021        PMID: 34702078      PMCID: PMC8548795          DOI: 10.1098/rspb.2021.1213

Source DB:  PubMed          Journal:  Proc Biol Sci        ISSN: 0962-8452            Impact factor:   5.349


Introduction

The Earth's deep ocean remains less understood than the surface of Mars [1]. However, much biodiversity is waiting to be discovered in the deep sea, and there is great potential for this region to contribute to and challenge major ecological hypotheses [2]. Here, we focus on beaked whales (ziphiids), which are among the most visible inhabitants, due to their large size, worldwide distribution and surfacing to breathe [3]. Even so, their diversity and ecology remain obscure, with seven of the 23 species in the IUCN Red List classified as Data Deficient. Our understanding of beaked whales has been limited by the scarcity of records, with many species known only by a handful of incomplete skeletons [4]. Beaked whales typically spend limited time at the surface [5], during which they are difficult to distinguish [6]. Genetic tools have been instrumental in developing an understanding of the diversity and phylogenetic relationships among ziphiids [7], but the application of genomic tools to understand their ecology and evolution has been limited [8] despite recent technological advances [9]. Here, we use a range of genomic approaches to access genetic information from museum and archival specimens of varying age and quality to investigate the taxonomic status of disjunct populations of True's beaked whale (Mesoplodon mirus). First described by Frederick W. True in 1913 from a male that stranded in North Carolina, USA [10], the known distribution (figure 1) of M. mirus grew to include much of the temperate North Atlantic (NA) [3,6,12]. In 1959, the species was first reported in the Southern Hemisphere (SH) off South Africa [13], with the subsequent discovery of a breeding population [14]. Additional SH specimens were reported from Mozambique [15], Tristan da Cunha [16], Walter Shoals (south of Madagascar) [3], southern Brazil [17], Australia [18] and Aotearoa New Zealand [11] (figure 1).
Figure 1

Sampling locations in the NA (black circles) and SH (yellow circle). Global map viewed as a Spilhaus projection that shows the connectedness of the ocean, with sampling locations and distribution of Mesoplodon mirus and proposed species M. eueu shown by the key (sourced from [3,11]), with the artist's impression of the species in top right. Credit: Vivian Ward, University of Auckland.

Sampling locations in the NA (black circles) and SH (yellow circle). Global map viewed as a Spilhaus projection that shows the connectedness of the ocean, with sampling locations and distribution of Mesoplodon mirus and proposed species M. eueu shown by the key (sourced from [3,11]), with the artist's impression of the species in top right. Credit: Vivian Ward, University of Auckland. NA and SH True's beaked whales are separated by thousands of kilometres, raising questions about their level of divergence [6,19]. Initial suspected differences in pigmentation between hemispheres [19] have been outweighed by greater variation in colouration within NA, reducing evidence for differentiation [6]. Similarly, the only comprehensive anatomical analysis to date (based on six SH and nine NA specimens) found differences in just one cranial measurement out of 47 [19]. Molecular analyses did reveal notable divergence between NA and SH specimens, but this was based on mitochondrial control region sequences from four individuals [20,21] and ad hoc data collected thereafter for species identification [6,11]. Here, we take an integrative approach, combining genomic and morphological datasets, to resolve this taxonomic quandry.

Results

Genomes reveal deep genetic divergence

To resolve the status of SH ‘True's’ beaked whales, we first examined interhemispheric divergence using both complete mitogenomes and nuclear single-nucleotide polymorphisms (SNPs; figure 2). A Bayesian phylogeny of ten NA and two SH mitochondrial genomes clustered individuals from each hemisphere into well-supported, reciprocally monophyletic sister clades. Within this phylogeny, the holotype of M. mirus (USNM175019) clearly groups within the NA group (figure 2c); the holotype for the SH form clusters with the other SH sample with high support. The same pattern appears in a phylogeny constructed using 17 217 SNPs derived from ddRADseq for five samples from each hemisphere and three other beaked whale species (44 total samples; figure 2a; electronic supplementary material table S1 and dataset S1). The divergence between M. mirus in NA and SH appears greater than in the two other globally distributed beaked whales: Cuvier's beaked whale (Ziphius cavirostris, n = 15) and Blainville's beaked whale (M. densirostris, n = 9) (figure 2a). The ddRAD genetic clustering results revealed little admixture between the NA and SH samples, suggesting no major introgression between the two forms after divergence (figure 2b).
Figure 2

Deep genetic divergence between M. mirus (black) and M. eueu (yellow). (a) ddRAD-based phylogenetic tree on 17 217 SNPs, with 95% bootstrap support indicated by black boxes; (b) inset showing ddRAD-based admixture plot of M. mirus and M. eueu, based on 15 761 SNPs variable in these two species. (c) Mitogenome phylogenetic tree with 95% HPDs indicated by black boxes. (d) hPSMC analysis of nuclear genomes from one specimen each from USA and South Africa showing the end of gene flow between the Northern Hemisphere and SHs estimated at 0.35–0.55 Ma. The black line is the hPSMC result from our observed data and the grey lines represent the simulations closest to the real data without overlapping it, following Li and Durbin [22]. Holotype samples in (a) and (c) indicated with yellow and black stars.

Deep genetic divergence between M. mirus (black) and M. eueu (yellow). (a) ddRAD-based phylogenetic tree on 17 217 SNPs, with 95% bootstrap support indicated by black boxes; (b) inset showing ddRAD-based admixture plot of M. mirus and M. eueu, based on 15 761 SNPs variable in these two species. (c) Mitogenome phylogenetic tree with 95% HPDs indicated by black boxes. (d) hPSMC analysis of nuclear genomes from one specimen each from USA and South Africa showing the end of gene flow between the Northern Hemisphere and SHs estimated at 0.35–0.55 Ma. The black line is the hPSMC result from our observed data and the grey lines represent the simulations closest to the real data without overlapping it, following Li and Durbin [22]. Holotype samples in (a) and (c) indicated with yellow and black stars. The deep genetic split between the NA and SH is supported by high levels of genetic differentiation based on 15 761 ddRADseq SNPs in M. mirus (NA/SH n = 5/5, FST = 0.64, p < 0.01, with 1909 (12%) fixed differences, estimated error rate of 0.002 per SNP allele), mitogenomes (NA/SH n = 9/2, d = 0.04, FST = 0.96, p < 0.01) and a shorter fragment of the mitochondrial control region for which more samples were available (304 bp, NA/SH n = 19/14, d = 0.04, FST = 0.85, p < 0.01; electronic supplementary material, table S2). This level of differentiation is consistent with that proposed to indicate species separation in cetaceans [23]. A time-calibrated Bayesian analysis of the mitochondrial genomes of the focal species and its closest relative based on previous cetacean phylogenies [24] dates the NA/SH split to approximately 2 Ma (95% HPD: 1.4–2.6 Ma; figure 2c). To examine the pattern of divergence further, we sequenced two whole nuclear genomes using Illumina short-read technology, one from each ocean basin (NA, 51x and SH 7x coverage). An associated hybrid pairwise sequential Markovian coalescent (hPSMC) analysis suggests that the cessation of gene flow likely occurred around 0.55–0.35 Ma (figure 2d). The SH dataset has significantly higher nuclear diversity (SNP observed heterozygosity H = 0.159, 95% CI: 0.156–0.163, t-test t = −10.85, Kolmogorov–Smirnov D = 0.053, p < 0.001, genome-wide heterozygosity h = 0.00223, 95% CI: 0.00220, 0.00225, t = 35.49, p < 0.001) than the NA dataset (H = 0.134, 95% CI:0.131–0.1370, h = 0.00165, 95% CI: 0.00162, 0.00168), but there was no significant difference in mitochondrial control region haplotype diversity (H, 304 bp) between the two regions (SH H = 0.84, 95% CI: 0.813–0.867, NA H = 0.80, 95% CI:0.772–0.828, permutation test [25], p > 0.05). This is consistent with the estimates of effective population size during the last glacial maximum derived from whole-genome data using PSMC of approximately 17 000 for the SH and approximately 13 000 from NA (figure 3). Although down-sampling flattened the PSMC trend (electronic supplementary material, figure S1), the recent expansion was consistent with the Tajima's D statistic from ddRAD data suggesting that both the NA and SH forms have undergone expansion in recent evolutionary time (SH = −0.11, 95% CI: −0.134, −0.089, t = −9.82; Wilcoxon signed-rank test V = 14240039, NA = −0.164, 95% CI: −0.187, −0.141, t = −13.856, V = 10858056, p < 0.001 for all analyses).
Figure 3

Historical effective population sizes of M. mirus (black) and M. eueu (yellow) over time. Pairwise sequential Markovian coalescent reconstruction of the two species effective population sizes from 20 000 to 3 million years before present, assuming a generation span of 25.9 years [26] and a mutation rate [27,28] of 2.3 × 10–8. The dark single line represents the median and the lighter lines represent 100 bootstrap.

Historical effective population sizes of M. mirus (black) and M. eueu (yellow) over time. Pairwise sequential Markovian coalescent reconstruction of the two species effective population sizes from 20 000 to 3 million years before present, assuming a generation span of 25.9 years [26] and a mutation rate [27,28] of 2.3 × 10–8. The dark single line represents the median and the lighter lines represent 100 bootstrap.

The skulls of the two forms are distinct

To determine morphological distinctiveness, we conducted a geometric morphometric analysis of 15 NA and 23 SH skulls, including the holotype of M. mirus and holotype of the SH form (figure 4a–e; electronic supplementary material, dataset S2). A principal component analysis (PCA) and associated PERMANOVA (F = 18.98; p = 0.0001) of eight size-standardized cranial measurements (electronic supplementary material, dataset S2, figure S2) clearly separate NA and SH individuals (figure 4f; electronic supplementary material, figure S3). The first two principal components account for 83.3% of variance and reveal the southern form has a shorter rostrum and mandibular symphysis (PC1: 56.25%), a broader rostral base and slightly more expanded premaxillary crests and sac fossae (PC2: 26.31%; first six PCs comprise 96% of the variation).
Figure 4

Skull and morphological distinctiveness of M. eueu shown by holotype (NMNZ MM003000). (a) Skull in dorsal (left), ventral (centre) and left lateral (right) view (b) periotic in dorsal (left) and ventral (right) view; (c) tympanic bulla in dorsal (left) and ventral (right) view; (d) mandible in dorsal (top) and lateral (bottom) view; (e) mandibular tusks in medial (left) and lateral (right) view. (f) PCA of cranial and mandibular measurements showing clear separation between M. eueu from the SH and M. mirus from the NA. Percentages next to principal components (PCs) denote the total variance explained. Filled stars are holotypes and hollow stars paratypes. (Online version in colour.)

Skull and morphological distinctiveness of M. eueu shown by holotype (NMNZ MM003000). (a) Skull in dorsal (left), ventral (centre) and left lateral (right) view (b) periotic in dorsal (left) and ventral (right) view; (c) tympanic bulla in dorsal (left) and ventral (right) view; (d) mandible in dorsal (top) and lateral (bottom) view; (e) mandibular tusks in medial (left) and lateral (right) view. (f) PCA of cranial and mandibular measurements showing clear separation between M. eueu from the SH and M. mirus from the NA. Percentages next to principal components (PCs) denote the total variance explained. Filled stars are holotypes and hollow stars paratypes. (Online version in colour.) Both visual examination and morphometric analysis (F = 3.93, p = 0.017) show that the SH individuals are sexually dimorphic (female: figure 4; male: electronic supplementary material, figures S4–S6). Females (figure 4) have a longer mandibular symphysis, while males (electronic supplementary material, figure S4) are significantly larger (greater bizygomatic width: mean = 359.4 mm versus 346.6 mm; t = 2.39, p = 0.028) and have a pair of large, erupted apical ‘tusks’, more ossification of the mesorostral cartilage (electronic supplementary material, figure S6 [19]), a wider rostrum base, and broader premaxillary crests and sac fossae. In the NA form (female: electronic supplementary material, figure S7; male: electronic supplementary material, figure S8), males have enlarged ‘tusks’, more mesorostral ossification, broader premaxillary crests and a shorter mandibular symphysis also (F = 4.98, p = 0.013; electronic supplementary material, figure S3) but the sexes are comparably sized (mean bizygomatic width = 350.25 mm versus 348.43 mm; t = 0.34, p = 0.74).

Introducing a new species: Ramari's beaked whale, Mesoplodon eueu sp. nov

Molecular and morphological data reveal that True's beaked whales from the NA and SH form two distinct, long-divergent evolutionary lineages, consistent with species under the Genealogical Concordance Species Concept [29]. Therefore, we propose that the SH form should be reclassified as a new species. Systematic biology Cetacea Brisson 1762 Ziphiidae Gray 1865 Mesoplodon Gervais 1850 Mesoplodon eueu sp. nov. Holotype NMNZ MM003000, a pregnant, 5.06 m long adult female named Nihongore by Te Rūnanga o Makaawhio. Collected by Ramari Stewart, Nathaniel Scott and Don Neale after beachcast on 27 November 2011. The complete skeletons of the female and fetus are held by Museum of New Zealand Te Papa Tongarewa (NMNZ, Wellington, Aotearoa New Zealand; specimen MM003000), and a tissue sample is held in the New Zealand Cetacean Tissue Archive (NZCeTA, University of Auckland, Auckland, Aotearoa New Zealand; all institutional abbreviations in electronic supplementary material, S1). Type locality Waiatoto Spit, South Westland, Aotearoa New Zealand. Paratypes Adult females (PEM N0136 and PEM N3438) and adult male (PEM N1114) held at Port Elizabeth Museum (Gqeberha, South Africa) and adult males (SAM-ZM-041596 and SAM-ZM-039840) held at Iziko South African Museum (Cape Town, South Africa). Full description of paratypes is found in electronic supplementary material S2. Etymology The scientific and common names acknowledge links with Indigenous communities in South Africa and Aotearoa New Zealand, respectively, and were chosen in consultation with these peoples. Most of the South African strandings come from territory inhabited by the Khoisan peoples. Guided by the Khoisan Council, we chose the name //eu//’eu (simplified to eueu to fit nomenclature standards; correct pronunciation available in associated audio clip a1 in the electronic supplementary material), which means ‘big fish’ in Khwedam (from the Khoe language family). In Aotearoa New Zealand, Māori cultural expert Brad Haami developed a shortlist of potential names meaningful in the Māori language, which was then sent for comment to Te Rūnanga o Ngāi Tahu. The selected common name, Ramari's beaked whale, pays homage to Māori tohunga (expert) Ramari Stewart, who has kept traditional knowledge alive, contributed extensively to scientific research on marine mammals, and helped prepare the skeleton of the holotype. The word ‘Ramari’ means a rare event in the Māori language, reflecting the elusive nature of most beaked whales. Diagnosis Molecular characteristics M. eueu differs from M. mirus based on nuclear DNA markers, and from M. mirus and its closest relatives M. europaeus, M. ginkgodens and M. bidens using mtDNA markers (figure 2). M. mirus is distinct from all other mesoplodont species based on previous mitochondrial and nuclear DNA trees [7,20,24]. Mitochondrial DNA: analysis of mitochondrial data includes sequences from the holotypes of both M. mirus and M. eueu at all sequence lengths. Over the 304 bp mitochondrial control region segment, M. eueu is distinguished by seven fixed differences from M. mirus, with FST = 0.85 (p < 0.01), d = 0.04 between the two species. At the full mitochondrial genome lengths, M. eueu is distinguished by 579 fixed differences from M. mirus with FST = 0.96 (p < 0.01) and d = 0.04 (electronic supplementary material, table S2). Nuclear DNA: reduced representation sequencing with ddRAD showed M. eueu had a distinct admixture pattern to M. mirus (figure 2), and an FST = 0.64 (p < 0.0001) was estimated between the two species. M. mirus and M. eueu were distinguished by 1909 fixed differences (12%, per SNP allele error rate = 0.002), across a dataset of 15 671 SNPs found between or within both species. Comparison of one whole nuclear genome each from M. mirus and M. eueu showed a level of nucleotide divergence of 0.28%. Morphological characters M. eueu is a larger (5.3 m) species of Mesoplodon differing from all other members of the genus except M. mirus, M. hectori and M. perrini in having tusks positioned at the tip of the mandible. It also differs from M. hectori and M. perrini in having smaller, less triangular tusks and from M. mirus in having a relatively shorter rostrum with a wider base, a shorter mandibular symphysis, wider premaxillary sac fossae and crests, and a taller cranium. External appearance The external appearances of M. eueu and M. mirus are not known to be consistently distinguishable. Both species are rotund mesoplodonts with bodies that taper towards the tail and rostrum, with somewhat bulbous and well-defined melons, a mostly straight beak and short, straight gape. Colouration is generally grey with a dark eye patch in both species, but there may be specific colouration patterns linked to M. eueu; for example, a female that stranded in South Africa showed a whitish dorsal colouration from the fin to caudal peduncle [14]. However, Aguilar et al. [6] reported M. mirus with varying dorsal and ventral white colouration in the Canary and Azores Islands, so colouration patterns linked to species cannot yet be conclusively defined. Distribution M. eueu probably occurs throughout temperate SH waters, with reports from several locations. Genetic methods have confirmed at least some of these records in South Africa ([20], this study), Australia [20] and Aotearoa New Zealand [11]. Nomenclatural acts This published work and the nomenclatural acts it contains are registered in ZooBank (http://zoobank.org/), the online registration system for the International Commission on Zoological Nomenclature (ICZN). The ZooBank Life Science Identifier is urn:lsid:zoobank.org:pub:C61A1A33-B234-476E-AA24-86C6ED130C10, and for the new species is urn:lsid:zoobank.org:act:C56121C4-1E15-4A07-A270-D92AF43AE74A.

Discussion

As many as 1.5 million species await discovery in the deep sea [30]. Here, we show that detailed analysis of even a small number of samples can yield profound insights into the diversity and phylogeography of the species that occur in this vast habitat. As one of the few mammalian deep sea specialists, it is perhaps not surprising that ziphiids are among the most speciose cetacean lineages. Our results emphasize this pattern and corroborate the idea that the deep sea is more biodiverse than previously thought [2]. Deep sea ecosystems are governed by temperature, primary productivity and habitat complexity [31]. Modelling of energy input suggests that deep sea biodiversity peaks at latitudes of 30–50° [32], coinciding with the ranges of M. eueu and M. mirus. Based on our time-calibrated, phylogenetic reconstruction of mitogenomes, the initial divergence of these species may have been driven by intense cooling in tropical ocean temperatures approximately 2 Ma, which in turn would have facilitated cross-equatorial dispersal from one hemisphere into another. It is unclear in which ocean basin the ancestral species originated, but the slightly greater genetic diversity and wider geographical range of M. eueu may indicate a southern origin. Either way, our hPSMC analysis suggests that by 0.35 Ma all significant gene flow between the two hemispheres had ceased. After divergence, the population size of both species probably expanded as M. mirus and M. eueu showed signals of expansion in recent evolutionary time. Examples of anti-tropical species pairs and population structuring are common among cetaceans and other marine mammals [33,34]. In our analysis, M. densirostris also shows distinct populations in the NA and Indo-Pacific (figure 2a), but with far less divergence than M. mirus and M. eueu. This may indicate dispersal during a more recent glacial period, or greater gene flow in M. densirostris with its more continuous, tropical distribution [3]. Mesoplodon eueu is the fifth beaked whale species to be described or elevated to species status in the past few decades [21,35-37]. Ziphiids occur primarily offshore, spend little time at the surface [5] and are hard to distinguish visually [6], which makes them difficult to study. An integrative approach combining genetic and morphometric analysis has proved effective at uncovering the diversity and relationship of these elusive animals [21,36], and is likely to produce further taxonomic insights in the future. Our study required international collaboration, generating a global archive (International Tissue Archive for Beaked Whale) of ziphiid samples collected over five decades. Exploiting everything from whole-genome sequences to short mtDNA fragments allowed us to use samples of varying quality and thus maximize geographical coverage. Overall, our results highlight the value of museum and tissue archives in documenting and understanding speciation, especially when paired with techniques like genomic sequencing. Describing a new species requires important decisions around the naming of the new taxon. We sought to recognize Indigenous peoples’ deep connection with, and knowledge of, the natural environment by consulting with them on potential species and common names. This is part of a critical shift in the global science community, which strives to collaborate with Indigenous knowledge holders in ecology [38,39] and conservation biology [40]. Here, this has resulted in one of the first cetaceans named after an Indigenous woman.

Methods

Dataset overview

The geographic origin of and analyses used in each sample is summarized by species in electronic supplementary material, table S1. Full metadata, including sex, collection date and location, are given for samples in the genomic and morphometric analyses in the electronic supplementary material, datasets S1 and S2, respectively.

Investigating genomic differentiation

DNA extraction and genomic sequencing

DNA extraction method varied depending on the sample type. For tissue, DNA was extracted either using the Qiagen DNeasy kit, Gentra Puregene kit or using standard phenol:chloroform methods [41,42]. For bone, extraction followed the silica-based method [43], as modified by [44] or [45]. Genomic sequencing was conducted using different methods and platforms for double digest Restriction Associated DNA (ddRAD), shotgun nuclear and mitogenome analyses (see electronic supplementary material, S3). To increase the sample size for the mitochondrial control region analysis, we combined novel unpublished with previously published [6,11,20] data (electronic supplementary material, dataset S1). PCR amplification was conducted with forward primer Dlp1.5 [46] and reverse primer Ha500R_R (5'-CCATCGAGATGTCTTATTTAAGAGG-3') using standard protocols [7]. PCR products were sent to Macrogen Europe for Sanger sequencing.

ddRAD data analysis

STACKS v. 2 [47] was used to genotype the samples using the reference mapped pipeline, employing the algorithm of Maruki & Lynch [48] that accounts for sequencing quality, allele balance and allele frequencies. For the phylogenetic analysis, we used a dataset comprising the species of interest, M. mirus, two other globally distributed beaked whales to compare divergence patterns, M. densirostris and Ziphius cavirostris, and the more closely related M. bidens (electronic supplementary material, table S1). For the population genomics analysis, we focused on M. mirus samples. Both analyses followed the same pipeline. First, reads were demultiplexed using the process_radtags command in the STACKSv2 program and aligned against the M. bidens genome (GenBank: PRJNA399476) using BWA-MEM v. 0.7.17 [49]. The SNP and genotyping calling likelihood α = 0.01, ensuring only high-quality variants were called. We used a tiered and iterative approach to filtering, starting with low cut-off values for missing data (applied separately per locus and individual) and finalizing the dataset with higher thresholds, an approach that can maximize available data [50]. Genotyping error rate and discordant loci were identified by running one sample through the laboratory work and genotyping pipeline twice. Differences between these replicates were used to identify discordant loci that were then removed from the analysis and estimate a per SNP allele error rate for the pipeline. This QC pipeline was done in R v. 3.6.0 [51] and VCFtools v. 0.1.12a [52] and code is available at https://github.com/emmcarr/Mmirus. Loci were exported from STACKS v2 as a phylip file and a total of 14 468 parsimony informative sites were used to construct a maximum-likelihood phylogenetic tree in IQtree [53] using a general time-reversible model with unequal rates and unequal base frequencies [54] and ascertainment bias correction. No partitioning was used. Confidence in the clades was given by standard nonparametric bootstrapping (n = 1000). We estimated levels of genetic differentiation using FST calculated with the R package hierfstat [55], with significance assessed using the permutation method in the R package strataG [56]. The number of fixed differences between the NA and SH was estimated using the R package dartR [57] and Tajima's D was estimated for each region using vcftools (sliding window of 10 000 bases), with significance from zero of the latter tested with t-test and Wilcoxon signed-rank test in R. We estimated mean and 95% CI for observed heterozygosity [58] for NA and SH using hierfstat and compared them using t-test and Kolmogorov–Smirnov tests in R. Admixture was assessed using the genetic clustering program sNMF [59] implemented in the R package LEA [60]. The best k was inferred by calculating cross-entropy values from 10 runs of k set from 1 to 4.

Mitochondrial genome (mitogenome) assembly

Mitochondrial reads were assessed prior to and after trimming and filtering reads with FastQC v. 0.11.8a [61]. Adapters were removed and sequences were trimmed in BBDUK (options: ref = adapters ktrim = r k = 23 min k = 8 h dist = 1 tbo qtrim = rl trimq = 15 ma q = 20 min len = 40: https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbduk-guide/). Trimmed reads were mapped to a M. mirus reference mitogenome (Genbank: NC_042217.1) using BWA-MEM. PCR duplicates were removed with SAMtools v. 1.9 (rmdup command: [62]). The consensus sequence was generated with ANGSD v. 0.931 [63] (options: -doFasta2 -doCounts1 -minQ30 -minMapQ30 -setMinDepth3).

Bayesian mitogenome phylogeny

All mitogenomes were aligned using MAFFT v. 7.388 [64] with M. bidens, M. ginkgodens and M. europaeus as outgroups, M. mirus's most closely related species [24]. The 12 s rRNA, 16 s rRNA and 13 protein-coding mitochondrial genes were extracted and individually aligned. Stop codons in coding genes were manually removed in Geneious Prime 2020.1.2 (https://www.geneious.com/). The final alignments were concatenated into a combined alignment and substitution models were inferred with PartitionFinder v. 2.1.1 [65]. There were eight subsets (electronic supplementary material, table S3) which were adjusted to fit the BEAUTi v. 2.5.2 [66] XML file generator and linked with a relaxed clock log normal and a linked Yule tree model. Three MRCA node priors were implemented based on fossil evidence [24] (electronic supplementary material, table S4). The MCMC model was implemented in BEAST2 v. 2.5.2 [66] with 15 million chains and sampled every 1500 steps. The model was run three separate times to ensure convergence. The log files were inspected with Tracer v. 1.7.1 (https://github.com/beast-dev/tracer/releases/tag/v1.7.1) and combined with LogCombiner v. 2.5.2 after a 10% burn-in. The tree files were combined after a 10% burn-in with LogCombiner and a Maximum Clade Credibility (MCC) tree was generated with TreeAnnotator v. 2.4.2 [66]. The final MCC tree was visualized in FigTree v. 1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/).

mtDNA control region analysis

The mtDNA control region sequences were aligned and trimmed using Geneious Prime 2020.1.2. Given their origins, sequences varied in length and overlap. We created three datasets of lengths 600 bp (n = 28), 423 bp (n = 32) and 304 bp (n = 33), balancing length and sampling representation. The phylogenetic analyses were conducted on each datasets using MrBayes [67] (GTR gamma: 1 000 000 replicates, 100 000 burn-in) and NJ (HKY: 10 000 bootstraps) as implemented in Geneious Prime. Haplotype and nucleotide diversity, as well as estimates of d, FST, KST and number of fixed differences between ocean basins were estimated in DNAsp v. 6 [68]. Significant differences in diversity statistics between NA and SH were assessed using a permutation test [25].

Nuclear genome assembly and haploid consensus sequence generation

Genomes were assembled following steps modified from [69]. For genome assembly, raw sequences were inspected for QC with FastQC. Sequence adapters and short reads were trimmed using BBDUK and subsequently re-assessed with FastQC. Remaining reads were mapped to the M. bidens [8] reference genome using the BWA-MEM algorithm. PCR duplicates were removed with Samtools -rmdup command. Consensus haploid sequences for hPSMC were generated with ANGSD (minimum quality score = 25, minimum mapping score = 5, minimum read depth = 3, -uniqueonly 1 and -remove_bads 1). Autosomal scaffolds and scaffolds shorter than 100 000 base pairs were filtered out using the filterbyname.sh script from BBMAP v. 28.70 (https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbmap-guide/). Assembly depth of coverage estimates were generated with SAMtools. The NA assembly was downsampled to the same depth of coverage of SH assembly using SAMtools to account for inflated levels of heterozygosity in the lower depth of coverage assemblies. Subsequently, the two genome heterozygosity estimates were calculated with ROHan [70]. Windows containing less than 90% called sites were filtered out, the mean and 95% CI were calculated each for NA and SH, and the distributions compared with a t-test in R.

Pairwise sequential Markovian coalescent and hybrid PSMC modelling

The PSMC plot was generated following [22]. Sex-linked scaffolds [8] were removed from the assembly and a subsequent diploid genome was generated with bcftools using a minimum depth of coverage of five and mapping quality score of 30. The depth of coverage for M. mirus was downsampled to 5x–40x using SAMtools for a more similar comparison and to see the effects of the depth of coverage on PSMC plots (electronic supplementary material, figure S1). The PSMC file was generated and visualized in gnuplot v. 5.2 (options: −N25-t15-r5-p"4 + 25 * 2 + 4 + 6″: http://www.gnuplot.info/). The plot was scaled using a general odontocete mutation rate of 2.34 × 10−8 per year [27], following previous similar work [28], and a generation time of 25.9 years (based on Berardius [26]). The two nuclear consensus genomes were combined into a single pseudo-hybrid psmcfa file with the psmcfa_from_2_fastas.py script [71]. A PSMC run was implemented as above [22] with a modified -s10 setting. From the hybrid M. mirus hPSMC output, we manually visualized the text file output from the psmc_plot.pl script to estimate the pre-divergence N. The N of 17 000 was incorporated to run simulations dating the end of gene flow from 300 000 to 600 000ya in 50 000 year intervals using ms [72]. All simulations were plotted with the real data and the range for dating the end of gene flow was estimated from the two simulations which occurred closest to the real data without overlapping it, within the range of 1.5–10 × the pre-divergence N.

Identity by state calculation

The haploid pairwise identity between NA and SH sample partitions was calculated in ANGSD. The identity by state (IBS) value was calculated with the -doIBS 2 (consensus base) command and included filters including only sites present in both bam files with a minimum depth of coverage of 5 and reads that had a mapping score of 30, reads that mapped uniquely (samtools flag 256) and bases with a quality score of 30.

Investigating morphological variation

Skull morphometrics

We quantified the morphology 15 NA and 23 SH specimens via eight measurements of the cranium and mandible (electronic supplementary material, figure S5). Our sample included the holotype of M. mirus (USNM175019) and M. eueu (NMNZMM003000). All measurements were taken to the nearest mm, divided by the bizygomatic width of the skull to account for differences in body size (see [73] for the use of this metric as a size proxy in cetaceans), log-transformed and summarized via PCA using PAST 4.03 [74]. We tested for (i) interhemispheric differences and (ii) sexual dimorphism per hemisphere via a PERMANOVA. Missing values were estimated via mean value imputation (PCA) or accounted for via pairwise deletion (PERMANOVA: both PAST defaults). Click here for additional data file.
  42 in total

1.  Generating samples under a Wright-Fisher neutral model of genetic variation.

Authors:  Richard R Hudson
Journal:  Bioinformatics       Date:  2002-02       Impact factor: 6.937

Review 2.  Marine Biodiversity, Biogeography, Deep-Sea Gradients, and Conservation.

Authors:  Mark J Costello; Chhaya Chaudhary
Journal:  Curr Biol       Date:  2017-06-05       Impact factor: 10.834

3.  These aren't the loci you'e looking for: Principles of effective SNP filtering for molecular ecologists.

Authors:  Shannon J O'Leary; Jonathan B Puritz; Stuart C Willis; Christopher M Hollenbeck; David S Portnoy
Journal:  Mol Ecol       Date:  2018-07-10       Impact factor: 6.185

4.  Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments.

Authors:  Jesse Dabney; Michael Knapp; Isabelle Glocke; Marie-Theres Gansauge; Antje Weihmann; Birgit Nickel; Cristina Valdiosera; Nuria García; Svante Pääbo; Juan-Luis Arsuaga; Matthias Meyer
Journal:  Proc Natl Acad Sci U S A       Date:  2013-09-09       Impact factor: 11.205

5.  DnaSP 6: DNA Sequence Polymorphism Analysis of Large Data Sets.

Authors:  Julio Rozas; Albert Ferrer-Mata; Juan Carlos Sánchez-DelBarrio; Sara Guirao-Rico; Pablo Librado; Sebastián E Ramos-Onsins; Alejandro Sánchez-Gracia
Journal:  Mol Biol Evol       Date:  2017-12-01       Impact factor: 16.240

6.  What influences the worldwide genetic structure of sperm whales (Physeter macrocephalus)?

Authors:  Alana Alexander; Debbie Steel; Kendra Hoekzema; Sarah L Mesnick; Daniel Engelhaupt; Iain Kerr; Roger Payne; C Scott Baker
Journal:  Mol Ecol       Date:  2016-05-02       Impact factor: 6.185

7.  IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies.

Authors:  Lam-Tung Nguyen; Heiko A Schmidt; Arndt von Haeseler; Bui Quang Minh
Journal:  Mol Biol Evol       Date:  2014-11-03       Impact factor: 16.240

8.  Genome-culture coevolution promotes rapid divergence of killer whale ecotypes.

Authors:  Andrew D Foote; Nagarjun Vijay; María C Ávila-Arcos; Robin W Baird; John W Durban; Matteo Fumagalli; Richard A Gibbs; M Bradley Hanson; Thorfinn S Korneliussen; Michael D Martin; Kelly M Robertson; Vitor C Sousa; Filipe G Vieira; Tomáš Vinař; Paul Wade; Kim C Worley; Laurent Excoffier; Phillip A Morin; M Thomas P Gilbert; Jochen B W Wolf
Journal:  Nat Commun       Date:  2016-05-31       Impact factor: 14.919

9.  Historical and contemporary indigenous marine conservation strategies in the North Pacific.

Authors:  Natalie C Ban; Emma Wilson; Doug Neasloss
Journal:  Conserv Biol       Date:  2019-11-20       Impact factor: 6.560

10.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

View more
  1 in total

1.  Sowerby's beaked whale biosonar and movement strategy indicate deep-sea foraging niche differentiation in mesoplodont whales.

Authors:  Fleur Visser; Machiel G Oudejans; Onno A Keller; Peter T Madsen; Mark Johnson
Journal:  J Exp Biol       Date:  2022-05-12       Impact factor: 3.308

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.