Literature DB >> 27016485

Parallel Mitogenome Sequencing Alleviates Random Rooting Effect in Phylogeography.

Shotaro Hirase1, Hirohiko Takeshima2, Mutsumi Nishida3, Wataru Iwasaki4.   

Abstract

Reliably rooted phylogenetic trees play irreplaceable roles in clarifying diversification in the patterns of species and populations. However, such trees are often unavailable in phylogeographic studies, particularly when the focus is on rapidly expanded populations that exhibit star-like trees. A fundamental bottleneck is known as the random rooting effect, where a distant outgroup tends to root an unrooted tree "randomly." We investigated whether parallel mitochondrial genome (mitogenome) sequencing alleviates this effect in phylogeography using a case study on the Sea of Japan lineage of the intertidal goby Chaenogobius annularis Eighty-three C. annularis individuals were collected and their mitogenomes were determined by high-throughput and low-cost parallel sequencing. Phylogenetic analysis of these mitogenome sequences was conducted to root the Sea of Japan lineage, which has a star-like phylogeny and had not been reliably rooted. The topologies of the bootstrap trees were investigated to determine whether the use of mitogenomes alleviated the random rooting effect. The mitogenome data successfully rooted the Sea of Japan lineage by alleviating the effect, which hindered phylogenetic analysis that used specific gene sequences. The reliable rooting of the lineage led to the discovery of a novel, northern lineage that expanded during an interglacial period with high bootstrap support. Furthermore, the finding of this lineage suggested the existence of additional glacial refugia and provided a new recent calibration point that revised the divergence time estimation between the Sea of Japan and Pacific Ocean lineages. This study illustrates the effectiveness of parallel mitogenome sequencing for solving the random rooting problem in phylogeographic studies.
© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  Chaenogobius annularis; Sea of Japan; divergence time estimation; mitogenome; phylogeography; random rooting

Mesh:

Substances:

Year:  2016        PMID: 27016485      PMCID: PMC4860695          DOI: 10.1093/gbe/evw063

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

Phylogeography is a field that deals with principles and processes underlying spatial patterns of genetic lineages, particularly among closely related species (Dumolin-Lapègue et al. 1997; Taberlet et al. 1998; Avise 2000). In this field, mitochondrial DNA, which is characterized by maternal transmission, extensive intraspecific variation, and general absence of intermolecular recombination, has contributed to many findings such as the impacts of Pleistocene glacial cycles on modern species distribution (Avise et al. 1987; Hewitt 2000). Although these discoveries have been facilitated by various methods such as molecular variance analysis, haplotype network analysis, mismatch distribution analysis, and Bayesian skyline plots, reliably rooted phylogenetic trees are still the key to revealing diversification patterns along evolutionary paths (Avise 2000). However, there has been a fundamental bottleneck that inhibits the reliable rooting of phylogenetic trees in phylogeographic studies: The random rooting effect. In phylogenetics, the random rooting effect represents a phenomenon in which a distant outgroup tends to root an unrooted tree “randomly” (Wheeler 1990). If the distance to an outgroup is much larger than that within an ingroup, the outgroup likely contains little information on the temporal polarity of the characters within the ingroup. Consequently, the long branch from the outgroup tends to be connected to any branch in the ingroup with low reliability (Wheeler 1990; Graham et al. 2002; Brinkmann et al. 2005; Gatesy et al. 2007; Rosenfeld et al. 2012). In phylogeographic studies, the random rooting effect often causes severe problems (Cassens et al. 2003), even though it is sometimes not explicitly stated. A typical case would be when an ingroup comprises rapidly expanded population that is characterized by a star-like phylogeny with short internal branches and an outgroup is a distant lineage that diverged a much longer time ago. Recently, the combination of high-throughput and parallel-tagged sequencing (PTS) techniques has made it possible to obtain large amounts of genetic data from many individuals at substantially lower costs and shorter time than previously (Meyer et al. 2008). Notably, whole mitochondrial genome (mitogenome) sequences of hundreds of individuals can now be sequenced in one run of a benchtop sequencer and be annotated with accurate and automatic software (Iwasaki et al. 2013). Because longer sequences contain more information, it is expected that “mitogenomic phylogeography,” or phylogeography based on parallel mitogenome sequencing (Jacobsen et al. 2012) can alleviate the random rooting effect. In this study, we applied mitogenomic phylogeography to the Sea of Japan lineage of an intertidal goby, Chaenogobius annularis, and investigated whether the approach reduced the random rooting effect. This goby species is distributed throughout rocky coastlines of Japan and Korea (Akihito et al. 2002) and has two major lineages, the Pacific Ocean and Sea of Japan lineages (Hirase et al. 2012; Hirase and Ikeda 2014b). In the Sea of Japan, the Pleistocene interglacial periods caused frequent rapid population expansion events for several coastal marine species (Kojima et al. 2004; Akihito et al. 2008; Kitano et al. 2007; Kokita and Nohara 2011; Hirase et al. 2012; Hirase and Ikeda 2014a). The Sea of Japan lineage of C. annularis also diverged over a short period of time, and a previous study on specific mitochondrial genes could not reliably root its star-like phylogenetic tree using the Pacific Ocean lineage as an outgroup (Hirase et al. 2012).

Materials and Methods

Sampling and DNA Extraction

From 2006 to 2010, we collected C. annularis by hand net from 16 and 2 sampling locations along the Japanese and Korean coastlines, respectively (fig. 1 and table 1). Part of the collected specimens were previously subjected to sequencing of mitochondrial CytB and ND2 genes and genotyping of nuclear microsatellite markers (Hirase et al. 2012; Hirase and Ikeda 2014b, 2015). Individuals from seven and ten sampling locations were assigned exclusively to the Pacific Ocean and the Sea of Japan lineages, respectively. At only one location (Miyako) were individuals from both lineages codistributed. DNA was extracted from muscle or fin by phenol/chloroform extraction (Asahida et al. 1996) and stored at − 20 °C prior to analysis.
F

Sampling locations of C. annularis in and around the Japanese archipelago. The upper-left inset map shows the coastlines during the last glacial period (Crusius et al. 1999).

Table 1

Sampling Locations of Chaenogobius annularis and Genetic Diversity in Each Location

LocationLatitude, LongitudeNo. of Individuals Examinedhπ(%)
Sea of Japan lineage
    Otaru, Hokkaido Pref.43°14′N, 141°0′E50.8000.121
    Shiriuchi, Hokkaido Pref.41°31′N, 140°25′E61.0000.211
    Hachinohe, Aomori Pref.40°32′N, 141°33′E61.0000.082
    Aomori, Aomori Pref.40°53′N, 140°51′E50.9000.251
    Fukaura, Aomori Pref.40°44′N, 139°59′E51.0000.209
    Murakami, Niigata Pref.38°16′N, 139°26′E81.0000.259
    Kyotango, Kyoto Pref.35°41′N, 135°2′E51.0000.243
    Shimonoseki, Yamaguchi Pref.33°57′N, 130°52′E70.9520.147
    Busan35°12′N, 129°13′E50.8000.292
    Gangneung, Gangwon Prov.37°51′N, 128°51′E21.0000.024
Pacific Ocean lineage
    Ofunato, Iwate Pref.38°59′N, 141°44′E30.6670.037
    Onagawa, Miyagi Pref.38°26′N, 141°27′E40.8330.198
    Hitachinaka, Ibaraki Pref.36°21′N, 140°37′E51.0000.150
    Omaezaki, Shizuoka Pref.34°35′N, 138°13′E31.0000.087
    Toba, Mie Pref.34°27′N, 136°52′E21.0000.268
    Konan, Kochi Pref.33°26′N, 133°23′E41.0000.360
    Iyo, Ehime Pref.33°42′N, 132°39′E61.0000.749
Both lineages
    Miyako, Iwate Pref.39°46′N, 141°59′E21.0005.226

Note.—h, haplotype diversity; π, nucleotide diversity.

Sampling locations of C. annularis in and around the Japanese archipelago. The upper-left inset map shows the coastlines during the last glacial period (Crusius et al. 1999). Sampling Locations of Chaenogobius annularis and Genetic Diversity in Each Location Note.—h, haplotype diversity; π, nucleotide diversity.

PTS of Mitogenomes

Chaenogobius annularis mitogenomes were amplified by long polymerase chain reaction (PCR). Two sets of complementary primers (30 bp each) were designed for conserved regions of 16S ribosomal RNA (rRNA) and transfer RNA leucine (tRNA-leu) genes. The primer sequences were S-LA-16S-H (5′-TGC ACC ATT AGG ATG TCC TGA TCC AAC ATC-3′) and S-LA-16S-L (5′-GAT GTT GGA TCA GGA CAT CCC AAT GGT GCA-3′) of 16S rRNA, and L-12321-Leu (5′-GGT CTT AGG AAC CAA AAA CTC TTG GTG CAA-3′) and H-12321-Leu (5′-TTG CAC CAA GAG TTT TTG GTT CCT AAG ACC-3′) of tRNA-leu (Kim et al. 2004). Because the two sets of complementary sequences were used as primers, the two long PCR fragments do not overlap if the primer sequences are excluded. PCR was performed in a 20-µl reaction volume containing 10 µl 2× PCR buffer, 5 mg/l template DNA, 25,000 units/l Tks Gflex DNA polymerase (TAKARA), and 0.2 µM of each primer. The PCR conditions comprised an initial denaturation at 94 °C for 1 min followed by 30 cycles of denaturation at 98 °C for 10 s and annealing and extension at 68 °C for 10 min. The PCR products were purified using AMPURE beads (Beckman Coulter) or SeqTarget (Qiagen), and DNA concentration was measured with a Qubit dsDNA HS Assay Kit (Life Technologies). For each individual, products amplified by the two primer sets were mixed in an equimolar ratio and fragmented with dsDNA Fragmentase (New England Biolabs) in a 20-µl reaction volume containing 2 µl 10× fragmentase reaction buffer, 2 µl fragmentase solution, 100 mg/l BSA, and 2.5–100 mg/l PCR product at 37 °C for 40–65 min. After purification with AMPURE beads, approximately 10–200 ng of the fragmented products were tagged with individual-specific 8-nt PTS barcode tags following Meyer et al. (2008). Two GS junior libraries that contained the equimolar-tagged products were generated using Next Quick DNA library Prep Master Set for 454 (New England Biolabs). Two MID tags (Roche) were used for each library. Short DNA fragments were removed with AMPURE beads subsequently using GenRead (Qiagen). Each library was then sequenced on a GS Junior System (Roche) following the manufacturer’s protocol after emulsion PCR under a condition of 0.2 DNA copies per amplification bead. The output SFF files were processed to obtain one SFF file per individual using sfffiles (Roche) and in-house Perl scripts. Sequences that contained no PTS tag sequence in their 3′- and 5′-end stretches (25 bases each) were discarded.

De Novo Assembly of Reference Sequences

No reference mitogenome sequence was available for C. annularis. Accordingly, de novo assembly was performed using the sequenced reads of Ofunato2 and Aomori9 samples, which had the highest average sequencing depths (89.5 and 79.5×) among the Pacific Ocean and Sea of Japan lineages, respectively. After removal of primer sequences, GS De Novo Assembler 2.7 (Roche) was applied, with the default settings, to obtain two reference contig sequences that corresponded to the two long-PCR products of each lineage. Nucleotides with quality scores less than 30 were substituted with N and accounted for 0.09% of the reference sequences on average.

Alignment of Mitogenome Sequences

The sequenced reads from each individual were mapped to the reference contigs of the corresponding lineage using GS Reference Mapper 2.7 (Roche) with the default settings. Nucleotides with quality scores less than 30 were substituted with N. In the subsequent analyses, only mitogenomes composed of two contigs and greater than 16,000 nt bases (∼97% of a mitogenome) were used. For each individual, the two contigs were concatenated and the coordinates were adjusted to place the tRNA-Phe gene at the first position. Note that sequences that corresponded to the complementary primer positions (60 bases in total) remained undetermined. The mitogenome sequences were subjected to multiple alignment with a mitogenome sequence of Gymnogobius petschiliensis (AY525784), which is closely related to the Chaenogobius genus (Hirase and Ikeda 2014a), by the FFT-NS-2 algorithm of MAFFT 7.058 (Katoh et al. 2005) with manual checking. We annotated the mitogenome sequences using MitoAnnotator (Iwasaki et al. 2013) and by reference to the G. petschiliensis mitogenome annotation. We observed some deletions and insertions in the mitogenomes of C. annularis. Because the long PCR reduced the possibility that pseudogenes in nuclear mitochondrial DNA were amplified and we observed no increase in the frequencies of deletions and insertions whose lengths were multiples of three within protein-coding sequences (i.e., no increase in nonframeshift mutations), we attributed them to homopolymer errors of pyrosequencing. In protein-coding regions, we added Ns to or removed nucleotides from sequences to eliminate any frameshift error and make the gene lengths as close to the corresponding positions of the G. petschiliensis mitogenome as possible. In noncoding regions, we added Ns or removed nucleotides to make the gene lengths equal to those of G. petschiliensis. Finally, we obtained mitogenomes of 83 C. annularis individuals, whose sequence coverage values ranged from 6.8 to 116.9 × (INSDC accession numbers: AP014725–AP014799). Individuals having the same mitogenome sequences were grouped using Phylogears 2.0 (Tanabe 2008) into 75 haplotypes.

Comparison with Sanger Sequences

Among the 83 individuals whose mitogenome sequences were obtained, the CytB and ND2 gene sequences of 48 individuals had been obtained by the Sanger method in a previous study (Hirase et al. 2012). The R package ADEGENET (Jombart and Ahmed 2011) was applied to those CytB and ND2 genes to call single nucleotide variations (SNVs). An in-house Perl script was then used to compare the SNVs between the Sanger and pyrosequenced sequences (314 SNVs on average per pair).

Phylogenetic Analysis

MEGA 5.05 (Tamura et al. 2011) with options of 95%-cutoff partial deletion, Jukes–Cantor distance, and 1,000 bootstrap replications was applied to the mitogenome alignment to reconstruct neighbor-joining (NJ) trees (Saitou and Nei 1987) because NJ trees based on Jukes–Cantor distance are reported to show good performance in estimating phylogenetic relationships of closely related lineages (Nei and Kumar 2000). For reference, we also reconstructed maximum likelihood (ML) trees using PartitionFinder 1.0.1 (Lanfear et al. 2012) and RAxML 7.2.8 (Stamatakis 2006). In the PartitionFinder analysis, we adopted the greedy heuristic search algorithm and Bayesian information criterion for identifying the best fit partitioning scheme, where the designated data blocks comprised each codon in each protein-coding gene, each of the tRNA and rRNA genes, and the control region. In the RAxML analysis, we reconstructed ML trees based on the GTR + G substitution model and performed rapid bootstrap analyses with 1,000 replicates (–f a option). We obtained a similar topology but with a smaller bootstrap value of the northern lineage than of the NJ tree (supplementary fig. S1, Supplementary Material online). The discrepancy in the bootstrap value agreed with a previous report that complex substitution models would perform worse in phylogenetic analysis of closely related lineages (Nei and Kumar 2000). For assessment of the random rooting effect, we extracted 17 single-gene (or region) alignments that corresponded to 13 protein-coding genes, 2 rRNA genes, concatenated tRNA genes, and the control region, from the mitogenome alignment. The additional 17 single-gene (or region) alignments that contained only sequences of the Sea of Japan lineage (i.e., without outgroup) were also created. SEQBOOT of the Phylip 3.695 package (Felsenstein 1993) was used to produce 1,000 bootstrap replicates of each of the 34 alignments. For each replicate, an NJ tree was reconstructed using DNADIST with Jukes–Cantor distance and NEIGHBOR. The proportions of the bootstrap trees that supported lineages under investigation were then examined with CompareToBootstrap.pl in Fast Tree-Comparison Tools (http://meta.microbesonline.org/fasttree/treecmp.html, last accessed March 29, 2016). We repeated the same phylogenetic analyses by gradually increasing the number of genes and regions in the alignment and observing whether root placement converged.

Population Genetic Analysis

Arlequin 3.5 (Excoffier et al. 2005) was used to calculate nucleotide diversity, haplotype diversity, mismatch distribution (Rogers and Harpending 1992), Tajima’s D (Tajima 1989), and Fu’s Fs (Fu 1997) for the whole mitogenomes. To determine whether rapid population expansion or positive selection explains the observed values (Slatkin and Hudson 1991), Tajima’s D was also calculated for each of the protein-coding, tRNA, and rRNA genes and for the control region. In addition, DNAsp 5.10 (Librado and Rozas 2009) was used to calculate Tajima’s D for synonymous and nonsynonymous sites of each protein-coding gene, and significance of their differences was tested with a Wilcoxon two-sample test.

Divergence Time Estimation

We assumed that the time of the most recent common ancestor of the northern lineage corresponds to one of four times during the late Pleistocene interglacial periods (9–19, 125–135, 238–248, and 332–342 ka) (Lisiecki and Raymo 2005). We used Reltime of MEGA 6.06 (Tamura et al. 2013) to estimate the branch lengths by the ML method with the TN93 (Tamura and Nei 1993) + G + I substitution model, which was selected by MEGA as the best-fit substitution model. Although we adopted the midpoint of each interglacial period as the calibration point, use of the start or end point did not affect the conclusion. We also estimated evolutionary clock rates of CytB, ND2, ATP6, COI, 16S rRNA genes, and control region using the abovementioned calibration points and average pairwise sequence divergences in the northern lineage, and compared them with previously reported rates (those of fish pairs separated by the Isthmus of Panama [ND2: 1.3%/Myr, ATP6: 1.3%/Myr, COI: 1.2%/Myr, Bermingham et al. 1997; 16S rRNA: 0.8–1.2%/Myr, Tringali et al. 1999; control region: 3.6%/Myr, Donaldson and Wilson 1999] and those of the Gymnogobius genus [CytB: 2.2–2.7%/Myr, Harada et al. 2002 and Sota et al. 2005]). We also performed BEAST 1.8.0 (Drummond and Rambaut 2007) analysis under the assumption that the evolutionary rate of C. annularis mitogenomes is similar to previously reported rates of fish mitogenomes. We obtained no significant evidence for rate heterogeneity among branches using an uncorrelated log-normal clock model and by removing sequences of G. petschiliensis (ucld.stdev = 0.30. Note that ucld.stdev > 1.0 denotes rate heterogeneity Drummond et al. 2007). A likelihood ratio test performed by MEGA also did not show among-lineage rate heterogeneity in C. annularis (P = 0.19). We accordingly removed sequences of G. petschiliensis from the data set, chose the strict clock model, and set a prior distribution of the evolutionary rate (clock rate = 1.0 ± 1.0%/Myr per lineage) by reference to the typical range in fish (0.4–1.8%/Myr per lineage, Tringali et al. 1999; Donaldson and Wilson 1999). It may be noted that the typical evolutionary rate in mammals is approximately 1%/Myr per lineage (Brown et al. 1979). TNr (TN93) + G + I substitution model and the Yule (speciation) tree prior were used, and all other model parameters and priors were set to default values. In Markov chain Monte Carlo analysis, a total of 100 million generations were generated and one in every 1,000 generations was sampled (10% of the initial samples were removed as burn-in). The convergence of the chains to the stationary distribution and large effective sample size (>500) were confirmed using Tracer 1.5 (Rambaut and Drummond 2009).

Results

Parallel Mitogenome Sequencing of C. annularis

Mitogenomes of C. annularis collected from 18 sampling locations (fig. 1 and table 1) were sequenced by two runs of a GS Junior sequencer with the PTS technique (Meyer et al. 2008). In total, we obtained 182,784 raw sequence reads whose average length and quality score were 431.8 bp and 32.7, respectively. Individual-specific PTS and MID tag sequences allowed 146,967 reads (80.4%) to be unambiguously assigned to corresponding individuals. Following de novo assembly of reference mitogenomes of the Pacific Ocean and Sea of Japan lineages, mitogenomes of 83 individuals were determined. To ensure high quality of the mitogenome sequences, we called SNVs of the CytB and ND2 gene sequences of 48 individuals previously obtained by the Sanger method and compared them with the mitogenome sequences. The comparison confirmed that the SNVs from the two sequencing methods were perfectly consistent. Finally, we grouped individuals having the same sequences into 75 mitogenomes (haplotypes).

A Mitogenomic Phylogenetic Tree Reveals the Existence of a Northern Lineage within the Sea of Japan Lineage

We reconstructed a mitogenomic phylogenetic tree using the 75 C. annularis mitogenome sequences and that of G. petschiliensis (fig. 2). Within the Sea of Japan lineage, the tree revealed with 99% bootstrap support the existence of a novel lineage whose members are widely distributed along the northern coastline of the Sea of Japan (designated as ‘Northern’ in figure 2 and called the northern lineage hereafter). The existence of several small clades (e.g., Busan, Gangneung, Shimonoseki, Hachinohe, and Hokkaido [Otaru and Shiriuchi] clades) that correspond to specific geographic areas was also suggested.
F

A phylogenetic tree of 75 mitogenomes (haplotypes) of C. annularis and one of G. petschiliensis. Branch lengths are proportional to estimated numbers of nucleotide substitutions per site. Closed circles and numbers indicate strongly (>80%) supported geographical clades and their bootstrap values, respectively.

A phylogenetic tree of 75 mitogenomes (haplotypes) of C. annularis and one of G. petschiliensis. Branch lengths are proportional to estimated numbers of nucleotide substitutions per site. Closed circles and numbers indicate strongly (>80%) supported geographical clades and their bootstrap values, respectively. Nonmitogenomic trees that were reconstructed using each mitochondrial gene, concatenated tRNA genes, or the control region did not support the existence of the northern lineage (highest bootstrap value = 56% in the ND1 gene tree). Thus, the mitogenome sequences were indispensable in the establishment of the northern lineage.

Use of Mitogenomes Alleviated the Random Rooting Effect

To determine whether the random rooting effect was the cause of the low bootstrap values of the northern lineage in the nonmitogenomic trees, we reconstructed an ND1 gene tree using sequences of the Sea of Japan lineage only. Within this “ingroup-only” tree, the split corresponding to the northern lineage was supported by an 85% bootstrap value. This large value suggested that random rooting actually took place in the reconstruction of the nonmitogenomic ND1 tree, where an outgroup branch from the Pacific Ocean lineage was connected to diverse branches in the Sea of Japan lineage and, as a result, decreased the bootstrap support value of the northern lineage. For any phylogenetic tree that was reconstructed using each gene or region of the mitogenomes, a sharp decrease in the bootstrap support value of the northern lineage was observed following the addition of the outgroup (table 2).
Table 2

Bootstrap Values of the Northern Lineage without and with the Outgroup

LengthWithout OutgroupWith Outgroup
ND19758556
ND21,0443724
COI1,5544830
COII69000
ATP815600
ATP66753530
COIII783326
ND334800
ND41,380120
ND4L28800
ND51,83600
ND65224417
CytB1,1404920
12S rRNA94600
16S rRNA1,66000
Control region83500
tRNAs1,5218325
Mitogenome16,42610098
Bootstrap Values of the Northern Lineage without and with the Outgroup To further examine the random rooting effect, we investigated topologies of bootstrap trees reconstructed using ND1 genes including the outgroup, and identified the ingroup branches to which the outgroup branch was connected (fig. 3). As expected, the outgroup branch was connected to diverse branches with low reliability (3−34%) in the ND1 gene tree, likely because the outgroup contained little information on the temporal polarity of the characters within the Sea of Japan lineage, which had diverged rapidly (fig. 3). The same trend was observed when another gene or region was used in the analysis instead of ND1 (supplementary fig. S2, Supplementary Material online). We concluded that the use of the mitogenome efficiently stabilized the rooting position. However, it may be noted that a highly supported phylogeny does not necessarily represent a correct phylogeny, particularly where long-branch attraction artifacts are present. We also repeated the phylogenetic analysis using different partial mitogenomic sequences by gradually increasing the number of genes under evaluation. When we included approximately ten genes (or regions) in the analyses, the root placement became almost stabilized. This root placement was the same (i.e., converged) even if different sets of ten genes or regions were used (supplementary table S1, Supplementary Material online). Thus, incorrect rooting owing to a specific gene whose evolutionary trend is atypical (such as by long-branch attraction) is unlikely.
F

Proportions of branches of the Sea of Japan lineage to which the outgroup branch was connected among the bootstrap trees of the ND1 gene. The outgroup branch was connected to diverse ingroup branches, failing to root the tree reliably.

Proportions of branches of the Sea of Japan lineage to which the outgroup branch was connected among the bootstrap trees of the ND1 gene. The outgroup branch was connected to diverse ingroup branches, failing to root the tree reliably.

The Northern Lineage Experienced Rapid Population Expansion

The star-like shape of the phylogenetic tree based on the present mitogenome data suggested that the northern lineage experienced rapid population expansion. Thus, using the mitogenomes of the northern lineage, further population genetic analysis was performed to test this hypothesis. The shape of the mismatch distribution was unimodal and similar to that of a simulated distribution under the model of rapid population expansion (Rogers and Harpending 1992) (fig. 4). Neither sum of squared deviation nor raggedness index suggested that the two distributions differed significantly (P > 0.05). Significantly negative values of Tajima’s D (−2.26; P = 0.001) and Fu’s Fs (−7.45; P = 0.017) also supported the rapid population expansion scenario. Furthermore, because most regions in the mitogenomes showed significantly negative Tajima’s D values and no skews between the nonsynonymous and synonymous sites were observed (table 3), positive selection was not likely a major factor in the genetic divergences in the northern lineage (Slatkin and Hudson 1991). We concluded from these findings that rapid population expansion gave rise to the northern lineage along the northern coastline of the Sea of Japan.
F

Mismatch distribution in the northern lineage mitogenomes. Observed (histogram) and simulated (line) frequencies of pairwise nucleotide differences between the mitogenome sequences are shown.

Table 3

Tajima’s D Value of the Northern Lineage in Each Region of Mitogenome

AllSynonymousNonsynonymous
SubstitutionsSubstitutionsSubstitutions
ND1−2.34*−1.98*−1.81*
ND2−2.06*−1.95*−1.89*
COI−2.35*−2.19*−1.71*
COII−1.60−1.60NA
ATP8−1.31−1.12−1.12
ATP6−2.03*−1.52−2.01*
COIII−2.01*−1.56−1.76
ND3−1.82*−1.35−1.48
ND4−2.37*−2.14*−2.32*
ND4L−1.81*−1.44−1.12
ND5−2.20*−2.17*−1.98*
ND6−1.86*−1.82*−1.64
CytB−2.08*−2.08*−1.81*
12S rRNA−1.47
16S rRNA−2.18*
Control region−1.74*
tRNAs−2.37*

Note.—NA, not applicable.

P < 0.05.

Mismatch distribution in the northern lineage mitogenomes. Observed (histogram) and simulated (line) frequencies of pairwise nucleotide differences between the mitogenome sequences are shown. Tajima’s D Value of the Northern Lineage in Each Region of Mitogenome Note.—NA, not applicable. P < 0.05.

Divergence Time Estimation Based on the Rapid Population Expansion of the Northern Lineage

Population expansion in the Sea of Japan has been observed for many coastal marine species (Kojima et al. 2004; Akihito et al. 2008; Kokita and Nohara 2011; Hirase et al. 2012; Hirase and Ikeda 2014a). Thus, population expansion of these species may be attributed to large-scale environmental changes rather than species-specific factors, and the glacial–interglacial cycles played the largest role in such changes in the Sea of Japan (Hirase et al. 2012). During glacial periods, the Sea of Japan was clearly characterized by colder temperature and lower salinity because of the closing of the south Tsushima Strait, with cold fresh water from rivers supplying much of the water (Oba et al. 1991; Tada 1994; Gorbarenko and Southon 2000). In contrast, during interglacial periods, the resumed inflow of the warm Tsushima Current drastically improved the severe conditions in the Sea of Japan after the late Pleistocene (Kitamura et al. 2001; Kitamura and Kimoto 2006). We accordingly speculated that the rapid population expansion of the northern lineage occurred during an interglacial period. We calibrated the phylogenetic tree by making the time of the most recent common ancestor of the northern lineage correspond to each of the four interglacial periods in the late Pleistocene (i.e., 9–19, 125–135, 238–248, and 332–342 ka; Lisiecki and Raymo 2005). Among these scenarios, comparison with previously reported evolutionary rates immediately rejected the 9–19 ka expansion scenario because of unrealistically large estimated rates (table 4). Based on the other three scenarios, the divergence time between the Pacific Ocean and Sea of Japan lineages was estimated to be from 3.37 to 8.74 Ma (table 4 and supplementary figs. S3–S5, Supplementary Material online). As an alternative approach without the calibration based on the interglacial periods, we also performed Bayesian analysis with a strict molecular clock model and prior distribution of the typical evolutionary rate in fish (a clock rate of 1.0 ± 1.0%/Myr per lineage). This independent analysis estimated the divergence time between the Pacific Ocean and Sea of Japan lineages to be 3.30 Ma (95% highest posterior density: 0.90–7.56 Ma) and the time of the most recent common ancestor of the northern lineage to be 144 ka (95% highest posterior density: 38–330 ka).
Table 4

Estimation of Divergence Time between the Pacific Ocean and Sea of Japan Lineages

TMRCA of Northern Lineage9–19 ka125–135 ka238–248 ka332–342 ka
Average time (95% confidence intervals)
    Divergence between tde Pacific Ocean and Sea of Japan lineages3.37 (1.18–5.56) Ma6.30 (2.20–10.40) Ma8.74 (3.06–14.42) Ma
Evolutionary rate
    ND2 (reported rate: 1.3%/Myra)40.0%/Myr4.3%/Myr2.3%/Myr1.7%/Myr
    COI (reported rate: 1.2%/Myra)14.9%/Myr1.6%/Myr0.9%/Myr0.6%/Myr
    ATP6 (reported rate: 1.3%/Myra)27.0%/Myr2.9%/Myr1.6%/Myr1.1%/Myr
    CytB (reported rate: 2.2–2.7%/Myrb)21.1%/Myr2.3%/Myr1.2%/Myr0.9%/Myr
    16S rRNA (reported rate: 0.8–1.2%/Myrc)13.3%/Myr1.4%/Myr0.8%/Myr0.6%/Myr
    Control region (reported rate: 3.6%/Myrd)44.4%/Myr4.8%/Myr2.6%/Myr1.8%/Myr

Note.—TMRCA, the time of the most recent common ancestor.

Evolutionary rates used as reference: aBermingham et al. (1997);

bHarada et al. (2002); Sota et al. (2005);

cTringali et al. (1999);

dDonaldson and Wilson (1999).

Estimation of Divergence Time between the Pacific Ocean and Sea of Japan Lineages Note.—TMRCA, the time of the most recent common ancestor. Evolutionary rates used as reference: aBermingham et al. (1997); bHarada et al. (2002); Sota et al. (2005); cTringali et al. (1999); dDonaldson and Wilson (1999).

Discussion

Discovery of the Northern Lineage Suggests the Existence of Additional Glacial Refugia

Discovery of the northern lineage provides a new perspective on glacial refugia of C. annularis. In a previous study that analyzed mitochondrial CytB gene and nuclear microsatellite DNA markers, the entire Sea of Japan lineage was proposed to have expanded from southern refugia through an inflow of the Tsushima Current (Hirase and Ikeda 2014b). This estimation was based on observations that the Sea of Japan lineage showed a star-like phylogeny without large genetic divergences and that southern Shimonoseki populations were of the richest genetic diversity. This scenario was also consistent with a paleoenvironmental study (Gorbarenko and Southon 2000) that found that the southern area had a constant inflow of fresh seawater through a portion of the strait even during the last glacial period. However, the present mitogenomic analysis requires this scenario to be revised because the northern lineage is clearly distinct from the southern lineages in the mitogenomic tree and likely experienced rapid population expansion. We propose that there were additional glacial refugia in the northern areas of the Sea of Japan coastlines and that the entire Sea of Japan lineage did not expand from the southern refugia alone. It is notable that the presence of small clades whose members are distributed in the southern areas may indicate that there were multiple glacial refugia in the southern areas of the Sea of Japan. However, these small clades seem to have experienced genetic drift, a hypothesis consistent with an observation that the Korean populations showed no or few private alleles in the microsatellite DNA loci in contrast to the Shimonoseki population’s display of rich genetic diversity (Hirase and Ikeda 2014b). Mitogenome data from more sampling points would be necessary to clarify whether multiple glacial refugia were present in the southern areas of the Sea of Japan.

The New Calibration Point Leads to a Revised Divergence Time Estimation

In a previous study that analyzed the mitochondrial CytB gene, the Pacific Ocean and the Sea of Japan lineages were estimated to have diverged 1.08–2.46 Ma (Hirase et al. 2012). Based on the new mitogenome data, the divergence time estimation was revised to be either of 3.37, 6.30, or 8.74 Ma using the time of the most recent common ancestor of the northern lineage as a new calibration point (table 4). Of these times, we propose that the 125–135 ka northern lineage expansion and 3.37 Ma divergence scenario is the most likely, given that the independent Bayesian analysis with a strict molecular clock model and prior distribution of typical evolutionary rate in fish gave similar estimates (144 ka expansion and 3.30 Ma divergence) and rapid population expansions around 130 ka during the last interglacial periods have been reported for other marine species as well (Provan et al. 2005; Hoarau et al. 2007; Ni et al. 2014). Although comparison between the estimated and previously reported evolutionary rates suggests that the 238–248 ka expansion and 6.30 Ma divergence can be the case (table 4), the 6.30 Ma divergence is unlikely because Tsushima Strait was closed during 10.5–3.5 Ma (Tada 1994; Kitamura and Kimoto 2006) and this closure should have greatly inhibited the migration of goby species between the Pacific Ocean and the Sea of Japan. Conversely, the 3.37 Ma divergence scenario reasonably synchronizes with the opening of Tsushima Strait 3.5 Ma. Furthermore, it may be noted that the opening of the Tsushima Strait 3.5 Ma has been reported to account for the divergence between the continental and Japanese archipelago populations of a related group, the Gymnogobius castaneus–taranetzi species complex (Sota et al. 2005). Geographic changes after the last interglacial period such as disappearance and reappearance of a land bridge across the Tsugaru Strait (Koaze et al. 2003) may have contributed to the formation of small clades (including the Hachinohe and Hokkaido clades) in the northern lineage. The availability of recent calibration points lays a foundation for understanding the phylogeography of various species and populations. However, particularly for marine species, molecular clock calibrations often need to rely on old fossils and geological events because of a general lack of recent calibration points (Marko 2002; Lessios 2008). Rapid population expansions during interglacial periods would provide alternative ways to obtain recent calibration points (e.g., Crandall et al. [2012] calibrated clock rates of marine invertebrates by assuming rapid population expansion on the Sunda Shelf in southeastern Asia). We propose that reliably rooted mitogenomic phylogenetic trees will facilitate obtaining calibration points that correspond to recent expansion events.

Mitogenomic Phylogeography Is a Powerful Approach

In this study, we showed that high-throughput and low-cost parallel mitogenome sequencing led to the discovery of a new lineage within the Sea of Japan lineage of C. annularis by alleviating the random rooting effect in phylogenetic tree reconstruction. Furthermore, this novel lineage, which expanded rapidly in the interglacial, suggested the existence of additional glacial refugia and provided a new calibration point for divergence time estimation. Our study illustrates the power of parallel mitogenome sequencing in phylogeographic studies. Because sequencing of mitogenomes of many individuals previously required considerable cost and time (Keis et al. 2013), whole mitogenome sequences have been used in a limited number of phylogeographic studies (for a recent study, see Jacobsen et al. 2012). In 2015, more than 50 mitogenomes can be sequenced in one run of a benchtop sequencer, costing approximately $2,100 (library construction, $800; sequencing, $1,300), within 1 week. Although the GS Junior System used in this study will be phased out in 2016 according to the manufacturer’s announcement, this approach can readily be adopted with other sequencing platforms. Mitogenomic phylogeography would contribute to a wide range of phylogeographic studies, particularly for revealing geographical and historical patterns behind rapidly expanded populations and predicting impacts of future climate changes on species diversity (Pearson 2006). Another promising application would be a combination of mitogenomic and nuclear genomic high-throughput sequencing approaches such as restriction site-associated DNA-Seq analysis (Emerson et al. 2010) to investigate the phylogeography of sexual organisms in depth.

Supplementary Material

Supplementary figures S1–S5 and table S1 are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).
  40 in total

Review 1.  The genetic legacy of the Quaternary ice ages.

Authors:  G Hewitt
Journal:  Nature       Date:  2000-06-22       Impact factor: 49.962

2.  Partitionfinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses.

Authors:  Robert Lanfear; Brett Calcott; Simon Y W Ho; Stephane Guindon
Journal:  Mol Biol Evol       Date:  2012-01-20       Impact factor: 16.240

3.  An empirical assessment of long-branch attraction artefacts in deep eukaryotic phylogenomics.

Authors:  Henner Brinkmann; Mark van der Giezen; Yan Zhou; Gaëtan Poncelin de Raucourt; Hervé Philippe
Journal:  Syst Biol       Date:  2005-10       Impact factor: 15.683

4.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.

Authors:  Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2006-08-23       Impact factor: 6.937

5.  Parallel tagged sequencing on the 454 platform.

Authors:  Matthias Meyer; Udo Stenzel; Michael Hofreiter
Journal:  Nat Protoc       Date:  2008       Impact factor: 13.491

6.  Comparative phylogeography and postglacial colonization routes in Europe.

Authors:  P Taberlet; L Fumagalli; A G Wust-Saucy; J F Cosson
Journal:  Mol Ecol       Date:  1998-04       Impact factor: 6.185

7.  Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection.

Authors:  Y X Fu
Journal:  Genetics       Date:  1997-10       Impact factor: 4.562

Review 8.  Comparative phylogeography in marginal seas of the northwestern Pacific.

Authors:  Gang Ni; Qi Li; Lingfeng Kong; Hong Yu
Journal:  Mol Ecol       Date:  2014-02       Impact factor: 6.185

9.  Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.

Authors:  F Tajima
Journal:  Genetics       Date:  1989-11       Impact factor: 4.562

10.  The neighbor-joining method: a new method for reconstructing phylogenetic trees.

Authors:  N Saitou; M Nei
Journal:  Mol Biol Evol       Date:  1987-07       Impact factor: 16.240

View more
  6 in total

1.  Comparative Mitogenomic Analyses of Praying Mantises (Dictyoptera, Mantodea): Origin and Evolution of Unusual Intergenic Gaps.

Authors:  Hong-Li Zhang; Fei Ye
Journal:  Int J Biol Sci       Date:  2017-02-25       Impact factor: 6.580

2.  Mitochondrial Genomics Reveals Shared Phylogeographic Patterns and Demographic History among Three Periodical Cicada Species Groups.

Authors:  Zhenyong Du; Hiroki Hasegawa; John R Cooley; Chris Simon; Jin Yoshimura; Wanzhi Cai; Teiji Sota; Hu Li
Journal:  Mol Biol Evol       Date:  2019-06-01       Impact factor: 16.240

3.  Reconstructing the population history of the sandy beach amphipod Haustorioides japonicus using the calibration of demographic transition (CDT) approach.

Authors:  Kay Sakuma; Risa Ishida; Taketoshi Kodama; Yoshitake Takada
Journal:  PLoS One       Date:  2019-10-09       Impact factor: 3.240

4.  Complete mitochondrial genomes of four species of praying mantises (Dictyoptera, Mantidae) with ribosomal second structure, evolutionary and phylogenetic analyses.

Authors:  Yan Shi; Lin-Yu Li; Qin-Peng Liu; Muhammad Yasir Ali; Zhong-Lin Yuan; Guy Smagghe; Tong-Xian Liu
Journal:  PLoS One       Date:  2021-11-04       Impact factor: 3.240

5.  Lineage Divergence of Dendrolimus punctatus in Southern China Based on Mitochondrial Genome.

Authors:  Huicong Du; Man Liu; Sufang Zhang; Fu Liu; Zhen Zhang; Xiangbo Kong
Journal:  Front Genet       Date:  2020-02-19       Impact factor: 4.599

6.  Genomic Evidence for Speciation with Gene Flow in Broadcast Spawning Marine Invertebrates.

Authors:  Shotaro Hirase; Yo Y Yamasaki; Masashi Sekino; Masato Nishisako; Minoru Ikeda; Motoyuki Hara; Juha Merilä; Kiyoshi Kikuchi
Journal:  Mol Biol Evol       Date:  2021-10-27       Impact factor: 16.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.