Literature DB >> 29201984

Genome statistics and phylogenetic reconstructions for Southern Hemisphere whelks (Gastropoda: Buccinulidae).

Felix Vaux1, Simon F K Hills1, Bruce A Marshall2, Steve A Trewick1, Mary Morgan-Richards1.   

Abstract

This data article provides genome statistics, phylogenetic networks and trees for a phylogenetic study of Southern Hemisphere Buccinulidae marine snails [1]. We present alternative phylogenetic reconstructions using mitochondrial genomic and 45S nuclear ribosomal cassette DNA sequence data, as well as trees based on short-length DNA sequence data. We also investigate the proportion of variable sites per sequence length for a set of mitochondrial and nuclear ribosomal genes, in order to examine the phylogenetic information provided by different DNA markers. Sequence alignment files used for phylogenetic reconstructions in the main text and this article are provided here.

Entities:  

Year:  2017        PMID: 29201984      PMCID: PMC5702863          DOI: 10.1016/j.dib.2017.11.021

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the data Summary statistics for whole mitochondrial DNA sequences and 45S nuclear ribosomal genes are presented because such information for gastropods is currently rare, and base bias is known to influence phylogenetic inferences. Phylogenetic reconstructions (from short-length DNA sequence data) presented here include multiple buccinid and buccinulid taxa not included in the main-text trees, and may be useful to future evolutionary studies of Neogastropoda. DNA sequence variation and phylogenetic trees are provided because Southern Hemisphere taxa are currently under-sampled. The proportion of variable DNA sites for a selection of mitochondrial and nuclear genes from buccinulid whelks are compared. This information can improve genetic marker selection for future molluscan studies.

Data

The data presented here originates from a phylogenetic study of Southern Hemisphere whelks [1], which refers to a group of marine snails that can be classified within the taxonomic families Buccinulidae or Buccinidae [2], [3], [4], [5], [6], [7], [8]. The classification of these gastropod snails depends upon a biogeographic hypothesis and an assumption of reciprocal monophyly between the majority of lineages in the Northern and Southern Hemispheres [3], [6], [7], [8]. Results from our study indicated that Buccinulidae and Southern Hemisphere whelks are not monophyletic [1]. 32 putative buccinulid and buccinid marine snails, as well as three fasciolariid snails used as a phylogenetic outgroup, were high-throughput sequenced on the Illumina 2500 platform. Sequence data was assembled to provide mitochondrial (mtDNA) genomic and 45S nuclear ribosomal DNA (rDNA) sequence data for most taxa, although some individuals failed to successfully sequence for the entire mtDNA or rDNA. This data was complemented with short-length sequence data from the mitochondrial 16S RNA and cox1 genes and nuclear ribosomal 28S RNA gene. This short-length sequence data was acquired via PCR amplification and Sanger sequencing using universal primers. Sequence alignments used for analyses presented in the main text are attached to this paper. Using these sequence alignments, we present maximum-likelihood and Bayesian phylogenetic reconstructions for the sampled buccinulid whelks. These phylogenetic trees are alternative reconstructions that can be compared to trees presented in the main text. Splits networks are also estimated using the mtDNA genomic and nuclear ribosomal RNA (18S, 5.4S, 28S) sequence data. The proportion of variable sites per sequence length for a set of mitochondrial and nuclear ribosomal genes is investigated as well, which provides insight towards marker information for recent and distant evolutionary change in neogastropods (Fig. 1, Fig. 9).
Fig. 1

Maximum-likelihood mtDNA phylogeny of buccinid and buccinulid whelks. A maximum-likelihood derived phylogeny generated using RAxML 8.2.8 [9], based an alignment of 31 concatenated mitochondrial genome sequences (11,128 bp incorporating protein-encoding, tRNA and rRNA genes). No partitions were used. No outgroup or monophyly was enforced for this tree. Genera putatively belonging to Buccinulidae are shown in different colours.

Fig. 9

Splits network for illustrating alternative phylogenetic signal in 45S rDNA sequence data for marine snails. The splits network of based on a 4667 bp alignment of 31 concatenated nuclear rDNA gene sequences (18S, 5.8S, 28S rRNA genes). Splits were generated using the Neighbor-Net algorithm in SplitsTree 4 [18]. The splits network presents a generalisation of all of possible topological solutions for the phylogenetic signal contained in the underlying sequence data, but it does not quantify the likelihood of alternative phylogenetic relationships. Edge length is proportional to split weight, and box structures within the network indicate signal for alternative topologies in the underlying sequence data.

Maximum-likelihood mtDNA phylogeny of buccinid and buccinulid whelks. A maximum-likelihood derived phylogeny generated using RAxML 8.2.8 [9], based an alignment of 31 concatenated mitochondrial genome sequences (11,128 bp incorporating protein-encoding, tRNA and rRNA genes). No partitions were used. No outgroup or monophyly was enforced for this tree. Genera putatively belonging to Buccinulidae are shown in different colours. Maximum-likelihood 45S rDNA phylogeny of buccinid and buccinulid whelks. A maximum-likelihood derived phylogeny generated using RAxML 8.2.8 [9], based on a 4667 bp alignment of 31 concatenated nuclear rDNA gene sequences (18S, 5.8S, 28S rRNA). No partitions were used. No outgroup or monophyly was enforced for this tree. Genera putatively belonging to Buccinulidae are shown in different colours. Bayesian calibrated mtDNA phylogeny of buccinid and buccinulid whelks. A Bayesian phylogeny based on an alignment of 25 concatenated mitochondrial genome sequences (incorporating protein-encoding, tRNA and rRNA genes), which has been fossil calibrated to estimate divergence dates among the whelk lineages. Two sequence partitions were used: 1) protein-encoding and tRNA genes (10,635 bp), and 2) tRNA genes (1065 bp) using the GTR + I + G and HKY + I + G substitution models respectively [10], [11]. Black stars indicate splits that fossil calibrated. Tree root height was calibrated using the earliest known buccinoid fossils [12], and fossil calibrations were also used for the earliest Fasciolariidae (un-enforced outgroup) [13], [14], and the earliest known occurrence of the tip branch Buccinulum v. vittatum[15]. BEAST 1.8.3 [16] using and MCMC length of 100 million, 1000 sample frequency and a 10% burn-in was used to generate this phylogeny. Node labels are estimated median divergence dates with the 95% highest posterior density (HPD) range shown as a blue bar. Posterior support values are also shown at nodes, but only if support was < 1.0. Putative buccinulid genera are shown in different colours. Bayesian cox1 phylogeny of buccinid and buccinulid whelks. A Bayesian phylogeny based on a 439 bp alignment of mitochondrial cox1 gene sequences obtained from 54 individual marine snails. The GTR + I + G substitution model was used [11]. The phylogeny was produced using a Bayesian method (100 million MCMC, 10% burn-in, 1000 sample frequency, node labels are posterior support values), via BEAST 1.8.3 [16]. For this tree no outgroup was specified explicitly but reciprocal monophyly was enforced for the Fasciolariidae and Buccinidae/Buccinulidae/Nassariidae. Genera putatively belonging to Buccinulidae are shown in different colours. Bayesian 28S RNA phylogeny of buccinid and buccinulid whelks. A Bayesian phylogeny based on an alignment of nuclear ribosomal 28 S RNA gene sequences obtained from 44 individual marine snails (1486 bp). The GTR + I + G substitution model was used [11]. The phylogeny was produced using a Bayesian method (100 million MCMC, 10% burn-in, 1000 sample frequency, node labels are posterior support values), via BEAST 1.8.3 [16]. For this tree no outgroup was specified explicitly but reciprocal monophyly was enforced for the Fasciolariidae and Buccinidae/Buccinulidae/Nassariidae. Genera putatively belonging to Buccinulidae are shown in different colours. Bayesian 16S RNA phylogeny of buccinid and buccinulid whelks. A Bayesian phylogeny based on an alignment of mitochondrial 16S RNA gene sequences obtained from 35 individual marine snails (868 bp). The GTR + I + G substitution model was used [11]. The phylogeny was produced using a Bayesian method (100 million MCMC, 10% burn-in, 1000 sample frequency, node labels are posterior support values), via BEAST 1.8.3 [16]. For this tree no outgroup was specified explicitly but reciprocal monophyly was enforced for the Fasciolariidae and Buccinidae/Buccinulidae/Nassariidae. Genera putatively belonging to Buccinulidae are shown in different colours. Proportion of variable sites at increasingly deep levels of divergence. The proportion of variable sites per sequence length (bp) for a selection of mtDNA and nuclear rDNA genes reflects different rates of DNA substitution. Values were calculated using Geneious 9.1.3 [17]. The trends plotted effectively represent change in the phylogenetic information provided by each gene for different levels of investigation. Average numbers of variable sites were used for groups in genus and family-level comparisons. For example, we used the average number of differences for all sampled whelk (Buccinidae/Buccinulidae) taxa from all sampled Fasciolariidae taxa. Sampling from Aeneator, Buccinulum and Penion was used to estimate generic-level differences as these groups contained more than two specimens. Likewise, only P. sulcatus, P. chathamensis, and P. c. cuvierianus were used for within-species estimates as these taxa were sampled twice. Since read coverage varies for some genes, not all individuals were included for estimates made for each gene. Splits network illustrating alternative phylogenetic signal in mtDNA sequence data for marine snails. The splits network of based on an alignment of 31 concatenated mitochondrial genome sequences (incorporating protein-encoding, tRNA and rRNA genes; 11,128 bp). Splits were generated using the Neighbor-Net algorithm in SplitsTree 4 [18]. The splits network presents a generalisation of all of possible topological solutions for the phylogenetic signal contained in the underlying sequence data, but it does not quantify the likelihood of alternative phylogenetic relationships. Edge length is proportional to split weight, and box structures within the network indicate signal for alternative topologies in the underlying sequence data. Splits network for illustrating alternative phylogenetic signal in 45S rDNA sequence data for marine snails. The splits network of based on a 4667 bp alignment of 31 concatenated nuclear rDNA gene sequences (18S, 5.8S, 28S rRNA genes). Splits were generated using the Neighbor-Net algorithm in SplitsTree 4 [18]. The splits network presents a generalisation of all of possible topological solutions for the phylogenetic signal contained in the underlying sequence data, but it does not quantify the likelihood of alternative phylogenetic relationships. Edge length is proportional to split weight, and box structures within the network indicate signal for alternative topologies in the underlying sequence data.

Experimental design, materials and methods

The DNA extraction, purification, sequencing method and routine for sequence assembly is provided in the main text [1]. The main text also explains how the figures presented here were generated, including the software and settings used. Legends for tables and figures presented below specify which sequence alignments were used (again referenced in the main text) (Table 1, Table 2).
Table 1

A summary of statistics for the length and nucleotide composition for the concatenated DNA sequences for the nuclear ribosomal RNA genes 18S, 5.8S and 28S (the internal transcribed spacer regions are not included). All listed specimens were newly sequenced for this study.

SpeciesMuseum IDConcatenated nuclear rDNA 18S, 5.8S, 28S
Length (bp)% A% C% G% TGC bias
Pararetifusus carinatusSFKH-TMP00553372324.530.022.254.5
Glaphyrina caudataSFKH-TMP00453392324.530.022.254.5
Taron dubiusSFKH-TMP00653392324.730.122.054.8
Austrofusus glansSFKH-TMP01453382424.430.022.254.4
Colus islandicus2014078253342424.530.022.254.5
Volutopsius norwegicus2014078153382424.430.022.354.4
Buccinum undatum2014078353392424.330.022.354.3
Cominella adspersaSFKH-TMP00953392324.630.022.054.6
Cominella v. brookesiSFKH-TMP01053392124.930.321.755.2
Buccinulum fuscozonatumM.302907/253402224.830.122.054.9
Buccinulum lineaSFKH-TMP01653402224.830.122.054.9
Buccinulum v. littorinoidesSFKH-TMP01153402224.730.122.054.8
Buccinulum pallidumM.258277/653402224.730.221.954.9
Buccinulum p. finlayiM.302870/253402224.730.122.054.8
Buccinulum robustumM.314755/153402224.830.121.954.9
Buccinulum v. vittatumSFKH-TMP00453402224.730.122.054.8
Aeneator benthicolusM.27411153402224.630.122.154.7
Aeneator elegansSFKH-TMP01553402224.730.122.054.8
Aeneator otagoensisM.27943753402224.730.221.954.9
Aeneator recensM.19011953402224.630.122.154.7
Penion benthicolusM.18383253372324.430.022.354.4
Kelletia kelletiiKK1253372424.429.922.354.3
Kelletia lischkeiKL253372424.329.922.454.2
Penion mandarinusC.45698053392424.429.922.254.3
Penion maximusC.48764853392524.429.922.254.3
Penion sulcatusPhoenix953392324.430.022.354.4
Penion sulcatusPhoenix153392324.430.022.354.4
Penion chathamensisM.190085/353392324.429.922.354.3
Penion chathamensisM.190082/253392324.429.922.354.3
Penion c. cuvierianusM.18379253392324.330.022.354.3
Penion c. cuvierianusM.18392753392324.330.022.454.3
Table 2

A summary of the statistics for the length and nucleotide composition for the mitochondrial genomes newly sequenced as part of this study. Specimens marked with one asterisk (*) exhibit drops in read coverage for some small regions, for example K. kelletii has 54 bp missing from cox1. Specimens marked with two asterisks (**) have genomes with large gaps in genome coverage for some regions, such as B. v. vittatum that has 266, 151 and 64 bp missing from the ATP6, cox1 and ND2 genes respectively.

SpeciesMuseum IDmtDNA genome
Length (bp)% A% C% G% TGC bias
Pararetifusus carinatusSFKH-TMP0051520431.51415.040.128.4
Glaphyrina caudataSFKH-TMP0041523531.51314.640.727.9
Taron dubiusSFKH-TMP0061518929.315.717.038.032.7
Austrofusus glansSFKH-TMP0141519531.114.515.339.129.8
Colus islandicus201407821515830.514.915.838.730.7
Volutopsius norwegicus201407811523229.315.716.538.432.2
Buccinum undatum201407831523129.515.616.338.731.9
Cominella adspersaSFKH-TMP0091525130.415.716.038.031.7
Cominella v. brookesiSFKH-TMP0101526329.615.916.737.832.6
Buccinulum fuscozonatumM.302907/21524630.214.815.839.130.6
Buccinulum pallidumM.258277/61524730.914.115.239.729.3
Buccinulum p. finlayiM.302870/21524730.114.815.939.130.7
Buccinulum robustumM.314755/11524429.615.216.139.031.3**
Buccinulum v. vittatumSFKH-TMP0121524429.615.116.438.931.5**
Aeneator benthicolusM.2741111525430.414.715.739.230.4
Aeneator elegansSFKH-TMP0151525430.314.615.839.330.4
Aeneator otagoensisM.2794371524930.314.715.539.530.2*
Aeneator recensM.1901191526430.014.916.039.130.9
Aeneator valedictusSFKH-TMP0131525829.315.816.738.232.5
Penion benthicolusM.1838321522929.516.217.037.332
Kelletia kelletiiKK121510429.316.017.137.631*
Kelletia lischkeiKL21522529.616.116.837.532.9
Penion mandarinusC.4569801525030.415.116.238.331.3
Penion maximusC.4876481524930.615.116.038.231.1
Penion sulcatusPhoenix91522729.216.017.237.532
Penion sulcatusPhoenix11522729.216.117.237.433
Penion chathamensisM.190085/31522728.616.818.036.734.8
Penion chathamensisM.190082/21522828.516.818.036.734.8
Penion c. cuvierianusM.1837921523528.616.917.836.734.7
Penion c. cuvierianusM.1839271524128.317.118.036.635.1**
A summary of statistics for the length and nucleotide composition for the concatenated DNA sequences for the nuclear ribosomal RNA genes 18S, 5.8S and 28S (the internal transcribed spacer regions are not included). All listed specimens were newly sequenced for this study. A summary of the statistics for the length and nucleotide composition for the mitochondrial genomes newly sequenced as part of this study. Specimens marked with one asterisk (*) exhibit drops in read coverage for some small regions, for example K. kelletii has 54 bp missing from cox1. Specimens marked with two asterisks (**) have genomes with large gaps in genome coverage for some regions, such as B. v. vittatum that has 266, 151 and 64 bp missing from the ATP6, cox1 and ND2 genes respectively.
Subject areaBiology
More specific subject areaPhylogenetics; Genetics; Evolutionary Biology
Type of dataTable, text file, graph, figure
How data was acquiredHigh-throughput and Sanger DNA sequencing
Data formatText file format for DNA sequence alignments and phylogenetic trees is.nex (nexus) and.tree respectively.
Experimental factorsTotal DNA was extracted from specimens using CTAB buffer. DNA was paired-end sequenced using the high-throughput Illumina HiSeq. 2500 platform. Short-length DNA sequences were amplified via PCR and Sanger sequenced.
Experimental featuresmtDNA genome and 45 S nuclear ribosomal DNA sequences were assembled using reference sequences. Sequences were aligned with gaps and poorly-aligned positions removed. Phylogenetic trees were constructed using Bayesian (BEAST 1.8.3) and Maximum-Likelihood methods (RAxML 8.2.8). The unrooted phylogenetic network of some alignments was investigated using SplitsTree 4.
Data source locationMost specimens originate from New Zealand waters, some were collected from the coasts of Australia, Japan, USA (California), and the UK.
Data accessibilityInteractive.nwk (Newick) tree files are provided here and with the main article [1].
  6 in total

1.  Application of phylogenetic networks in evolutionary studies.

Authors:  Daniel H Huson; David Bryant
Journal:  Mol Biol Evol       Date:  2005-10-12       Impact factor: 16.240

2.  A phylogeny of Southern Hemisphere whelks (Gastropoda: Buccinulidae) and concordance with the fossil record.

Authors:  Felix Vaux; Simon F K Hills; Bruce A Marshall; Steven A Trewick; Mary Morgan-Richards
Journal:  Mol Phylogenet Evol       Date:  2017-06-29       Impact factor: 4.286

3.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA.

Authors:  M Hasegawa; H Kishino; T Yano
Journal:  J Mol Evol       Date:  1985       Impact factor: 2.395

4.  Bayesian phylogenetics with BEAUti and the BEAST 1.7.

Authors:  Alexei J Drummond; Marc A Suchard; Dong Xie; Andrew Rambaut
Journal:  Mol Biol Evol       Date:  2012-02-25       Impact factor: 16.240

5.  Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data.

Authors:  Matthew Kearse; Richard Moir; Amy Wilson; Steven Stones-Havas; Matthew Cheung; Shane Sturrock; Simon Buxton; Alex Cooper; Sidney Markowitz; Chris Duran; Tobias Thierer; Bruce Ashton; Peter Meintjes; Alexei Drummond
Journal:  Bioinformatics       Date:  2012-04-27       Impact factor: 6.937

6.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies.

Authors:  Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2014-01-21       Impact factor: 6.937

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.