Literature DB >> 28993654

Ctenophore relationships and their placement as the sister group to all other animals.

Nathan V Whelan1,2, Kevin M Kocot3, Tatiana P Moroz4, Krishanu Mukherjee4, Peter Williams4, Gustav Paulay5, Leonid L Moroz6,7, Kenneth M Halanych8.   

Abstract

Ctenophora, comprising approximately 200 described species, is an important lineage for understanding metazoan evolution and is of great ecological and economic importance. Ctenophore diversity includes species with unique colloblasts used for prey capture, smooth and striated muscles, benthic and pelagic lifestyles, and locomotion with ciliated paddles or muscular propulsion. However, the ancestral states of traits are debated and relationships among many lineages are unresolved. Here, using 27 newly sequenced ctenophore transcriptomes, publicly available data and methods to control systematic error, we establish the placement of Ctenophora as the sister group to all other animals and refine the phylogenetic relationships within ctenophores. Molecular clock analyses suggest modern ctenophore diversity originated approximately 350 million years ago ± 88 million years, conflicting with previous hypotheses, which suggest it originated approximately 65 million years ago. We recover Euplokamis dunlapae-a species with striated muscles-as the sister lineage to other sampled ctenophores. Ancestral state reconstruction shows that the most recent common ancestor of extant ctenophores was pelagic, possessed tentacles, was bioluminescent and did not have separate sexes. Our results imply at least two transitions from a pelagic to benthic lifestyle within Ctenophora, suggesting that such transitions were more common in animal diversification than previously thought.

Entities:  

Mesh:

Year:  2017        PMID: 28993654      PMCID: PMC5664179          DOI: 10.1038/s41559-017-0331-3

Source DB:  PubMed          Journal:  Nat Ecol Evol        ISSN: 2397-334X            Impact factor:   15.460


Ctenophores, or comb jellies, have successfully colonized nearly every marine environment and can be key species in marine food webs[1-6]. For example, invasive ctenophores have caused dramatic fisheries collapses by voraciously preying on native fish larvae and their food, resulting in the economic loss of millions of US dollars to impacted areas[4]. Understanding morphological and life history diversity of ctenophores in a comparative context is essential for our knowledge of ctenophore and metazoan diversification as a whole[7]. Ctenophores have received considerable attention in regard to debate about whether they are the sister group to all other animals[3,5,8-11], but relationships within Ctenophora has been the focus of only limited research[3,12,13]. Putative ctenophore fossils date back to the Ediacaran period[14] with substantial morphological diversity being present in the Cambrian[15,16]. All ctenophores possess smooth muscles, and at least one genus, Euplokamis, has striated muscles[17]. Most ctenophores possess tentacles (Fig. 1), but species in the genus Ocyropsis lose tentacles as adults[18] and beroids lack them throughout their life cycle (Fig. 1)[1,6]. Many species are pelagic, but some are benthic or semi-benthic as adults and can have a relatively flattened body and lose the ciliary comb rose that otherwise characterize the phylum[6,19] (Fig. 1). Relationships among ctenophore lineages remain poorly resolved as past phylogenetic analyses have either had too few taxa to recover broad evolutionary patterns[3] or resulted in weak support for the deepest nodes, likely resulting from the use of only one or two genes[12,13]. Past researchers[12,13] have also hypothesized that Ctenophora has undergone a bottleneck in species diversity, possibly as recently as 65 MYA. However, the age of crown group ctenophores has yet to be estimated with molecular dating methods. Here, we sequenced 27 transcriptomes from species across most of the known phylogenetic diversity of Ctenophora. New sequence data were combined with 10 ctenophore and 50 non-ctenophore publicly available transcriptomes (Supplementary Tables S1, S2) to clarify the phylogenetic placement of Ctenophora[11,20-22]. Thus, we performed analyses to determine appropriate outgroups and ctenophore placement among other metazoans using more ctenophore taxa than previous studies[3,5,9-11,20] (Supplementary Table S2).
Fig. 1

Exemplar morphological forms of Ctenophora. a) Cydippid morphology (ovate body, long tentacles); photograph taken by James Townsend. b) Lobate morphology (reduced tentacles, large lobes). c) Beroida morphology (lacking tentacles and lobes). d) Platyctenida morphology (flattened, long tentacles). e) Cestida morphology (ribbon-like); photograph taken by Roberto Pillon and contrast adjusted in Adobe Photoshop.

Results

Ctenophora is the sister lineage to all other extant metazoans

Using a variety of data filtering schemes and different substitution models to control for systematic error (Supplementary Table 2), we recovered ctenophores as the sister group to all other extant metazoans (1.00 Bayesian posterior probability (PP), 100% bootstrap support (BS); Fig. 2, Supplementary Figs. S1–S14). The percentage of individual genes favoring the hypothesis of ctenophores sister to all other animals was higher in every dataset (56.8%–75.4%; Table 1) than the percentage of genes favoring the hypothesis of sponges sister to all other animals (32.7%–43.2%; Table 1). Datasets that were trimmed of genes most likely to cause long-branch attraction had the highest percentage of genes supporting Ctenophora-sister, indicating that the Ctenophora-sister hypothesis is not a result of long-branch attraction. Our recovered placement of ctenophores does not change when concerns of Pisani et al.[20] about outgroup choice and use of site-heterogeneous models are taken into account (see additional considerations in[22,23]).
Fig. 2

Relationships among metazoans inferred with the CAT-GTR substitution model and dataset Metazoa_Choano_RCFV_strict. All nodes have 100% PP. Inferred relationships among phyla are identical to those inferred with other models and datasets (Supplementary Figs. S1–S15; Supplementary Discussion). Scale bar in expected substitutions per site. Silhouette images downloaded from phylopic.org.

Table 1

Number of genes and sites in each dataset supporting alternative hypotheses of the sister lineage to all other metazoans.

Dataset1Genes supportingCtenophora-sisterGenes supportingPorifera-sisterSites supportingCtenophora-sisterSites supportingPorifera-sister
Metazoa_full144 (64.3%)80 (35.7%)38,378 (56.4%)29,684 (43.6%)
Metazoa_RCFV_relaxed133 (64.8%)72 (35.2%)36,255 (55.5%)29,072 (44.5%)
Metazoa_RCFV_strict70 (60.3%)46 (39.7%)22,897 (52.9%)20,415 (47.1%)
Metazoa_LB_relaxed112 (68.3%)52 (31.7%)28,642 (55.9%)22,554 (44.1%)
Metazoa_LB_strict105 (69.5%)46 (30.5%)25,875 (55.1%)21,071 (44.9%)
Metazoa_RCFV_LB_relaxed97 (65.1%)52 (34.9%)26,647 (54.3%)22,389 (45.7%)
Metazoa_RCFV_LB_strict53 (71.6%)21 (28.4%915,194 (52.8%)13,558 (47.2%)
Metazoa_Choano144 (61.5%)90 (38.5%)41,971 (55.3%)33,850 (44.7%)
Metazoa_Choano_RCFV_relaxed111 (68.9%)50 (31.3%)32,434 (54.3%)27,247 (45.7%)
Metazoa_Choano_RCFV_strict87 (68.5%)40 (31.5%)27,257 (55.2%)22,131 (44.8%)
Metazoa_Choano_LB_relaxed104 (56.8%)79 (43.2%)33,268 (54.8%)27,417 (45.2%)
Metazoa_Choano_LB_strict156 (75.4%)51 (32.7%)29,875 (59.2%)20,595 (40.8%)
Metazoa_Choano_RCFV_LB_relaxed83 (63.8%)47 (36.2%)26,586 (54.2%)22,493 (45.8%)
Metazoa_Choano_RCFV_LB_strict56 (68.3%)26 (31.7%)17,873 (57.3%)13,334 (42.7%)

See Supplementary Table S3 for more information on datasets.

A recent study by Simion et al.[21] recovered sponges as the sister lineage to all other animals, but methodological problems in their analyses explain disagreement with our results. The placement of sponges as the sister lineage to all other animals was only recovered using the CAT-F81 substitution model (often referred to as “CAT”), which has been shown to sometimes result in less accurate phylogenetic hypotheses than models used here[24]. More problematically, not a single Bayesian analysis conducted by Simion et al.[21] converged (Simion et al. pers. communication), rendering them statistically invalid. Use of other site-heterogeneous models that may not suffer from problems associated with CAT-F81 (see[24] and Supplementary Discussion) resulted in Ctenophora sister to all other animals[21], consistent with our findings (Fig. 2, Supplementary Fig. S1–S14) and those of two recent papers that employed novel methods[25,26]. Wide consensus exists that Ctenophora is a hard lineage to place on the animal tree of life[8,11,20,21], and increased taxon sampling is broadly accepted to aid in placement of difficult lineages[27-29]. Our datasets have greater ctenophore taxon sampling than past studies, including 27 novel ctenophore transcriptomes, and are arguably the most appropriate datasets, generated to date, for assessing the placement of Ctenophora. Using datasets with reasonably high ctenophore and other non-bilaterian taxon sampling, our results strongly reject the hypothesis that sponges are the sister lineage to all other extant metazoans. Bayesian inference with a relaxed molecular clock also recovered ctenophores as the sister group to all other animals with maximum support (1.00 PP; Supplementary Figure S15, Supplementary Discussion). These analyses indicated that sampled ctenophores shared a common ancestor much more recently than either crown group sponges, cnidarians, or bilaterians (Supplementary Figure S15; Supplementary Discussion). Thus, our findings are consistent with the hypothesis that Ctenophora has undergone a species-diversity bottleneck, but we acknowledge uncertainty in our absolute diversification timing (350 MYA ± 88 MY, Supplementary Discussion). Nevertheless, this bottleneck appears to have occurred between 456-261 MYA (Supplementary Figure S15), much older than the 65 MYA previously hypothesized[12,13]. Given our results, ancestral ctenophores likely experienced a drastic decline before, or during, the Permian-Triassic (P-Tr) extinction (~250 MYA[30]). Early to mid Paleozoic Ctenophore fossils display substantially greater morphological diversity (e.g., more than eight comb rows) than seen today[14-16], supporting the hypothesis that the phylum underwent a major diversity decline during the Paleozoic.

Evolution of Ctenophora

Relationships among ctenophores were assessed using a novel set of ctenophore-centric core orthologs. Orthology determination, subject to paralog and contamination screening, resulted in a primary dataset of 350 genes and 98,844 amino acid positions (Supplementary Table S3). Potential causes of systematic error were controlled for by creating additional datasets that removed potentially problematic genes (Supplementary Table S3)[11]. Phylogenetic analyses were conducted with data partitioning under maximum likelihood and with the CAT-GTR[31] site-heterogeneous substitution model under Bayesian inference. All phylogenetic analyses focusing on intra-ctenophore relationships resulted in identical, highly-supported relationships (Fig. 3, Supplementary Figs. S16–S19).
Fig. 3

Evolutionary relationships among Ctenophora and ancestral character state reconstruction of general body plan. Traditional orders labeled with colors matching corresponding body plan morphotype. Nodes are labelled with pie charts depicting posterior probability of character states. Phylogeny was inferred with dataset Ctenophore_RCFV_LB. Lines connect photographs of exemplars with species identity. Sponge and cnidarian outgroups that were used to root the tree were removed for illustrative purposes. Nodes have 100% BS or 1.00 PP support unless otherwise noted (BS/PP).

We found pervasive non-monophyly among currently recognized ctenophore higher taxonomic groups, including Tentaculata, Cydippida, and Lobata (Fig. 3, Supplementary Figs. S16–S19), corroborating previous analyses[3,12,13]. Other traditional groups based on morphology, like the benthic Platyctenida and atentaculate Beroida, were recovered monophyletic (Figs. 3, Supplementary Figs. S16–S19) congruent with past analyses[3,12,13]. Lobata was paraphyletic by inclusion of Cestida, represented by the ribbon-like Cestum veneris (Fig. 3, Supplementary Fig. S20). Ocyropsis species, which lose tentacles as adults, move by muscle propulsion, and are dioecious (Supplementary Figs. S21, S22), were monophyletic and sister to a clade with Cestida and all other lobates except the benthic Lobatolampea tetragona. These results indicate that the cydippid and lobate body plans are plesiomorphic (Fig. 3, Supplementary Fig. S20). We recovered Euplokamis dunlapae as the sister lineage to all other sampled ctenophores with maximum support (Fig. 3, Supplementary Figs. S16–S19) consistent with initial genomic analyses[3]. Previous studies also recovered Mertensia ovum and Charistephane fugiens with Euplokamis dulapae as a united sister group to all other ctenophores[13]. Novel analyses based on 18S rRNA, which included many more taxa than our transcriptome-based analyses, recovered Mertensiidae as non-monophyletic and a clade including Mertensia ovum, Charistephane fugiens, and Euplokamis spp. sister to all other extant ctenophores (see Supplementary Discussion; Supplementary Fig. S23). As were unable to sample M. ovum and C. fugiens, we cannot reject that any three of these species, a clade of all three, or a yet to be discovered species could be the sister lineage to all other extant ctenophores. Euplokamis dunlapae is the only ctenophore species known to have striated muscles[2], and Bayesian ancestral state reconstruction suggests that striated muscles likely evolved after the split between E. dunlapae and other ctenophores (PP = 0.90; Supplementary Fig. S24), rather than being present in the most recent common ancestor (MRCA) of extant ctenophores. Striated muscles have evolved at least three times: after the split of the Euplokamis dunlapae lineage from other ctenophores, in select Cnidaria[32], and in bilaterians[32] (Supplementary Fig. S25). Given that all extant ctenophores have smooth muscles, the MRCA of all extant ctenophores almost certainly possessed smooth muscles (PP = 1.0; Supplementary Fig. S24). Given our inferred relationships among ctenophores, sponges, placozoans, and cnidarians (Fig. 2), the MRCA to extant metazoans either possessed smooth muscles that were subsequently lost at least twice (in Porifera and in Placozoa), or, more parsimoniously, muscles evolved independently at least twice (in Ctenophora and the lineage leading to Cnidaria + Bilateria)[3]. The MRCA of extant ctenophores was most likely pelagic (PP = 0.91; Fig. 4, Supplementary Fig. S24), with cydippid-like morphology (i.e., ovate body and branched tentacles; PP = 0.92; Fig. 3, Supplementary Fig. S20; Supplementary Discussion), and a simultaneous hermaphrodite (PP = 0.99; Supplementary Fig. S22). Ancestral state reconstruction suggests plesiomorphy of the cydippid body plan with most other morphotypes evolving from it (Fig. 3, Supplementary Fig. S20). The one exception appears to be the ribbon-like Cestida, which evolved from a lobate-like ancestor. Aside from beroids, which are atentaculate at all life stages, all ctenophores for which larval information is available have a free-swimming larval stage with cydippid-like morphology[1,33]. However, Platyctenids, and to a lesser extent lobates and cestids, undergo considerable morphological and functional changes during development[33]. Nevertheless, juvenile morphology among all ctenophores, except the derived beroids, resembles the inferred ancestral state of extant ctenophores (Fig. 3, Supplementary Fig. S20).
Fig. 4

Evolutionary relationships of Ctenophora and ancestral character state reconstruction of benthic vs. pelagic lifestyle. Nodes (and unique taxa) are labelled with pie charts depicting posterior probability of character states. Traditional orders are labeled. a) Phylogeny was inferred with dataset Ctenophore_RCFV_LB. Sponge and cnidarian outgroups that were used to root the tree were removed for illustrative purposes. Nodes have 100% BS or 1.00 PP support unless otherwise noted (BS/PP). b) Benthic Platyctenida, Ceoloplana astericola on a seastar. c) Pelagic Pleurobrachia bachei. d) Benthic Lobata, Lobatolampea tetragona.

Ancestral state reconstruction indicates that ctenophores have transitioned from a pelagic to a benthic, or semi-benthic, adult lifestyle at least twice (Fig. 4, Supplementary Fig. S25). These two transitions occurred on the branches leading to Platyctenida and to Lobatolampea, but we cannot rule out additional transitions in undescribed benthic lineages. Interestingly, Lobatolampea was recovered as the sister lineage to a clade with all other lobates and Cestida, while Platyctenida was recovered as sister to all other ctenophores but Euplokamis. Thus, the two benthic lineages evolved separately. Transition between benthic and pelagic lifestyles has been studied in numerous invertebrate groups[34], with most documented transitions occurring from a benthic to a pelagic existence. However, we found no evidence that any ancestrally benthic ctenophore lineage has evolved to occupy the water column (Fig. 4, Supplementary Fig. S25). Pleurobrachiidae is one of the most common and well-studied groups of ctenophores and is often used as a reference for the phylum[3,35]. However, Pleurobrachiidae lacks bioluminescence[35], and past uncertainty about the phylogenetic position of the family limited the ability to fully analyze evolution of bioluminescence in ctenophores[12,13]. We confidently recovered Pleurobrachiidae (i.e., Pleurobrachia and Hormiphora), plus Pukiidae, as a monophyletic lineage on a relatively long branch (Figs. 3, Supplementary Figs. S16–S19). Like Pleurobrachiidae, Pukiidae is incapable of bioluminescence. Ancestral state reconstruction suggests that the MRCA to extant ctenophores was bioluminescent (PP = 0.96; Supplementary Fig. S25; Supplementary Discussion), and this trait has likely been lost only once within Ctenophora. Bioluminescence is generally considered advantageous in deep water[36], but most pleurobrachiids are found near-shore at shallow depths[1,37,38], which may have relaxed selective pressures for maintaining bioluminescence. The MRCA of extant ctenophores likely fed by capturing plankton with branched tentacles equipped with colloblasts, a unique synapomorphy of ctenophores. However, multiple transitions in adult feeding mode have occurred (Fig. 5, Supplementary Fig. S20; Supplementary Discussion). These transitions are associated with lineage-specific behavioral and morphological innovations[38]. For instance, the simplification of tentacles seen in Dryodora followed by the complete loss of tentacles in Beroe (Fig. 5, Supplementary Figs. S20, S21) is associated with engulfing larger prey items, rather than using tentacles and/or lobes for food capture as in other lineages (Fig. 5, Supplementary Fig. S20); although Dryodora has tentacles, they are likely used for sensing rather than capturing prey (Supplementary Discussion). The sister relationship between Dryodora and Beroe suggests a gradual transition from branched to reduced tentacles, followed by complete loss of tentacles. More broadly, ancestral state reconstruction of feeding behaviors produced three nodes where no character state had posterior probabilities of 90% or greater (Fig. 5, Supplementary Figs. S20). Ambiguity at these nodes is associated with a clear shift away from using primarily, or only, tentacles for prey capture as adults and dramatic morphological transitions.
Fig. 5

Evolutionary relationships of Ctenophora and ancestral sate reconstruction of primary feeding mode. Traditional orders are labeled. Nodes are labelled with posterior probability of character states. Phylogeny was inferred with dataset Ctenophore_RCFV_LB. Sponge and cnidarian outgroups that were used to root the tree were removed for illustrative purposes. Nodes have 100% BS or 1.00 PP support unless otherwise noted (BS/PP).

Discussion

Using greater ctenophore taxon sampling than previous studies, data filtering schemes to remove potential causes of systematic error, and a variety of substitution models, we recovered Ctenophora as the sister lineage to all other animals. The debate surrounding the phylogenetic placement of Ctenophora has complicated studies on evolution of complex characters such as muscles and neurons. Genomic components of these features suggest extensive convergent and parallel evolution across Metazoa[3], which is further supported by our phylogenetic results. However, events of independent origins of neural and muscular systems are not directly coupled with competing hypotheses of metazoan phylogeny[3,39,40]. Nevertheless, the placement of Ctenophora as the sister lineage to all other animals appears robust to error. Our results suggest that Ctenophora has undergone a species-diversity bottleneck considerably farther in the past than previously hypothesized (Supplementary Fig. S15). Subsequent diversification resulted in numerous morphotypes evolving from a cydippid-like ancestor (Fig. 3). A benthic lifestyle has evolved convergently in at least two ctenophore lineages (Fig. 4), but evolution of striated muscles, loss of bioluminescence, and loss of tentacles throughout all life cycles appears to have only occurred once (Supplementary Figs. S20–S24). Ctenophora is in need of thorough taxonomic revision, and we expect progress to be made on that front in the coming years. Ctenophora is one of the most morphologically diverse and understudied metazoan groups, and our results provide a phylogenetic foundation for future studies on developmental, neuro-muscular, and tissue/organ evolution both within Ctenophora and among all metazoans.

Methods

Taxon sampling and sequencing

We sampled ctenophores from locations around the world (Table S1), mostly between 2013 and 2016. Ctenophore specimens were identified to as low of a taxonomic level as possible (Table S1). Many newly sequenced species, particularly those sampled from Antarctica, are undescribed species. Complementary DNA (cDNA) libraries for newly collected ctenophores were constructed using a template-switch method using the SMART™ cDNA library construction (Cat# 639537, Clontech). Full-length cDNA was amplified using the Advantage 2 PCR system (Cat# 639201, Clontech) and the minimum number of PCR cycles necessary for single-end sequencing for Ion Proton or 2 × 100 bp paired-end sequencing with Illumina. Illumina and Ion Proton sequencing libraries were subsequently prepared using NEBNext® Ultra™ DNA Library Prep Kit for Illumina® (Cat# E7645S, New England Biolabs Inc.) or NEBNext® Fast DNA Library Prep Set for Ion Torrent™ (Cat# E6270S, New England Biolabs Inc.). Each library was sequenced using either an Illumina NextSeq 500 or Ion Proton (see Table S1). Publicly available ctenophore and non-ctenophore transcriptomes or gene models were retrieved from NCBI and other databases (Supplementary Tables S1, S2). Bolinopsis infundibulum from Moroz et al.[3] was determined to be misidentified based on our sequencing of a novel B. infundibulum transcriptome. Thus, we now use name “Cydippida sp. Washington, USA” for the transcriptome labeled “Bolinopsis infundibulum” in Moroz et al.[3]. We performed phylogenetic analyses at two scales to achieve different goals. First we inferred relationships among non-bilaterian metazoan phyla (with other opishtokonts as outgroups) to determine the sister lineage to Ctenophora. Second, we analyzed relationships and trait evolution within Ctenophora using appropriate outgroups as identified with the broader Metazoa analyses. Depending on the focal taxonomic scale, different taxon sampling schemes were used (see Supplementary Tables S1, S2). Datasets designed to examine relationships between metazoan phyla are named with the prefix “Metazoa_” followed by more specific information about the dataset as appropriate. For example, datasets with only choanoflagellates as outgroups are named “Metazoa_choano_”. Datasets designed to test relationships among ctenophores are named in a similar fashion except they have the prefix “Cteno_”. See Supplementary Table S3, and below, for additional information about dataset naming conventions. When testing relationships among metazoan phyla, taxon sampling was similar to that of Whelan et al.[11] with three exceptions. First, fewer bilaterians were included to decrease computational time. Second, a larger number of choanoflagellates were sampled, which we expected to result in more robust rooting of Metazoa than in past analyses[3,5,8-11,24,41-43]. Finally, more ctenophores were sampled than in previous studies[3,5,8-11,24,41-43], which will likely increase the accuracy of ctenophore placement[28,29,44]. For analyses that focused on relationships among metazoan phyla, we generated datasets that only had choanoflagellate outgroups and datasets that had Ichthyosporea, Filasterea, and choanoflagellate outgroups (Supplementary Tables S2, S3). These datasets had fewer ctenophores included than in datasets generated to test relationships within Ctenophora because we did not include individuals that were repetitive at or near the species level (e.g., only one individual identified as Pleurobrachia bachei was included in the broader analyses; Supplementary Tables S1, S2). This was done in order to decrease required computational time. Pukia falcata was also not included in the broad metazoan analyses, despite its inclusion in ctenophore-centric phylogenetic inference, because preliminary phylogenetic inference (not shown) revealed that its inclusion caused unstable relationships among metazoan phyla. Presumably, this was due to the comparably high amount of missing data in P. falcata. “Mertensiidae sp. (Antarctica)” was inadvertently not included in ctenophore specific dataset generation. However, inclusion of this species would likely not have affected overall conclusions about ctenophore evolution given its inferred placement from analyses with the metazoan datasets (Fig. 2, Supplementary Figures S1–S14).

Informatics and data matrix assembly

Prior to assembly, raw transcriptome reads were digitally normalized to a target of 30× coverage using normalize-by-median.py[45] and assembled with Trinity 20140717[46] using default parameters. After assembly, open reading frames and putative protein sequences were identified with TransDecoder[46] using default parameters. We used HaMStR 13.2[47] and two core ortholog sets to recover orthologous groups (OGs) for phylogenomic analyses (Supplementary Table S3). The model organism core ortholog set packaged with HaMStR 13.2 was used for testing relationships among metazoan phyla because it was designed to be of broad taxonomic utility. For reconstructing ctenophore phylogeny, we designed a ctenophore-centric core ortholog set to increase the number of OGs in our datasets (Supplementary Table S3). The ctenophore-centric core ortholog set was created by first performing an all-versus-all blastp search[48] among transcriptomes of Beroe abyssicola, Coeloplana astericola, Euplokamis dunlapae, Mnemiopsis leidyi, Ocyropsis sp. from Florida, USA, and Pleurobrachia bachei. These species were chosen because they were hypothesized to represent a wide swath of ctenophore phylogeny and had relatively deeply sequenced transcriptomes. An e-value cutoff of 105 was used for blastp searches. Blastp results were used to perform Markov clustering with OrthoMCL 2.0[49] with an inflation parameter of 2.1 following Hejnol et al.[10] and Kocot et al.[50]. Markov clustering resulted in 55,433 putative OGs. These OGs were further filtered to remove possible paralogs and low-quality OGs. First, any sequence that had less than 100 amino acids in length was removed. Each OG was then aligned with MAFFT[51] using an automatically chosen alignment strategy and a “maxiterate” value of 1,000. After alignment, an approximately maximum likelihood tree was generated for each OG with FastTree 2[52] using “slow” and “gamma” options. Each tree and corresponding OG was processed with PhyloTreePruner[53] to screen for paralogs; a bootstrap value of 90 was used for collapsing nodes. If more than one sequence for any of the six respective species was present after the paralog pruning step, then the longest sequence for that species was retained and others were discarded. Lastly, we removed OGs that had sequences for fewer than 4 species and any OG that did not have a Mnemiopsis leidyi sequence because it was chosen as the HaMStR primer taxon. The 2,354 remaining OG alignments were used to build protein hidden Markov models using HMMER tools hmmbuild and hmmcalibrate[54]. Our ctenophore core ortholog set has been deposited on figshare (doi:xxxx.xxx). Transcriptomes and gene models were processed with HaMStR using one or both core ortholog sets (i.e., model organism or ctenophore) depending on which analyses each taxon was included in (Tables S1, S2). Post-HaMStR orthology filtering followed Whelan et al.[11] with slight script modifications to increase speed and accuracy. For datasets generated to infer relationships among Bilateria and non-Bilateria phyla, OGs were discarded if they had less than 42 taxa present for datasets generated with all outgroups and less than 38 taxa present for datasets generated with only choanoflagellate outgroups (i.e., datasets Metazoa_full and Metazoa_Choano, respectively; Table S2). For datasets designed for testing relationships among ctenophores, OGs were discarded if they had less than 27 taxa present. After orthology filtering of each dataset, single gene trees were generated with RAxML 8.2.4[55] using a gamma distribution to model rate heterogeneity and amino acid substitution models identified by model testing implemented in RAxML. We performed 100 fast bootstrap replicates for each gene tree to assess nodal support. Resulting gene trees were used with TreSpEx[56] for more thorough screening of paralogs and contamination that may have passed through initial orthology determination. Briefly, we used the BLAST associated method in TreSpEx with the packaged Capitella teleta and Helobdella robusta blast databases following Struck[56] and Whelan et al.[11]. All sequences identified as certain or uncertain paralogs by TreSpEx —such sequences may also be non-target sequence contamination—were removed from OGs. Subsequently, OGs that then had less than 42 taxa for dataset Metazoa_Full, 38 taxa for dataset Metazoa_Choano, and 27 taxa for dataset Ctenophore_full after paralog pruning with TreSpEx were also discarded to minimize missing data. For clarity, datasets Metazoa_full, Metazoa_Choano, and Ctenophore_full are herein referred to as “initial” datasets that were then filtered for OGs that had the highest potential for causing systematic error.

Systematic Error

To assess the effect of systematic error on phylogenetic inference we generated datasets with potential sources of systematic error removed from the initial datasets. Specifically, genes with the highest potential for causing long-branch attraction (LBA) or that had the highest levels of base compositional heterogeneity were removed. By creating nested datasets with different potential causes of systematic error removed, we were able to assess if inferred relationships were influenced by systematic error. Branch length heterogeneity scores (LB), which can be used to rank genes based on their possible contribution to LBA, were calculated using TreSpEx. This was done with individual trees for each OG in the three initial datasets; new trees for each paralog pruned OG were inferred with RAxML as described above. Density plots of LB score heterogeneity and upper quartile LB score for each OG and dataset were plotted using R[57] (Supplementary Fig. S26). The two datasets designed to test relationships among metazoan phyla (i.e., Metazoan_full, Metazoa_Chaono) had fewer genes than the ctenophore-centric dataset. Thus, to strike a balance between removing OGs that may cause systematic error and not having enough phylogenetic signal (i.e., OGs) to accurately resolve relationships we identified a strict and a relaxed cutoff for removing genes with outlier LB scores (Supplementary Fig. S26). For the ctenophore-centric dataset, we only identified one set of genes as outliers (Supplementary Fig. S26). Using the initial datasets, nested datasets were generated by removing genes that were identified as having outlier LB scores (Supplementary Table S3). Relative Composition Frequency Variability (RCFV)[58], which is a measure for how much base compositional heterogeneity is present in an OG, was calculated for each gene using BaCoCa[59]. A density plot of RCFV for each initial dataset was plotted in R (Supplementary Fig. S26). As with LB scores, for datasets Metazoa_full and Metazoa_Choano two sets of outliers were identified and removed to create datasets with all outlier RCFV genes removed (i.e., strict) and some outlier RCFV genes removed (i.e., relaxed; Supplementary Fig. S65, Table S3). Only a single set of RCFV outlier genes were identified for dataset Cteno_full (Supplementary Fig. S26, Table S3). We also created datasets that had both LB and RCFV outlier genes removed from the initial three datasets (Supplementary Fig. S26, Table S3). For the ctenophore-centric datasets, we created corresponding datasets with outgroups removed to test whether or not relationships among ctenophores were affected by relatively distantly related outgroups.

Phylogenetic reconstructions

Bayesian inference with the site-heterogeneous CAT-GTR substitution model was done with PhyloBayes MPI[60]. Analyses with CAT-GTR are notoriously time consuming[24] so a number of steps were taken to facilitate convergence of independent Bayesian runs. First, only two datasets were analyzed with CAT-GTR: dataset Metazoa_Choano_RCFV_strict for testing relationships among metazoan phyla and dataset Cteno_RCFV_LB for determining relationships among ctenophore lineages. We removed three ctenophore taxa from dataset Metazoa_Choano_RCFV_strict to facilitate convergence; these three ctenophores were unstable in preliminary CAT-GTR analyses that failed to converge (see Supplementary Tables S1–S3). For CAT-GTR analyses on both datasets, two independent chains were sampled every generation. Trace plots of Markov chain Monte Carlo (MCMC) runs were visually inspected in Tracer 1.6[61] to assess stationarity and appropriate burn-in, which was determined to be 3,500 and 4,000 generations for datasets Metazoa_Choano_RCFV_strict and Cteno_RCFV_LB, respectively. Phylobayes runs were sampled for 18,436 generations on dataset Metazoa_Choano_RCFV_strict and for 23,947 generations on dataset Cteno_RCFV_LB. All parameters and tree shape reached convergence, which was considered to have occurred when the maxdiff value < 0.1 as measured by bpcomp[60] and when rel_diff value < 0.3 and effective sample size > 50 as measured by tracecomp[60]. Although some have advocated for using CAT-F81 when CAT-GTR is deemed computationally prohibitive[20,62], Whelan and Halanych[24] recently showed the CAT-F81 can result in critically inaccurate trees. Thus, tree inference was not done with the CAT-F81 model on datasets which would have been too computationally demanding for analyses with CAT-GTR. Maximum likelihood trees for each dataset were inferred with site-homogeneous amino acid substitution models coupled with data partitioning[63]. Best-fit partitions and amino acid substitution models for each dataset were inferred with PartitionFinder 2.0[64] using 20% relaxed clustering[65], the rcluster_f command, and Bayesian information criteria. Maximum likelihood phylogenetic inference using best-fit partitions and amino acid substitution models was done with RAxML 8.2.4[55]. A discrete gamma distribution with four categories was used on each partition for modeling rate heterogeneity. Nodal support was assessed with 100 fast bootstrap replicates. Files with best-fit partitions and models for each dataset have been deposited on FigShare (doi: XXXX).

Measuring support for competing hypotheses of non-bilaterian relationships

The number of genes and sites favoring each of the two competing hypotheses—sponges-sister to all other extant metazoans and ctenophores-sister to all other metazoans—was assessed under a maximum likelihood framework. For each metazoan dataset, site-wise likelihood scores were inferred for both hypotheses with RAxML 8.2.4 (option -f G). The same partitioning schemes and models utilized in the original tree inference were used. The two different phylogenetic hypotheses passed to RAxML (via -z) were the tree inferred with RAxML (i.e., the ctenophore-sister tree) and the corresponding tree that was modified to have sponges sister to all other metazoans; constraints were done by modifying the original tree in Mesquite 3.2[66]. The number of genes and sites supporting each hypothesis were calculated with RAxML output and Perl scripts from Shen et al.[26].

Molecular Clock Analyses

Past authors have hypothesized that Ctenophora underwent a species bottleneck, possibly as recently as 65 MYA[12,13,67]. However, the bottleneck hypothesis has not been tested with molecular clock methods. BEAST 2[68] is a well-tested and widely used program that implements molecular clock models, but analyses with amino acids can be prohibitively slow. Thus, for molecular clock analyses, we used our smallest dataset, Metazoa_Choano_RCFV_LB_strict. We also trimmed the same taxa that were deemed unstable for analyses with CAT-GTR (see above and Table S2). The same amino acid substitution models and best-fit partitions were inferred with PartitionFinder using 20% relaxed clustering with the rcluster_f command. The best-fit number of relaxed molecular clock models for use in BEAST 2 were inferred with ClockstaR[70] using default parameters. One molecular clock was inferred to be most appropriate for this dataset. A relaxed molecular clock with a lognormal distribution[71] and a Yule tree model was used. A calibration was placed on the node representing the most recent common ancestor (MRCA) of Metazoa using a normal distribution with a mean of 750 MYA and a standard deviation of 35 following the findings of dos Rios et al.[72]; monophyly of Metazoa was enforced. We only used one calibration point for the molecular clock analysis, even though this may result in inaccurate absolute branching time estimates. We attempted to perform analyses with a greater number of node-age calibrations (e.g., for sponges, cnidarians, and bilaterian lineages; see Supplementary Table S4)[72], but Bayesian analyses failed to show evidence of convergence after over four months of run time. However, a single calibration point still allows for inference of relative timing of extant ctenophore diversification compared to better studied lineages and lineages with better fossil records. Thus, even if absolute timing of diversification events is imprecise in our molecular clock tree inference, we can analyze the inferred timing of ctenophore diversification relative to well-studied diversification events where timing of diversification is reasonably well known (e.g., Bilateria, protosomes) to estimate the age of the extant ctenophore MRCA. Molecular clock analyses with BEAST 2 consisted of two independent runs with 27,246,750 MCMC generations sampled every 250 generations. Trace plots were viewed in tracer, burn-in was visually determined (12% for run 1 and 50% for run 2). Convergence was checked and confirmed by comparing trace plots in Tracer making sure the effective sample size of each parameter was greater than 50 and that stationarity appeared to have been achieved; most parameters has effective sample sizes well in excess of 200. A maximum clade credibility tree with median heights was calculated using TreeAnnotater[68]. Bayesian inference using a molecular clock resulted in identical branching patterns among phyla as analyses with RAxML and PhyloBayes (e.g., Ctenophora sister to all other animals PP = 1.00; Supplementary Figs. S1–S15).

Ancestral State Reconstruction

We performed ancestral state reconstruction for the following traits: 1) general body plan (i.e. “cydippid-like”, “lobata-like”, Platyctenida, Cestida; Fig. 3, Supplementary Fig. S20), 2) primary food capture mode (i.e., with tentacles, with body lobes, and engulfing prey with a comparatively large mouth; Supplementary Fig. S20), 3) presence/absence of tentacles as adults (Supplementary Fig. S21), 4) presence/absence of dioecy (Supplementary Fig. S22), 5) presence/absence of striated muscles (Supplementary Fig. S23), 6) presence/absence of smooth muscle (Supplementary Fig. S23), 7) pelagic vs. benthic/semi-benthic lifestyle (Supplementary Fig. S24), 8) ability to bioluminesce (Supplementary Fig. S24), 9) presence/absence of tentacles throughout life cycles (Supplementary Fig. S21). Characteristics were assigned using previous descriptive work[2,6,17,19,37,38,73-79] and/or personal observations of individuals we collected (see Supplementary Table S5). Additional information about trait assignment can be found in Supplementary Discussion. Phylogenetic signal of each trait was measured with Blomberg’s K[80], using the phytools 0.5–10[81] package in R[57]; each trait had significant phylogentic signal (p <0.05). Stochastic mapping of character evolution, a Bayesian method for ancestral state reconstruction[82,83], was done to generate character state joint probabilities on the phylogeny inferred with dataset Cteno_RCFV_LB. This was done in R using phytools 0.5–10. Uncertainty in relationships was ignored because the only uncertain nodes were those at the tips among closely related taxa with identical character states (Figs. 3, Supplementary Figs. S16–S19; Table S5). Analyses that incorporated uncertainty in branch lengths were effectively the same as those that ignored uncertainty (Supplementary Discussion). For ancestral state reconstruction, Cydippida sp. from Friday Harbor was removed because this species was labeled as Bolinopsis infundibulum in Moroz et al.[3], and we could not confidently assign character states given the misidentification. The larval ctenophore specimen (Ctenophora sp.) was also removed because many character states that would be present only in adults were undetermined. These tips were removed from trees using the R package Ape[84]. Outgroups were removed from all stochastic mapping analyses except presence/absence of striated and smooth muscle. The best-fit model of character evolution to be used for stochastic mapping was determined by fitting an equal rates model, a symmetrical model, and an all rates different model to each character state dataset using the R package Geiger[85]; corrected Akaike information criteria was used to determine the best-fit model for each respective character dataset. For each analysis, the prior probability of the root’s character state was estimated directly from the data and Bayesian MCMC was used to generate a posterior probability distribution for the character transition matrix. With these parameters, 1,000 stochastic maps were generated for each trait. Evolution of traits was visualized by displaying pie charts of posterior probabilities for each character state on every node.
  60 in total

1.  EMBOSS: the European Molecular Biology Open Software Suite.

Authors:  P Rice; I Longden; A Bleasby
Journal:  Trends Genet       Date:  2000-06       Impact factor: 11.639

2.  A molecular phylogenetic framework for the phylum Ctenophora using 18S rRNA genes.

Authors:  M Podar; S H Haddock; M L Sogin; G R Harbison
Journal:  Mol Phylogenet Evol       Date:  2001-11       Impact factor: 4.286

3.  Mapping mutations on phylogenies.

Authors:  Rasmus Nielsen
Journal:  Syst Biol       Date:  2002-10       Impact factor: 15.683

4.  Stochastic mapping of morphological characters.

Authors:  John P Huelsenbeck; Rasmus Nielsen; Jonathan P Bollback
Journal:  Syst Biol       Date:  2003-04       Impact factor: 15.683

5.  Testing for phylogenetic signal in comparative data: behavioral traits are more labile.

Authors:  Simon P Blomberg; Theodore Garland; Anthony R Ives
Journal:  Evolution       Date:  2003-04       Impact factor: 3.694

6.  Increased taxon sampling greatly reduces phylogenetic error.

Authors:  Derrick J Zwickl; David M Hillis
Journal:  Syst Biol       Date:  2002-08       Impact factor: 15.683

7.  A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process.

Authors:  Nicolas Lartillot; Hervé Philippe
Journal:  Mol Biol Evol       Date:  2004-03-10       Impact factor: 16.240

8.  APE: Analyses of Phylogenetics and Evolution in R language.

Authors:  Emmanuel Paradis; Julien Claude; Korbinian Strimmer
Journal:  Bioinformatics       Date:  2004-01-22       Impact factor: 6.937

9.  OrthoMCL: identification of ortholog groups for eukaryotic genomes.

Authors:  Li Li; Christian J Stoeckert; David S Roos
Journal:  Genome Res       Date:  2003-09       Impact factor: 9.043

10.  Relaxed phylogenetics and dating with confidence.

Authors:  Alexei J Drummond; Simon Y W Ho; Matthew J Phillips; Andrew Rambaut
Journal:  PLoS Biol       Date:  2006-03-14       Impact factor: 8.029

View more
  47 in total

1.  A species-level timeline of mammal evolution integrating phylogenomic data.

Authors:  Sandra Álvarez-Carretero; Asif U Tamuri; Matteo Battini; Fabrícia F Nascimento; Emily Carlisle; Robert J Asher; Ziheng Yang; Philip C J Donoghue; Mario Dos Reis
Journal:  Nature       Date:  2021-12-22       Impact factor: 49.962

2.  The genetic factors of bilaterian evolution.

Authors:  Peter Heger; Wen Zheng; Anna Rottmann; Kristen A Panfilio; Thomas Wiehe
Journal:  Elife       Date:  2020-07-16       Impact factor: 8.140

3.  The genome of the jellyfish Clytia hemisphaerica and the evolution of the cnidarian life-cycle.

Authors:  Lucas Leclère; Coralie Horin; Sandra Chevalier; Pascal Lapébie; Philippe Dru; Sophie Peron; Muriel Jager; Thomas Condamine; Karen Pottin; Séverine Romano; Julia Steger; Chiara Sinigaglia; Carine Barreau; Gonzalo Quiroga Artigas; Antonella Ruggiero; Cécile Fourrage; Johanna E M Kraus; Julie Poulain; Jean-Marc Aury; Patrick Wincker; Eric Quéinnec; Ulrich Technau; Michaël Manuel; Tsuyoshi Momose; Evelyn Houliston; Richard R Copley
Journal:  Nat Ecol Evol       Date:  2019-03-11       Impact factor: 15.460

4.  Early metazoan cell type diversity and the evolution of multicellular gene regulation.

Authors:  Arnau Sebé-Pedrós; Elad Chomsky; Kevin Pang; David Lara-Astiaso; Federico Gaiti; Zohar Mukamel; Ido Amit; Andreas Hejnol; Bernard M Degnan; Amos Tanay
Journal:  Nat Ecol Evol       Date:  2018-06-25       Impact factor: 15.460

5.  The genome of the jellyfish Aurelia and the evolution of animal complexity.

Authors:  David A Gold; Takeo Katsuki; Ralph J Greenspan; Yang Li; Xifeng Yan; Michael Regulski; David Ibberson; Thomas Holstein; Robert E Steele; David K Jacobs
Journal:  Nat Ecol Evol       Date:  2018-12-03       Impact factor: 15.460

Review 6.  Neural versus alternative integrative systems: molecular insights into origins of neurotransmitters.

Authors:  Leonid L Moroz; Daria Y Romanova; Andrea B Kohn
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2021-02-08       Impact factor: 6.237

7.  Hidden cell diversity in Placozoa: ultrastructural insights from Hoilungia hongkongensis.

Authors:  Daria Y Romanova; Frédérique Varoqueaux; Dirk Fasshauer; Leonid L Moroz; Jean Daraspe; Mikhail A Nikitin; Michael Eitel
Journal:  Cell Tissue Res       Date:  2021-04-19       Impact factor: 4.051

Review 8.  Whole-Body Regeneration in the Lobate Ctenophore Mnemiopsis leidyi.

Authors:  Allison Edgar; Dorothy G Mitchell; Mark Q Martindale
Journal:  Genes (Basel)       Date:  2021-06-05       Impact factor: 4.096

Review 9.  Evolution of glutamatergic signaling and synapses.

Authors:  Leonid L Moroz; Mikhail A Nikitin; Pavlin G Poličar; Andrea B Kohn; Daria Y Romanova
Journal:  Neuropharmacology       Date:  2021-07-31       Impact factor: 5.273

10.  Multiple Origins of Neurons From Secretory Cells.

Authors:  Leonid L Moroz
Journal:  Front Cell Dev Biol       Date:  2021-07-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.