Literature DB >> 29186447

Large-scale phylogenomic analysis resolves a backbone phylogeny in ferns.

Hui Shen1,2, Dongmei Jin1,2, Jiang-Ping Shu1,2, Xi-Le Zhou1,2, Ming Lei3, Ran Wei4, Hui Shang1,2, Hong-Jin Wei1,2, Rui Zhang1,2, Li Liu1,2, Yu-Feng Gu1,2, Xian-Chun Zhang4, Yue-Hong Yan1,2.   

Abstract

Background: Ferns, originated about 360 million years ago, are the sister group of seed plants. Despite the remarkable progress in our understanding of fern phylogeny, with conflicting molecular evidence and different morphological interpretations, relationships among major fern lineages remain controversial.
Results: With the aim to obtain a robust fern phylogeny, we carried out a large-scale phylogenomic analysis using high-quality transcriptome sequencing data, which covered 69 fern species from 38 families and 11 orders. Both coalescent-based and concatenation-based methods were applied to both nucleotide and amino acid sequences in species tree estimation. The resulting topologies are largely congruent with each other, except for the placement of Angiopteris fokiensis, Cheiropleuria bicuspis, Diplaziopsis brunoniana, Matteuccia struthiopteris, Elaphoglossum mcclurei, and Tectaria subpedata. Conclusions: Our result confirmed that Equisetales is sister to the rest of ferns, and Dennstaedtiaceae is sister to eupolypods. Moreover, our result strongly supported some relationships different from the current view of fern phylogeny, including that Marattiaceae may be sister to the monophyletic clade of Psilotaceae and Ophioglossaceae; that Gleicheniaceae and Hymenophyllaceae form a monophyletic clade sister to Dipteridaceae; and that Aspleniaceae is sister to the rest of the groups in eupolypods II. These results were interpreted with morphological traits, especially sporangia characters, and a new evolutionary route of sporangial annulus in ferns was suggested. This backbone phylogeny in ferns sets a foundation for further studies in biology and evolution in ferns, and therefore in plants.
© The Authors 2017. Published by Oxford University Press.

Entities:  

Keywords:  evolution; monilophytes; phylogenomic; sporangium; transcriptome

Mesh:

Year:  2018        PMID: 29186447      PMCID: PMC5795342          DOI: 10.1093/gigascience/gix116

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


Background

Phylogeny, which reflects natural history, is fundamental to understanding evolution and biodiversity. Ferns (monilophytes), originated about 360 million years (MY) ago, are the sister group of seed plants [1, 2]. With estimated 10 578 extant living species globally [3], they are the second most diverse group of vascular plants. Phylogenetic studies for ferns, especially based on molecular evidence, have been widely carried out in recent decades. These studies have revolutionized our understanding of the evolutionary history of ferns. Milestones included setting ferns as the sister group of seed plants [1, 2], placing Psilotaceae and Equisetaceae within ferns [2, 4, 5], and revealing a major polypods radiation following the rise of angiosperms [6, 7]. Resolutions at shallow phylogenetic depth among families or genera have also been improved remarkably [8-14]. However, previous research on fern phylogeny has mostly relied on plastid genes [10, 12, 13], some combined with a few nuclear genes [4, 5, 14] or morphological traits [5, 11]. Due to incomplete lineage sorting (ILS), genes from different resources often show conflicting evolutionary patterns, especially when based on a limited number of samples, and some deep relationships in fern phylogeny remain controversial (Fig. 1). In the latest PPG I system [3], which has derived from many recent phylogenetic studies, some important nodes remain uncertain, such as (i) what are the relationships among Marattiales, Ophioglossales, and Psilotales?; (ii) are Hymenophyllales and Gleicheniales sister groups?; and (iii) what are the relationships among families in eupolypods II?
Figure 1:

Topologies (a-f) adapted from published results [5, 12–14, 26, 34]. Branches with support <75% were shown using dotted lines, and taxa that differ in their phylogeny locations were shown in different colors.

Topologies (a-f) adapted from published results [5, 12–14, 26, 34]. Branches with support <75% were shown using dotted lines, and taxa that differ in their phylogeny locations were shown in different colors. Transcriptome sequencing (RNA-Seq) provides massive transcript information from the genome. Phylogenetic reconstructions based on RNA-Seq are more efficient and cost-effective than traditional polymerase chain reaction–based or expressed sequence tags (EST)-based methods when lacking whole-genome data [15]. Successful cases in recent years include mollusks [16], insects [17], the grape family [18], angiosperms [19], and land plants, including 6 ferns [20]. Here, with the aim to reconstruct the framework of fern phylogeny, we sampled abundant fern species representing all important linages and applied the latest phylogenomic analyses based on RNA-Seq. To reconstruct a robust and well-resolved phylogeny in ferns, applying multiple methods of phylogenomic analysis is extremely important. Since concatenation-based estimations of species trees usually have good accuracy under a low level of ILS, while coalescent-based methods are developed to overcome the effect of ILS but are sensitive to gene tree estimation error [21], both concatenation-based and coalescent-based estimations are applied. Nucleotide sequence, with higher variability than amino acid sequence, usually brings more useful information in phylogeny reconstruction, especially for closely related taxa. However, the substitutional saturation and compositional bias in nucleotide sequence, especially in the third codon position, may lead to a deviation from the true phylogeny. Here, both nucleotide and amino acid sequences are used in phylogeny reconstruction. Morphologically, the fern sporangium is an organ for enclosing and dispersing spores, most of which function like a unique catapult with the annulus [22]. During the last centuries, Bower's hypothesis on the evolution of sporangia with a focus on annulus [23] has been one of the most important cornerstones to fern phylogeny based on morphology [24, 25]. However, this hypothesis has been challenged by somewhat conflicting frameworks of fern phylogeny [4, 10, 12, 14, 26]. A robust framework in fern phylogeny that reflects the evolutionary history will improve our understanding of the evolution of fern sporangia as well as other characters.

Data Description

Taxa sampling and RNA-Seq

We chose 69 fern species from 38 families according to the PPG I system (48 fern families in total), covering all the 11 orders (Equisetales, Psilotales, Ophioglossales, Marattiales, Osmundales, Hymenophyllales, Gleicheniales, Schizaeales, Salviniales, Cyatheales, and Polypodiales). Information about the location and time for sampling is given in Table S1. All the sampled species were collected under the permissions of the natural reserves and Shanghai Chenshan Botanical Garden in China. Sporophyll or/and trophophyll were collected and frozen in liquid nitrogen immediately, and preserved in an ultra-low-temperature refrigerator at –80°C before RNA extraction. Total RNA was extracted using TRIzol (Life Technologies Corp., Carlsbad, California, USA) according to the manufacturer's protocols. The RNA concentration was determined using a NanoDrop spectrophotometer, and RNA quality was assessed with an Agilent Bioanalyzer. Paired-end reads were generated by Majorbio Company (Shanghai, China) using the HiSeq 2500 system. Raw reads were deposited in NCBI [27].

Transcriptomes assembly and orthology assignment

Transcriptomes data were generated from 69 fern species (Table 1). After filtering, about 2726.9 million paired-end DNA sequence reads (about 313 Gbp) were retained. We assembled these reads de novo and obtained a total of 5 449 842 contigs [28].
Table 1:

Sequencing and assembly information of the transcriptome data

IDSpeciesClean data, GTotal reads (clean)Q30 %Number of contigsN50, bpMean, bpGenes in Matrix 1Genes in Matrix 2
RS1 Pronephrium simplex 4.738 045 86491.24151 319887581.0721681254
RS10 Antrophyum callifolium 4.032 745 38491.7664 1071819998.7322261305
RS101 Oleandra musifolia 4.536 487 06891.4537 0751493919.320931248
RS103 Woodsia polystichoides 3.931 465 87090.9147 8121348811.322871310
RS107 Equisetum diffusum 4.435 693 23890.2188 9321154655.6418111254
RS108 Oreogrammitis dorsipila 4.637 037 32490.57266 540591485.121411273
RS11 Vandenboschia striata 4.838 639 79090.3261 724460422.7619591276
RS111 Pleurosoriopsis makinoi 4.838 983 79690.1398 1871145632.2921821277
RS112 Azolla pinnata subsp. asiatica 4.435 735 20690.5778 2951348777.921418839
RS114 Taenitis blechnoides 4.132 898 68290.9870 4951262711.321861278
RS115 Gymnogrammitis dareiformis 3.931 630 98889.81119 483569449.3819961220
RS116 Schizaea dichotoma 4.536 668 73489.667 4221350826.9220351285
RS119 Botrychium japonicum 4.838 603 00090.2885 2361477846.9718661283
RS122 Goniophlebium niponicum 4.838 786 21490.8254 1521663951.9222791300
RS123 Arthropteris palisotii 4.435 646 7409150 7001454891.6722861311
RS124 Matteuccia struthiopteris 4.234 080 99890.4457 5141345776.5222901313
RS127 Salvinia natans 4.233 780 05691.1779 3931379767.1419051173
RS128 Woodwardia prolifera 5.140 967 32291.6369 9311557859.7223281328
RS14 Diplazium viridescens 4.032 320 41690.4688 2361434780.8722691310
RS16 Bolbitis appendiculata 4.737 503 33691.66201 426802556.3922261288
RS17 Dryopteris pseudocaenopteris 4.133 136 19691.23102 751723514.9222361298
RS18 Dicranopteris pedata 4.233 942 12092.0474 0111193684.0920311304
RS19 Haplopteris amboinensis 4.242 772 16894.1747 60317131041.822491307
RS21 Psilotum nudum 8.585 199 03493.666 2121739927.1917411223
RS24 Cyclopeltis crenata 4.637 158 05891.529 668600491.8221461279
RS25 Asplenium formosae 4.646 629 75493.573 3181722989.8422731312
RS27 Lomariopsis spectabilis 4.133 233 59491.7798 0301466750.4222251304
RS28 Cheiropleuria bicuspis 5.141 617 29491.3599 4111435832.8220221295
RS31 Plagiogyria japonica 5.746 472 76091.9289 5321258733.920361222
RS34 Alsophila podophylla 4.948 768 60893.4366 2541580904.6221951289
RS35 Histiopteris incisa 4.343 115 39093.8161 2311749985.0323191316
RS36 Pteris vittata 4.141 212 85894.3776 66618681021.1322961312
RS37 Cibotium barometz 4.133 263 55091.9285 5551612891.8717901099
RS38 Osmunda japonica 4.133 485 27492.0558 6121730901.2817321159
RS39 Loxogramme chinensis 3.931 392 95292.1684 7961065651.8822401305
RS4 Microlepia hookeriana 4.040 561 42294.4995 9511610874.0622621301
RS41 Pteridium aquilinum 4.646 157 13493.5155 6151742960.3723211316
RS42 Hypolepis punctata 4.443 828 15493.5659 7171371833.6822771308
RS43 Dicksonia antarctica 3.931 210 60891.6956 4941533902.9620451213
RS45 Rhachidosorus mesosorus 4.435 348 99491.9880 0691541835.9223001315
RS46 Drynaria bonii 4.536 017 54892.0268 1321077643.9321761279
RS47 Platycerium bifurcatum 4.133 209 74091.6240 4561097694.5621481283
RS48 Angiopteris fokiensis 4.435 120 30291.1257 6371629932.5719171306
RS5 Diplaziopsis brunoniana 4.334 698 84691.3570 184822541.3120401234
RS50 Dennstaedtia pilosella 4.545 618 44693.6384 8131582831.5623081313
RS51 Monachosorum henryi 4.141 658 50493.4287 8321465803.1722551288
RS52 Acystopteris japonica 5.544 662 14691.1557 1181507873.591222677
RS53 Monachosorum maximowiczii 4.848 497 00493.58101 4481817899.5422571294
RS54 Dennstaedtia scabra 5.151 360 71693.4792 1581565845.4418181056
RS56 Arachniodes nigrospinosa 5.150 929 36294.4757 1681623916.123321319
RS69 Cheilanthes chusana 5.251 851 06694.1849 44917271012.6323171324
RS7 Elaphoglossum mcclurei 4.132 800 24892.3157 3301398846.7922671299
RS70 Lomagramma matthewii 4.435 218 87691.2165 1701748947.1822581307
RS71 Osmolindsaea odorata 4.646 808 64694.13113 7781521845.9622571312
RS72 Aleuritopteris chrysophylla 4.847 955 67494.1861 6371669929.6323071322
RS77 Marsilea quadrifolia 4.334 724 43291.7665 2271607930.3121881299
RS8 Humata repens 4.536 606 74691.1768 9321267690.3522641315
RS81 Tectaria subpedata 4.242 539 48294.4357 3841326797.8321281242
RS84 Ophioglossum vulgatum 4.435 637 33091.7771 8211226741.6216311179
RS85 Nephrolepis cordifolia 5.040 063 23690.8155 2071530842.6323021319
RS86 Microlepia platyphylla 4.646 324 2949474 9561763945.8722671295
RS88 Lygodium flexuosum 4.234 098 31691.4466 7511514867.8220641296
RS89 Hypodematium crenatum 4.132 711 79891.5852 8131416852.5722981319
RS90 Acrostichum aureum 5.443 422 57490.6946 18917291043.223031319
RS91 Adiantum caudatum 5.151 062 20494.2351 1451575950.4923231327
RS92 Parahemionitis cordata 4.133 309 45091.7247 5081456894.4223061317
RS93 Microlepia speluncae 4.444 124 84294.5594 9801720917.5922921308
RS97 Stenochlaena palustris 4.737 887 64291.8158 4161655945.8323001316
RS98 Ceratopteris thalictroides 3.931 741 082.091.474 7281610912.2622311296

The number of ortholog genes used in Matrix 1 and Matrix 2 were shown.

Sequencing and assembly information of the transcriptome data The number of ortholog genes used in Matrix 1 and Matrix 2 were shown. In order to obtain a reliable phylogenetic relationship, we selected 4 species as the outgroup, representing the main lineages of land plants: Amborella trichopoda (representing angiosperms), Picea abies (representing gymnosperms), Selaginella moellendorffii (representing lycophytes), and Physcomitrella patens (representing bryophytes). The translated ORF (protein) sequences of these 4 species were downloaded from Phytozome [29] and used in the following analysis. To ensure the consistency of phylogenomic analysis, we used a phylogenetic-based ortholog selection method and obtained 2 subsets of 1-to-1 orthologous genes that differed in gene number and species occupancy rate, named “Matrix 1” and “Matrix 2” [30]. Matrix 1 consists of 2391 genes that are present in at least 52 taxa (that is 75% of the 69 taxa in total), resulting in 2 024 565 nucleotide and 674 855 amino acid positions; the gene and character occupancy were 88% and 85%, respectively. Matrix 2 consists of 1334 genes that are present in at least 62 taxa (that is 90% of the 69 taxa in total), resulting in 1 171 332 nucleotide and 390 444 amino acid positions; the gene and character occupancy reached 94% and 90%, respectively. For each orthologue gene set, coalescent-based and concatenation-based methods were applied separately to both nucleotide and amino acid sequences. A working flow diagram showing the major processes in this study is presented in Fig. 2.
Figure 2:

A working flow diagram showing the major processes of data production and analysis in this study. Three major processes are de novo transcriptome assembly, 1-to-1 orthologs prediction, and phylogenetic analysis. The rectangles represent the main results, and the ellipses represent the main methods and analysis.

A working flow diagram showing the major processes of data production and analysis in this study. Three major processes are de novo transcriptome assembly, 1-to-1 orthologs prediction, and phylogenetic analysis. The rectangles represent the main results, and the ellipses represent the main methods and analysis.

Results

Species tree estimated in 69 ferns

For each combination of reconstruction methods (coalescent-based or concatenation-based) and sequence types (nucleotide or amino acid), Matrix 1 and Matrix 2 [31, 32] always yielded the same topology. In general, the 4 topologies (Fig. 3, Figs S1, S2, S3) from a combination of methods and sequence types are consistent, except for 6 positions (Table 2). Among the topologies, the one estimated by applying a coalescent-based method to the nucleotide sequence (Fig. 3) and the one applying a concatenation-based method (Figure S2) are most congruent.
Figure 3:

Phylogeny of ferns reconstructed by coalescent-based method using nucleotide sequence with divergence times calculated. Support values for the main phylogeny (A) calculated from Matrix 1/Matrix 2 are listed as percentages. *Indicates 100%/100%. Representative leave(s), sporangium, and the corresponding lineage are labeled with a same number. Simplified topology (B) shows the main linages as in Fig. 1. Species in phylogeny (A) and the corresponding lineage in topology (B) are shown in the same color.

Table 2:

Inconsistent topologies using different methods and sequences

Coalescent-based methodConcatenation-based method
SiteNucleotideAmino acidNucleotideAmino acid
A (Anfo,(Pnu,(Ovu,Bja))) (Anfo,(Pnu,(Ovu,Bja))) ((Pnu,(Ovu,Bja)),(Anfo,a))((Pnu,(Ovu,Bja)),(Anfo,a))
B (Cbi,(Dpe,Vst)) (Cbi,(Dpe,Vst)) (Cbi,(Dpe,Vst)) ((Dpe,Vst),(Cbi,a))
C (Asfo,(Aja,(Dbr,a))) (Asfo,(Aja,(Dbr,a))) (Asfo,(Aja,(Dbr,a))) (Asfo,((Aja,Dbr),a))
D (Dvi,(Mst,(Spa,Wpr))) ((Dvi,Mst),(Spa,Wpr)) (Dvi,(Mst,(Spa,Wpr))) (Dvi,(Mst,(Spa,Wpr)))
E (Bap,(Emc,Lma)) (Emc,(Bap,Lma)) (Bap,(Emc,Lma)) (Emc,(Bap,Lma))
F (Nco,((Tsu,Apa),a)) (Nco,(Tsu,(Apa,a))) (Nco,((Tsu,Apa),a)) (Nco,((Tsu,Apa),a))

(A) Anfo: Angiopteris fokiensis, Pnu: Psilotum nudum, Ovu: Ophioglossum vulgatum, Bja: Botrychium japonicum; (B) Cbi: Cheiropleuria bicuspis, Dpe: Dicranopteris pedata, Vst: Vandenboschia striata; (C) Asfo: Asplenium formosae, Aja: Acystopteris japonica, Dbr: Diplaziopsis brunoniana; (D) Dvi: Diplazium viridescens, Mst: Matteuccia struthiopteris, Spa: Stenochlaena palustris, Wpr: Woodwardia prolifera; (E) Bap: Bolbitis appendiculata, Emc: Elaphoglossum mcclurei, Lma: Lomagramma matthewii; (F) Nco: Nephrolepis cordifolia, Tsu: Tectaria subpedata, Apa: Arthropteris palisotii.

aIndicates other sampled species within this lineage. Topologies consistent with the one yielded from coalescent-based methods and nucleotide sequences are shown in bold.

Phylogeny of ferns reconstructed by coalescent-based method using nucleotide sequence with divergence times calculated. Support values for the main phylogeny (A) calculated from Matrix 1/Matrix 2 are listed as percentages. *Indicates 100%/100%. Representative leave(s), sporangium, and the corresponding lineage are labeled with a same number. Simplified topology (B) shows the main linages as in Fig. 1. Species in phylogeny (A) and the corresponding lineage in topology (B) are shown in the same color. Inconsistent topologies using different methods and sequences (A) Anfo: Angiopteris fokiensis, Pnu: Psilotum nudum, Ovu: Ophioglossum vulgatum, Bja: Botrychium japonicum; (B) Cbi: Cheiropleuria bicuspis, Dpe: Dicranopteris pedata, Vst: Vandenboschia striata; (C) Asfo: Asplenium formosae, Aja: Acystopteris japonica, Dbr: Diplaziopsis brunoniana; (D) Dvi: Diplazium viridescens, Mst: Matteuccia struthiopteris, Spa: Stenochlaena palustris, Wpr: Woodwardia prolifera; (E) Bap: Bolbitis appendiculata, Emc: Elaphoglossum mcclurei, Lma: Lomagramma matthewii; (F) Nco: Nephrolepis cordifolia, Tsu: Tectaria subpedata, Apa: Arthropteris palisotii. aIndicates other sampled species within this lineage. Topologies consistent with the one yielded from coalescent-based methods and nucleotide sequences are shown in bold.

Reconstruction of the evolutionary history of sporangial annulus

Our reconstruction of the evolution of sporangial annulus (Fig. 4) showed that ex-annulus sporangia are inferred to be the ancestral state (proportional likelihood [PL] = 1), and the rest of annulus states are likely derived from ex-annulus sporangia. Vertical annulus is suggested as synapomorphy for all polypod ferns (PL > 0.99). Both oblique annulus and rudimentary annulus have experienced parallel evolution.
Figure 4:

Reconstruction of the evolutionary history of sporangial annulus in ferns. Sampled species with 7 types of sporangial annulus are shown in different colours. For each ancient node, percentage of character state of sporangial annulus is shown.

Reconstruction of the evolutionary history of sporangial annulus in ferns. Sampled species with 7 types of sporangial annulus are shown in different colours. For each ancient node, percentage of character state of sporangial annulus is shown.

Discussion

Comparison of topologies estimated by various methods

By comparing topologies estimated by coalescent-based and concatenation-based methods using both nucleotide and amino acid sequences (Table 2), we found that the topologies yielded from coalescent-based and concatenation-based methods using nucleotide sequences are mostly consistent, except for the position of Angiopteris fokiensis. Topologies yielded from coalescent-based methods using nucleotide sequences and amino acid sequences showed 3 positions of inconsistency, all of which belong to eupolypods. As eupolypods have experienced rapid evolutionary radiation in Cenozoic [7] and nucleotide sequences usually provide more information to reconstruct relationships at a shallow phylogenetic scale, we consider the topology yielded from nucleotide sequences to be more reliable. However, the inconsistent positions among topologies often show relatively lower supporting values, and are often the controversial nodes from past studies based on different genes; we suggest that such inconsistency might be caused partially by ILS and reticulate evolution.

Relationships of eusporangiate ferns

Which clade is sister to the remaining taxa in ferns is a long-debated question (Fig. 1). Our results strongly supported that Equisetales (horsetails) are the sister group to all other monilophytes. This topology confirmed the results reported by Rai and Graham [12] and Kuo et al. [33] based on plastid genes, and it was accepted by the PPG I [3] in 2016. Distinct from most fern phylogeny based on molecular evidence (Fig. 1), our results based on a coalescent method revealed that Psilotales (whisk ferns), Ophioglossales (moonworts), and Marattiales (king ferns) form a monophyletic clade as ([Psilotales, Ophioglossales], Marattiales), which is sister to leptosporangiate ferns. The monophyletic origin of Psilotales, Ophioglossales, and Marattiales, which belong to eusporangiate ferns, is supported by the structure of sporangia. Being different from the leptosporangiate type, sporangia of eusporangiate ferns have no sporangiophore; they are thick in wall and large in volume, produce large amounts of spores, and have no sporangial annulus or only have a few enlarged parenchyma cells. The incongruence between the results based on coalescent and concatenation methods may be caused by strong ILS effect, which is a main pitfall when using the concatenation method [21].

Relationship of early leptosporangiates

Within early leptosporangiates, our results revealed a new monophyletic clade in which Gleicheniaceae (forking ferns) is sister to Hymenophyllaceae (filmy ferns), which is different from the mainstream [3, 10, 12–14, 34]. Similar but still different from the topology ([Dipteridaceae, Matoniaceae], Gleicheniaceae], Hymenophyllaceae) reported by Pryer et al. in 2004 [5], in our results, Cheiropleuria, which belongs to Dipteridaceae and was formerly placed in Gleicheniales [2, 5, 12, 26, 35, 36], is sister to the monophyletic clade of (Gleicheniaceae, Hymenophyllaceae). This new relationship is supported by sporangia character. Early leptosporangiates [36] are characterized by diverse sporangia and annulus. However, both Gleicheniaceae and Hymenophyllaceae have spherical sporangia with transverse-oblique annulus, as well as a short sporangial stalk connecting to a prominent receptacle [37]. On the other hand, flattened sporangia with slightly oblique annulus are found in Cheiropleuria. Moreover, long sporangial stalk and inapparent receptacle are common in Cheiropleuria, Dipteris, and Matonia. We suggest that Dipteridaceae, probably together with its sister lineage Matoniaceae [5, 12], may be sister to the clade of (Gleicheniaceae, Hymenophyllaceae). According to our results, Gleicheniales, which is comprised of Dipteridaceae, Matoniaceae, and Gleicheniaceae [26], is no longer a monophyletic lineage, but a paraphyletic one.

Relationships within polypod ferns

Polypods include more than 80% of living ferns, and their phylogeny remains somewhat controversial and elusive [26, 35, 36]. Our results strongly supported that Dennstaedtiaceae instead of Pteridaceae is sister to eupolypods. This pattern confirmed the topology suggested recently by Rothfels et al. based on 25 low-copy nuclear genes [14] and Lu et al. based on plastid genes [13], as well as the PPG I system [3]. According to our results, the relationships of Pteridaceae [34, 36, 38] and Dennstaedtiaceae [36] are also well resolved. Notably, Monachosorum is sister to the rest of the members in Dennstaedtiaceae, rather than being sister to the lineage of Pteridium, Hypolepis, and Histiopteris [36]. Our results showed that eupolypods are divided into 2 major lineages, eupolypods I and eupolypods II, in agreement with the consensus opinion [3]. Within eupolypods II, our results supported that Aspleniaceae is the sister group to the rest of the members, which is different from the current viewpoint [26, 36, 39]. Within eupolypods I, our result strongly supported that Lomariopsidaceae and Nephrolepidaceae form a paraphyletic group, rather than a monophyletic clade based on plastid genes [10, 26, 36]. Our new topology confirmed the morphology-based hypothesis that Dennstaedtiaceae with 2 indusial, rather than Pteridaceae with 1 false indusium, is more closely related to eupolypod ferns [40]. In Pteridaceae, the unstable structure of spherical sporangia, including variable annulus and short sporangial stalk, indicates that these characters of sporangia are relatively original and are close to those with oblique annulus in early leptosporangiates [23]. We also noticed that the characters of spherical sporangia with slightly oblique annulus in Monachosorum should be more ancestral than the flattened sporangia with typical vertical annulus in other genera of Dennstaedtiaceae. For distinguishing eupolypods I and eupolypods II, the number and shape of the vascular bundles at the base of petiole have been demonstrated to be of a powerful diagnostic character [36, 39].

The evolution of sporangial annulus in ferns

By observing the character of sporangial annulus of abundant samples in each fern group and combining these characters with our well-resolved backbone phylogeny (Fig. 3), we reconstructed the evolutionary history of sporangial annulus in ferns (Fig. 4). According to the results, we infer that ex-annulus sporangia, as in Equisetaceae, Psilotaceae, and Ophioglossaceae, is the ancestral state in ferns; rudimentary multiseriate annulus, which is inverse U-shaped in Marattiaceae and U-shaped in Osmundaceae; equatorial transverse-oblique uniseriate annulus, as in Gleicheniaceae and Hymenophyllaceae; oblique annulus as in Cyatheales (tree ferns); and vertical annulus as synapomorphy in polypods have been derived from the ex-annulus state. Both apical annulus, as in Lygodium and Schizaea, and vestige or disappeared annulus, as in Salviniales (aquatic ferns), are likely to be specialized in parallel from oblique annulus. Inconsistent with Bower's hypothesis [23], our results showed that sporangia with apical annulus as in Schizaeales are no longer the ancestral type in ferns but a specialized one. Correspondingly, the oldest fossils of Schizaeaceae are now believed to appear in the Jurassic period (201–145 MY BP) rather than formerly thought Carboniferous period (359–252 MY BP) [41].

Conclusion

Our results confirmed that Equisetales is sister to all the other monilophytes and that Dennstaedtiaceae is sister to eupolypods, which have been reported previously. Moreover, our results revealed some new relationships, such as that eusporangiate ferns, except Equisetales, may form a monophyletic clade as ([Psilotaceae, Ophioglossaceae], Marattiaceae), while Gleicheniaceae and Hymenophyllaceae form a monophyletic clade, which is sister to Dipteridaceae, and that Aspleniaceae is sister to the rest of the groups in eupolypods II. Most of these results are supported by sporangia characters, and a new evolutionary route of sporangial annulus in ferns is suggested.

Potential implications

Here, we present a robust fern phylogeny yielded from a large-scale phylogenomic analysis based on a high-quality RNA-seq dataset covering 69 fern species. This backbone phylogeny in ferns sets a foundation for further studies in biology and evolution in ferns and therefore in plants, especially when fern genomes are not available.

Methods

De novo transcriptome assembly

For each paired-end library, we first removed the Illumina adapter of raw reads using Scythe (Scythe, RRID: SCR_011844) [42] and trimmed the poor-quality bases using DynamicTrim Perl script of the SolexQA package with default parameters [43]. Next, de novo transcriptome assembly of each species was conducted using the Trinity package, version trinityrnaseq_r20140413 (Trinity, RRID: SCR_013048) with default parameters [44]. To discard the duplicated sequences, the obtained contigs were clustered using CD-HIT-EST v4.6.1 (CD-HIT, RRID: SCR_007105) to generate nonredundant contigs. All contigs longer than 200 bp in length were used for downstream analysis. We used TransDescoder, a program in the Trinity package, to identify the candidate coding sequences (CDS) from the contigs with default criteria. Finally, the translated protein sequences of CDS were searched by BLASTP against the nonredundant protein database in NCBI with an e-value threshold of 1e-5. These BLASTP hit sequences were used for further analysis.

Orthology assignment, alignment, and alignment masking

For orthology assignment for the 69 sample assemblies together with the 4 outgroup species, a phylogenetic-based clustering method described previously [16] was used. In short, an all-vs-all BLAST search of amino acid sequence was performed across different species; the BLAST results were clustered using MCL [45] software with the parameters ‘-I 2–tf ΄gq(20)΄.’ Optimization of the inflation parameter (I) was conducted as described previously [46], and the default value 2.0 was selected ultimately. As the de novo assembly by Trinity produces many sequences with high similarity, which contain both paralogs and isoforms [47], when a clustered gene family contains too many sequences (e.g., more than 10), the risk of contamination of isoforms rises, along with the computational infeasibility. Hence, when a species had more than 10 sequences in a gene family, we removed all sequences in this gene family of this species. Then, groups with at least 35 (50%) fern species were aligned using the einsi command, implemented in MAFFT (MAFFT, RRID: SCR_011811) [48], and trimmed by Gblocks with default parameters [49]. Next, for each group, a homologous gene tree was built with RAxML software, version 8.0.20 (RAxML, RRID: SCR_006086), by implementing the maximum likelihood method (ML) [50]. To infer orthologous genes, we used treeprune in the Agalma package [51] to mask the monophyletic sequences. We pruned the paralogous subtrees from the homologous gene trees until only 1 monophyletic subtree was retained. Next, the resulting orthologous gene trees were further filtered by the criteria that each species should be represented by only 1 sequence, and the resulting subset genes were referred to “1-to-1 orthologs,” which were largely free of gene duplication. Then, we extracted both the CDS (nucleotide sequence) and translated amino acid sequence from each orthologous gene group, followed by aligning with MAFFT and trimming with Gblocks. The alignment with coding and corresponding translated sequences longer than 150 bp (or 50 amino acids) in length were kept for further analysis.

Basic Universal Single Copy Orthologs analysis

The Basic Universal Single Copy Orthologs (BUSCO, RRID: SCR_015008), which employs a core set of orthologs conservative in eukaryotic species to determine the gene coverage of each assembly [52], was employed to assess the completeness of the transcriptome assembly we obtained (Table S2) [53]. A total of 303 BUSCOs were employed to blast against by translated amino acid of the assemblies using BLASTP. Then the numbers of complete and partially matched genes from each assembly were counted. Out of the 69 samples in total, the gene coverage of 65 samples (94.2%) exceeded 82%, with at least 251 complete genes identified. Unexpectedly, among our total assemblies, 1 sample (Aleuritopteris chrysophylla, named RS_72) presented an extremely low gene coverage degree, in which only 72 (23.8%) complete housekeeping genes were found (Supplementary Table S2). However, when the sample was deleted from the matrix used to construct the backbone of the phylogenetic tree, the topology remained unchanged, indicating that the lower completeness in this sample did not affect our results (data not shown).

Phylogenetic analysis

The coalescent-based species trees were reconstructed by ASTRAL v4.10.4 [54], carried out by 100 replicates of multilocus bootstrapping [55]. Each gene tree was constructed with the PROTCATJTTF model by RAxML v8.2.4 (RAxML, RRID: SCR_006086) [50], performed using 100 random replicates to calculate bootstrap value. For the concatenation analysis, we preformed the ML for each matrix using RAxML software (version 8.0.20). Branch support was evaluated using 100 bootstrap replicates. We used the “GTR + Γ4 + I” model for DNA matrices, and the JTTF model for the corresponding protein matrices, selected by “ProtienModelselection.pl” [56]. To estimate the divergence times, we used the concatenated alignment of orthologs, calibrated with the ages of 2 fossils (Archaeocalamites Senftenbergia: 354 MY, Grammatopteris: 280 MY) [6, 57] as the minimum ages of monilophytes and leptosporangiate ferns, respectively, and a maximum age constraint of 500 MY for land plants in a Bayesian relaxed clock method using MCMCTREE [58] on the coalescent-based species tree.

Reconstruction of the evolution of sporangial annulus

Characters of sporangial annulus of the sampled species were observed using a polarized light microscope (Axio Scope.A1, ZEISS) after the fresh and mature sporangia were treated with sodium hypochlorite (NaClO) solution. The evolution of sporangial annulus was reconstructed with the likelihood method, implemented in Mesquite v2.7.5 [59]. All character states (i.e., vertical annulus, oblique annulus, rudimentary annulus, ex-annulus, apical annulus, transverse annulus, and vestigial annulus) were treated as unordered and equally weighted. To reconstruct character evolution, a maximum likelihood approach using Markov k-state 1 parameter model [60] was applied. To account for phylogenetic uncertainty, the “Trace-characters-over-trees” command was used to calculate the ancestral states at each node, including probabilities in the context of likelihood reconstructions. To carry out these analyses, characters were plotted onto 100 trees that were sampled in the ML analyses of the combined dataset using RAxML v7. The results were finally summarized as percentage of changes of character states on a given branch among all 100 trees utilizing the option of “Average-frequencies-across-trees.”

Availability of data and materials

Raw reads of RNA-Seq for 69 fern species were deposited in GenBank under Bioproject accession number PRJNA281136. Transcriptome datasets, alignments, phylogenetic trees, BUSCO results and other supporting data are available via the GigaScience repository, GigaDB [61].

Abbreviations

BUSCOs: Basic Universal Single-Copy Orthologs; ILS: incomplete lineage sorting; ML: maximum likelihood; MY: million years; PPG: Pteridophyte Phylogeny Group; RNA-Seq: transcriptome sequencing.

Additional files

Additional file 1: Tables S1 and S2 and Figures S1–S3.

Competing interests

The authors declare that they have no competing interests.

Funding

This work was funded by Shanghai Landscaping and City Appearance Administrative Bureau of China, Scientific Research Grants (G142433, G152420, and F112422), and the National Natural Science Foundation of China (31370234).

Author contributions

Y.H.Y., H. Shen, and D.M.J. conceived and designed the study. M.L., J.P.S., D.M.J., R.W., and L.L. implemented the data analyses. Y.H.Y., H. Shen, H.J.W., X.L.Z., H. Shang, and Y.F.G. collected the specimens. H. Shen, R.Z., and Y.F.G. prepared the specimens for sequencing. X.L.Z. provided the anatomical data. D.M.J., H. Shen, Y.H.Y., J.P.S., M.L., R.W., H. Shang, X.L.Z., and X.C.Z. interpreted the results and wrote the manuscript. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. 31 Jul 2017 Reviewed Click here for additional data file. 06 Sep 2017 Reviewed Click here for additional data file. 01 Aug 2017 Reviewed Click here for additional data file. 03 Aug 2017 Reviewed Click here for additional data file. Click here for additional data file.
  33 in total

1.  Resolving the evolutionary relationships of molluscs with phylogenomic tools.

Authors:  Stephen A Smith; Nerida G Wilson; Freya E Goetz; Caitlin Feehery; Sónia C S Andrade; Greg W Rouse; Gonzalo Giribet; Casey W Dunn
Journal:  Nature       Date:  2011-10-26       Impact factor: 49.962

2.  Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments.

Authors:  Gerard Talavera; Jose Castresana
Journal:  Syst Biol       Date:  2007-08       Impact factor: 15.683

3.  Leveraging skewed transcript abundance by RNA-Seq to increase the genomic depth of the tree of life.

Authors:  Chris Todd Hittinger; Mark Johnston; John T Tossberg; Antonis Rokas
Journal:  Proc Natl Acad Sci U S A       Date:  2010-01-04       Impact factor: 11.205

4.  Assessing the root of bilaterian animals with scalable phylogenomic methods.

Authors:  Andreas Hejnol; Matthias Obst; Alexandros Stamatakis; Michael Ott; Greg W Rouse; Gregory D Edgecombe; Pedro Martinez; Jaume Baguñà; Xavier Bailly; Ulf Jondelius; Matthias Wiens; Werner E G Müller; Elaine Seaver; Ward C Wheeler; Mark Q Martindale; Gonzalo Giribet; Casey W Dunn
Journal:  Proc Biol Sci       Date:  2009-09-16       Impact factor: 5.349

5.  Evidence for a Cenozoic radiation of ferns in an angiosperm-dominated canopy.

Authors:  Eric Schuettpelz; Kathleen M Pryer
Journal:  Proc Natl Acad Sci U S A       Date:  2009-06-30       Impact factor: 11.205

6.  Utility of a large, multigene plastid data set in inferring higher-order relationships in ferns and relatives (monilophytes).

Authors:  Hardeep S Rai; Sean W Graham
Journal:  Am J Bot       Date:  2010-08-26       Impact factor: 3.844

7.  Phylogenomics resolves the timing and pattern of insect evolution.

Authors:  Bernhard Misof; Shanlin Liu; Karen Meusemann; Ralph S Peters; Alexander Donath; Christoph Mayer; Paul B Frandsen; Jessica Ware; Tomáš Flouri; Rolf G Beutel; Oliver Niehuis; Malte Petersen; Fernando Izquierdo-Carrasco; Torsten Wappler; Jes Rust; Andre J Aberer; Ulrike Aspöck; Horst Aspöck; Daniela Bartel; Alexander Blanke; Simon Berger; Alexander Böhm; Thomas R Buckley; Brett Calcott; Junqing Chen; Frank Friedrich; Makiko Fukui; Mari Fujita; Carola Greve; Peter Grobe; Shengchang Gu; Ying Huang; Lars S Jermiin; Akito Y Kawahara; Lars Krogmann; Martin Kubiak; Robert Lanfear; Harald Letsch; Yiyuan Li; Zhenyu Li; Jiguang Li; Haorong Lu; Ryuichiro Machida; Yuta Mashimo; Pashalia Kapli; Duane D McKenna; Guanliang Meng; Yasutaka Nakagaki; José Luis Navarrete-Heredia; Michael Ott; Yanxiang Ou; Günther Pass; Lars Podsiadlowski; Hans Pohl; Björn M von Reumont; Kai Schütte; Kaoru Sekiya; Shota Shimizu; Adam Slipinski; Alexandros Stamatakis; Wenhui Song; Xu Su; Nikolaus U Szucsich; Meihua Tan; Xuemei Tan; Min Tang; Jingbo Tang; Gerald Timelthaler; Shigekazu Tomizuka; Michelle Trautwein; Xiaoli Tong; Toshiki Uchifune; Manfred G Walzl; Brian M Wiegmann; Jeanne Wilbrandt; Benjamin Wipfler; Thomas K F Wong; Qiong Wu; Gengxiong Wu; Yinlong Xie; Shenzhou Yang; Qing Yang; David K Yeates; Kazunori Yoshizawa; Qing Zhang; Rui Zhang; Wenwei Zhang; Yunhui Zhang; Jing Zhao; Chengran Zhou; Lili Zhou; Tanja Ziesmann; Shijie Zou; Yingrui Li; Xun Xu; Yong Zhang; Huanming Yang; Jian Wang; Jun Wang; Karl M Kjer; Xin Zhou
Journal:  Science       Date:  2014-11-06       Impact factor: 47.728

8.  Statistical binning enables an accurate coalescent-based estimation of the avian tree.

Authors:  Siavash Mirarab; Md Shamsuzzoha Bayzid; Bastien Boussau; Tandy Warnow
Journal:  Science       Date:  2014-12-11       Impact factor: 47.728

9.  ASTRAL: genome-scale coalescent-based species tree estimation.

Authors:  S Mirarab; R Reaz; Md S Bayzid; T Zimmermann; M S Swenson; T Warnow
Journal:  Bioinformatics       Date:  2014-09-01       Impact factor: 6.937

10.  Resolution of deep angiosperm phylogeny using conserved nuclear genes and estimates of early divergence times.

Authors:  Liping Zeng; Qiang Zhang; Renran Sun; Hongzhi Kong; Ning Zhang; Hong Ma
Journal:  Nat Commun       Date:  2014-09-24       Impact factor: 14.919

View more
  24 in total

1.  Virtual issue: Ecology and evolution of pteridophytes in the era of molecular genetics.

Authors:  Joel H Nitta; Atsushi Ebihara
Journal:  J Plant Res       Date:  2019-11       Impact factor: 2.629

2.  Excluding Loci With Substitution Saturation Improves Inferences From Phylogenomic Data.

Authors:  David A Duchêne; Niklas Mather; Cara Van Der Wal; Simon Y W Ho
Journal:  Syst Biol       Date:  2022-04-19       Impact factor: 9.160

3.  Phylogenomic Analysis Reconstructed the Order Matoniales from Paleopolyploidy Veil.

Authors:  Jiang-Ping Shu; Hao Wang; Hui Shen; Rui-Jiang Wang; Qiang Fu; Yong-Dong Wang; Yuan-Nian Jiao; Yue-Hong Yan
Journal:  Plants (Basel)       Date:  2022-06-07

4.  Inferring the Total-Evidence Timescale of Marattialean Fern Evolution in the Face of Model Sensitivity.

Authors:  Michael R May; Dori L Contreras; Michael A Sundue; Nathalie S Nagalingum; Cindy V Looy; Carl J Rothfels
Journal:  Syst Biol       Date:  2021-10-13       Impact factor: 15.683

5.  Biogeography and genome size evolution of the oldest extant vascular plant genus, Equisetum (Equisetaceae).

Authors:  Maarten J M Christenhusz; Mark W Chase; Michael F Fay; Oriane Hidalgo; Ilia J Leitch; Jaume Pellicer; Juan Viruel
Journal:  Ann Bot       Date:  2021-04-17       Impact factor: 4.357

6.  Phylogenomic conflict coincides with rapid morphological innovation.

Authors:  Caroline Parins-Fukuchi; Gregory W Stull; Stephen A Smith
Journal:  Proc Natl Acad Sci U S A       Date:  2021-05-11       Impact factor: 11.205

7.  Origin and early evolution of the plant terpene synthase family.

Authors:  Qidong Jia; Reid Brown; Tobias G Köllner; Jianyu Fu; Xinlu Chen; Gane Ka-Shu Wong; Jonathan Gershenzon; Reuben J Peters; Feng Chen
Journal:  Proc Natl Acad Sci U S A       Date:  2022-04-08       Impact factor: 12.779

8.  Comparative Analyses of 3,654 Plastid Genomes Unravel Insights Into Evolutionary Dynamics and Phylogenetic Discordance of Green Plants.

Authors:  Ting Yang; Sunil Kumar Sahu; Lingxiao Yang; Yang Liu; Weixue Mu; Xin Liu; Mikael Lenz Strube; Huan Liu; Bojian Zhong
Journal:  Front Plant Sci       Date:  2022-04-11       Impact factor: 6.627

9.  Transcriptome-wide SNPs for Botrychium lunaria ferns enable fine-grained analysis of ploidy and population structure.

Authors:  Vinciane Mossion; Benjamin Dauphin; Jason Grant; Michael Kessler; Niklaus Zemp; Daniel Croll
Journal:  Mol Ecol Resour       Date:  2021-08-07       Impact factor: 8.678

10.  Target sequence capture of nuclear-encoded genes for phylogenetic analysis in ferns.

Authors:  Paul G Wolf; Tanner A Robison; Matthew G Johnson; Michael A Sundue; Weston L Testo; Carl J Rothfels
Journal:  Appl Plant Sci       Date:  2018-05-17       Impact factor: 1.936

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.