| Literature DB >> 30155368 |
Ana Maria R Almeida1, Alma Piñeyro-Nelson2, Roxana B Yockteng3,4, Chelsea D Specht5.
Abstract
The advancement of next generation sequencing technologies (NGS) has revolutionized our ability to generate large quantities of data at a genomic scale. Despite great challenges, these new sequencing technologies have empowered scientists to explore various relevant biological questions on non-model organisms, even in the absence of a complete sequenced reference genome. Here, we analyzed whole flower transcriptome libraries from exemplar species across the monocot order Zingiberales, using a comparative approach in order to gain insight into the evolution of the molecular mechanisms underlying flower development in the group. We identified 4,153 coding genes shared by all floral transcriptomes analyzed, and 1,748 genes that are only retrieved in the Zingiberales. We also identified 666 genes that are unique to the ginger lineage, and 2,001 that are only found in the banana group, while in the outgroup species Dichorisandra thyrsiflora J.C. Mikan (Commelinaceae) we retrieved 2,686 unique genes. It is possible that some of these genes underlie lineage-specific molecular mechanisms of floral diversification. We further discuss the nature of these lineage-specific datasets, emphasizing conserved and unique molecular processes with special emphasis in the Zingiberales. We also briefly discuss the strengths and shortcomings of de novo assembly for the study of developmental processes across divergent taxa from a particular order. Although this comparison is based exclusively on coding genes, with particular emphasis in transcription factors, we believe that the careful study of other regulatory mechanisms, such as non-coding RNAs, might reveal new levels of complexity, which were not explored in this work.Entities:
Keywords: Floral evo-devo; Floral evolution; Floral transcriptomes; Ginger transcriptomes; Monocot flower
Year: 2018 PMID: 30155368 PMCID: PMC6110254 DOI: 10.7717/peerj.5490
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Evolution of floral morphology in the Zingiberales.
(A) Most recent Zingiberales phylogeny (modified from Sass et al. (2016)). Zingiberales families are divided into the banana group, a paraphyletic assembly of early branching lineages, and the ginger clade. The asterix (*) marks the evolution of increased petaloidy and reduced number of fertile stamens as shared characteristics of the ginger clade. (B) M. basjoo flower and floral organs. Calix and corolla members are mostly fused into what is called the floral tube, with the exception of a single corolla member, the free petal. As a representative of the androecial constitution of the banana group, M. basjoo has five filamentous fertile stamens. M. basjoo gynoecium is also representative of most species in the banana group. (C) Canna sp. flower and floral organs. Species in the ginger clade usually exhibit inconspicuous and sepal-like calix and corolla, while infertile androecial members (staminodes) become laminar and petaloid. Species in the Zingiberaceae and Costaceae families bear a single fertile stamen, while species in the Cannaceae and Marantaceae families only develop 1/2 a fertile stamen. Furthermore, in Canna sp. the gynoecium is also laminarized to some extent. ft, floral tube; fp, free petal; se, sepals; pe, petals; st, stamen; th, theca; std, staminodes; gy, gynoecium (Photos by Ana Almeida).
Species used in this study, collection location and accession numbers.
| Species | Location | Accession |
|---|---|---|
| UC Davis Greenhouse | B81.521 | |
| UC Botanical Garden | 89.0873 | |
| Oxford Track Greenhouse (UC Berkeley) | 194.656 | |
| Oxford Track Greenhouse (UC Berkeley) |
| |
| UC Botanical Garden | 90.1656 | |
| Oxford Track Greenhouse (UC Berkeley) |
| |
| Oxford Track Greenhouse (UC Berkeley) |
|
Number of cleaned reads and contigs, average contig length in base pairs, and assembly quality metrics (N50 and RSEM-EVAL scores).
RSEM-scores for each transcriptome were calculated using Arabidopsis, Musa acuminata, Elaeis guineensis and Phoenix dactylifera predicted CDS as references.
| Whole flower transcriptomes | Number of cleaned reads | Number of contigs | Average contig length | N50 | RSEM-EVAL to Arabidopsis CDS | RSEM-EVAL to Musa CDS | RSEM-EVAL to Elaeis CDS | RSEM-EVAL to Phoenix CDS |
|---|---|---|---|---|---|---|---|---|
| 6,103,473 | 59,607 | 1,177 | 1,635 | −554.921.347 | −554.925.496 | −554.909.485 | −554.930.293 | |
| 4,365,085 | 67,283 | 1,032 | 1,408 | −396.133.340 | −396.137.949 | −396.118.692 | −396.143.069 | |
| 142,860,349 | 132,411 | 1,724 | 2,440 | −994.730.221 | −994.728.623 | −994.727.315 | −994.729.011 | |
| 9,357,365 | 74,190 | 1,113 | 1,503 | −860.726.519 | −860.732.496 | −860.711.867 | −860.736.385 | |
| 4,643,266 | 52,798 | 825 | 1,602 | −357.355.187 | −357.358.211 | −357.346.889 | −357.360.742 | |
| 1,292,595 | 19,377 | 632 | 674 | −95.168.818 | −95.169.800 | −95.166.156 | −95.170.392 | |
| 6,252,788 | 64,723 | 891 | 1,166 | −603.219.814 | −603.224.077 | −603.211.474 | −603.225.657 |
Number of predicted long open reading frames (ORFs) from TransDecoder.
Long ORFs were first predicted from the universe of de novo assembled contigs. Blastp and HMMER3 searchers were used to further filter long ORFs.
| Whole flower transcriptomes | TransDecoder ORF predictions | |||
|---|---|---|---|---|
| Long ORFs | % contigs | Filtered ORFs | % contigs | |
| 48,051 | 81 | 29,182 | 49 | |
| 39,003 | 58 | 26,790 | 40 | |
| 85,437 | 65 | 55,360 | 42 | |
| 43,932 | 59 | 29,366 | 40 | |
| 39,214 | 74 | 24,463 | 46 | |
| 17,112 | 88 | 13,122 | 68 | |
| 37,449 | 58 | 27,772 | 43 | |
Orthogroup species overlap as predicted by OrthoFinder.
Largest number of orthogroup overlap per species is highlighted in bold. Ca. zebrina transcriptome shows the largest number of overlaps to all species, with the exception of Arabidopsis thaliana, potentially resulting from increased transcriptome coverage in that species.
| SPECIES | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 11,511 | 10,403 | 10,298 | 10,049 | 10,814 | 10,089 | 10,543 | 9,161 | 9,627 | 9,448 | |
| 10,403 | 29,032 | |||||||||
| 10,298 | 25,460 | 14,822 | 11,225 | 18,149 | 11,726 | 15,503 | 10,830 | 15,524 | ||
| 10,049 | 15,927 | 14,822 | 20,139 | 10,875 | 14,985 | 11,073 | 13,757 | 10,494 | 13,985 | |
| 11,405 | 11,225 | 10,875 | 13,065 | 10,992 | 11,428 | 9,820 | 11,109 | 10,101 | ||
| 10,089 | 19,778 | 18,149 | 14,985 | 10,992 | 26,331 | 12,034 | 15,989 | 10,591 | 15,923 | |
| 10,543 | 12,072 | 11,726 | 11,073 | 11,428 | 12,034 | 13,910 | 10,392 | 10,539 | 10,547 | |
| 9,161 | 16,904 | 15,503 | 13,757 | 9,820 | 15,989 | 10,392 | 22,244 | 9,644 | 14,164 | |
| 9,627 | 11,212 | 10,830 | 10,494 | 11,109 | 10,591 | 10,539 | 9,644 | 13,156 | 9,805 | |
| 9,448 | 16,879 | 15,524 | 13,985 | 10,101 | 15,923 | 10,547 | 14,164 | 9,805 | 21,568 |
Blastn results between floral transcriptomes and predicted coding sequences (CDS) from the genomes of Arabidopsis thaliana, Musa acuminata, Phoenix dactylifera, and Elaeis guineensis.
| Transcriptomes | ||||||
|---|---|---|---|---|---|---|
| Blastn all contigs to CDS | CDS represented in transcriptome | % CDS represented in transcriptome | Blastn all contigs to CDS | CDS represented in transcriptome | % CDS represented in transcriptome | |
| 49,127 | 29,433 | 80.5 | 19,509 | 19,945 | 44.96 | |
| 38,170 | 20,289 | 55.5 | 21,317 | 18,238 | 41.11 | |
| 75,885 | 20,671 | 56.5 | 42,638 | 17,229 | 38.84 | |
| 35,597 | 20,522 | 56.1 | 19,353 | 17,723 | 39.95 | |
| 16,901 | 14,322 | 39.2 | 8,725 | 11,886 | 26.79 | |
| 9,319 | 9,223 | 25.2 | 4,491 | 6,430 | 14.5 | |
| 12,384 | 8,596 | 23.5 | 11,394 | 12,780 | 28.81 | |
Figure 2Venn diagram of Blastn results of all floral transcriptomes filtered ORFs against Elaeis guineensis predicted CDS.
Values represent number of unigenes.
Figure 3Distribution of main ‘functional’ categories of coding genes shared by all floral transcriptomes, and shared by all Zingiberales floral transcriptomes based on Blastn results to Elaeis guineensis transcriptome.
Distribution of transcription factor families amongst the floral transcriptomes studied.
A total of 508 transcription factors were ascribed to 36 of the 58 plant transcription factor families characterized in the PlantTFDB v4.0. Outgroup species is D. thyrsiflora.
| Shared by all | Zingiberales | Banana clade | Ginger clade | Canna-Calathea | Zingiber | Outgroup ( | |
|---|---|---|---|---|---|---|---|
| Transcription Factor Families (PlantTFDB v4.0) | 25 | 22 | 19 | 18 | 20 | 30 | 21 |
| Putative Transcription Factors (not in PlantTFDB v4.0) | 0 | 0 | 1 | 0 | 2 | 3 | 2 |