| Literature DB >> 26819541 |
Lu Zhang1, Wei Wu2, Hai-Fei Yan2, Xue-Jun Ge2.
Abstract
Advances in high-throughput sequencing have generated a vast amount of transcriptomic data that are being increasingly used in phylogenetic reconstruction. However, processing the vast datasets for a huge number of genes and even identifying optimal analytical methodology are challenging. Through de novo sequenced and retrieved data from public databases, we identified 221 orthologous protein-coding genes to reconstruct the phylogeny of Ericales, an order characterized by rapid ancient radiation. Seven species representing different families in Ericales were used as in-groups. Both concatenation and coalescence methods yielded the same well-supported topology as previous studies, with only two nodes conflicting with previously reported relationships. The results revealed that a partitioning strategy could improve the traditional concatenation methodology. Rapidly evolving genes negatively affected the concatenation analysis, while slowly evolving genes slightly affected the coalescence analysis. The coalescence methods usually accommodated rate heterogeneity better and required fewer genes to yield well-supported topologies than the concatenation methods with both real and simulated data.Entities:
Keywords: Ericales; coalescence methods; concatenation methods; evolving rates; phylogenomics
Year: 2016 PMID: 26819541 PMCID: PMC4718149 DOI: 10.4137/EBO.S22448
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Figure 1The species tree yielded by unpartitioned ML analysis with our 221 genes.
Notes: Values above branches indicated the BSVs of unpartitioned ML analysis, partitioned ML analysis based on codon positions, and partitioned ML analysis based on genes and MP-EST analysis. An asterisk indicates that this branch was supported by 100 BSV in all the four analyses.
The number of genes needed to yield a relatively strongly supported topology when genes were removed gradually.
| ASWE WHEN TOTAL GENES USED | THE HIGHEST ASWE | NUMBER OF GENES NEEDED TO GOT THEHIGHEST ASWE | TOPOLOGY WITH THE HIGHEST ASWE | NUMBER OF GENES NEEDED TO KEEP ASWE HIGHER THAN 85 | |
|---|---|---|---|---|---|
| Test 1 | 96.2 | 97.6 | 200 | 221~160 | |
| Test 2 | 96.2 | 97.6 | 180 | 221~120 | |
| Test 1 | 95 | 95 | 221 | 221~100 | |
| Test 2 | 95 | 96.8 | 160 | 221~60 | |
Notes: Test 1: rapidly evolving genes were removed gradually. Test 2: slowly evolving genes were removed gradually.
Abbreviation: ASWE, average number of BSVs within Ericales.
Figure 2The species trees yielded with dataset M (100 genes with medium evolving rates).
Notes: (A) The species tree yielded by unpartitioned ML analysis. Values above branches indicated the corresponding BSV. (B) The species tree yielded by partitioned ML analysis based on genes. Values above branches indicated the BSV of partitioned ML analysis and MP-EST analysis.