| Literature DB >> 26309387 |
Abstract
The question how many genes are needed to resolve phylogenetic incongruence has been investigated at various taxonomic levels, yet few studies have investigated the minimum required numbers of selected genes based on single-gene tree performance at the genus level or lower. We conducted resampling analyses by compiling transcriptome-based single-copy nuclear gene sequences of 11 species of Primulina (Gesneriaceae) to investigate the minimum numbers of both random and selected genes needed to resolve the phylogeny. Only 8 of the 26 selected genes were sufficient for full resolution, while 175 genes were needed if all 830 random genes were used. Our results provided a baseline for future sampling strategies of gene numbers in molecular phylogenetic studies of speciose taxa. The gene selection strategies based on single-gene tree performance are strongly recommended in phylogenic analyses.Entities:
Keywords: Primulina; consensus network; gene number; phylogenetic incongruence; resampling analyses
Year: 2015 PMID: 26309387 PMCID: PMC4533847 DOI: 10.4137/EBO.S26047
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Figure 1Species tree of the 11 Primulina species inferred from the 830 single-gene trees with the average rank of coalescences (STAR) method.
Figure 2Consensus network of the 11 Primulina species constructed from 830 single-gene trees at six different threshold values.
Summary of the percentages of correct trees at a bootstrap threshold of 95% or 75% (in parentheses) for all 11 Primulina species, the triplet, and other species, through random resampling by gene or site from the original 830-gene (631,686-nucleotide) data set.
| N | ALL | TRIPLET | OTHER |
|---|---|---|---|
| 25 | 26.6 (51.2) | 41.0 (63.4) | 57.2 (77.1) |
| 50 | 50.1 (75.0) | 59.0 (78.9) | 82.7 (94.4) |
| 75 | 68.9 (85.9) | 72.7 (87.2) | 95.2 (98.7) |
| 100 | 79.9 (92.6) | 81.3 (92.8) | 98.5 (99.8) |
| 125 | 87.5 (95.6) | 87.8 (95.6) | 99.5 (100) |
| 150 | 93.8 (98.7) | 93.9 (98.7) | 99.9 (100) |
| 175 | 96.0 (98.9) | 96.1 (98.9) | 99.9 (100) |
| 200 | 97.3 (99.6) | 97.3 (99.6) | 100 (100) |
| 225 | 98.2 (99.7) | 98.2 (99.7) | 100 (100) |
| 250 | 99.3 (99.8) | 99.3 (99.8) | 100 (100) |
| 275 | 99.8 (100) | 99.8 (100) | 100 (100) |
| 300 | 100 (100) | 100 (100) | 100 (100) |
| 10 k | 10.4 (54.7) | 22.1 (61.3) | 50.7 (88.8) |
| 20 k | 39.4 (81.4) | 44.3 (82.6) | 88.4 (98.6) |
| 30 k | 64.7 (92.2) | 66.2 (92.3) | 97.6 (99.9) |
| 40 k | 75.7 (96.7) | 76.0 (96.7) | 99.7 (100) |
| 50 k | 87.7 (98.7) | 87.8 (98.7) | 99.9 (100) |
| 60 k | 92.7 (99.5) | 92.7 (99.5) | 100 (100) |
| 70 k | 96.1 (99.9) | 96.1 (99.9) | 100 (100) |
| 80 k | 97.7 (99.9) | 97.7 (99.9) | 100 (100) |
| 90 k | 99.5 (99.9) | 99.5 (99.9) | 100 (100) |
| 100 k | 99.5 (100) | 99.5 (100) | 100 (100) |
| 110 k | 99.6 (100) | 99.6 (100) | 100 (100) |
| 120 k | 100 (100) | 100 (100) | 100 (100) |
Summary of the percentages of correct trees at a bootstrap threshold of 95% or 75% (in parentheses) for all 11 Primulina species, the triplet and other species, through random resampling by gene from the selected 26-gene data set.
| N | ALL | TRIPLET | OTHER |
|---|---|---|---|
| 3 | 26.3 (87.1) | 63.1 (96.9) | 40.2 (90.1) |
| 4 | 54.6 (96.3) | 78.4 (99.2) | 69.7 (97.1) |
| 5 | 72.9 (99.5) | 87.8 (99.8) | 83.2 (99.7) |
| 6 | 88.0 (99.9) | 94.8 (100) | 92.8 (99.9) |
| 7 | 93.9 (100) | 97.2 (100) | 96.5 (100) |
| 8 | 98.2 (100) | 99.2 (100) | 99.0 (100) |
| 9 | 99.3 (100) | 99.6 (100) | 99.7 (100) |
| 10 | 99.8 (100) | 100 (100) | 99.8 (100) |
| 11 | 99.9 (100) | 99.9 (100) | 100 (100) |
| 12 | 100 (100) | 100 (100) | 100 (100) |