| Literature DB >> 31867045 |
Abstract
Phylogenetic analysis is important in understanding the process of biological evolution, and phylogenetic trees are used to represent the evolutionary history. Each taxon in a phylogenetic tree has not more than one parent, so phylogenetic trees cannot express the complex evolutionary information implicit in phylogeny. Phylogenetic networks can be used to express genome evolutionary histories. Therefore, it is great significance to research the construction of phylogenetic networks. Cass algorithm is an efficient method for constructing phylogenetic networks because it can construct a much simpler network. However, Cass relies heavily on the order of input data, i.e. different networks can be constructed for the same dataset with different input orders. Based on the frequency and incompatibility degree of taxa, we propose an efficiently improved algorithm of Cass, called as Frin. The experimental results show that the networks constructed by Frin are not only simpler than those constructed by other methods, but Frin can also construct more consistent phylogenetic networks when the treated data have different input orders. Furthermore, the phylogenetic network constructed by Frin is closer to the original information described by phylogenetic trees. Frin has been built as a Java software package and is freely available at https://github.com/wangjuanimu/Frin.Entities:
Keywords: evolution; frequency; genome; incompatibility degree; phylogenetic network
Year: 2019 PMID: 31867045 PMCID: PMC6909884 DOI: 10.3389/fgene.2019.01261
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Two networks N1 and N2 are constructed by Frin for the cluster set of Example 3.1.
Figure 2N3 is the network constructed by Frin for all permutations of input data in Example 3.2.
Figure 3N4, N5 and N6 are the networks constructed by Cass for all permutations of input data in Example 3.2.
Figure 4N7, N8 and N9 are the networks constructed by BIMLR for all permutations of input data in Example 3.2.
Figure 5N10, N11 and N12 are the networks constructed by Lnetwork for all permutations of input data in Example 3.2.
The results of Frin, Cass, Lnetwork and BIMLR on practical datasets with clusters |C| and taxa |X| when input order is different.
| Data | Firm | Cass | Lnetwork | BIMLR | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| | | | | n | mean | min | max | n | mean | min | max | n | mean | min | max | n | mean | min | max |
| 35 | 22 | 1 | 0 | 0 | 0 | 2 | 6.5 | 6.5 | 6.5 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| 25 | 15 | 1 | 0 | 0 | 0 | 2 | 3 | 3 | 3 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| 22 | 13 | 2 | 1.5 | 1.5 | 1.5 | 2 | 0.5 | 0.5 | 0.5 | 2 | 1 | 1 | 1 | 2 | 1.5 | 1.5 | 1.5 |
| 27 | 15 | 3 | 3.3 | 1 | 5 | 3 | 3 | 3 | 3 | 2 | 1 | 1 | 1 | 2 | 1 | 1 | 1 |
| 25 | 13 | 1 | 0 | 0 | 0 | 4 | 6.3 | 2 | 7.5 | 3 | 1.2 | 0.5 | 1.5 | 1 | 0 | 0 | 0 |
| 22 | 11 | 2 | 5.5 | 5.5 | 5.5 | 3 | 3 | 2.5 | 3.5 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| 17 | 10 | 1 | 0 | 0 | 0 | 3 | 2 | 1.5 | 2.5 | 3 | 1.3 | 1 | 1.5 | 3 | 2 | 1 | 3 |
| 13 | 8 | 1 | 0 | 0 | 0 | 4 | 3.6 | 1.5 | 4 | 2 | 1 | 1 | 1 | 1 | 0 | 0 | 0 |
| 23 | 11 | 1 | 0 | 0 | 0 | 4 | 5.6 | 3 | 7.5 | 2 | 1 | 1 | 1 | 2 | 1 | 1 | 1 |
| 18 | 10 | 1 | 0 | 0 | 0 | 4 | 1.5 | 0.5 | 3 | 3 | 2.5 | 1.5 | 3.5 | 3 | 1.5 | 0.5 | 2.5 |
| 22 | 11 | 2 | 0.5 | 0.5 | 0.5 | 3 | 3.2 | 1.5 | 5 | 1 | 0 | 0 | 0 | 2 | 0.5 | 0.5 | 0.5 |
| 12 | 11 | 1 | 0 | 0 | 0 | 2 | 3 | 3 | 3 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| 21 | 10 | 2 | 5.5 | 5.5 | 5.5 | 4 | 3.9 | 1.5 | 5.5 | 2 | 1.5 | 1.5 | 1.5 | 2 | 0.5 | 0.5 | 0.5 |
| 13 | 7 | 1 | 0 | 0 | 0 | 4 | 3.8 | 1.5 | 4 | 2 | 1 | 1 | 1 | 1 | 0 | 0 | 0 |
| 22 | 10 | 3 | 2.7 | 2 | 3.5 | 2 | 1.5 | 1.5 | 1.5 | 1 | 0 | 0 | 0 | 2 | 0.5 | 0.5 | 0.5 |
| 21.1 | 11.8 | 1.5 | 1.3 | 1.1 | 1.4 | 3.1 | 3.4 | 2.2 | 4.0 | 1.8 | 1.2 | 1.1 | 1.4 | 1.6 | 0.6 | 0.4 | 0.7 |
The results of Frin, Cass, Lnetwork and BIMLR on practical datasets with clusters |C| and taxa |X|.
| Data | Frin | Cass | Lnetwork | BIMLR | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| | | | | t | k | r | c | t | k | r | c | t | k | r | c | t | k | r | c |
| 14 | 4 | 1s | 3 | 3 | 0 | 1s | 3 | 3 | 0 | 1s | 3 | 3 | 0 | 1s | 3 | 3 | 0 |
| 30 | 5 | 1s | 4 | 4 | 0 | 2s | 4 | 4 | 0 | 2s | 4 | 4 | 0 | 1s | 4 | 4 | 0 |
| 62 | 6 | 6s | 5 | 5 | 0 | 11s | 5 | 5 | 0 | 6s | 5 | 5 | 0 | 7s | 5 | 5 | 0 |
| 42 | 10 | 1s | 4 | 4 | 8 | 5s | 4 | 4 | 34 | 1s | 4 | 4 | 8 | 1s | 4 | 4 | 8 |
| 39 | 11 | 23s | 6 | 6 | 10 | 21s | 5 | 5 | 7 | 13s | 5 | 5 | 8 | 3s | 5 | 5 | 8 |
| 61 | 11 | 23s | 5 | 5 | 11 | 1m26s | 5 | 5 | 48 | 5s | 5 | 5 | 11 | 1s | 5 | 5 | 11 |
| 75 | 30 | 1s | 2 | 2 | 19 | 5s | 2 | 2 | 122 | 1s | 2 | 2 | 19 | 1s | 2 | 2 | 19 |
| 180 | 51 | 8s | 2 | 2 | 0 | 40s | 2 | 2 | 0 | 4s | 2 | 2 | 0 | 1s | 2 | 2 | 0 |
| 70 | 56 | 1s | 1 | 4 | 0 | 1s | 1 | 4 | 0 | 1s | 1 | 4 | 0 | 2s | 1 | 4 | 0 |
| 270 | 76 | 1m7s | 2 | 2 | 0 | 6m22s | 2 | 2 | 0 | 12s | 2 | 2 | 0 | 24s | 2 | 2 | 0 |
| 404 | 122 | 4m1s | 2 | 2 | 0 | 1h44m | 2 | 2 | 0 | 27s | 2 | 2 | 0 | 27s | 2 | 2 | 0 |
| 113.4 | 34.7 | 43.7s | 3.3 | 3.5 | 4.4 | 10m18s | 3.2 | 3.5 | 10 | 6.6s | 3.4 | 3.6 | 8.5 | 7.1s | 3.2 | 3.5 | 4.2 |
The results of Frin, Cass, Lnetwork and BIMLR on artificial datasets with clusters |C| and taxa |X|.
| Data | Frin | Cass | Lnetwork | BIMLR | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| | | | | t | k | r | c | t | k | r | c | t | k | r | c | t | k | r | c |
| 86 | 37 | 14s | 4 | 9 | 12 | 3s | 3 | 8 | 27 | 4s | 3 | 8 | 11 | 8s | 3 | 8 | 23 |
| 38 | 20 | 33s | 5 | 7 | 11 | 2s | 4 | 6 | 25 | 25s | 4 | 6 | 15 | 2s | 4 | 6 | 25 |
| 43 | 22 | 1s | 3 | 5 | 3 | 1s | 2 | 4 | 12 | 1s | 3 | 5 | 3 | 1s | 3 | 5 | 11 |
| 72 | 27 | 32s | 5 | 7 | 19 | 15s | 5 | 7 | 43 | 3s | 5 | 7 | 19 | 4s | 5 | 7 | 29 |
| 52 | 22 | 27s | 4 | 8 | 12 | 17s | 4 | 7 | 33 | 3s | 4 | 8 | 15 | 6s | 4 | 8 | 15 |
| 79 | 27 | 3m54s | 8 | 10 | 80 | 7m21s | 6 | 8 | 89 | 47s | 6 | 8 | 44 | 2m40s | 8 | 10 | 52 |
| 38 | 16 | 1m44s | 6 | 8 | 28 | 15s | 5 | 7 | 50 | 4m22s | 7 | 9 | 36 | 13s | 6 | 8 | 25 |
| 41 | 16 | 2s | 4 | 5 | 6 | 1s | 4 | 5 | 29 | 1s | 4 | 5 | 4 | 1s | 4 | 5 | 7 |
| 12 | 8 | 1s | 2 | 2 | 0 | 1s | 2 | 2 | 2 | 1s | 2 | 2 | 0 | 1s | 2 | 2 | 0 |
| 45 | 20 | 1m51s | 6 | 7 | 34 | 4h4m | 6 | 7 | 66 | 35s | 6 | 7 | 28 | 17s | 6 | 7 | 47 |
| 22 | 11 | 44s | 2 | 3 | 1 | 1s | 2 | 3 | 5 | 1s | 2 | 3 | 1 | 1s | 2 | 3 | 4 |
| 17 | 10 | 1s | 3 | 3 | 4 | 1s | 3 | 3 | 8 | 1s | 3 | 3 | 4 | 1s | 3 | 3 | 7 |
| 46 | 16 | 6m8s | 6 | 8 | 10 | 23s | 5 | 7 | 34 | 7s | 6 | 8 | 15 | 12s | 6 | 8 | 22 |
| 22 | 11 | 41s | 4 | 4 | 14 | 2s | 4 | 4 | 23 | 3s | 4 | 4 | 13 | 2s | 5 | 5 | 21 |
| 22 | 10 | 54s | 4 | 4 | 10 | 2s | 4 | 4 | 21 | 6s | 4 | 4 | 12 | 2s | 5 | 5 | 19 |
| 42.3 | 18.2 | 1m2s | 4.4 | 6 | 16 | 16m51s | 3.9 | 5.5 | 31 | 24.9s | 4.2 | 5.8 | 14.7 | 15.4s | 4.4 | 6 | 20.5 |
Figure 6Frin constructs a level-5 network with r = 5, c = 31 for the three gene trees of the Poaceae datasets.
The clusters represented by a network in the soft-wired sense.
| 1. |
| 2. |
| 3. soft ( |
| 4. |
| 5. |
| 6. |
| 7. switch on the left incoming edge of each reticulate node and switch off the right one |
| 8. |
| 9. switch off the left incoming edge of each reticulate node and switch on the right one |
| 10. |
| 11. |
| 12. |
| 13. add a cluster represented by |
| 14. |
| 15. add clusters represented by the child of |
| 16. |
| 17. |
| 18. |
| 19. |
| 20. |
| 21. |
| 22. |
| 23. |
| 24. |
| 25. |