| Literature DB >> 35243506 |
Sudhir Kumar1,2.
Abstract
Molecular evolutionary analyses require computationally intensive steps such as aligning multiple sequences, optimizing substitution models, inferring evolutionary trees, testing phylogenies by bootstrap analysis, and estimating divergence times. With the rise of large genomic data sets, phylogenomics is imposing a big carbon footprint on the environment with consequences for the planet's health. Electronic waste and energy usage are large environmental issues. Fortunately, innovative methods and heuristics are available to shrink the carbon footprint, presenting researchers with opportunities to lower the environmental costs and greener evolutionary computing. Green computing will also enable greater scientific rigor and encourage broader participation in big data analytics.Entities:
Keywords: carbon footprint; green computing; molecular evolution; phylogenetics
Mesh:
Year: 2022 PMID: 35243506 PMCID: PMC8894743 DOI: 10.1093/molbev/msac043
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Fig. 1.The use of computational methods in molecular evolution has been increasing quickly, as seen in the annual counts of new research articles citing the use of major software packages for molecular evolutionary and phylogenetic analyses. Citation counts for software packages were obtained from Google Scholar (last accessed January 25, 2022) for 2005–2020. See supplementary material, Supplementary Material online for more details on software versions included.
Carbon Footprints (gram CO2e) of Molecular Phylogenetic Analyses and Software for an MSA of 37 Mammalian Species and 1.3 Million Sites.
| Computer Resources | Environmental Impact | |||||
|---|---|---|---|---|---|---|
| Time | Memory | Energy | C-footprint | Trees | ||
| Function | Method/Tool | (h) | (peak, MB) | (kWh) | (g) | (days) |
| (a) Optimal substitution model selection | ||||||
| a1. | ModelFinder | 106.0 | 9,300 | 1.64 | 617 | 20.1 |
| a2. | jModelTest | 8.8 | 3,700 | 0.12 | 44 | 1.5 |
| a3. | ModelTest-NG | 8.0 | 3,700 | 0.11 | 41 | 1.2 |
| (b) Clock rate model selection | ||||||
| b1. | Bayes factor | 2,500.0 | 46,000 | 51.00 | 19,220 | 540.0 |
| b2. | CorrTest | 0.2 | 4,000 | <0.01 | 1 | <0.1 |
| (c) Phylogeny inference | ||||||
| c1. | Maximum likelihood | 8.1 | 4,000 | 0.11 | 41 | 1.2 |
| c2. | FastTree | 0.7 | 700 | 0.01 | 3 | 0.1 |
| c3. | Neighbor-joining | 0.1 | 8 | <0.01 | <1 | <0.1 |
| (d) Statistical tests of phylogenies (ML) | ||||||
| d1. | Standard bootstrap | 980.0 | 3,100 | 13.00 | 4,850 | 159.0 |
| d2. | Rapid bootstrap | 98.0 | 3,700 | 1.00 | 493 | 16.2 |
| d3. | Little bootstrap | 18.9 | 100 | 0.23 | 86 | 2.7 |
| d4. | Little+ultrafast-bootstraps | 0.9 | 200 | 0.01 | 4 | 0.1 |
| d5. | Bayesian | 857.9 | 22,000 | 17.00 | 6,490 | 210.0 |
| (e) Relaxed clock dating | ||||||
| e1. | Bayesian (slow) | 2,309.5 | 23,000 | 46.00 | 17,460 | 570.0 |
| e3. | Bayesian (fast) | 29.5 | 909 | 0.36 | 135 | 4.5 |
| e3. | RelTime | 0.1 | 8 | <0.01 | <1 | <0.1 |
Note.—The C-footprint (Carbon footprint) is the amount (g) of CO2 released in the production of energy (kilowatt-hours, kWh) needed to power computers in the USA, estimated using the Green Algorithms website (Lannelongue, Grealey, and Inouye 2021). Tree days are calculated based on the information that a mature tree can scrub ∼917 g of CO2e per day (Grealey et al. 2022). The Supplementary Material online provides details on software used and the options applied.