Literature DB >> 22745529

PhyloTempo: A Set of R Scripts for Assessing and Visualizing Temporal Clustering in Genealogies Inferred from Serially Sampled Viral Sequences.

Melissa M Norström¹, Mattia C F Prosperi, Rebecca R Gray, Annika C Karlsson, Marco Salemi.

Abstract

Serially-sampled nucleotide sequences can be used to infer demographic history of evolving viral populations. The shape of a phylogenetic tree often reflects the interplay between evolutionary and ecological processes. Several approaches exist to analyze the topology and traits of a phylogenetic tree, by means of tree balance, branching patterns and comparative properties. The temporal clustering (TC) statistic is a new topological measure, based on ancestral character reconstruction, which characterizes the temporal structure of a phylogeny. Here, PhyloTempo is the first implementation of the TC in the R language, integrating several other topological measures in a user-friendly graphical framework. The comparison of the TC statistic with other measures provides multifaceted insights on the dynamic processes shaping the evolution of pathogenic viruses. The features and applicability of PhyloTempo were tested on serially-sampled intra-host human and simian immunodeficiency virus population data sets. PhyloTempo is distributed under the GNU general public license at https://sourceforge.net/projects/phylotempo/.

Entities: Chemical Disease Gene Species

Keywords: clustering; coalescence; comparative methods; fast evolving viruses; longitudinal samples; phylodynamics; phylogenetics; positive selection; software

Year: 2012 PMID： 22745529 PMCID： PMC3382462 DOI： 10.4137/EBO.S9738

Source DB: PubMed Journal: Evol Bioinform Online ISSN： 1176-9343 Impact factor: 1.625

Introduction

The evolutionary and demographic history of a measurable evolving viral population1 can be inferred by phylodynamic analysis of longitudinally sampled sequences.2 In particular, the shape of a phylogenetic tree often reflects the characteristics and the interactions among evolutionary and ecological processes. For example, phylogenies of viruses under continuous positive selection, such as inter-host influenza or intrahost human immunodeficiency virus (HIV), usually exhibit a marked staircase-like topology.3,4 Pathogens under weak or absent positive selection show a more balanced tree shape, as it happens within the measles serotypes.5 Trees displaying a star-like topology or increasing root to tip distance with sampling time are typical of an exponentially growing population, whilst the opposite pattern can usually be associated to constant or decreasing population sizes, as for example in Dengue virus inter-host phylogenies.6 Phylogenetic tree shapes can also be coupled with phenotypic traits (either numeric or categorical) or geographic correlates (such as geographic origin of sampled strains), which can be analyzed via comparative methods in terms of evolutionary or spatiotemporal dynamics.7 Several statistical approaches exist for determining if and how serially sampled sequences evolve under a strict or relaxed molecular clock.8,9 However, such methods do not give indications about the phylogenetic tree shape (eg, staircase- or star-like) and the related temporal structure (eg, if sequences sampled at the same time point tend to cluster together and to be direct ancestors of sequences sampled at later time points). There are various functions for describing the topological features of a phylogenetic tree. Some of these measures consider both topology and branch lengths of a tree, as well as phenotypic tip traits,10–12 while others evaluate only the tree topology in relation to geographic or phenotypic characters associated with the sampled strains.13–15 Finally, there are purely topological measures based on tree symmetry/balance.16–19 The temporal clustering (TC) is a recently developed statistic, which takes into account phylogenetic tree topology and sampling time of the tips.20 The TC statistic assesses the temporal structure of a phylogenetic tree, by evaluating the order of time changes from internal nodes to tips. It is based on the maximum parsimony reconstruction of ancestral characters implemented in phylogeography,13 but it has been modified to prevent the estimation of temporally impossible state changes in tip dated trees (ie, an earlier time point emerging from an ancestor assigned to a later time point). It also allows the comparison of phylogenies inferred from data sets including different number of times points and/or sampled sequences per time point. Currently, there is no available software implementation of the TC statistic, although it can be calculated with a series of manual steps using the MacClade program (http://macclade.org/). In this work, we present PhyloTempo, the first software implementation of the TC statistic. PhyloTempo is written in R, a free software environment for statistical computing and graphics (http://www.r-project.org/). Along with the TC implementation, several other tree topological measures were integrated in PhyloTempo using pre-existing R libraries, in a user-friendly graphical framework. The program was tested on several longitudinally sampled intra-host HIV and simian immunodeficiency virus (SIV) population data sets. The results showed how the comparison of the TC statistic with other topological measures can provide multifaceted insights on the dynamic processes shaping the evolution of pathogenic viruses.

Methods

The original formulation of the TC statistic20 requires a phylogenetic tree with n taxa sampled at t different discrete time points. A state (ie, time) transition matrix is then defined, where the cost of going from later to earlier time points is infinite (ie, time-irreversibility). The other costs are usually defined as integer linearly increasing with the time points ordering. Ancestral tree states are inferred using Fitch’s parsimony algorithm.21 A non-normalized TC score is then calculated by summing all the state changes across the tree branches according to the cost matrix weights. A tree with a perfect temporal structure, ie, a tree in which all tips sampled at time point t are monophyletic and directly emerge from time point t–1, would have a non-normalized TC equal to t–1. Conversely, a tree with the least temporal structure would have a maximal non-normalized TC, equal to the sum of the n number of taxa multiplied by the corresponding w weights of the cost matrix, ie, . The normalized TC rescales the non-normalized TC value in the interval [0,1], by considering a background distribution of TC statistics obtained by shuffling the time points associated to the tree tips (keeping the topology fixed) and reestimating the ancestral characters. Specifically, the normalized TC statistic is where the S and Smax are -respectively- the average and the maximum non-normalized TC values observed in the randomized trees, while Min is the minimum theoretical non-normalized TC, equal to t–1. S is the observed non-normalized TC value calculated on the original tree. The numerator represents the deviation from the null hypothesis (ie, no temporal clustering), while the denominator represents the range of possible values for the given number of taxa and time points. Coupled with the TC statistic, PhyloTempo includes also the following tree topology measures and tests of hypothesis: Aldous’ graphical test and lilkelihood ratio test to decide if tree fit the Yule or the uniform models;22 Colless’ and Sackin’s shape statistics, both under the Yule or uniform hypotheses;23 cherry count;24 Pybus’ gamma.25 In addition, a simple tree statistic called “staircase-ness” is introduced, counting the proportion of sub-trees that are imbalanced (ie, sub-trees where the left child contains more leaves than the right child, or vice-versa) compared against the distribution of such proportions obtained from random trees. See the supplementary material for the properties of this measure.

Implementation

All the code has been written in the R language. Besides the standard core library set of R, the following R libraries have been used (including their dependencies): “ape”, “ade4”, “phybase”, “phylobase”, “phangorn”, “doBy”, “infotheo”, “apTreeshape”, “diversitree” (http://www.r-phylo.org). The required input of PhyloTempo is a phylogenetic tree file in “newick” format and a two-column text file in which each tip name present in the phylogenetic tree is associated with its corresponding time of sampling (a numeric value such as days or years). The input phylogenetic tree is preliminarily checked for polytomies, which are resolved randomly. If present, negative branch lengths are set to zero and then all branch lengths are added a 10−5 value. The tree is rooted on the tip that gives the highest linear correlation between the root-to-tip distance and the sampling time of the tip, and finally it is ladderized. The vector of sampling times is then discretized into time intervals by using an equal-frequency binning, where the optimal number of bins is the square root of the vector size. The maximum allowed number of discrete time intervals is nine, and each time bin needs to contain at least two tips. The TC statistic calculation is made upon the previous theoretical description. However, in this new implementation the ancestral characters are estimated using maximum likelihood26 rather than parsimony. A major advantage of maximum likelihood is that it also allows for an optimized estimate of the weights of the transition cost matrix. The number of tip randomizations is set to 300 by default, but the value can be modified by the user. All the other tree statistics are assembled by combining existing R functions. Both graphical and text output are produced, where figures are plotted in multiple windows, text is printed in the R command-line window and results are saved in a tabulated file. The graphical plots include: the phylogenetic tree with ancestral character state probabilities drawn with pie charts at internal nodes; the TC statistic compared versus the randomized background distribution; a linear correlation plot between the sampling times of tips and root-to-tip distances; a Kruskal-Wallis test comparing distribution of root-to-tip distances with the discretized time points; the staircase-ness, Aldous’, Sackin’s and Cherry count statistics with the corresponding background randomizations. The text output reports the aforementioned results as well as the P-values from the statistical tests. Also, a script that allows the analysis of multiple trees in “nexus” format, from an a posteriori, eg, trees as output from MrBayes,27 or bootstrap analysis has been made available. PhyloTempo is distributed under the GNU general public license and is available at https://sourceforge.net/projects/phylotempo/ for download.

Results

PhyloTempo has been tested on different viral data sets. The first data set included intra-host HIV-1 phylogenetic trees, inferred from serially sampled envelope (env) C2-V5 sequences, from nine untreated subjects with fast disease progression,4 named as the “Shankarappa” data set after the first author of the paper. The second data set included intra-host SIV trees, inferred from env gp120 sequences sampled longitudinally from four experimentally infected Rhesus macaques that were CD8-depleted before infection and progressed to AIDS within 75–118 days post infection.28 The third data set included intrahost HIV-1 gag p24 trees from six untreated subjects enrolled in the OPTIONS cohort29 all carrying the HLA-B*5701 allele strongly associated with slower disease progression, that were followed longitudinally from early infection up to seven years.30 In Table 1 the text output of PhyloTempo is reported, after running the program on each proof-of-concept data set (note that for simplicity not all indicators output by PhyloTempo are shown). When comparing the former calculation of the TC statistic based on parsimony with the new one based on maximum likelihood, in general we found a high degree of linear ρ correlation (ρ ≈ 0.8, combining the three data sets, data not shown). On average, the TC exhibited a weak linear ρ correlation with any of the other tree topology measures implemented in PhyloTempo (average ρ = 0.10, standard deviation 0.17), including also dN/dS values (estimated via the Nei-Gojobori method averaging across all positions). The maximum value obtained was ρ = 0.42, found with respect to the root-to-tip-distance vs. sampling time correlation.

Table 1

Summary of PhyloTempo output from different proof-of-concept data sets.

Data set	Time range (post-infection)	No. time intervals	No. tips	RTD vs. ST ρ	Staircaseness	dN/dS	TC
OPTIONS P1 all seqs.	91–1872 days	4	84	0.81	0.75	0.26	0.41
OPTIONS P1 unique seqs.	91–1872 days	4	48	0.89	0.64	0.21	0.35
OPTIONS P2 all seqs.	126–1348 days	3	79	0.88	0.73	0.15	0.31
OPTIONS P2 unique seqs.	126–1348 days	3	65	0.80	0.75	0.17	0.29
OPTIONS P3 all seqs.	91–2234 days	7	186	0.93	0.82	0.17	0.22
OPTIONS P3 unique seqs.	91–2234 days	6	74	0.84	0.70	0.15	0.36
OPTIONS P4 all seqs.	77–2180 days	5	128	0.78	0.80	0.30	0.20
OPTIONS P4 unique seqs.	77–2180 days	5	54	0.66	0.72	0.27	0.33
OPTIONS P5 all seqs.	91–2129 days	5	124	0.94	0.83	0.31	0.37
OPTIONS P5 unique seqs.	91–2129 days	3	55	0.95	0.74	0.18	0.72
OPTIONS P6 all seqs.	70–2602 days	6	140	0.92	0.73	0.24	0.13
OPTIONS P6 unique seqs.	70–2602 days	5	85	0.80	0.65	0.12	0.13
Shankarappa #1	14–133 days	9	137	0.90	0.66	1.00	0.30
Shankarappa #2	14–161 days	9	231	0.93	0.68	1.24	0.20
Shankarappa #3	42–154 days	9	106	0.92	0.72	1.63	0.50
Shankarappa #5	14–567 days	9	236	−0.01	0.68	0.74	0.25
Shankarappa #6	77–154 days	8	130	0.93	0.64	0.89	0.37
Shankarappa #7	35–126 days	7	138	0.92	0.68	1.14	0.22
Shankarappa #8	63–168 days	8	150	0.92	0.69	1.23	0.24
Shankarappa #9	21–357 days	9	120	0.32	0.71	1.71	0.21
Shankarappa #11	35–154 days	6	52	0.94	0.69	0.90	0.32
SIV D03 plasma	22–75 days	3	58	0.50	0.67	0.27	0.03
SIV D04 plasma	22–91 days	3	66	0.52	0.62	0.44	0.22
SIV D05 plasma	22–89 days	3	67	0.35	0.70	0.42	0.05
SIV D06 plasma	22–118 days	3	68	0.52	0.66	0.40	0.13
Correlation with TC		0.05	−0.17	0.42	0.23	0.04	1.00

Abbreviations: Seqs, sequences; RTD, root-to-tip distance; ST, sampling time; ρ, Pearson’s linear correlation; SC, staircase-ness; dN/dS, ratio between non-synonymous and synonymous substitutions; TC, temporal clustering statistic.

The average TC statistic for the Shankarappa data set was 0.29 (st.dev 0.10), for the SIV data set was 0.11 (st.dev 0.09). The OPTIONS data set, based on the highly conserved HIV-1 gag p24 gene, allowed us to evaluate the effect of including or excluding identical sequences and resulted in a TC value of 0.27 (st.dev 0.11) and 0.36 (st.dev 0.19) when analyzing all sequences or only unique sequences, respectively. Figures 1 and 2 illustrate the PhyloTempo graphical output. In detail, Figure 1 shows two of the trees analyzed (OPTIONS and the SIV data sets, respectively), and includes the maximum likelihood estimate of the ancestral time states, with state probabilities reported as pie charts at each internal node. Figure 2 reports the placement of the TC statistic, as well as all the other tests, with respect to background random distributions or null hypotheses. In addition, the correlation plot between the root-to-tip distances and the sampling time is shown, along with the boxplots of the root-to-tip distances stratified by the time intervals.

Figure 1

PhyloTempo graphical output showing the ancestral character estimation on an input phylogenetic tree.

Notes: Pie charts in the internal nodes of the tree represent probabilities of ancestral states. Left panel shows a tree from the OPTIONS data set (patient P5, unique sequences) with a high TC statistic (0.7); right panel shows a tree from the SIV data set (subject D03) with a poor TC statistic (0.1).

Figure 2

PhyloTempo graphical output summarizing phylogenetic tree shape statistics.

Note: A tree from the OPTIONS data set (patient p5, unique sequences) with a high TC statistic (0.7) was used.

On average, the running time of PhyloTempo on an input phylogenetic tree with 100–150 leaves (3–4 time points and 300 randomizations) takes less than 5 minutes using a standard desktop computer. Running times for trees of 300 or 400 tips increase to half or one hour.

Discussion

In this paper we presented PhyloTempo, a set of scripts in the R language that calculates the TC clustering statistics and other measures of phylogenetic tree shape, with a comprehensive text and graphical output. The choice of the R software environment gives to the tool the advantage to be available for many platforms (Microsoft Windows, Mac, or Linux) and, since R features a plethora of libraries both for phylogenetic analysis and graphics, to be ready for the inclusion of other functions related to the analysis of phylogenetic tree shape and comparative statistics. Although other programs that calculate tree shape statistics are available, such as the java application TreeStat (http://tree.bio.ed.ac.uk/software/treestat/), and Path-O-Gen (http://tree.bio.ed.ac.uk/software/pathogen/), as well as several command-line functions in R, this is the first that implements the TC statistic merging in a user-friendly interface both graphical and text outputs. In addition, PhyloTempo is capable of generating an a posteriori TC statistic, reading a tree ensemble in “nexus” format, such as the output by MrBayes (http://mrbayes.sourceforge.net/). As a future perspective in the context of Bayesian analysis, a theoretical approach to derive an analytical formula for the TC statistic is advisable, allowing the avoidance of the time-consuming tree randomization for each tree. Two interesting biological insights are evident from the present analysis. First, TC does not correlate with any previously described topological tree measure implying that the new statistic evaluates aspects of the evolutionary process not captured by other methods. Second, TC does not correlate with estimated dN/dS ratios in different data sets. Several studies have interpreted temporally structured phylogenies as evidence of sequential viral population bottlenecks driven by continuous selection pressure.2,4,31,32 The trees inferred from the OPTIONS data sets include HIV-1 sequences from patients with the HLA-B*5701 allele that has been associated with slower disease progression, possibly due to strong positive or purifying selection driving viral escape from cytotoxic T lymphocyte recognition.33 Interestingly, the TC calculated for the OPTIONS data sets are not significantly different (P = 0.21 from a t-test) from those calculated for the Shankarappa data sets. The finding suggests that temporally structured genealogies may reflect intra-host evolutionary processes that are similar in two groups of patients characterized by different rates of disease progression and that may not be related to selection pressure. However, it is important to point out that the subjects in the Shankarappa data set were followed for a shorter period of time than the OPTIONS subjects, and that the intervals between longitudinal samples were overall shorter in the first data set (data not shown). The low TC values for the Shankarappa data set may simply reflect an incomplete turnover of the viral quasispecies, which can require up to 22 months,34 causing an intermix of sequences sampled at different time points. Moreover, archival viral strains expressed in cellular reservoirs would decrease the temporal structure of serially sampled genealogies, because sequences from later time points may share a most recent common ancestor with sequences collected much earlier in infection.35 Therefore, the TC statistic could be a powerful tool to investigate the extent and impact of latent viral reservoirs in intra-host HIV-1 evolution. Finally, it is interesting to note that the SIV data sets show the lowest TC values. This may be the result of the relatively short time of infection in these animals, as well as a consequence of the depletion of CD8+ T cells right after infection.28 In conclusion, the present work describes a practical and user-friendly implementation of a novel statistic to evaluate the shape of phylogenetic trees, inferred from longitudinal samples of measurably evolving viral populations, which can provide significant insights on underlying evolutionary processes linked to infection dynamics and pathogenesis.

Supplementary Material

On the properties of the staircase-ness measure

The staircase-ness measure counts the (i) proportion of sub-trees that are imbalanced (ie, sub-trees where the left child contains more leaves than the right child, or vice-versa). An alternative formulation (ii) is to make the average of all the min(l,r)/max(l,r) values of each sub-tree, where l and r are the number of leaves in the left and right children of a subtree. In this work we compared the staircase-ness values against the distribution of such proportions obtained from random trees. However, there are also a few properties of this measure that are worth to be analyzed analytically. First of all, the staircase-ness of perfectly balanced binary trees is always zero, whichever formulation is used. On the other hand, the staircase-ness of perfectly imbalanced trees (ie, ladder-like trees) is always one when counting the proportions (ie, formulation i), whilst depends on the number of leaves when performing the average (ie, formulation ii). Specifically, the staircase-ness values for perfectly imbalanced trees using formulation (ii) is , where n is the number of subtrees. This formula tends to zero as n increases since the limit of the series converges to zero, as previously demonstrated by E. Cesàro (http://en.wikipedia.org/wiki/Ces%C3%A0ro_mean. The distribution of the staircase-ness values obtained by simulating random trees (function rtree(number_of_tips, branch_length = runif(1)) of the R library “ape”) does not pass the Shapiro-Wilk normality test (P<<0.0001, even by considering only trees with a number of tips >300), neither resembles a Gamma distribution, whose parameters had been fit on the actual data (P<<0.0001, using a Kruskal/Wallis test on simulations). However, the average values of both definitions values look stable across all the tree sizes (Fig. S1), while the standard deviation seems to decrease by increasing the tree size. The limits of the average staircase-ness values for formulation (i) and (ii) are close to 0.61 and 0.64, respectively. Distribution of staircase-ness values from random trees (5,000 simulations), by varying tree size (from 3 to 2,000 leaves). Notes: Upper panels show results for formulation (i), whilst lower panels for formulation (ii). Left panels represent the scatterplot of all staircase-ness values depending on the tree size, with a global average and standard deviation indicated in red. Central panels show the boxplots of staircase-ness values by stratifying for tree sizes (5 equal-width intervals spanning tree sizes between 3 and 2,000), with the corresponding stratified average and standard deviation. The right panels show the histograms for all staircase-ness values, and compares them with simulated distributions whose parameters have been fit on the empirical data (Gaussian and Gamma functions).

26 in total

1. Distributions of cherries for two models of trees.

Authors: A McKenzie; M Steel
Journal: Math Biosci Date: 2000-03 Impact factor: 2.144

Review 2. The causes and consequences of HIV evolution.

Authors: Andrew Rambaut; David Posada; Keith A Crandall; Edward C Holmes
Journal: Nat Rev Genet Date: 2004-01 Impact factor: 53.242

3. MrBayes 3: Bayesian phylogenetic inference under mixed models.

Authors: Fredrik Ronquist; John P Huelsenbeck
Journal: Bioinformatics Date: 2003-08-12 Impact factor: 6.937

4. A robust measure of HIV-1 population turnover within chronically infected individuals.

Authors: G Achaz; S Palmer; M Kearney; F Maldarelli; J W Mellors; J M Coffin; J Wakeley
Journal: Mol Biol Evol Date: 2004-06-23 Impact factor: 16.240

5. Efficient transmission and persistence of low-frequency SIVmac251 variants in CD8-depleted rhesus macaques with different neuropathology.

Authors: Samantha L Strickland; Rebecca R Gray; Susanna L Lamers; Tricia H Burdo; Ellen Huenink; David J Nolan; Brian Nowlin; Xavier Alvarez; Cecily C Midkiff; Maureen M Goodenow; Kenneth Williams; Marco Salemi
Journal: J Gen Virol Date: 2012-02-01 Impact factor: 3.891

6. Identification of shared populations of human immunodeficiency virus type 1 infecting microglia and tissue macrophages outside the central nervous system.

Authors: T H Wang; Y K Donaldson; R P Brettle; J E Bell; P Simmonds
Journal: J Virol Date: 2001-12 Impact factor: 5.103

7. Use of laboratory tests and clinical symptoms for identification of primary HIV infection.

Authors: Frederick M Hecht; Michael P Busch; Bhupat Rawal; Marcy Webb; Eric Rosenberg; Melinda Swanson; Margaret Chesney; Jennifer Anderson; Jay Levy; James O Kahn
Journal: AIDS Date: 2002-05-24 Impact factor: 4.177

8. Variance adjusted weighted UniFrac: a powerful beta diversity measure for comparing communities based on phylogeny.

Authors: Qin Chang; Yihui Luan; Fengzhu Sun
Journal: BMC Bioinformatics Date: 2011-04-25 Impact factor: 3.169

9. Phylodynamics and molecular evolution of influenza A virus nucleoprotein genes in Taiwan between 1979 and 2009.

Authors: Jih-Hui Lin; Shu-Chun Chiu; Ju-Chien Cheng; Hui-Wen Chang; Kuang-Liang Hsiao; Yung-Cheng Lin; Ho-Sheng Wu; Marco Salemi; Hsin-Fu Liu
Journal: PLoS One Date: 2011-08-12 Impact factor: 3.240

Review 10. Unifying the epidemiological and evolutionary dynamics of pathogens.

Authors: Bryan T Grenfell; Oliver G Pybus; Julia R Gog; James L N Wood; Janet M Daly; Jenny A Mumford; Edward C Holmes
Journal: Science Date: 2004-01-16 Impact factor: 47.728

9 in total

1. Predicting the short-term success of human influenza virus variants with machine learning.

Authors: Maryam Hayati; Priscila Biller; Caroline Colijn
Journal: Proc Biol Sci Date: 2020-04-08 Impact factor: 5.349

2. Persistence of HIV transmission clusters among people who inject drugs.

Authors: Rebecca Rose; Sissy Cross; Susanna L Lamers; Jacquie Astemborski; Greg D Kirk; Shruti H Mehta; Matthew Sievers; Craig Martens; Daniel Bruno; Andrew D Redd; Oliver Laeyendecker
Journal: AIDS Date: 2020-11-15 Impact factor: 4.632

3. Inferring epidemiological parameters from phylogenies using regression-ABC: A comparative study.

Authors: Emma Saulnier; Olivier Gascuel; Samuel Alizon
Journal: PLoS Comput Biol Date: 2017-03-06 Impact factor: 4.475

Review 4. The intra-host evolutionary and population dynamics of human immunodeficiency virus type 1: a phylogenetic perspective.

Authors: Marco Salemi
Journal: Infect Dis Rep Date: 2013-06-06

5. Phylogenetic tree shapes resolve disease transmission patterns.

Authors: Caroline Colijn; Jennifer Gardy
Journal: Evol Med Public Health Date: 2014-06-09

6. Effects of memory on the shapes of simple outbreak trees.

Authors: Giacomo Plazzotta; Christopher Kwan; Michael Boyd; Caroline Colijn
Journal: Sci Rep Date: 2016-02-18 Impact factor: 4.379

7. Mapping genome variation of SARS-CoV-2 worldwide highlights the impact of COVID-19 super-spreaders.

Authors: Alberto Gómez-Carballa; Xabier Bello; Jacobo Pardo-Seco; Federico Martinón-Torres; Antonio Salas
Journal: Genome Res Date: 2020-09-02 Impact factor: 9.043

8. Phylogeography of SARS-CoV-2 pandemic in Spain: a story of multiple introductions, micro-geographic stratification, founder effects, and super-spreaders.

Authors: Alberto Gómez-Carballa; Xabier Bello; Jacobo Pardo-Seco; María Luisa Pérez Del Molino; Federico Martinón-Torres; Antonio Salas
Journal: Zool Res Date: 2020-11-18

9. Network science inspires novel tree shape statistics.

Authors: Leonid Chindelevitch; Maryam Hayati; Art F Y Poon; Caroline Colijn
Journal: PLoS One Date: 2021-12-23 Impact factor: 3.240

9 in total