BACKGROUND: Hadamard conjugation is part of the standard mathematical armoury in the analysis of molecular phylogenetic methods. For group-based models, the approach provides a one-to-one correspondence between the so-called "edge length" and "sequence" spectrum on a phylogenetic tree. The Hadamard conjugation has been used in diverse phylogenetic applications not only for inference but also as an important conceptual tool for thinking about molecular data leading to generalizations beyond strictly tree-like evolutionary modelling. RESULTS: For general group-based models of phylogenetic branching processes, we reformulate the problem of constructing a one-one correspondence between pattern probabilities and edge parameters. This takes a classic result previously shown through use of Fourier analysis and presents it in the language of tensors and group representation theory. This derivation makes it clear why the inversion is possible, because, under their usual definition, group-based models are defined for abelian groups only. CONCLUSION: We provide an inversion of group-based phylogenetic models that can implemented using matrix multiplication between rectangular matrices indexed by ordered-partitions of varying sizes. Our approach provides additional context for the construction of phylogenetic probability distributions on network structures, and highlights the potential limitations of restricting to group-based models in this setting.
BACKGROUND: Hadamard conjugation is part of the standard mathematical armoury in the analysis of molecular phylogenetic methods. For group-based models, the approach provides a one-to-one correspondence between the so-called "edge length" and "sequence" spectrum on a phylogenetic tree. The Hadamard conjugation has been used in diverse phylogenetic applications not only for inference but also as an important conceptual tool for thinking about molecular data leading to generalizations beyond strictly tree-like evolutionary modelling. RESULTS: For general group-based models of phylogenetic branching processes, we reformulate the problem of constructing a one-one correspondence between pattern probabilities and edge parameters. This takes a classic result previously shown through use of Fourier analysis and presents it in the language of tensors and group representation theory. This derivation makes it clear why the inversion is possible, because, under their usual definition, group-based models are defined for abelian groups only. CONCLUSION: We provide an inversion of group-based phylogenetic models that can implemented using matrix multiplication between rectangular matrices indexed by ordered-partitions of varying sizes. Our approach provides additional context for the construction of phylogenetic probability distributions on network structures, and highlights the potential limitations of restricting to group-based models in this setting.
Fundamental to evolutionary biology is the development and implementation of molecular phylogenetic methods [1]. These methods provide the means to reconstruct the past evolutionary history of biological entities given present-day molecular data, such as DNA. Considering Kimura’s neutral theory of molecular evolution, it is logical to apply a stochastic model at the level of DNA substitutions to construct probabilistic description of what molecular alignments are expected to be observed, given a proposed evolutionary history (tree topology and edge lengths). is commonly implemented assuming an IID (across sites in the alignment) and Markov process for DNA substitution, leading to a model that has a continuous-time Markov chain at its core (see Semple and Steel [2] for an introduction to the mathematics underlying modern phylogenetic methodology).In a series of papers, Hendy and colleagues introduced the Hadamard conjugation as a novel tool for phylogenetic analyses [3-5]. They found an invertible relationship between a phylogenetic tree, as characterized by its edge length spectrum, and the probability distribution of site patterns (referred to as the sequence spectrum). Originally introduced only for the 2-state symmetric model, the Hadamard conjugation was later extended to the K3ST model [6-8] and further to any of the so-called “group-based” models [9]. Hadamard conjugation has been used as both a tool for simulation [10] and to look at statistical properties of methods, exploring the inconsistency of parsimony under a molecular clock [5,11]. For these sorts of applications, following the notation in Felsenstein [1], we can use the Hadamard transform H to start with an edge length spectrum γ and calculate the sequence spectrum s=H−1 log(Hγ). The beauty of Hadamard conjugations is that one can also begin with an observed sequence spectrum and perform the inverse of the conjugation to empirically obtain an edge length spectrum . Although it is not expected that the spectrum will precisely match a tree, Hendy [12] proposed using an optimisation criterion to map from to the “closest tree”.Several authors have commented that it is potentially a useful feature of Hadamard conjugation that data isn’t forced onto a fixed tree. The conflicting information can be retained and interpreted in the form of a “lentoplot” [13] or a splits-graph [14], with both of these methods implemented in Spectronet [15]. Schliep [16] gives some more statistical justification for such an approach by making a link to modern statistical techniques such as the Lasso and Ridge regression.von Haeseler and Churchill [17] seems to be the first paper that explicitly suggests using Hadamard conjugation to provide a likelihood framework for networks. The principle idea in this work was to start with an edge length spectrum that encodes a set of incompatible splits, use the Hadamard transformation to get site probabilities and use these to determine a likelihood. This idea was further explored by Bryant [18], and Bryant [19] followed this through defining the “n-taxon process” for group-based models. It should be noted that likelihoods calculated via Hadamard are not equivalent to likelihoods calculated by taking a mixture of trees. Indeed, Matsen and Steel [20], Matsen et al. [21] used Hadamard methods in combination with phylogenetic invariants to show that mixtures of trees with the same topology can exactly mimic another tree under the 2-state model. Considering biological applications, thinking in terms of mixtures of trees or partitions where the data can be thought of as arising on a set of trees [22-24] seems more reasonable than the Hadamard conjugation. Strimmer and Moulton [25] suggested using split networks as a spring board to likelihood-based analyses on DAGs, but later identified several problems with the approach [26]; most notably, in split-networks internal nodes do not have a biological interpretation as an ancestor.In Sumner et al. [27], we gave some additional insight into the interpretation of applying the Hadamard conjugation in a network setting. We showed that permutation group structure inherent to the Hadamard transformation – as for any group-based model – restricts the resulting process from being capable of reproducing truly convergent processes. This is a serious limitation, as one of the biological motivations for explicit network models is the ability to model convergent processes. We also presented an alternative algebraic formalism for the general Markov model, analogous to the n-taxon process, but capable of reproducing convergent processes.From the point of view of group representation theory, the inversion of group-based models relies on the fact that the irreducible representations of an abelian group are one-dimensional, and the model structure essentially reduces to analysing group characters – hence the standard presentation of a Fourier inversion. In this article, we make this connection concrete. For the general Markov model, it is then immediately apparent that an analogous inversion is not possible because the algebraic structure underlying the model is not abelian and hence the irreducible representations are not one-dimensional. In fact, to obtain one-dimensional representations for the general Markov model, it is necessary to apply higher-degree polynomial maps (beyond the degree 1, linear case), and define “Markov invariants” [28]. These invariants present one-dimensional representations but at the cost of the higher degree – degree 5 in the case of the general Markov model with four states on quartet trees [29,30]. This connection between Hadamard transformation and Markov invariants is an interesting one, but we do not discuss it further here.In this paper we approach the inversion of group-based phylogenetic models by taking a representation-theoretic perspective and working explicitly with tensor indices. Our approach rests heavily on the formalism of “phylogenetic tensors”, as presented in Bashford et al. [31], for the binary-symmetric and K3ST model, and Sumner et al. [27,28], for the general Markov model.Although the main inversion results presented here are not more general than those in in Székely et al. [7], we think it is important to reformulate them using the language of tensors and representation theory. This viewpoint has already led to new approaches for modeling convergent evolution [27] and for studying non-group-based models [28]. However, in none of our previous work was the link to Hadamard conjugation explicitly discussed. By presenting an old technique (Hadamard conjugation) in a new light we hope to introduce other researchers to the viewpoint of tensor analysis and representation theory.
Methods
Group-based models
We consider the continuous-time formulation of Markov processes, and show how to implement the inversion of a group-based phylogenetic model based on any abelian group G. We note that such an inversion requires a map from tensor product space (where elements are indexed by ordered-n-partitions) to phylogenetic splits (where elements are indexed by bipartitions). We achieve this by finding canonical maps from bipartitions to ordered-n-partitions.For a group G (not necessarily abelian) with order |G|=d, we write G={σ1,σ2,…,σ}, and, when necessary, write ε∈G to specify the identity element of G. We will discuss the “regular representation” of G shortly, but skipping ahead we find that any rate matrix Q occurring in a group-based Markov model can be written in the formwhere each , and the K are the permutation matrices corresponding to the (non-identity) group elements σ∈G.For the reader interested in deriving this result, consider the d-dimensional vector space , with scalar multiplication and vector addition defined viafor all and . The regular representation,, is then defined by setting the group action
for all and σ∈G. If we fix {σ1,σ2,…,σ} as an ordered basis for , it is then clear – via Cayley’s theorem – that each group element σ gets mapped to a permutation matrix K:=ρreg(σ), with . Thus K has matrix elementsConsider the unit column vectors
and identify each with , so that the group action becomes σ:ξ↦Kξ=ξ where σ=σσ. Thus the matrix elements have i as the column label and j as the row label.A group-based Markov model is then obtained by taking a continuous-time Markov chain with state space G={σ1,σ2,…,σ} and using the group multiplication in G to assign a rate α to all substitutions σ1↦σ2 where σσ1=σ2. Following this through (as is done in detail in [32]) we are led to the formula (1) for rate matrices in any group-based model.The regular representation is one example of the general concept of a representation of G on a vector space V, defined as a homomorphism ρ:G→GL(V) satisfying ρ(g1g2)=ρ(g1)ρ(g2) for all g1,g2∈G. A representation is said to be reducible if there exists a proper subspace U⊂V satisfying ρ(g)U⊂U, i.e. the set of matrices ρ(G) send vectors in U back to U. In this case, U is called an invariant subspace. The representation ρ is then called irreducible if V does not contain any invariant subspaces.The reader should note that the usual construction of a “group-based” model [2] stipulates that G be abelian. Although the construction just given using the regular representation allows for non-abelian G, we will nonetheless only consider the abelian case in this paper, because, as discussed in the introduction, it is only in the abelian case that a (linear) inversion of phylogenetic models is possible. In this case the irreducible representations of G are all one-dimensional [33], and hence the analysis reduces to computations with group characters, as is exploited in the previous approaches using Fourier analysis [9,34].
Phylogenetic tensors
We denote [d]:={1,2,…,d} as the state space for a continuous-time Markov chain. Consider an n-taxa phylogenetic tree and a d-state phylogenetic pattern distribution with the interpretation that is the probability that the observed state at the k leaf on the tree is i. As is shown in Sumner and Jarvis [35] and in more detail in Sumner et al. [27], such phylogenetic pattern distributions can be represented abstractly as tensors in the n-fold tensor product space , as follows. If we choose {ξ1,ξ2,…,ξ} as an ordered basis for , and ordered basis for the tensor product space, a “phylogenetic tensor” is then defined asFor readers who are unfamiliar with tensor products, it is possible to understand the general concept via the definition of the “Kronecker” product of a n×m matrix A and a n′×m′ matrix B as the nn′×mm′ matrix given byWe can index the matrix A⊗B with row indicies i1j1=11,12,…,nn′ and column indices j1j2=11,12,…,mm′, i.e. generically and specifically (A⊗B)12,32=A13B22. This point of view is useful if one wants to write out specific matrix representations of tensors, however, in the development that follows will focus heavily on the indexing of tensor components in the various cases discussed.Suppose represents the state distribution of a single taxa, i.e. π is the probability that a randomly chosen site in the sequence will be in state i. Now suppose a phylogenetic branching event occurs and the sequence is copied. The corresponding phylogenetic tensor representing the joint distribution of the two-taxa just after the branching event then has the property that if i2=i1 and is zero otherwise. Thinking in terms of tensor operations, we find that phylogenetic branching events can be generated by a linear operator determined by δ(π)=P and defined in general using our chosen basis asThe remarkable fact for group-based models, central to the present article, is that the permutation matrices “intertwine” particularly simply with the branching operator:Thus, for any rate matrix Q arising from a group-based model, we have (via the linearity of δ):We also note that, since Q can be expressed a linear combination of permutation matrices representing elements in a group G, the matrix powers Q2,Q3,Q4… will also be expressible as linear combinations of the same permutation matrices (although precise expressions for the relevant coefficients may or may not be easily computable). Together with (3), this implies that, for any substitution matrix e arising from matrix exponentiation,This relation shows that mathematically, and hence conceptually, “Markov evolution on a single followed by a branching event” can be replaced with “Branching event on a single taxon followed by (correlated) Markov evolution of two taxa.” This equivalence is illustrated in Figure 1, and should be compared to the equivalent discussion of the “n-taxa process” given in [18] and [19].
Figure 1
Markov evolution on a single followed by a branching event (illustrated on the left), is equivalent to a branching event on a single taxon followed by correlated Markov evolution of two taxa (illustrated on the right). Mathematically, this equivalence can be implemented by exploiting the equality given in (4).
Markov evolution on a single followed by a branching event (illustrated on the left), is equivalent to a branching event on a single taxon followed by correlated Markov evolution of two taxa (illustrated on the right). Mathematically, this equivalence can be implemented by exploiting the equality given in (4).In Sumner et al. [27] we showed how to generalise this intertwining action to the case of the general Markov model. Interestingly, for the general Markov model the appropriate intertwining has quite a different structure from what occurs in group-based models, and hence the simplicity of (4) is somewhat misleading in general. We refer the reader to Sumner et al. [27] for more discussion on this point.Returning to the case of group-based models, for each subset A⊆[n], we define a linear map on as the tensor product where a=1 if i∈A and 0 otherwise. For example, if n=5, we haveTo develop a phylogenetic tensor on a tree, we root the phylogenetic tree at taxon n, and label edges by subsets ∅≠e⊆[n−1], where i∈e if the path from taxon n to taxon i crosses the edge labelled by e. A five taxon tree with this labelling, is presented in Figure 2. To each edge labelled by ∅≠e⊆[n−1], we assign the rate matrix
Figure 2
A six taxa tree rooted at taxon 6 with edges labelled by subsets of {1,2,3,4,5}.
A six taxa tree rooted at taxon 6 with edges labelled by subsets of {1,2,3,4,5}.where each is the rate of substitution for all states σ1 to σ2 satisfying , and . Each edge is then assigned substitution matrix , so that the time parameter for each edge is absorbed into the definition of Q.Now iterating (4) multiple times, Bashford et al. [27,31] show that any phylogenetic tensor can be written aswhere , and δπ is the d×d×…×d tensor that represents the “zero edge-length star tree” distribution on n taxa. It is this form of phylogenetic tensors that will do a lot of the heavy lifting in the discussion that follows. The reader should note that under this representation, there is no need for the edge parameters to be chosen to be compatible with a particular tree, hence the possibilities for generalising to non-tree-like or network models, as discussed in the introduction.The stationary distribution for group-based models is uniform (because the rate matrices are doubly stochastic). In this paper we always assume a stationary distribution, so that:and δπ has tensor componentsThis concludes our discussion of the tensor presentation of phylogenetic probability distributions under group-based models. It is important to note that everything discussed so far works for any group-based model, with no requirement that the underlying group G be abelian.In what follows, we discuss the inversion of abelian group-based models. We present the simplest case with ; the case; the case; the general case; and finally we discuss the case of any abelian group.
Results
The binary-symmetric case
We begin with the inversion of the so-called “binary-symmetric” model. Consider with standard basisAs a group-based model, the binary-symmetric model arises by taking the groupwith a generic rate matrix given bywhere is the permutation matrix representing σ in the standard basis.Now , with σ↦K, is the regular representation of , and the character table of given in Table 1 is easily recognised to be the Hadamard matrix
Table 1
The character table of
id
sgn
[ e]
1
1
[ σ]
1
-1
The character table ofAs is an abelian group, the irreducible representations are one-dimensional.The corresponding projection operators can be read off from the columns of the character table. That is, the operatorsproject ρreg=id⊕sgn onto the id and sgn representations of , respectively.This observation prompts us to work in the alternative basis:In this basis the permutation matrix is diagonal:The representation-theoretic perspective on is to observe that id(σ)=1 and sgn(σ)=−1.Referring to (5), we know that we can write a generic phylogenetic tensor aswhere .We index matrix and tensor indices by using and allow multiplication × in the ring of integers . The Hadamard matrix then has matrix elements where j is the row index and i is the column index. Observe that in the diagonal basis, the permutation matrix has elementsThus we have expressions such aswhere .As we are dealing with tensors of arbitrary size, it is convenient to represent a string such as i1i2…i as an ordered-bipartitionμ=μ0:μ1 of the set [n], where μ0,μ1⊆[n] with j∈μ if and only if i=k. For example we have the following equivalences:and inequivalence:We then haveDefining h(:=h(⊗h where h(1):=h, in the diagonal basis and using our notation h( has tensor componentsThe zero edge-length star-tree initial distribution has tensor components(where, although it seems we have given preference to taxon 1 in this expression, there are many ways that this distribution can be expressed using the δ). In the diagonal basis with , we have componentswhich is exactly the statementSince is diagonal in the transformed basis, we can conclude thatOf course many of these tensor components will be zero and we would like to ignore these.Take u=u0:u1 as an ordered bipartition of the reduced set [n−1], so that u≡i1i2…i where j∈u if and only if i=k, and defineand interpret u·γ(u) as a string: u·γ(u) = i1i2…iγ(u).If we make the definitionsthen we can write the non-zero components aswith inversesThis is the first part of the inversion.We would like to go further and actually recover the individual edge weights α. To do this we define the (square) 2×2 matrix F with componentswith e a subset and u an ordered-bipartition of [n−1]. As , we see that F provides its own inverse F−1 with componentsDefining the column vectors and , we can write the matrix equationsTogether with the first part of the inversion (6), these equations give a one-one map between pattern probabilities and edge weights for the binary-symmetric model.
Inversion of the model
Taking confidence from the previous case we now discuss the inversion of the group-based phylogenetic model with . We take
and, by analogy to the case, index tensors with indices i,j=0,1,2 and allow multiplication × by extending to the ring .In this case a generic rate matrix is given bywhereare the matrices representing the permutations σ≅(123) and σ2≅(132) under the regular representation, respectively.We define ω=e2, and present the character table of in Table 2. The decomposition of the regular representation is ρreg=id⊕ω⊕ω2, and the columns of the character table give the projection operators onto the (one-dimensional) irreducible subspaces:
Table 2
The character table of
id
ω
ω2
[ e]
1
1
1
[ σ]
1
ω
ω2
[ σ2]
1
ω2
ω
The character table ofTherefore, the matrixdiagonalizes the generic rate matrix for this model:or, equivalently,We recall our basic result (5) that for group-based models, a generic phylogenetic tensor can be expressed aswhere . We take the stationary distribution as initial distribution, so .The matrix elements of f can be expressed as , where we extend to include multiplication × from the ring of integers . Similarly,More generally, tensorial components can be expressed asWe represent a string i1i2…i as an ordered-tripartition, i1i2…i≡μ=μ0:μ1:μ2, of the set [n], where j∈μ if and only if i=k. For example, if we take n=5, we have:Taking n = 3, we haveand in general:Taking the uniform distribution as initial distribution, the initial star-tree distribution can be written asDefining f(=f(⊗f where f(1)=f, we haveand in the transformed basis, where , we haveIndexing by ordered-tripartitions, we conclude thatNow suppose |μ1|+2|μ2|=0 (mod 3), thenIf |μ1|+2|μ2|=1 (mod 3), thenand if |μ1|+2|μ2|=2 (mod 3), thenThus we have found a basis where all the elements of the initial star-tree tensor are zero unless the tripartion μ satisfies |μ1|+2|μ2|=0 (mod 3). Crucially, this statement also holds for the phylogenetic tensor because in this basis the rate matrices of this model are diagonal:We deal with this condition on μ by taking u=u0:u1:u2 as an ordered-tripartion of the reduced set [n−1] and setting μ=u·γ(u) (considered as the concatenation of strings) whereIf we make the definitionswe then have the first part of the inversionAs in the case, we would like to use η to recover the rate parameters α,β for all ∅≠e⊆[n−1] and thus complete the full inversion for this model. Of course, it is little bit more difficult this time.Recall that μ=μ0:μ1:μ2 with μ⊆[n], whereas u=u0:u1:u2 with u⊆[n−1], and ∅≠e⊆[n−1]. Consideringit follows thatand similarlyWe make the observation thatandwhere F1 and F2 are 2×3 matrices.Thus we may writeDefining the column vectors and , we can writeand define two 3×2 matrices G1 and G2 aswherewith ff−1=1.Considering thatfor all ordered-triparitions u,w of [n−1], we have the matrix productsThus the second part of the inversion for this model isTogether with (7), these equations give a one-one map between pattern probabilities and edge weights for the group-based model with .
Inversion of the K3ST model
We now consider the K3ST model [36] which occurs as the group-based model withIn this model a generic rate matrix is given bywhereWe already know that the 2×2 Hadamard matrix h diagonalizes K, so we see immediately that H=h⊗h diagonalizes this model:Of course H is the character table of and the permutation matrices (8), together with K00:=1, give the regular representation ρreg≅id⊗id⊕id⊗sgn⊕sgn⊗id⊕sgn⊗sgn, where we recall the basic result that the tensor product of two irreducible representations of a group G gives an irreducible representation of G×G.Simplifying notation, for this model we index tensors with indices given as pairs: ; and we express the individual parts using lower case Roman characters. For example, we write i:=ab=01, with a=0 and b=1. This gives matrix elements:and more complicated tensor products such asAgain we interpret strings such as μ≡a1a2…a and ν≡b1b2…b as ordered-bipartitions μ=μ0:μ1 and ν=ν0:ν1 of the set [n]. We can then write matrix elements of tensor products asTaking the stationary distribution as initial distribution, the zero edge-length star-tree distribution is given bywhich in the finer index representation isRecall that elements of the Hadamard matrix can be written as , where and we allow multiplication × by extending to the ring of integers . In the transformed basis, we haveWe recall (5), so under this model we can express a generic phylogenetic tensor asTo exclude the vanishing components we define, for all ordered bipartitions u=u0:u1 of the reduced set [n−1],and intepret u·γ(u) as the string u·γ(u)=a1a2…aγ(u). Then, for each pair u,v of ordered-bipartitions of [n−1], we defineandThis gives the inversionConsider the 2×2 rectangular matrices F01, F10 and F11 with componentswhere e⊆[n−1] and u=u0:u1 and v=v0:v1 are ordered-bipartitions of [n−1]. If we define the column vector indexed by pairs of ordered-bipartitions and the column vectors , and indexed by subsets of [n−1], we then have the matrix equationWriting H(=H(⊗H with H(1)=H, we note thatand define the 2×2 rectangular matrices G01,G10 and G11 asNoting thatfor all u,v,y,z ordered-bipartitions of [n−1], we then have the matrix identitiesandWritingcompletes the inversion for the K3ST model.We now consider the group based model for . For this model the generic rate matrix has the formwhere andso that .Defining ω=e2, we have ω=1 and 1+ω+ω2+…+ω=0 and where i,j=0,1,2,…,r−1. Of course, f is the character table of and .
Lemma1.
where μ,ν,μ′ are ordered-r-partitions of the set [n] defined by the strings i1i2…i, j1j2…j and k1k2…k, respectively.
Proof.
The result is obvious by the definition of tensor product. However, explicitly we havewhich clearly equals 1 if i−k=0 for all ℓ, and, by repeatedly applying 1+ω+ω2+…+ω=0, equals 0 otherwise.The regular representation contains exactly one copy of every irreducible representation and the irreducible representations of are given by the powers of ω:Thus the change of basis will give diagonal matrices . Additionally,
Lemma2.
In the diagonal basis, the matrices have matrix elements given by .Consider the matrix elements . Thuswhere we have used .NowandTranslating this result using the ordered-r-partitions for indices, we have
Lemma3.
In the diagonal basis, the uniform initial distribution on the star tree has componentswhere μ=μ0:μ1:μ2:…:μ is an ordered-r-partition of the set [n].Again recall that for this model a generic phylogenetic tensor can be written aswhere . In the diagonal basis and as a consequence of Lemma 3 will have many vanishing components. To avoid these we take u=u0:u1:u2:…:u as an ordered-r-partition of [n−1] and setIf we define andwe then have the first part of the inversion for the model:For each i∈[r−1], we define the column vectors , and, for each ∅≠e⊆[n−1] and u an ordered- (r−1)-partition of [n−1], we define the rectangular r×2 matricesso we have the vector equationWe claim that
Lemma4.
We recall that , so, for μ=μ0:μ1:μ2:…:μ an ordered-r-parition of [n], and e a subset of [n−1] we havesobecause e⊆[n−1]. On the other hand , sowhere e appears in the s position.Define, for i∈[r−1], the rectangular 2×r matricesOf course GF=δ1, so we now have the second part of the inversion:
Inversion of any abelian group-based model
Lemma5.
Any (finitely generated) abelian group G is isomorphic to a direct product of cyclic groups of prime-power order, ie. where each where p is prime and n is a positive integer.
Lemma6.
The group-based model arising from the G is defined only up to group isomorphisms of G.A generic rate matrix for the group-based model arsing from G is given byUnder a group isomorphism ϕ:G→G′, we have ϕ(σσ)=ϕ(σ)ϕ(σ).Recall (2), so that the matrix elements is set via the action σ↦σσ=σ. If we consider the regular representation of G′ we then have defined by ϕ(σ)↦ϕ(σ)ϕ(σ). Now ϕ(σ)ϕ(σ)=ϕ(σσ)=ϕ(σ) and, because ϕ is a group isomorphism, this occurs if and only if σσ=σ. Thus for all i and j.This means that we can restrict attention to a single representitive in the isomorphism class of G. Of course, for this purpose we choose the representative guaranteed by Lemma 5.Thus, for any abelian group G, with generators σ1,σ2,…,σ the corresponding group-based model has rate generators given byfor all , where is the permutation matrix representing the generator . The character table f of G is simply the tensor product of the individual character tables of the :In the diagonal basis we have matrix elementswhere ω is a k root of unity. ThusWe write phylogenetic tensors for this model in the formwhere 0≤i≤r for all 0≤s≤q. We simplify notation by writing each group of indices as μ(:=ii…i where μ( is an ordered- r-partition of [n].
Lemma7.
In the diagonal basis, the uniform initial distribution on the star tree has componentsA generic phylogenetic tensor for this model can be expressed aswhere π is the unifrom distribution on states, i.e.In the diagonal basis , and, as a consequence of the previous lemma, P has many vanishing components. To avoid these, for each i∈[q] we take as an ordered- r-partition of [n−1] and setWe then defineandso that we have the first part of the inversionWe define the column vectors and where u is an ordered- r-partition of [n−1], and we define the (r1r2…r)×2 matriceswhere in each term e appears in the position and the equality follows from Lemma 4.We can then write the vector equationIf we define the 2×(r1r2…r) matriceswhere in each term e appears in the position, we have the orthogonality relationsThis gives us the second part of the inversion of any group-based model:
Conclusion
In this article we have given an alternative derivation of the inversion of group-based phylogenetic models. Primarily our method relies on the remarkable intertwining relation between branching events and Markov evolution (4), and the resulting simplified expression of phylogenetic tensors given in (5). From there we took a representation theoretic approach concentrating on the structure of tensor indices.