| Literature DB >> 31842736 |
Yue Zhang1, Chunfang Zheng1, David Sankoff2.
Abstract
BACKGROUND: A basic tool for studying the polyploidization history of a genome, especially in plants, is the distribution of duplicate gene similarities in syntenically aligned regions of a genome. This distribution can usually be decomposed into two or more components identifiable by peaks, or local maxima, each representing a different polyploidization event. The distributions may be generated by means of a discrete time branching process, followed by a sequence divergence model. The branching process, as well as the inference of fractionation rates based on it, requires knowledge of the ploidy level of each event, which cannot be directly inferred from the pair similarity distribution.Entities:
Keywords: Branching process; Gene triples; Plant genomes; Polyploidy; Whole genome doubling; Whole genome tripling
Mesh:
Year: 2019 PMID: 31842736 PMCID: PMC6915858 DOI: 10.1186/s12859-019-3202-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Distribution of gene pair similarities. Pairs in the Durio zibethinus (COGE ID 51764) genome after two rounds of whole genome tripling. Discrimination point H=85.2%. Cut-off for pairs not originating in polyploidization > 98%
Fig. 2Sample trajectory, starting from a single gene, of branching process based on two whole genome triplings
Formulae for the expected numbers of triplets W(Δ) of each type Δ, by branching model M
| Model | Tripling- | Tripling- | Doubling- | Doubling- |
|---|---|---|---|---|
| tripling | doubling | tripling | doubling | |
| Triplet | (3,3) | (3,2) | (2,3) | (2,2) |
| { | - | - | ||
| { | 2(3 | 2(3 | 2 | (2 |
| ×(1+2 | × | ×(1+2 | ||
| { | - | - | - | - |
| { | (1+2 | - | (1+ | - |
u= probability that two progeny survive after the first polyploidization event. u′= probability that three survive. Similarly v and v′ are the probabilities that two or three progeny survive, respectively, after the second event
Durian model predictions before (underlying) and after imposition of mutational divergence
Shaded column indicates the model predicted by the literature and the closest fit to the observed profile
Populus model predictions before (underlying) and after imposition of mutational divergence
Shaded column indicates the model predicted by the literature and the closest fit to the observed profile
Brassica oleraceae model predictions before (underlying) and after imposition of mutational divergence
In the two shaded columns, the expected (2,3) profile does not fit the observed pattern as well as the (3,3) profile
Fig. 3Distribution of syntenic gene pair similarities in Populus trichocarpa. Discrimination point H=84.5%. Cut-off for pairs not originating in polyploidization > 97.5%
Fig. 4Distribution of syntenic gene pair similarities in Brassica oleracea. Discrimination point H=83.3%. Cut-off for pairs not originating in two recent polyploidization ≤76%
Formulae for the expected numbers of triplets after three events
| Model | |||
|---|---|---|---|
| Triplet | (2,2,2) | (3,2,2) | (2,3,2) |
| { | - | - | |
| { | 2 | 2(3 | 2 |
| ×(1+2 | |||
| { | - | - | - |
| { | - | - | (1+ |
| { | 2(1+ | 2(1+2 | 2(1+ |
| { | - | - | - |
| { | - | - | - |
| { | 2 | 2(3 | 2 |
| { | - | - | - |
| { | - | - | - |
| model | |||
| triplet | (2,2,3) | (2,3,3) | (3,2,3) |
| { | - | - | |
| ×(1+2 | |||
| { | 2 | 2 | 2(3 |
| ×(1+2 | ×(1+2 | ×(1+2 | |
| { | - | - | - |
| { | - | (1+ | - |
| ×(1+2 | |||
| { | 2(1+ | 2(1+ | 2(1+2 |
| ×(3 | ×(3 | ×(3 | |
| { | - | - | - |
| { | (1+ | (1+ | (1+2 |
| { | 2 | 2 | 2(3 |
| ×(3 | ×(3 | ×(3 | |
| { | - | - | - |
| { | - | - | - |
| model | |||
| triplet | (3,3,2) | (3,3,3) | |
| { | |||
| { | 2(3 | 2(3 | |
| { | - | - | |
| { | (1+2 | (1+2 | |
| { | 2(1+2 | 2(1+2 | |
| { | - | - | |
| { | - | (1+2 | |
| { | 2(3 | 2(3 | |
| { | - | - | |
| { | - | - |
u= probability that two progeny survive after the first polyploidization event. u′= probability that three survive. Similarly v and v′ are the probabilities that two or three progeny survive, respectively, after the second event. w and w′ are the probabilities that two or three progeny survive after the third event
Fig. 5Distribution decomposed into three events. Discrimination points H1=73%,H2=85%
Brassica oleracea three-event model predictions before (underlying) and after imposition of mutational divergence
In the two shaded columns, the expected (3,2,3) profile fits the observed pattern as well as the (3,3,3) profile