| Literature DB >> 27600545 |
Alexandre De Bruyn1,2, Mireille Harimalala1,2,3, Innocent Zinga1,4, Batsirai M Mabvakure5, Murielle Hoareau1, Virginie Ravigné1,6, Matthew Walters7, Bernard Reynaud1,2, Arvind Varsani7,8,9, Gordon W Harkins5, Darren P Martin10, Jean-Michel Lett1, Pierre Lefeuvre11.
Abstract
BACKGROUND: Cassava mosaic disease (CMD) in Madagascar is caused by a complex of at least six African cassava mosaic geminivirus (CMG) species. This provides a rare opportunity for a comparative study of the evolutionary and epidemiological dynamics of distinct pathogenic crop-infecting viral species that coexist within the same environment. The genetic and spatial structure of CMG populations in Madagascar was studied and Bayesian phylogeographic modelling was applied to infer the origins of Madagascan CMG populations within the epidemiological context of related populations situated on mainland Africa and other south western Indian Ocean (SWIO) islands.Entities:
Keywords: Begomoviruses; Cassava; Epidemiology; Madagascar; Phylogeography; Recombination
Mesh:
Substances:
Year: 2016 PMID: 27600545 PMCID: PMC5012068 DOI: 10.1186/s12862-016-0749-2
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Fig. 1Species composition and repartition map of the CMGs in Madagascar. Black dots indicate the location of samples from which sequences (DNA-A or DNA-B) were isolated, with the degree of transparency representing density of sequence sampling. Based on hierarchical clustering of the sample coordinates, five distinct sampled areas were defined, and are indicated by grey ellipses. The species composition of DNA-A and DNA-B are indicated for each area, with the corresponding colour code being given at the bottom of the figure
Comparison of diversity, evolution rates and introduction history in Madagascar of the different CMG species
| ACMV DNA-A | ACMV DNA-B | SACMV DNA-A | EACMV DNA-A | EACMKV DNA-A | EACMCV DNA-A | EACMV-like DNA-B | EACMCV DNA-B | |
|---|---|---|---|---|---|---|---|---|
| [FS] Total sequences | 212 | 95 | 132 | 201 | 114 | 29 | 215 | 10 |
| [FS] Total diversity (mean Id%) | 96.6 % [85.9 %–100 %] | 93.2 % [90.0 %–100 %] | 98.6 % [90.1 %–100 %] | 94.3 % [83.5 %–100 %] | 95.7 % [84.3 %–100 %] | 94.7 % [89.7 %–99.9 %] | 92.2 % [87.3 %–100 %] | 90.3 % [84.8 %–97.3 %] |
| [FS] MG sequences | 93 | 15 | 130 | 1 | 43 | 13 | 98 | 4 |
| [FS] MG diversity (mean Id%) | 98.5 % [97.1 %–100 %] | 97.1 % [95.5 %–99.9 %] | 98.7 % [93.3 %–100 %] | / | 94.4 % [84.3 %–100 %] | 96.3 % [93.1 %–99.9 %] | 97.6 % [90.3 %–100 %] | 91.1 % [84.8 %–94.8 %] |
| [FS] MG detected recombinants (%) | 0 (0 %) | 0 (0 %) | 8 (6 %) | 1 (100 %) | 17 (40 %) | 10 (77 %) | 2 (2 %) | 1 (25 %) |
| [core CP] Total sequences | 218 | / | 132 | 244 | / | / | ||
| [core CP] Total diversity (mean Id%) | 97.0 % [91.5 %–100 %] | / | 98.5 % [80.3 %–100 %] | 95.6 % [89.8 %–100 %] | / | / | ||
| [core CP] MG sequences | 93 | / | 130 | 51 | / | / | ||
| [core CP] MG diversity (mean Id%) | 98.7 % [95.6 %–100 %] | / | 98.8 % [82.1 %–100 %] | 94.9 % [90.0 %–100 %] | / | / | ||
| Introduction events | 1 | 1 | / | 3–4 | 1 | / | ||
| Introduction dates (95 % HPD) | 1996–2004 [1995–2005] | 1940–1974 [1924–1986] | / | 1988–1990 [1982–1997] | 1961–1978 [1921–1989] | / | ||
| 1988–1996 [1983–2003] | ||||||||
| 1984–2003 [1971–2006] | ||||||||
| 1997–1999 [1994–2003] | ||||||||
| Substitution rates (subs/site/year) | 3.83 × 10−3 [2.82 × 10−3; 4.89 × 10−3] | 5.64 × 10−4 [4.13 × 10−4; 7.14 × 10−4] | / | 1.69 × 10−3 [1.31 × 10−3 to 2.12 × 10−3] | 1.10 × 10−3 [9.57 × 10−4; 1.28 × 10−3] | / | ||
For each dataset, the total and Madagascan (MG) number of sequences, mean and range of identity percentages are indicated ([FS] = Full Sequence, [core CP] = core of the capsid protein encoding ORF), as well as the number of recombinant Madagascan sequences isolated in this study. The number and dates of inferred introduction events as well as the inferred substitution rates are based on analyses of the core CP datasets for DNA-A and on full component sequences for DNA-B. Correlation coefficients related to the temporal signal of each dataset are listed
Fig. 2Interspecific DNA-A recombinant sequences that are found in Madagascar. Distinct profiles of recombinant DNA-A sequences isolated in Madagascar are represented with sequence portions coloured with respect to the original species they are most related to (and are therefore presumably derived from). The arrangements of DNA-A ORFs are represented at the top of the figure. Recombination events are numbered according to Table 2. The list of sequences associated with each recombination profile is available in Additional file 7: Table S2. As indicated in the text, some recombinants that are classified as EACMKV isolates according to the 89 % nucleotide identity species demarcation threshold are in fact mostly SACMV-like (i.e., SACMV was detected as the major parent)
List of recombination events detected in CMG DNA-A and DNA-B sequences
| Event number | Recombinant | Region | Minor parent | Major parent | Methods |
| ||
|---|---|---|---|---|---|---|---|---|
| Begin | End | |||||||
| DNA-A | 1 | EACMMV/SACMV | 1685 | 1974 | EACMKV | Unknown |
| 5.5 × 10−16 |
| 2 |
| 131 | 448 | ACMV | SACMV | RGBMC | 4.1 × 10−70 | |
| 3 |
| 502 | 907 | EACMV-like | SACMV |
| 6.1 × 10−60 | |
| 4 |
| 177 | 388 | ACMV | SACMV |
| 3.4 × 10−39 | |
| 5 |
| 2093 | 32 | EACMKV | SACMV | RGBMCS | 1.2 × 10−30 | |
| 6 |
| 340 | 449 | ACMV | SACMV | RG | 4.5 × 10−16 | |
| 7 |
| 163 | 195 | Unknown | SACMV |
| 4.5 × 10−06 | |
| 8 |
| 440 | 479 | ACMV | SACMV |
| 3.6 × 10−13 | |
| 9 |
| 580 | 615 | ACMV | SACMV | R | 4.8 × 10−10 | |
| 10 | EACMMV | 1996 | 2804 | EACMV | SACMV | RGBMC | 2.2 × 10−29 | |
| 11 | EACMMV | 54 | 1052 | Unknown | EACMV | RGBMC | 5.0 × 10−30 | |
| 12 | EACMZV | 93 | 1924 | EACMV-like | Unknown | RGBMC | 2.5 × 10−67 | |
| 13 | EACMZV | a1928 | 2077 | Unknown | EACMZV | RG | 3.6 × 10−10 | |
| 14 | EACMCV | 1131 | 1790 | Unknown | EACMV-like |
| 7.8 × 10−14 | |
| 15 | EACMCV | 1836 | 2800 | EACMV | Unknown | RGBMc | 1.4 × 10−39 | |
| 16 | EACMCV | 543 | 1103 | EACMV-like | EACMCV | RG | 2.4 × 10−42 | |
| 17 | EACMCV | 623 | 669 | ACMV | EACMCV | RG | 3.8 × 10−12 | |
| 18 | EACMCV | 1847 | a1968 | EACMV | EACMCV | RG | 5.7 × 10−06 | |
| 19 | EACMCV | 1468 | 1505 | ACMV | EACMCV |
| 3.4 × 10−03 | |
| 20 | EACMCV | 1910 | 2061 | EACMV | EACMCV |
| 2.0 × 10−05 | |
| 21 |
| 185 | 1079 | EACMKV | EACMCV | RG | 1.7 × 10−25 | |
| 22 |
| 33 | a171 | Unknown | EACMCV | R | 1.3 × 10−24 | |
| 23 | EACMCV | 10 | 1054 | EACMV-like | EACMCV | RG | 3.6 × 10−21 | |
| 24 |
| 509 | 1089 | EACMKV | EACMCV | RG | 4.0 × 10−26 | |
| 25 | EACMCV | 1835 | 42 | EACMV | EACMCV | RGB | 1.3 × 10−11 | |
| 26 |
| 654 | 698 | ACMV | EACMCV | RG | 1.9 × 10−05 | |
| 27 | EACMV-UG | 549 | 1007 | ACMV | EACMV | RGBMC | 1.2 × 10−62 | |
| 28 | EACMV | a1710 | 2084 | EACMZV | EACMV | RG | 1.3 × 10−22 | |
| 29 | EACMV | 1680 | 1902 | SACMV | EACMV | rgbMCST | 8.9 × 10−04 | |
| 30 |
| 1309 | 1988 | EACMKV | EACMV |
| 3.0 × 10−12 | |
| 31 | EACMV | 1956 | 2798 | EACMV | EACMKV | RGBMC | 1.2 × 10−31 | |
| 32 |
| 1 | 815 | EACMV-like | SACMV |
| 4.0 × 10−45 | |
| 33 |
| 1 | 759 | EACMV-like | SACMV | RGBMCS | 2.9 × 10−56 | |
| 34 |
| 141 | 570 | EACMV-like | SACMV | RGBMCS | 6.3 × 10−59 | |
| 35 |
| 32 | a128 | EACMV | SACMV | R | 2.4 × 10−10 | |
| 36 |
| 1739 | 2006 | Unknown | Unknown |
| 8.2 × 10−04 | |
| 37 |
| 1884 | 2014 | EACMCV | EACMKV | RG | 9.1 × 10−19 | |
| 38 |
| 1090 | 1788 | EACMCV | EACMKV | RGBMCS | 9.9 × 10−78 | |
| 39 |
| 1865 | 2798 | SACMV | EACMKV | RGBMCS | 4.6 × 10−72 | |
| 40 |
| 549 | 1160 | SACMV | EACMKV | RGBMCS | 7.9 × 10−46 | |
| 41 |
| 570 | 1154 | Unknown | EACMKV | rGB | 1.7 × 10−15 | |
| 42 |
| 1804 | 2796 | SACMV | EACMKV | RGBMC | 1.0 × 10−43 | |
| 43 |
| 759 | a1182 | SACMV | EACMKV | RGBMCS | 1.9 × 10−21 | |
| 44 |
| 35 | 1056 | EACMKV | SACMV | RGBMC | 4.6 × 10−35 | |
| 45 |
| 1510 | a1872 | SACMV | EACMKV | R | 4.4 × 10−14 | |
| DNA-B | 1 |
| 2541 | 53 | EACMV-like | EACMCV |
| 7.6 × 10−35 |
| 2 |
| 1124 | 1461 | EACMCV | EACMV-like | RGBMC | 1.2 × 10−49 | |
| 3 |
| 1569 | 2697 | Unknown | EACMV-like | RGBMCS | 5.0 × 10−66 | |
| 4 |
| 2227 | 2287 | Unknown | EACMV-like |
| 9.7 × 10−23 | |
| 5 | SLCMV | 2595 | 2712 | Unknown | SLCMV | RG | 6.7 × 10−17 | |
| 6 | EACMV-like | 2668 | 2780 | EACMCV | EACMV-like | R | 4.3 × 10−16 | |
| 7 | EACMCV | 1495 | 2585 | EACMV-like | Unkown |
| 1.4 × 10−07 | |
| 8 |
| 854 | 1120 | Unknown | EACMV-like |
| 3.1 × 10−09 | |
| 9 | EACMV-like | 2345 | 2756 | Unknown | EACMV-like | RGBMC | 1.9 × 10−15 | |
| 10 | EACMV-like | 2119 | 2752 | EACMV-like | EACMV-like | RBGMC | 2.9 × 10−16 | |
| 11 |
| 2638 | 2699 | Unknown | EACMV-like |
| 3.0 × 10−05 | |
| 12 | EACMV-like | 869 | 1597 | EACMV-like | EACMV-like | rM | 8.1 × 10−04 | |
For each event, the species of the recombinants and inferred parents, the recombinant region breakpoints and the list of methods which detected the event are indicated (R: RDP; G: GENECONV; B: BOOTSCAN; M: MAXCHI; C: CHIMAERA; S: SISCAN; T: 3SEQ). The reported p-values are for the methods in bold type and are the smallest p-values calculated for the region in question. Whereas upper-case letters imply that a method detected recombination with a multiple comparison corrected p-value <0.05, lower-case letters imply that the method detected recombination with a multiple comparison corrected p-value >0.05
aBreakpoints not inferred by RDP
Fig. 3Recombinationally derived fragment sizes differ between the different CMG groups. Ranges of recombinationally derived fragment sizes are indicated for the ACMV, SACMV and EACMV-like virus lineages, with the average sizes represented by black dots. Arrows point from the inferred minor parent to the major parent. These values are based on inferred recombination events for which both parents and breakpoints were determined by the RDP4 analysis (see Table 2). The number of events corresponding to each modality is indicated, and significant differences are represented by different letters above the ranges
Fig. 4Population genetic structure revealed through discriminant analyses of principal components. Geographical location (on the left) and unrooted maximum-likelihood phylogenetic tree (on the right) of Madagascan EACMV-like DNA-A core CP (a), EACMV-like DNA-B (b) and ACMV DNA-B (c) sequences. Samples and tree tips are coloured according to the groups inferred in the respective DAPC analyses. White dots on the trees represent the location of the root on the corresponding rooted phylogeny. Note that on the tree from panel b, one of the branches has a reduced length for convenience of presentation
Fig. 5Spatial population structure revealed through spatial principal components analyses. Spatial Principal Components Analyses of the Madagascan EACMV-like DNA-A core CP (a), EACMV-like DNA-B (b) and ACMV DNA-B (c) sequences. The locations of samples are represented with squares coloured according to a black to white gradient corresponding to the eigenvalues for the first axis of the sPCA. The p-values of the global spatial structure detected in each dataset are presented
Fig. 6Maximum clade credibility tree constructed from the ACMV core CP dataset. Branches are coloured according to the most probable location state of the node on their right (i.e., the likely geographical location of the ancestral sequence represented by this node). The time-scale of evolutionary changes represented in the tree is indicated by the scale bar above it. Whereas filled circles that are associated with nodes indicate >95 % posterior probability support for the branches to their left, open circles indicate nodes with >70 % posterior support for these branches. Nodes to the right of branches with <70 % support are left unlabelled. The bar graph indicates location probabilities of the node at the root of the tree (i.e., the most recent common ancestor of all the sequences represented in the tree). Grey bars represent the probabilities obtained with randomization of the tip locations. The probable introduction event from Africa to Madagascar is indicated with a blue arrow
Fig. 7Maximum clade credibility tree constructed from the EACMV-like core CP dataset. Branches are coloured according to the most probable location state of the node on their right (i.e., the likely geographical location of the ancestral sequence represented by this node). The large black circle around one of the nodes indicates that the state probability at this node is less than 0.5 (i.e., there is less than 50 % confidence in the indicated location being the actual place where this ancestral sequence existed). The time-scale of evolutionary changes represented in the tree is indicated by the scale bar above it. Whereas filled circles that are associated with nodes indicate >95 % posterior probability support for the branches to their left, open circles indicate nodes with >70 % posterior support for these branches. Nodes to the right of branches with <70 % support are left unlabelled. The bar graph indicates location probabilities of the node at the root of the tree (i.e., the most recent common ancestor of all the sequences represented in the tree). Grey bars represent the probabilities obtained with randomization of the tip locations. Probable introduction events from Africa to the SWIO islands are indicated with red arrows, while introduction events from Africa to Madagascar are numbered and indicated by blue arrows. Groups 2 and 3 inferred in the DAPC analysis of the EACMV-like core CP sequences are indicated on the tree