| Literature DB >> 24688293 |
Carolina L Pometti1, Cecilia F Bessega1, Beatriz O Saidman1, Juan C Vilardi1.
Abstract
Bayesian clustering as implemented in STRUCTURE or GENELAND software is widely used to form genetic groups of populations or individuals. On the other hand, in order to satisfy the need for less computer-intensive approaches, multivariate analyses are specifically devoted to extracting information from large datasets. In this paper, we report the use of a dataset of AFLP markers belonging to 15 sampling sites of Acacia caven for studying the genetic structure and comparing the consistency of three methods: STRUCTURE, GENELAND and DAPC. Of these methods, DAPC was the fastest one and showed accuracy in inferring the K number of populations (K = 12 using the find.clusters option and K = 15 with a priori information of populations). GENELAND in turn, provides information on the area of membership probabilities for individuals or populations in the space, when coordinates are specified (K = 12). STRUCTURE also inferred the number of K populations and the membership probabilities of individuals based on ancestry, presenting the result K = 11 without prior information of populations and K = 15 using the LOCPRIOR option. Finally, in this work all three methods showed high consistency in estimating the population structure, inferring similar numbers of populations and the membership probabilities of individuals to each group, with a high correlation between each other.Entities:
Keywords: AFLP; Acacia caven; DAPC; GENELAND
Year: 2013 PMID: 24688293 PMCID: PMC3958328 DOI: 10.1590/s1415-47572014000100012
Source DB: PubMed Journal: Genet Mol Biol ISSN: 1415-4757 Impact factor: 1.771
Populations of Acacia caven sampled in this study.
| Variety | Eco-region | Population | Population code | Latitude (ºS) | Longitude (ºW) | Number of individuals analyzed |
|---|---|---|---|---|---|---|
| Pampa | Costanera Sur | CS | 34°38′10.71″ | 58°42′44.08″ | 14 | |
| Pampa | Gualeguaychú | GY | 33°22′4.00″ | 58°44′3.00″ | 22 | |
| Puna | Coiruro | CI | 23°53′34.00″ | 65°27′30.00″ | 18 | |
| Puna | Campo Quijano | CQ | 24°55′12.00″ | 65°39′0.00″ | 13 | |
| Puna | Ruta Nueve | RN | 24°39′48.00″ | 65°22′49.00″ | 14 | |
| Puna | El Carril | EC | 25° 4′58.80″ | 65°28′1.20″ | 16 | |
| Puna | Tolombón | TO | 26°11′8.00″ | 65°56′7.00″ | 14 | |
| Wet Chaco | Vivero Forestal | VF | 26°16′0.00″ | 58°17′41.64″ | 12 | |
| Wet Chaco | Formosa | FS | 26°16′13.20″ | 58°17′7.92″ | 12 | |
| Wet Chaco | YPF | YP | 26°11′26.76″ | 58° 9′23.82″ | 12 | |
| Espinal | Iberá | IB | 28°15′40.13″ | 56°30′20.38″ | 18 | |
| Dry Chaco | Las Gemelas | LG | 30°53′26.10″ | 64°30′13.50″ | 14 | |
| Dry Chaco | Pan de Azúcar | PA | 31°15′58.90″ | 64°20′28.60″ | 12 | |
| Dry Chaco | Vaquerías | VA | 31°23′38.93″ | 63°51′30.87″ | 12 | |
| Dry Chaco | Valle Hermoso | VH | 31° 7′1.20″ | 64°28′58.80″ | 21 |
Pairwise geographic distances in kilometers between Acacia caven sampling sites.
| Pop | CQ | CS | EC | FS | GY | IB | LG | PA | RN | TO | VA | VF | VH | YP |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CI | 111.00 | 1348.53 | 123.80 | 768.28 | 1226.00 | 1017.75 | 809.44 | 832.00 | 79.21 | 263.14 | 833.53 | 771.49 | 856.92 | 780.57 |
| CQ | 1280.00 | 25.00 | 752.46 | 1166.50 | 982.49 | 700.00 | 749.50 | 39.28 | 159.26 | 744.23 | 751.00 | 750.71 | 764.50 | |
| CS | 1238.29 | 910.82 | 185.00 | 745.31 | 655.00 | 645.00 | 1273.21 | 1155.11 | 650.00 | 911.38 | 664.42 | 932.57 | ||
| EC | 730.82 | 1121.00 | 961.09 | 674.07 | 716.00 | 50.33 | 139.09 | 701.21 | 729.25 | 683.00 | 743.00 | |||
| FS | 777.15 | 282.15 | 779.31 | 801.41 | 731.84 | 759.85 | 780.43 | 3.00 | 800.00 | 15.80 | ||||
| GY | 617.17 | 584.38 | 574.00 | 1198.25 | 1046.00 | 593.30 | 770.79 | 593.50 | 793.76 | |||||
| IB | 832.84 | 827.86 | 976.03 | 964.59 | 794.48 | 283.43 | 838.96 | 282.15 | ||||||
| LG | 43.50 | 738.31 | 547.38 | 29.40 | 783.78 | 35.20 | 809.13 | |||||||
| PA | 798.50 | 576.00 | 21.00 | 795.39 | 22.44 | 823.54 | ||||||||
| RN | 194.06 | 767.15 | 730.38 | 727.22 | 745.00 | |||||||||
| TO | 593.13 | 759.56 | 565.74 | 774.56 | ||||||||||
| VA | 775.78 | 4.00 | 797.27 | |||||||||||
| VF | 799.00 | 16.41 | ||||||||||||
| VH | 824.37 |
Multiple runs for inferring the number of populations using GENELAND software.
| Run | Modal number | % of modal number | Mean of probability density |
|---|---|---|---|
| 1 | 12 | 37.20 | −62443.26 |
| 2 | 12 | 37.80 | −60538.68 |
| 3 | 11 | 38.90 | −60964.61 |
| 4 | 13 | 32.80 | −60583.12 |
| 5 | 11 | 36.80 | −61215.83 |
| 6 | 13 | 33.40 | −60874.19 |
| 7 | 12 | 36.40 | −60953.19 |
| 9 | 12 | 36.80 | −61164.86 |
| 10 | 12 | 36.00 | −60860.80 |
In bold: highest average posterior probability.
Figure 1Plot of the number of populations simulated from the posterior distribution obtained with GENELAND.
Figure 2Spatial distribution of each group defined by GENELAND at K = 12. Population codes are given in Table 1.
Figure 3Clustering of individuals by STRUCTURE. Each individual is represented by a vertical bar that is partitioned into colored segments that represent the individual’s estimated membership fractions. Same color in different individuals indicates that they belong to the same cluster. a) K = 11, estimated with no prior distribution of populations; b) K = 15, estimated with LOCPRIOR option. Population codes are given in Table 1.
Figure 4Scatterplot of individuals on the two principal components of DAPC. The graph represents the individuals as dots and the groups as inertia ellipses. Eigenvalues of the analysis are displayed in inset: a) obtained with the find.clusters option, b) with clusters defined a priori according to the sampling site. Population codes are given in Table 1.
Figure 5STRUCTURE-like plot of DAPC analysis for a global picture of the clusters composition. Each individual is represented by a vertical colored line. Same color in different individuals indicates that they belong to the same cluster. a) K = 12, obtained with find.clusters option; b) K = 15, obtained with a priori information of sampling sites. Population Codes are given in Table 1.
Pairwise comparison of distances between individuals obtained from the probabilities of posterior population membership of individuals, obtained by all five grouping methods. K = number of clusters; r = correlation coefficient; p < 0.0005; STR 1= STRUCTURE analysis without prior information; STR 2 = STRUCTURE analysis with LOCPRIOR option; DAPC 1= DAPC analysis with find.clusters option; DAPC 2 = DAPC analysis with a priori information of populations.
| % of correct assignment | GENELAND | STR 1 | STR 2 | DAPC 1 | DAPC 2 | ||
|---|---|---|---|---|---|---|---|
| Sampling sites | 15 | - | |||||
| GENELAND | 12 | 100 | - | ||||
| STR 1 | 11 | 96.4 | 0.811 | - | |||
| STR 2 | 15 | 0.726 | 0.710 | - | |||
| DAPC 1 | 12 | 84.8 | 0.612 | 0.616 | 0.577 | - | |
| DAPC 2 | 15 | 88.8 | 0.769 | 0.673 | 0.716 | 0.607 | - |