Literature DB >> 24688293

Analysis of genetic population structure in Acacia caven (Leguminosae, Mimosoideae), comparing one exploratory and two Bayesian-model-based methods.

Carolina L Pometti1, Cecilia F Bessega1, Beatriz O Saidman1, Juan C Vilardi1.   

Abstract

Bayesian clustering as implemented in STRUCTURE or GENELAND software is widely used to form genetic groups of populations or individuals. On the other hand, in order to satisfy the need for less computer-intensive approaches, multivariate analyses are specifically devoted to extracting information from large datasets. In this paper, we report the use of a dataset of AFLP markers belonging to 15 sampling sites of Acacia caven for studying the genetic structure and comparing the consistency of three methods: STRUCTURE, GENELAND and DAPC. Of these methods, DAPC was the fastest one and showed accuracy in inferring the K number of populations (K = 12 using the find.clusters option and K = 15 with a priori information of populations). GENELAND in turn, provides information on the area of membership probabilities for individuals or populations in the space, when coordinates are specified (K = 12). STRUCTURE also inferred the number of K populations and the membership probabilities of individuals based on ancestry, presenting the result K = 11 without prior information of populations and K = 15 using the LOCPRIOR option. Finally, in this work all three methods showed high consistency in estimating the population structure, inferring similar numbers of populations and the membership probabilities of individuals to each group, with a high correlation between each other.

Entities:  

Keywords:  AFLP; Acacia caven; DAPC; GENELAND

Year:  2013        PMID: 24688293      PMCID: PMC3958328          DOI: 10.1590/s1415-47572014000100012

Source DB:  PubMed          Journal:  Genet Mol Biol        ISSN: 1415-4757            Impact factor:   1.771


Introduction

Evaluating population genetic structure is of considerable interest because it is a precursor to addressing many other issues, such as estimating migration, identifying conservation units, and specifying phylogeographical patterns (Manel ). Various statistical approaches can be used to form genetic groups of populations or individuals. For statistical inferences, model-based approaches are more suitable. Bayesian clustering (Manel ) based on Hardy-Weinberg and linkage equilibrium, as implemented in the STRUCTURE (Pritchard ) or GENELAND (Guillot ) programs, is widely used for this purpose. These programs can also consider coordinates of sampling locations. For example, when STRUCTURE is applied to population genetics, it is often useful to classify individuals of a sample into populations. In one scenario, the investigator starts with a sample of individuals, aiming to determine something about the properties of populations. In a second scenario, the investigator begins with a set of predefined populations, aiming to classify individuals of unknown origin. Using the estimated allele frequencies, it is then possible to compute the likelihood of a given genotype having originated in each population. Individuals of unknown origin can be assigned to populations according to these likelihoods. Therefore, STRUCTURE uses a Bayesian clustering approach to assign individuals (probabilistically) to populations. A model is assumed in which there are K populations (where K may be unknown), each of which is characterized by a set of allele frequencies at each locus. This method attempts to assign individuals to populations on the basis of their genotypes, while simultaneously estimating population allele frequencies. The method can be applied to various types of markers, but it assumes that the marker loci are unlinked and in linkage equilibrium with one another within the populations. It also assumes that the populations are in Hardy-Weinberg equilibrium (Pritchard ). In other words, the method assumes that any disequilibrium found is attributable to population structuration. For cases in which the geographic locations of individuals are known and sampling is relatively even in space, spatial model-based clustering methods such as GENELAND (Guillot ) are available to identify clusters of individuals. Assuming that populations occupy geographically delimited areas, the use of spatial information increases the power of correctly detecting the underlying population structure (Bonin ). The statistical model implemented in GENELAND helps inferring and locating genetic discontinuities between populations in space from individual multilocus genetic data. The central assumption is that some spatial dependence is often present among individuals. Based on this sensible assumption, a hierarchical spatial model was developed in which a priori information on how the individuals are spatially organized is formally injected. In addition to detecting genetic discontinuities between populations, the method also addresses other points, such as denoising blurred coordinates of sampled individuals, estimating the number of populations in the studied area, quantifying the amount of spatial dependence in the data, assigning individuals to their population of origin, and detecting individual migrants between populations (Guillot ). One of the shortcomings of Bayesian clustering methods is related with the assumption of Hardy-Weinberg and linkage equilibrium within populations. However, in many cases, this assumption is not tenable. A technical yet critical limitation is the considerable computation time required for analyzing large datasets. In order to satisfy the need for less computer-intensive approaches, multivariate analyses seem particularly appealing, as they are specifically devoted to extracting information from large datasets. This is how the Discriminant Analysis of Principal Components (DAPC) was developed. DAPC is based on data transformation, using principal components analysis (PCA) as a prior step to discriminant analysis (DA), which ensures that variables submitted to DA are perfectly uncorrelated, and that their number is less than that of the analyzed individuals. Without necessarily implying a loss of genetic information, this transformation allows DA to be applied to any genetic data. Two options for DAPC are offered, depending on whether group priors are known or not (Jombart ). In this context, since plant populations are not randomly arranged assemblages of genotypes, but are structured in space and time, the above mentioned programs allow a fine-scale study of the genetic structure of these populations. This genetic structure may be manifested among geographically distinct populations, within a local group of plants, or even in the progeny of individuals. Ecologic factors affecting reproduction and dispersal are likely to be particularly important in determining genetic structure. Also, spatial and genetic patterns are often assumed to result from environmental heterogeneity and differential selection pressures (Loveless and Hamrick, 1984). In this paper, we describe a study on natural Argentinean populations of the plant species Acacia caven (Leguminosae, Mimosoideae). This species is an extremely wide-ranging one that probably originated in the warm temperate to subtropical biogeographic region known as the Gran Chaco of southern South America, due to its great morphologic diversity. This small legume species is found in six countries and is considered to have certain potential as a managed silvopastoral crop (Aronson and Ovalle, 1989). Fruit size and shape are highly variable in A. caven. In 1992, Aronson recognized six varieties for this species, including A. caven var. caven, A. caven var. dehiscens, A. caven var. sphaerocarpa, A. caven var. stenocarpa, A. caven var. microcarpa and A.caven var. macrocarpa, based on both morphologic traits (Aronson 1992; Pometti ) and molecular markers (Pometti ). Argentina is the only country where all varieties cohabit (Aronson, 1992). In this context, the main objective of the present work was to study the genetic structure of 15 populations of the six varieties of Acacia caven, using a dataset of AFLP markers. To accomplish this objective, we used two model-based approaches (STRUCTURE and GENELAND) and the exploratory method DAPC for estimating genetic structure and compared the consistency of the three methods.

Materials and Methods

Description of the dataset

In this study, a real dataset was used to compare the results of genetic structure analyses made by alternative approaches. This dataset consists of AFLP patterns of 224 individuals of the six varieties of Acacia caven (Leguminosae, Mimosoideae), collected from 15 sampling sites (Table 1). The distances between the sampling sites are shown in Table 2.
Table 1

Populations of Acacia caven sampled in this study.

VarietyEco-regionPopulationPopulation codeLatitude (ºS)Longitude (ºW)Number of individuals analyzed
A. caven var cavenPampaCostanera SurCS34°38′10.71″58°42′44.08″14
A. caven var cavenPampaGualeguaychúGY33°22′4.00″58°44′3.00″22
A. caven var cavenPunaCoiruroCI23°53′34.00″65°27′30.00″18
A. caven var cavenPunaCampo QuijanoCQ24°55′12.00″65°39′0.00″13
A. caven var cavenPunaRuta NueveRN24°39′48.00″65°22′49.00″14
A. caven var macrocarpaPunaEl CarrilEC25° 4′58.80″65°28′1.20″16
A. caven var macrocarpaPunaTolombónTO26°11′8.00″65°56′7.00″14
A. caven var microcarpaWet ChacoVivero ForestalVF26°16′0.00″58°17′41.64″12
A. caven var stenocarpaWet ChacoFormosaFS26°16′13.20″58°17′7.92″12
A. caven var stenocarpaWet ChacoYPFYP26°11′26.76″58° 9′23.82″12
A. caven var sphaerocarpaEspinalIberáIB28°15′40.13″56°30′20.38″18
A. caven var dehiscensDry ChacoLas GemelasLG30°53′26.10″64°30′13.50″14
A. caven var dehiscensDry ChacoPan de AzúcarPA31°15′58.90″64°20′28.60″12
A. caven var dehiscensDry ChacoVaqueríasVA31°23′38.93″63°51′30.87″12
A. caven var dehiscensDry ChacoValle HermosoVH31° 7′1.20″64°28′58.80″21
Table 2

Pairwise geographic distances in kilometers between Acacia caven sampling sites.

PopCQCSECFSGYIBLGPARNTOVAVFVHYP
CI111.001348.53123.80768.281226.001017.75809.44832.0079.21263.14833.53771.49856.92780.57
CQ1280.0025.00752.461166.50982.49700.00749.5039.28159.26744.23751.00750.71764.50
CS1238.29910.82185.00745.31655.00645.001273.211155.11650.00911.38664.42932.57
EC730.821121.00961.09674.07716.0050.33139.09701.21729.25683.00743.00
FS777.15282.15779.31801.41731.84759.85780.433.00800.0015.80
GY617.17584.38574.001198.251046.00593.30770.79593.50793.76
IB832.84827.86976.03964.59794.48283.43838.96282.15
LG43.50738.31547.3829.40783.7835.20809.13
PA798.50576.0021.00795.3922.44823.54
RN194.06767.15730.38727.22745.00
TO593.13759.56565.74774.56
VA775.784.00797.27
VF799.0016.41
VH824.37
The AFLP assay was performed as described by Vos , with a slight modification, as described in Pometti . This technique was used to investigate genetic variation within and among natural populations of A. caven from five eco-regions: Wet Chaco, Dry Chaco, Espinal, Pampa and Puna (Burkart ). From the individuals studied by means of AFLP markers, 225 bands were obtained. Each AFLP band was considered as a single biallelic locus with one amplifiable and one null allele. Bands with the same migration distance were considered homologous. Data were scored manually as band presence (1) or absence (0).

Methods to assess population structure

As mentioned before, different approaches were used here to identify spatial structure in A. caven populations: two Bayesian-model-based and one exploratory method. The first one was the spatial cluster model implemented in the GENELAND package (Guillot ) of the R program (R Development Core Team, 2011). Different sets of parameters (MCMC, thinning and burn-in) were used in different test runs, in order to find the optimal parameters by the time taken for the run. Finally, following the recommendation of the user’s manual, the Markov chain Monte Carlo (MCMC) repetitions were set at 100,000, thinning was set at 100, and the burn-in period was set at 200 (we eliminated the first 200 iterations whenever the curve was not constant); the number of groups (K) to be tested was set at 1–15. All individuals were assigned to K populations (1≤ K ≤ 15) based on their multilocus genotype and the spatial coordinates. To ensure that the run was long enough, we obtained 10 different runs and compared the parameter estimates (K, individual population membership, maps). The best result was chosen, based on the highest average posterior probability. The other Bayesian-model-based cluster analysis was performed using the STRUCTURE program version 2.3.3 (Pritchard ). This analysis was performed twice: once without prior information of the populations to which the individuals belonged, and once with prior information on the populations (LOCPRIOR model). In both cases, the burn-in period and the number of MCMC repetitions were set, respectively, at 50,000 and 100,000. An admixture model was used, with correlated allele frequencies. K was set at 1–15, and the highest K value was identified as the run with the highest likelihood value, as recommended by Pritchard . In addition, K values were averaged across 10 iterations. The exploratory Discriminant Analysis of Principal Components (DAPC) was applied, using the adegenet package (Jombart, 2008) (function dapc) for software R (R Development Core Team, 2011). This analysis was also performed both with and without prior information on individual populations. Whenever group priors were unknown, the number of clusters was assessed using the find.clusters function, which runs successive K-means clustering with increasing number of clusters (k). For selecting the optimal number of clusters, we applied the Bayesian Information Criterion (BIC) for assessing the best supported model, and therefore the number and nature of clusters, as recommended by Jombart .

Comparison of individual groupings in the different methods

The probabilities of posterior population membership of individuals obtained by all grouping methods used were converted into between-individual Euclidean distances. Pairwise comparisons of these distance matrices were performed by means of the Mantel test using the ade4 package of R (Chessel ).

Results

Analysis of the Acacia caven AFLP dataset obtained using GENELAND yielded a modal number of populations of 12, varying from 11–13 in different runs (Table 3). The run with the highest average posterior probability was chosen to base the conclusions on. The number of populations simulated from posterior distribution (Figure 1) displays a clear mode at K = 12. MCMC clearly converges within the first 10,000 iterations (Figure 1). Two populations, VA and PA (belonging to var. dehiscens), were included in one of the groups produced by GENELAND (Figure 2, row 3, column 2), and the other group identified comprises the VF, FS, and YP populations (belonging to vars. microcarpa and stenocarpa) (Figure 2, row 2, column 2). In both cases, the populations grouped together are geographically very close to each other. Each of the remaining groups corresponds to a single sampling site. The comparison of posterior probability of assignment of individuals to populations led to unequivocal results, assigning each individual to the population to which it belongs, except for those previously mentioned individuals that are in the same group of populations (100% of correct assignation).
Table 3

Multiple runs for inferring the number of populations using GENELAND software.

RunModal number% of modal numberMean of probability density
11237.20−62443.26
21237.80−60538.68
31138.90−60964.61
41332.80−60583.12
51136.80−61215.83
61333.40−60874.19
71236.40−60953.19
81236.20−59999.66
91236.80−61164.86
101236.00−60860.80

In bold: highest average posterior probability.

Figure 1

Plot of the number of populations simulated from the posterior distribution obtained with GENELAND.

Figure 2

Spatial distribution of each group defined by GENELAND at K = 12. Population codes are given in Table 1.

Data analysis using STRUCTURE with no prior distribution specified revealed that K = 11 had the highest mean probability of density value (Ln P(D) = −16832.60), after which this value plateaus, suggesting that the optimal number of K was 11. In this analysis (Figure 3a), individuals of populations FS, VF, and YP are grouped together, the same occurs with individuals of populations PA and VA, and a third group joins together individuals of populations CQ and RN that belong to the var. caven and are both located in the Puna eco-region (Figure 3a). The assignation of individuals to populations was 96.4% correct.
Figure 3

Clustering of individuals by STRUCTURE. Each individual is represented by a vertical bar that is partitioned into colored segments that represent the individual’s estimated membership fractions. Same color in different individuals indicates that they belong to the same cluster. a) K = 11, estimated with no prior distribution of populations; b) K = 15, estimated with LOCPRIOR option. Population codes are given in Table 1.

When the LOCPRIOR option was used, K = 15 had the highest mean probability of density value (Ln P(D) = −17065.30), suggesting that each population corresponded to a single sampling site (Figure 3b). Moreover, the STRUCTURE results detected admixture of individuals in all populations with both models (Figure 3 a, b). The assignation of individuals to populations was 94.2% correct. DAPC analysis was first made without any a priori group assignment. To obtain the optimal number of clusters with the find.clusters function, 70 axes that represented more than 88% of the total variance were retained. The program covered a range of possible clusters from 1 to 15. The lowest BIC value (1137.35) corresponded to K = 12. For DAPC analysis, 70 PCA axes and three discriminant functions were retained (52.3% of variance). One of the clusters included individuals of populations VF, FS, and YP, a second cluster joined PA and VA, and the remaining clusters were rather consistent with the rest of the sampling sites. The scatterplot of individuals on the two principal components of DAPC (Figure 4a) showed that the 12 clusters formed four groups. The consistency between prior and posterior assignment was 84.8%.
Figure 4

Scatterplot of individuals on the two principal components of DAPC. The graph represents the individuals as dots and the groups as inertia ellipses. Eigenvalues of the analysis are displayed in inset: a) obtained with the find.clusters option, b) with clusters defined a priori according to the sampling site. Population codes are given in Table 1.

In the second analysis, the clusters were defined a priori, according to the sampling site. Also in this case, 70 axes of the PCA were retained for DAPC, corresponding to more than 88.8% of the variance, and three discriminant functions were obtained (53.9% of the variance). The scatterplot shows overlapping between the a priori defined groups (Figure 4b); the consistency between prior and posterior assignment was 88.8%. The results obtained from the two approaches can also be compared with the posterior probability plots corresponding to the groups defined by the find.clusters procedure (Figure 5a) and with the groups defined by the sampling site (Figure 5b).
Figure 5

STRUCTURE-like plot of DAPC analysis for a global picture of the clusters composition. Each individual is represented by a vertical colored line. Same color in different individuals indicates that they belong to the same cluster. a) K = 12, obtained with find.clusters option; b) K = 15, obtained with a priori information of sampling sites. Population Codes are given in Table 1.

Regarding the consistency between prior and posterior assignment of individuals to groups (Table 4), the maximum corresponded to GENELAND (100%), whereas the lowest consistency was obtained by DAPC without information on population membership (84.8%). Pairwise comparison of distances between individuals obtained from the probabilities of posterior assignment of population membership of individuals resulting from all five grouping methods (Table 4) revealed highly significant correlations (p < 0.0005, based on 2000 permutations) in all cases. The highest consistency value (r = 0.811) corresponded to the groupings obtained by GENELAND and STRUCTURE for the admixture model without prior information on population membership. The grouping obtained by DAPC without prior information on population membership showed the lowest correlation estimates when compared with most of the other grouping methods.
Table 4

Pairwise comparison of distances between individuals obtained from the probabilities of posterior population membership of individuals, obtained by all five grouping methods. K = number of clusters; r = correlation coefficient; p < 0.0005; STR 1= STRUCTURE analysis without prior information; STR 2 = STRUCTURE analysis with LOCPRIOR option; DAPC 1= DAPC analysis with find.clusters option; DAPC 2 = DAPC analysis with a priori information of populations.

r
K% of correct assignmentGENELANDSTR 1STR 2DAPC 1DAPC 2
Sampling sites15-
GENELAND12100-
STR 11196.40.811-
STR 2150.7260.710-
DAPC 11284.80.6120.6160.577-
DAPC 21588.80.7690.6730.7160.607-

Discussion

The analysis of genetic diversity within species is vital for understanding the evolutionary processes, both at the population and at the genomic levels. Several statistical packages recently developed which offer a panel of standard as well as more sophisticated analyses have been reviewed by Excoffier and Heckel (2006). Most data analyses require the use of more than one program and should start with generalist packages to uncover the basic properties of the data, followed by the use of specialized methodologies to address more specific questions (Excoffier and Heckel, 2006). In line with this recommendation, we evaluated the consistency of different methodological approaches for analyzing genetic properties of Acacia caven populations, a shrub widely distributed in South America. This species plays an important role in arid ecosystems, as it contributes to the fixation of atmospheric nitrogen, provides fruits and leaves to herbivores, and stabilizes soils by fixing dunes. In addition, it is an appreciated natural resource for local settlers, because it provides fire wood, charcoal and forage for livestock. Due to its great plasticity, it is used in the reforestation of degraded ecosystems (Karlin ). In this work, we chose one exploratory and two Bayesian-model-based methods to infer the genetic structure of A. caven species from 15 sampling sites. The exploratory method used here was DAPC that seeks synthetic variables, the discriminant functions, which show differences between groups as best as possible, while minimizing variation within clusters (Jombart, 2012). Using the find.clusters option in this analysis, the number of populations inferred was K = 12, grouping together VF, FS, and YP and also PA and VA. DAPC analysis is preferred when groups are often unknown or uncertain and there is a need for identifying genetic clusters before describing them. In this work, we found that those sampling sites that grouped together in the same cluster were the geographically closer ones. When we defined the prior groups for the DAPC analysis, the inferred K was 15, the same as the number of sampling sites. In both cases, the percentage of variance explained by the three discriminant functions was < 54%. This could be attributed to the reduction of variables achieved by DAPC; in other words, we had 225 loci or variables, and this method reduced (in this case) the number of composed variables to the 70 more informative axes. Additionally, two Bayesian analyses were applied to the data to study the genetic structure of the samples (GENELAND and STRUCTURE). When STRUCTURE was run with the LOCPRIOR option, the K estimated was coincident with the number of data sampling sites (K = 15). When using STRUCTURE, it is usually assumed that all partitions of individuals are a priori approximately equally likely. Since the number of possible partitions is immense, it takes highly informative data for STRUCTURE to conclude that any particular partition of individuals into clusters has compelling statistical support. In contrast, the LOCPRIOR models assume that, in practice, individuals from the same sampling location often come from the same population. Therefore, the LOCPRIOR models are set up to expect that the sampling locations may be informative about ancestry. If the data suggest that the locations are informative, then the LOCPRIOR models allow STRUCTURE to use this information (Pritchard ). GENELAND analysis in turn showed that the 15 A. caven populations studied could be grouped into K =12 independent groups, indicating that each sampling site represented a single Mendelian population, with the exception of VA and PA, and FS, YP, and VF, which would correspond to two clusters. STRUCTURE analysis without prior information of populations showed that the optimal number of populations was K = 11, joining together populations CQ and RN. The other 10 groups constituted were coincident with those detected by GENELAND. The slight difference between analyses regarding the detection of the number of K could be attributed to the model chosen, since GENELAND was run with previous information of geographic coordinates, tending to favor partitions that are spatially organized, while STRUCTURE was not. Similar differences in behavior between GENELAND and STRUCTURE were noted by Guillot when comparing the dataset of Montana wolverines (Gulo gulo) recorded by Cegelski , as STRUCTURE inferred K = 3, whereas GENELAND inferred K = 4. In our case, GENELAND grouped together A. caven populations that were geographically and genetically closer and located in the same eco-region, such as VA and PA, and FS, VF, and YP. On the other hand, STRUCTURE detected the genetically similar groups. Variety caven is the most widespread (a generalist, in terms of ecology range), and here we analyzed five of its populations from two eco-regions. One could expect to find these populations grouped together according to the eco-region and the variety they belong to. However, the results of the Puna eco-region suggest that there the populations are less connected to each other by gene flow than the populations of the other eco-regions, since CI was not grouped together with CQ and RN in the STRUCTURE analysis. A possible explanation for this clustering could be that the geographic distances between CQ and RN were smaller than that from CI, and the genetic and geographic distances among the populations studied here have shown to be significantly correlated (Pometti ). Moreover, although these three populations belong to the same variety and the same eco-region, they were found at different altitudes: RN at 1305 m o.s.l., CQ at 1511 m o.s.l., and CI at 2089 m o.s.l. This results in an environment of patchy vegetation, because of the presence of mountains that separate CI from RN and CQ. It has been well documented that marginal populations are often less variable than populations within the primary range (Blows and Hoffmann, 1993; Deng ). The results obtained for the variety caven from the Puna eco-region could be explained by the observations of Hamrick and Godt (1990) and Maguire that populations located at range margins are more isolated from sources of immigrants and are thus more prone to genetic bottlenecks. When comparing the number K of populations estimated in the three methods, DAPC using the find.clusters option proved as accurate in detecting population clusters as STRUCTURE without prior information of populations and GENELAND. When prior groups were defined, the DAPC results were coincident with those obtained by STRUCTURE with the LOCPRIOR option, where K = 15. As previously explained, in both cases the sampling locations were informative about ancestry. A significant degree of genetic differentiation among A. caven populations was observed using the three methods, since K ranged from 11 to 15, showing a high level of structuration in the 15 sampling sites studied. The most evident associations among populations were found for PA and VA, and FS, VF and YP in all analyses, and for CQ and RN with STRUCTURE. No other association between populations by eco-region or variety was observed consistently with the tree methods used. The three methods used here to infer population structure also provide coefficients of membership probabilities of each individual to the different groups, based on the retained discriminant functions in the case of DAPC, or based on ancestry in the case of STRUCTURE and GENELAND. While DAPC coefficients are different from the admixture coefficients of softwares like STRUCTURE or GENELAND, they can still be interpreted as proximities of individuals to the different clusters. Membership probabilities also provide indications of how clear-cut genetic clusters are (Jombart, 2012). The highest membership probabilities of each individual for the different groups were obtained by GENELAND, followed by STRUCTURE with prior definition of groups, STRUCTURE without population information, DAPC with prior definition of groups, and the lowest membership probabilities were those observed by DAPC without information on population membership. This means that the three methods and their variants provided accurate assignments of individuals, ranging from 84.8% for DAPC using the find.clusters option to 100% for GENELAND. In conclusion, of the three methods used here, DAPC proved to be the fastest one, showing accuracy in inferring the K number of populations and the membership probabilities of each individual for the different groups in a short computational time (only a few minutes, while STRUCTURE and GENELAND needed four or five days to perform the analysis). So, DAPC should be preferred as a starting point when working with large datasets and several sampling sites, as recommended by Excoffier and Heckel (2006). GENELAND, on the other hand, provides information on the area of membership probabilities for individuals or populations in space, when coordinates are specified; moreover, the number of population units is treated as an unknown parameter (Guillot ). STRUCTURE, in addition to inferring the number of K populations and the membership probabilities of individuals based on ancestry, allows a hierarchical analysis of sampling sites from K =2 to K = n, where n is the number of populations estimated with the highest mean probability of density value (Tishkoff ; Pometti ). The two latter analyses present the disadvantage of being more time-consuming and relying on assumptions, such as the type of population subdivision and Hardy-Weinberg and linkage equilibrium inside populations. Finally, in this work, all three methods showed high consistency in estimating the population structure of A. caven, inferring similar numbers of populations and membership probabilities of individuals to each group, with a high correlation between each other. This consistency may be interpreted in a similar way as the consistency between phenetic and cladistic analyses, which, although being based on different assumptions, reveal in many cases similar associations between phylogenetically related groups.
  10 in total

1.  Inference of population structure using multilocus genotype data.

Authors:  J K Pritchard; M Stephens; P Donnelly
Journal:  Genetics       Date:  2000-06       Impact factor: 4.562

2.  Microsatellite analysis of genetic structure in the mangrove species Avicennia marina (Forsk.) Vierh. (Avicenniaceae).

Authors:  T L Maguire; P Saenger; P Baverstock; R Henry
Journal:  Mol Ecol       Date:  2000-11       Impact factor: 6.185

3.  Assignment methods: matching biological questions with appropriate techniques.

Authors:  Stephanie Manel; Oscar E Gaggiotti; Robin S Waples
Journal:  Trends Ecol Evol       Date:  2005-01-06       Impact factor: 17.712

Review 4.  Computer programs for population genetics data analysis: a survival guide.

Authors:  Laurent Excoffier; Gerald Heckel
Journal:  Nat Rev Genet       Date:  2006-08-22       Impact factor: 53.242

Review 5.  Statistical analysis of amplified fragment length polymorphism data: a toolbox for molecular ecologists and evolutionists.

Authors:  A Bonin; D Ehrich; S Manel
Journal:  Mol Ecol       Date:  2007-09       Impact factor: 6.185

6.  adegenet: a R package for the multivariate analysis of genetic markers.

Authors:  Thibaut Jombart
Journal:  Bioinformatics       Date:  2008-04-08       Impact factor: 6.937

7.  AFLP: a new technique for DNA fingerprinting.

Authors:  P Vos; R Hogers; M Bleeker; M Reijans; T van de Lee; M Hornes; A Frijters; J Pot; J Peleman; M Kuiper
Journal:  Nucleic Acids Res       Date:  1995-11-11       Impact factor: 16.971

8.  Discriminant analysis of principal components: a new method for the analysis of genetically structured populations.

Authors:  Thibaut Jombart; Sébastien Devillard; François Balloux
Journal:  BMC Genet       Date:  2010-10-15       Impact factor: 2.797

9.  The genetic structure and history of Africans and African Americans.

Authors:  Sarah A Tishkoff; Floyd A Reed; Françoise R Friedlaender; Christopher Ehret; Alessia Ranciaro; Alain Froment; Jibril B Hirbo; Agnes A Awomoyi; Jean-Marie Bodo; Ogobara Doumbo; Muntaser Ibrahim; Abdalla T Juma; Maritha J Kotze; Godfrey Lema; Jason H Moore; Holly Mortensen; Thomas B Nyambo; Sabah A Omar; Kweli Powell; Gideon S Pretorius; Michael W Smith; Mahamadou A Thera; Charles Wambebe; James L Weber; Scott M Williams
Journal:  Science       Date:  2009-04-30       Impact factor: 47.728

10.  Assessing population structure and gene flow in Montana wolverines (Gulo gulo) using assignment-based approaches.

Authors:  C C Cegelski; L P Waits; N J Anderson
Journal:  Mol Ecol       Date:  2003-11       Impact factor: 6.185

  10 in total
  3 in total

1.  Assessing polar bear (Ursus maritimus) population structure in the Hudson Bay region using SNPs.

Authors:  Michelle Viengkone; Andrew Edward Derocher; Evan Shaun Richardson; René Michael Malenfant; Joshua Moses Miller; Martyn E Obbard; Markus G Dyck; Nick J Lunn; Vicki Sahanatien; Corey S Davis
Journal:  Ecol Evol       Date:  2016-10-28       Impact factor: 2.912

2.  Single Marker and Haplotype-Based Association Analysis of Semolina and Pasta Colour in Elite Durum Wheat Breeding Lines Using a High-Density Consensus Map.

Authors:  Amidou N'Diaye; Jemanesh K Haile; Aron T Cory; Fran R Clarke; John M Clarke; Ron E Knox; Curtis J Pozniak
Journal:  PLoS One       Date:  2017-01-30       Impact factor: 3.240

3.  Genetic structure and population connectivity of the blue and red shrimp Aristeus antennatus.

Authors:  Sandra Heras; Laia Planella; José-Luis García-Marín; Manuel Vera; María Inés Roldán
Journal:  Sci Rep       Date:  2019-09-19       Impact factor: 4.379

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.