| Literature DB >> 31136078 |
Solange Duruz1, Natalia Sevane2, Oliver Selmoni1, Elia Vajana1, Kevin Leempoel3, Sylvie Stucki1, Pablo Orozco-terWengel4, Estelle Rochat1, Susana Dunner2, Michael W Bruford4, Stéphane Joost1.
Abstract
samβada is a genome-environment association software, designed to search for signatures of local adaptation. However, pre- and postprocessing of data can be labour-intensive, preventing wider uptake of the method. We have now developed R.SamBada, an r-package providing a pipeline for landscape genomic analysis based on samβada, spanning from the retrieval of environmental conditions at sampling locations to gene annotation using the Ensembl genome browser. As a result, R.SamBada standardizes the landscape genomics pipeline and eases the search for candidate genes of local adaptation, enhancing reproducibility of landscape genomic studies. The efficiency and power of the pipeline is illustrated using two examples: sheep populations from Morocco with no evident population structure and Lidia cattle from Spain displaying population substructuring. In both cases, R.SamBada enabled rapid identification and interpretation of candidate genes, which are further discussed in the light of local adaptation. The package is available in the r CRAN package repository and on GitHub (github.com/SolangeD/R.SamBada).Entities:
Keywords: Lidia cattle breed; Moroccan sheep; gene-environment association; landscape genomics; local adaptation; r-package
Mesh:
Year: 2019 PMID: 31136078 PMCID: PMC6790591 DOI: 10.1111/1755-0998.13044
Source DB: PubMed Journal: Mol Ecol Resour ISSN: 1755-098X Impact factor: 7.090
Figure 1Overall functionalities and process in R.SamBada. Grey boxes with italic names indicate functions included in the package. The process starts with a genomic file and a file with sample locations or list of IDs. The preprocessing will format the genomic file and prepare the environmental file; samβada is then run parallelly on multiple cores; after computing of p‐, q‐values, Manhattan plots and maps can be drawn and Ensembl database can be queried
Figure 5Manhattan plot of the Lidia cattle study, showing the p‐values with Bonferroni correction as derived from the samβada models involving mean annual temperature and one population variable. The red point corresponds to SNP ARS‐BFGL‐NGS‐106879, located 30,000 base pairs apart from the HSPB8 gene [Colour figure can be viewed at wileyonlinelibrary.com]
Figure 3Spatial occurrence of the CC genotype for SNP ss1208941124. In the background, the shaded topography with mean annual precipitation (given in [mm/year]) is displayed [Colour figure can be viewed at wileyonlinelibrary.com]
Figure 4Spatial distribution of the Lidia cattle population structure according to the scores of the first principal component, with a shaded relief and mean annual temperature [°C * 10] as background, as provided in the WorldClim database. Due to overlaps, close points are scattered around the farm [Colour figure can be viewed at wileyonlinelibrary.com]
Figure 2Manhattan plot showing the q‐values for each marker (with G‐ or Wald‐Score > 6) on chromosome 23 of Moroccan sheep associated with annual precipitation as calculated in samβada in a univariate mode. Points in red correspond to models involving two nonsynonymous SNPs (ss1208941124 and ss1208941157) in the MC5R gene (ss1208941124 having the lowest q‐value of the two). The red horizontal bar shows a significance threshold of 0.05 [Colour figure can be viewed at wileyonlinelibrary.com]
Figure 6Presence–absence of the AA genotype of SNP ARS‐BFGL‐NGS‐106879 reported with shaded relief and mean annual temperature [°C * 10]) as background. Due to overlaps, close points are scattered around the farm [Colour figure can be viewed at wileyonlinelibrary.com]