| Literature DB >> 24636660 |
Imtiaz Ahmed Sajid Randhawa1, Mehar Singh Khatkar, Peter Campbell Thomson, Herman Willem Raadsma.
Abstract
BACKGROUND: Discerning the traits evolving under neutral conditions from those traits evolving rapidly because of various selection pressures is a great challenge. We propose a new method, composite selection signals (CSS), which unifies the multiple pieces of selection evidence from the rank distribution of its diverse constituent tests. The extreme CSS scores capture highly differentiated loci and underlying common variants hauling excess haplotype homozygosity in the samples of a target population.Entities:
Mesh:
Year: 2014 PMID: 24636660 PMCID: PMC4101850 DOI: 10.1186/1471-2156-15-34
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Breeds, samples, genotypes (SNPs) and known genes in each group of cattle and sheep
| Polledness | Poll head | 7 | 85 | UMD3.1 | 38,290 | 65.50 | 38,177 | |||
| Horn head | 7 | 127 | ||||||||
| Double muscling | Double muscling | 3 | 49 | UMD3.1 | 38,520 | 65.15 | 38,407 | |||
| Normal muscling | 14 | 308 | ||||||||
| Polledness | Poll head | 37 | 1489 | OARv1.0 | 47,498 | 51.26 | - | |||
| Horn head | 36 | 1290 | ||||||||
| Double muscling | Double muscling | 3 | 149 | OARv1.0 | 47,502 | 51.26 | - | |||
| Normal muscling | 71 | 2654 | ||||||||
| Geographic location | African | 7 | 226 | UMD3.1 | 37,905 | 65.67 | 37,795 | - | ||
| European | 46 | 847 |
aDetails of breeds and genotyping information about cattle and sheep is available in the (Additional files 1, 2 and 3: Table S1, S2 and S3, respectively).
Figure 1Composite selection signals (CSS) for validation datasets. Chromosome-wise plots of highest CSS scores are shown for trait-wise datasets of cattle (A and B) and sheep (C and D). The dotted red horizontal lines in the CSS plots indicate the genome-wide 0.1% thresholds of the empirical scores. Smooth lines are the smoothed CSS scores by averaging SNPs within each 1 Mb window. Vertical green lines indicate the location of candidate genes at each chromosome as follows: A = POLL locus for polledness in cattle (dataset A), B = MSTN for double muscle in cattle (dataset B), C = RXFP2 for polledness in sheep (dataset C), and D = MSTN for double muscle in sheep (dataset D).
Genomic regions under selection in cattle and sheep identified using composite selection signals (CSS)
| A5 | 13 | 63.90-65.97 | 18 | 23* | 1 | 5 | 26 | Stature | |
| A7 | 14 | 23.78-25.61 | 11 | 7 | 5* | 10* | 12 | Stature | |
| B2 | 6 | 66.55-68.11 | 11 | 8 | - | - | 6 | Reproduction | |
| B6 | 16 | 44.49-46.05 | 11 | 11 | 1 | - | 12 | Embryonic growth, immunity | |
| B8 | 18 | 13.34-15.03 | 5 | 3 | 1 | - | 33 | Coat colour | |
| C8 | 13 | 66.97-68.50 | 7 | - | 7 | 3 | 17 | Coat colour | |
| C10 | 25 | 6.67-8.29 | 14 | 10 | - | 2 | 16 | Bone growth | |
| D2 | 2 | 119.62-122.30 | 20 | 11* | 10 | 16 | 26 | - | - |
Cluster of a minimum of three significant SNPs within a window spanning 1 Mb genomic locations centred on a core SNP above the threshold (top 0.1%) in CSS (smoothed statistics) are reported and are compared with the constituent tests.
aPrefix (A, B, C and D) with each region number represents the dataset as defined in Table 1 and rows in bold indicate the genomic regions containing candidate genes. A complete list of 36 genomic regions, their positions, range of all significant clusters (for each test) and genes under clusters of significant SNPs is shown in [Additional file 10: Table S4].
bPosition of genomic regions includes a 0.5 Mb extension on both sides of boundaries of the main cluster identified by CSS to compare constituent tests and count of genes (see Methods). Large sized (> 1 Mb) regions are formed by joining successive (<1 Mb apart) clusters.
cGenes mapped on bovine (UMD3.1) and ovine (OARv1.0) assemblies within the boundaries of genomic regions.
dCandidate genes with known functional/structural effects for a particular trait present in the contrasting panels of multiple breeds.
*Indicates the cluster of highest ranked SNPs (raw scores) for a particular selection test.
Figure 2Density distribution of false discovery rate (-values) of SNPs in significant clusters (orange) and the rest of the genome-wide SNPs (gray). Density plots are shown for polled cattle (A), double muscle cattle (B), polled sheep (C) and double muscle sheep (D). Vertical dashed (−−−−−) lines indicate q-values (FDR) = 0.05 in each subset. q-values were calculated from the calibrated p-values. Histograms of the mean Z, empirical and calibrated p-values are shown in Additional file 7: Figure S4. Relationship between q-values and calibrated p-values is shown in Additional file 8: Figure S5.
False discovery rates within identified genomic regions in each validation dataset of cattle and sheep
| A1 | 1 | 14 | 85.7 | 9.8 (A) |
| A5 | 13 | 19 | 78.9 | |
| A7 | 14 | 11 | 81.8 | |
| B1 | 2 | 10 | 90.0 | 6.2 (B) |
| B2 | 6 | 11 | 63.6 | |
| B6 | 16 | 11 | 36.4 | |
| B8 | 18 | 12 | 41.7 | |
| C5 | 10 | 26 | 46.2 | 5.3 (C) |
| C8 | 13 | 9 | 44.4 | |
| C10 | 25 | 15 | 60.0 | |
| D2 | 2 | 23 | 87.0 | 2.4 (D) |
| D4 | 2 | 54 | 75.9 |
aTotal number of SNPs located within the boundaries of the main cluster identified by CSS and their position exclude 0.5 Mb additions for gene investigation (shown in Table 2).
Figure 3Composite selection signals (CSS) for geographically isolated cattle populations. Manhattan plots of -log10(p) of CSS are shown for (A) European Bos taurus and (B) African Bos taurus. Genome-wide smoothed CSS scores for SNPs on consecutive chromosomes are shown in various colours. Dotted red line in the CSS plots indicate the genome-wide 0.1% (upper cutoff) thresholds of the empirical smoothed scores. Gray stars are shown for raw CSS scores in the genome-wide background and bold at the putative selection regions underlying the significant clusters.
Figure 4Circos plot of genome-wide composite (CSS) and constituent (XPEHH, ΔDAF and ) smoothed test statistics in European Bos cattle. Significant selection signatures in each test are highlighted with the red dots. Genes of important functions underlying the significant genomic regions identified by CSS are annotated and complete list of genes is available in (Additional file 13: Table S5). Circos plot was created using modified functions from the R package “RCircos” [63].