| Literature DB >> 26048416 |
Carmen Amador1, Jennifer Huffman2, Holly Trochet3, Archie Campbell4, David Porteous5, James F Wilson6, Nick Hastie7, Veronique Vitart8, Caroline Hayward9, Pau Navarro10, Chris S Haley11,12.
Abstract
BACKGROUND: The Generation Scotland Scottish Family Health Study (GS:SFHS) includes 23,960 participants from across Scotland with records for many health-related traits and environmental covariates. Genotypes at ~700 K SNPs are currently available for 10,000 participants. The cohort was designed as a resource for genetic and health related research and the study of complex traits. In this study we developed a suite of analyses to disentangle the genomic differentiation within GS:SFHS individuals to describe and optimise the sample and methods for future analyses.Entities:
Mesh:
Year: 2015 PMID: 26048416 PMCID: PMC4458001 DOI: 10.1186/s12864-015-1605-2
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Origin, location, number of individuals and given values for latitude and longitude for the different populations in the GS + 1 kG data set
| Code | Origin | Location | N. ind |
|---|---|---|---|
| GS:SFHS | Europe | Scotland | 9,889 |
| ASW | Africa | African ancestry individuals in SW US | 61 |
| CEU | Europe | Utah residents with N. and W. European ancestry | 85 |
| CHB | Asia | Han Chinese in Beijing | 97 |
| CHS | Asia | Han Chinese South | 100 |
| CLM | America | Colombian in Medellin, Colombia | 60 |
| FIN | Europe | Finnish individuals in Finland | 93 |
| GBR | Europe | British individuals in England and Scotland | 89 |
| IBS | Europe | Iberian populations in Spain | 14 |
| JPT | Asia | Japanese individuals in Tokyo, Japan | 89 |
| LWK | Africa | Luhya individuals in Webuye, Kenya | 97 |
| MXL | America | Mexican ancestry individuals in LA California | 66 |
| PUR | America | Puerto Rican in Puerto Rico | 55 |
| TSI | Europe | Tuscan individuals in Tuscany, Italia | 98 |
| YRI | Africa | Yoruba individuals in Ibadan, Nigeria | 88 |
| Total | 10,981 |
Fig. 1Results of the PCA in the GS + 1 kG data set. a Values for PC1 and PC2 in GS + 1 kG individuals; b Values for PC1 and PC2 only in GS:SFHS individuals (open circles were defined as outliers); c Values for PC3 and PC4 in GS + 1 kG individuals; d Values for PC5 and PC6 in GS + 1 kG individuals
Fig. 2Results of the PCA in the GS + European data set. Values for PC1 and PC2 in Generation Scotland and the other European samples
Genomic relationship coefficients between several pairs of individuals using different thresholds for the computation of the GRM
| Ind. 1 | Ind. 2 | GRMALL | GRM>1 % | GRM>5 % | GRM<1 % | GRM<5 % |
|---|---|---|---|---|---|---|
| 40280 | 11786 | 0.453 | 0.080 | 0.075 | 4.776 | 2.638 |
| 132098 | 30436 | 0.341 | 0.062 | 0.049 | 3.580 | 2.025 |
| 67527 | 30436 | 0.263 | 0.038 | 0.029 | 2.870 | 1.612 |
| 145349 | 30436 | 0.187 | 0.011 | 0.006 | 2.217 | 1.226 |
| 147185 | 30436 | 0.455 | 0.053 | 0.038 | 5.120 | 2.863 |
| 147185 | 132098 | 0.272 | 0.038 | 0.027 | 2.999 | 1.691 |
| 147185 | 67527 | 0.230 | 0.026 | 0.021 | 2.602 | 1.440 |
| 147185 | 145349 | 0.181 | 0.014 | 0.012 | 2.123 | 1.162 |
| 147185 | 34327 | 0.175 | 0.017 | 0.013 | 2.010 | 1.116 |
| 147185 | 9025 | 0.180 | 0.008 | 0.007 | 2.177 | 1.184 |
| 147185 | 118411 | 0.242 | 0.030 | 0.022 | 2.702 | 1.518 |
| 114918 | 30436 | 0.195 | 0.011 | 0.006 | 2.315 | 1.281 |
| 108361 | 147185 | 0.195 | 0.021 | 0.015 | 2.216 | 1.237 |
| 153784 | 30436 | 0.200 | 0.043 | 0.033 | 2.007 | 1.158 |
| 153784 | 145349 | 0.219 | 0.026 | 0.024 | 2.467 | 1.354 |
| 153784 | 147185 | 0.458 | 0.043 | 0.034 | 5.295 | 2.918 |
| 153784 | 108361 | 0.195 | 0.017 | 0.014 | 2.257 | 1.240 |
| 40280 | 30436 | 0.271 | 0.024 | 0.017 | 3.133 | 1.734 |
| 40280 | 147185 | 0.176 | 0.008 | 0.003 | 2.132 | 1.178 |
| 62626 | 147185 | 0.173 | 0.015 | 0.012 | 2.015 | 1.110 |
Fig. 3Score values of individuals 40,280 and 11,786. a Individual marker score of individual 40,280; b Individual marker score of individual 11,786; c Pair marker score of individuals 40,280 and 11,786; d Rarity scores of individual 40,280 and 11,786
Fig. 4Origin of rare alleles in GS:SFHS. Frequencies for rare Generation Scotland alleles (p ≤ 0.0004) in the a African, b Asian and c European populations of the 1000 Genomes data set
Distribution of the peaks detected using the rarity scores in windows of 50 SNPs in the two groups of individuals (outliers and non-outliers)
| Non-Outliers | ||
| Number of peaks per individual | Total coverage of peaks per individual (Mb) | |
| Mean | 5.3 | 2.3 |
| Max | 118 | 50 |
| Min | 0 | 0 |
| Outliers | ||
| Number of peaks per individual | Total coverage of peaks per individual (Mb) | |
| Mean | 46.5 | 136.7 |
| Max | 185 | 800 |
| Min | 11 | 51 |
Table shows mean, maximum and minimum number of peaks and total peak size per individual
Fig. 5Selected results from chromosomal PCA. Location of GS:SFHS individuals (whole genome and different chromosomes analyses) for PC 1 and 2 (upper row) and several PCs showing a distinct pattern (lower row). The colours show the correspondence between the groups shown in PC 5 and 6 when using the whole genome, and those obtained when analysing only chromosome 8 for PC 2 and 3
Areas in Scotland, number of individuals in the cohort born in each of the areas, number of individuals with the four grandparents coming from that area, and values of latitude and longitude used for each of the areas in the regression analyses
| Area | N ind. | N 4GPs | Lat. | Lon. | |
|---|---|---|---|---|---|
| 1 | Aberdeen City | 470 | 78 | 57.15 | -2.09 |
| 2 | Aberdeenshire | 100 | 85 | 57.16 | -2.72 |
| 3 | Angus | 290 | 66 | 56.80 | -2.92 |
| 4 | Argyll & Bute | 48 | 6 | 56.37 | -5.03 |
| 5 | Clackmannanshire | 0 | 0 | 56.12 | -3.55 |
| 6 | Dumfries & Galloway | 44 | 7 | 54.99 | -3.86 |
| 7 | Dundee City | 1016 | 202 | 56.46 | -2.97 |
| 8 | East Ayrshire | 23 | 5 | 55.46 | -4.33 |
| 9 | East Dunbartonshire | 80 | 4 | 55.96 | -4.20 |
| 10 | East Lothian | 11 | 1 | 55.95 | -2.77 |
| 11 | Edinburgh City | 189 | 21 | 55.95 | -3.19 |
| 12 | Western Isles | 18 | 8 | 57.76 | -7.02 |
| 13 | Falkirk | 34 | 4 | 56.00 | -3.78 |
| 14 | Fife | 184 | 26 | 56.21 | -3.15 |
| 15 | Glasgow City | 1644 | 414 | 55.86 | -4.25 |
| 16 | Highland | 77 | 29 | 57.36 | -5.10 |
| 17 | Inverclyde | 28 | 6 | 55.91 | -4.74 |
| 18 | Midlothian | 14 | 0 | 55.83 | -3.13 |
| 19 | Moray | 36 | 7 | 57.51 | -3.25 |
| 20 | North Ayrshire | 60 | 8 | 55.71 | -4.73 |
| 21 | North Lanarkshire | 117 | 24 | 55.83 | -3.92 |
| 22 | Orkney Islands | 8 | 5 | 58.94 | -2.74 |
| 23 | Perth & Kinross | 512 | 49 | 56.59 | -3.86 |
| 24 | Renfrewshire | 167 | 12 | 55.83 | -4.54 |
| 25 | Scottish Borders | 22 | 4 | 55.54 | -2.79 |
| 26 | Shetland Islands | 8 | 3 | 60.35 | -1.24 |
| 27 | South Ayrshire | 31 | 7 | 55.27 | -4.65 |
| 28 | South Lanarkshire | 110 | 20 | 55.52 | -3.70 |
| 29 | Stirling | 67 | 5 | 56.12 | -3.94 |
| 30 | West Dunbartonshire | 66 | 7 | 55.96 | -4.50 |
| 31 | West Lothian | 1 | 0 | 55.91 | -3.55 |
| - | Not disclosed | 1338 | - | NA | NA |
Results of the multiple linear regressions between the PC and the values of latitude and longitude in 6739 unrelated individuals of GS:SFHS
| Analysis | R2 |
| Latitude | Longitude | |
|---|---|---|---|---|---|
| PC1 | 0.1092 | 4.20E-139 | *** | *** | *** |
| PC2 | 0.1632 | 6.81E-214 | *** | *** | *** |
| PC3 | 0.0936 | 2.47E-118 | *** | *** | *** |
| PC4 | 0.0010 | 6.26E-02 | * | ||
| PC5 | 0.0070 | 4.21E-09 | *** | *** | *** |
| PC6 | 0.0003 | 3.88E-01 | |||
| PC7 | 0.0077 | 6.13E-10 | *** | *** | |
| PC8 | 0.0035 | 6.51E-05 | *** | *** | |
| PC9 | 0.0002 | 5.15E-01 | |||
| PC10 | 0.0144 | 4.35E-18 | *** | ** | *** |
| PC11 | 0.0024 | 1.39E-03 | ** | * | |
| PC12 | 0.0008 | 1.09E-01 | |||
| PC13 | 0.0093 | 6.74E-12 | *** | ** | *** |
| PC14 | 0.0016 | 1.09E-02 | * | * | ** |
| PC15 | 0.0001 | 8.28E-01 | |||
| PC16 | 0.0033 | 1.04E-04 | *** | *** | |
| PC17 | 0.0023 | 1.91E-03 | ** | *** | * |
| PC18 | 0.0018 | 7.54E-03 | ** | * | |
| PC19 | 0.0033 | 1.01E-04 | *** | *** | |
| PC20 | 0.0046 | 3.35E-06 | *** | *** | |
Signif: ***p ≤ 0.001, ** p ≤ 0.01, *p ≤ 0.05
Results of the multiple linear regressions between the PC and the values of latitude and longitude of the grandparents in 1113 individuals of GS:SFHS
| Analysis | R2 |
| Latitude | Longitude | |
|---|---|---|---|---|---|
| PC1 | 0.2077 | 7.42E-57 | *** | ** | *** |
| PC2 | 0.3085 | 1.24E-89 | *** | *** | ** |
| PC3 | 0.2056 | 3.26E-56 | *** | *** | * |
| PC4 | 0.0100 | 3.68E-03 | ** | *** | |
| PC5 | 0.0301 | 4.34E-08 | *** | *** | *** |
| PC6 | 0.0100 | 3.85E-03 | ** | ** | ** |
| PC7 | 0.0247 | 9.27E-07 | *** | *** | |
| PC8 | 0.0085 | 9.00E-03 | ** | ** | |
| PC9 | 0.0076 | 1.47E-02 | * | ** | |
| PC10 | 0.0552 | 2.06E-14 | *** | *** | *** |
| PC11 | 0.0037 | 1.26E-01 | |||
| PC12 | 0.0055 | 4.65E-02 | * | * | |
| PC13 | 0.0082 | 1.06E-02 | * | ** | |
| PC14 | 0.0057 | 4.30E-02 | * | * | * |
| PC15 | 0.0017 | 3.93E-01 | |||
| PC16 | 0.0147 | 2.73E-04 | *** | *** | |
| PC17 | 0.0148 | 2.62E-04 | *** | *** | *** |
| PC18 | 0.0047 | 7.29E-02 | |||
| PC19 | 0.0045 | 8.16E-02 | |||
| PC20 | 0.0102 | 3.44E-03 | ** | ** | |
Signif: ***p ≤ 0.001, **p ≤ 0.01, *p ≤ 0.05
Fig. 6Locations and predictions within Scotland. a Real location of the 31 different origins of the GS:SFHS individuals. b Predicted latitude and longitude of the individuals using the genomic principal components