| Literature DB >> 29691385 |
Yukinori Okada1,2,3, Yukihide Momozawa4, Saori Sakaue5,6,7, Masahiro Kanai5,6,8, Kazuyoshi Ishigaki6, Masato Akiyama6, Toshihiro Kishikawa5,9, Yasumichi Arai10, Takashi Sasaki10, Kenjiro Kosaki11, Makoto Suematsu12, Koichi Matsuda13, Kazuhiko Yamamoto14, Michiaki Kubo15, Nobuyoshi Hirose10, Yoichiro Kamatani6,16.
Abstract
Understanding natural selection is crucial to unveiling evolution of modern humans. Here, we report natural selection signatures in the Japanese population using 2234 high-depth whole-genome sequence (WGS) data (25.9×). Using rare singletons, we identify signals of very recent selection for the past 2000-3000 years in multiple loci (ADH cluster, MHC region, BRAP-ALDH2, SERHL2). In large-scale genome-wide association study (GWAS) dataset (n = 171,176), variants with selection signatures show enrichment in heterogeneity of derived allele frequency spectra among the geographic regions of Japan, highlighted by two major regional clusters (Hondo and Ryukyu). While the selection signatures do not show enrichment in archaic hominin-derived genome sequences, they overlap with the SNPs associated with the modern human traits. The strongest overlaps are observed for the alcohol or nutrition metabolism-related traits. Our study illustrates the value of high-depth WGS to understand evolution and their relationship with disease risk.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29691385 PMCID: PMC5915442 DOI: 10.1038/s41467-018-03274-0
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Site frequency spectrum and fraction of sites under selection in the worldwide populations. Site frequency spectrum (SFS) estimated for the worldwide populations. In addition to the Japanese WGS datasets, data obtained from the Genome Aggregation Database (gnomAD; African, admixed American, east Asian, Finnish, and Non-Finnish European) and the UK10K project (European) are indicated[13,17]. a Fraction of sites under selection pressure ( = f) calculated separately for loss-of-of-function variants, nonsynonymous SNV, or synonymous SNV. b Ratio of f between loss-of-of-function variants and synonymous SNV
Fig. 2Longitudinal change of the effective population size of the Japanese population. Longitudinal change of the effective population size of the Japanese population estimated from the WGS data. The effective population sizes were estimated separately for the datasets 1–3, using SMC++ software[20]. Times are indicated in a logarithm and in b linear scales. One generation was considered to be 29 years
Fig. 3Genome-wide very recent natural selection signatures of the Japanese population. A Manhattan plot of the genome-wide natural selection signatures obtained from the WGS data of 2234 Japanese individuals. The y-axis indicates the –log10(P) of a genome-wide selection signature calculated by using SDS[9]. The horizontal gray line represents the genome-wide significance threshold (P < 5.0 × 10-8)
SNPs with very recent natural selection signatures in the Japanese population
| rsID | Chr | Position (hg19) | Ancestral derived | DAF in WGS | Gene | Selection signature | DAF heterogeneity | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1000 Genomes Project global ( | Japanese ( | ||||||||||
|
| Fold change |
| Fold change |
| |||||||
| SNPs with SDS selection signatures (2234 Japanese subjects) | |||||||||||
| rs75721934 | 4 | 100,142,780 | G/A | 0.750 | ADH clusters | 7.13 | 9.7 × 10−13 | 10.26 | 8.5 × 10−6 | 4.32 | 0.021 |
| rs58008302 | 6 | 29,493,261 | G/A | 0.186 | MHC region | 8.14 | 4.1 × 10−16 | 0.61 | 0.63 | 14.82 | 7.4 × 10−5 |
| rs3782886 | 12 | 112,110,489 | T/C | 0.289 | 8.13 | 4.4 × 10−16 | 3.66 | 0.0044 | 12.17 | 1.5 × 10−5 | |
| rs4822159 | 22 | 42,932,013 | C/G | 0.193 |
| −5.80 | 6.6 × 10−9 | 0.99 | 0.40 | 3.41 | 0.045 |
DAF, derived allele frequency; WGS, whole-genome sequence
Fig. 4Derived allele frequency heterogeneity of the SNPs with natural selection signatures. a DAF heterogeneity of the SNPs within subpopulations of the 1000 Genomes Project global subjects, or the regional populations of the Japanese subjects from the BBJ cohort. Strength of blue color corresponds to the density of the SNPs. Circles indicate the top SNPs identified by SDS, and the top SNPs with nominally significant enrichment of DAF heterogeneity are labeled (P < 0.05). b DAF spectra of the four SNPs with genome-wide SDS selection signatures in each sub- or regional populations. DAF in each of the seven regions of Japan (Hokkaido, Tohoku, Kanto-Koshinetsu, Chubu-Hokuriku, Kinki, Kyushu, and Okinawa) are colored in the geographical map. We note that DAF in Chugoku-Shikoku was not available (colored in gray). c Correlations among the regional vector of Japan, PCs, and the SDS top SNP genotypes. PC1 separated the Japanese population into the two major clusters, Hondo and Ryukyu (left panel). Correlations between the regional vector and each of PCs (middle panel), and between top two PCs and each of the top SNP genotypes from the SDS analysis (right panel) are indicated. PC1 showed strong correlations with the regional vector and the SNP genotypes
Fig. 5Overlap between natural selection signatures and genetic risk of human phenotypes in Japanese. Enrichment of the natural selection signatures in the GWAS-associated variants of the diseases (n = 36), quantitative traits (n = 61), and immune-cell-specific eQTL (n = 6) in Japanese. For each trait, inflation of the selection χ2 value is indicated along the x-axis, and –log10(P) of enrichment is plotted along the y-axis. The horizontal gray line represents significance threshold based of Bonferroni’s correction on the numbers of the evaluated traits (P < 0.00049)