| Literature DB >> 30041597 |
Jun Yasuda1, Fumiki Katsuoka2, Inaho Danjoh2, Yosuke Kawai2,3, Kaname Kojima2, Masao Nagasaki2, Sakae Saito2, Yumi Yamaguchi-Kabata2, Shu Tadaka2, Ikuko N Motoike2, Kazuki Kumada2, Mika Sakurai-Yageta2, Osamu Tanabe2, Nobuo Fuse2, Gen Tamiya2, Koichiro Higasa4, Fumihiko Matsuda4, Nobufumi Yasuda5, Motoki Iwasaki6, Makoto Sasaki7,8, Atsushi Shimizu8, Kengo Kinoshita2,9, Masayuki Yamamoto10,11.
Abstract
BACKGROUND: Genotype imputation from single-nucleotide polymorphism (SNP) genotype data using a haplotype reference panel consisting of thousands of unrelated individuals from populations of interest can help to identify strongly associated variants in genome-wide association studies. The Tohoku Medical Megabank (TMM) project was established to support the development of precision medicine, together with the whole-genome sequencing of 1070 human genomes from individuals in the Miyagi region (Northeast Japan) and the construction of the 1070 Japanese genome reference panel (1KJPN). Here, we investigated the performance of 1KJPN for genotype imputation of Japanese samples not included in the TMM project and compared it with other population reference panels.Entities:
Keywords: Genome reference panel; Genotype imputation; Japan; Population genetics
Mesh:
Year: 2018 PMID: 30041597 PMCID: PMC6057088 DOI: 10.1186/s12864-018-4942-0
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Human genomic similarities among four different regions of mainland Japan. a. Schematic diagram indicating the geographic origins of the Japanese samples used in this study. b. Scatter plot of first and second eigenvalues in principal component analysis (PCA) of genetic diversity in chromosome 1 of Japanese and other East Asian populations analyzed in the 1000 Genomes Project. Horizontal and vertical axes indicate the first and second components, respectively. The dots represent individuals and the legends are shown on the top right. C. PCA of the five Japanese populations analyzed in this study (close up view of Fig. 1b). The individuals used in the 1KJPN panel are indicated by small dots to improve the visualization of other populations
Fixation index (FST) estimation* of genetic differentiation of samples in four parts of Japan
| Iwate | Nagahama | Aki | |
|---|---|---|---|
| Miyagi | 0.000345 | 0.000380 | 0.00154 |
| Iwate | – | 0.000876 | 0.00192 |
| Nagahama | – | 0.000920 |
*The FST values are based on the SNPS in chromosome 1
Fig. 2Pairwise coincidence matrix of individuals from the four Japanese populations created by fineSTRUCTURE. The color scale represents the posterior confidence probability. The origins of each sample are indicated with colors in the top left of the figure (different color codes compared with Fig. 1b and C). Four clusters are presented and the numbers of samples from each region are indicated in brackets at the top of the matrix
Fig. 3Imputation accuracies using 1KJPN data for Japanese populations not included in the 1KJPN panel. Plot of the imputation accuracy (vertical axis, aggregate r2 value) against the non-reference allele frequency of reference panel (horizontal axis) when the 1KJPN panel was used as the haplotype reference. Each population is indicated by a different color. Each point on the curves is the average of the corresponding allele frequency bin
Fig. 4Differences in imputation accuracies using reference panels for Japanese populations. Vertical axis indicates the r2 values and horizontal axis indicates the minor allele frequencies of the SNPs. Sample regions analyzed are indicated at the top of each panel
SNPS that differentiate between Miyagi and Nagahama of Aki based on MAF differences
| test | rsid (dbSNP 138) | Chr | Pos (hg19) | Miyagi | Nagahama | Aki | p-value | Annotation |
|---|---|---|---|---|---|---|---|---|
| Nagahama-Miyagi | rs1899621 | 3 | 157280172 | 0.03178 | 0.1795 | 0.0625 | 4.44.E-07 | PQLC2L: Intronic |
| Nagahama-Miyagi | rs9501875 | 6 | 2685970 | 0.09331 | 0.2949 | 0.04688 | 8.41.E-07 | MYLK4: Intronic |
| Nagahama-Miyagi | rs148081741 | 6 | 135758259 | 0.02944 | 0.2051 | 0.01562 | 4.32.E-09 | AHI1: Intronic |
| Nagahama-Miyagi | rs141380643 | 6 | 135812869 | 0.02897 | 0.2051 | 0.01562 | 3.54.E-09 | AHI1: Intronic |
| Nagahama-Miyagi | chr7:144127237:C:T | 7 | 144127237 | 0.0323 | 0.1795 | 0.1406 | 5.31.E-07 | intergenic |
| Nagahama-Miyagi | rs55897843 | 10 | 28739479 | 0.01731 | 0.1538 | 0 | 4.98.E-08 | LOC105376468: Intronic |
| Nagahama-Miyagi | chr11:32314153:G:A | 11 | 32314153 | 0.03087 | 0.1795 | 0.09375 | 3.26.E-07 | intergenic |
| Nagahama-Miyagi | chr16:47678044:A:T | 16 | 47678044 | 0.01457 | 0.1282 | 0.0625 | 7.88.E-07 | PHKB: Intronic |
| Nagahama-Miyagi | rs8094961 | 18 | 14267309 | 0.009615 | 0.1154 | 0 | 3.69.E-07 | intergenic |
| Nagahama-Miyagi | rs57064200 | 21 | 46286788 | 0.3396 | 0.6282 | 0.371 | 3.86.E-07 | PTTG1IP: Intronic |
| Aki-Miyagi | rs117933761 | 1 | 100267335 | 0.01402 | 0.03846 | 0.1562 | 8.90.E-08 | intergenic |
| Aki-Miyagi | rs4922078 | 8 | 19512537 | 0.3312 | 0.359 | 0.6406 | 7.15.E-07 | CSGALNACT1: Intronic |
| Aki-Miyagi | rs9423657 | 10 | 5607370 | 0.02682 | 0.03846 | 0.1875 | 3.33.E-07 | LOC105376381: Intronic |
| Aki-Miyagi | rs10899501 | 11 | 78131408 | 0.4565 | 0.4359 | 0.1452 | 3.96.E-07 | intergenic |
| Aki-Miyagi | chr13:103019948:C:T | 13 | 103019948 | 0.0565 | 0.1923 | 0.25 | 7.78.E-07 | FGF14: Intronic |
| Aki-Miyagi | rs118020607 | 14 | 99987537 | 0.06232 | 0.1026 | 0.2656 | 5.24.E-07 | CCDC85C: Intronic |
| Aki-Miyagi | rs59993898 | 15 | 80826474 | 0.009813 | 0.01282 | 0.125 | 8.60.E-07 | ARNT2: Intronic |
| Aki-Miyagi | rs150711498 | 18 | 21327345 | 0.008879 | 0.01282 | 0.125 | 4.66.E-07 | LAMA3: Intronic |
| Aki-Miyagi | rs11669387 | 19 | 33999497 | 0.1076 | 0.141 | 0.3438 | 7.66.E-07 | PEPD: Intronic |
Numbers of SNPs found in three populations but not found in 1KJPN (Miyagi population)
| Type | Iwate (per person) | Nagahama (per person) | Aki (per person) |
|---|---|---|---|
| Total SNPS (AC > =2)* | 113,541 (834.86) | 33,067 (847.87) | 44,634 (1275.26) |
| Exonic | 1596 (11.74) | 411 (10.54) | 593 (16.94) |
| Mis sense | 1032 (7.59) | 282 (7.23) | 366 (10.46) |
| Stop gain | 14 (0.10) | 2 (0.05) | 9 (0.26) |
AC allele counts in the three regions