Literature DB >> 34013582

An analysis of the demographic history of the risk allele R4810K in RNF213 of moyamoya disease.

Kae Koganebuchi^1,2,3, Kimitoshi Sato⁴, Kiyotaka Fujii⁴, Toshihiro Kumabe⁴, Kuniaki Haneji⁵, Takashi Toma⁵, Hajime Ishida⁵, Keiichiro Joh⁶, Hidenobu Soejima⁶, Shuhei Mano⁷, Motoyuki Ogawa^1,8, Hiroki Oota^1,8,3.

Abstract

BACKGROUND: Ring finger protein 213 (RNF213) is a susceptibility gene of moyamoya disease (MMD). A previous case-control study and a family analysis demonstrated a strong association of the East Asian-specific variant, R4810K (rs112735431), with MMD. Our aim is to uncover evolutionary history of R4810K in East Asian populations.
METHODS: The RNF213 locus of 24 MMD patients in Japan were sequenced using targeted-capture sequencing. Based on the sequence data, we conducted population genetic analysis and estimated the age of R4810K using coalescent simulation.
RESULTS: The diversity of the RNF213 gene was higher in Africans than non-Africans, which can be explained by bottleneck effect of the out-of-Africa migration. Coalescent simulation showed that the risk variant was born in East Asia 14,500-5100 years ago and came to the Japanese archipelago afterward, probably in the period when the known migration based on archaeological evidences occurred.
CONCLUSIONS: Although clinical data show that the symptoms varies, all sequences harboring the risk allele are almost identical with a small number of exceptions, suggesting the MMD phenotypes are unaffected by the variants of this gene and rather would be more affected by environmental factors.

Entities: Chemical

Keywords: East Asia, Jomon people, Moyamoya disease, RNF213, Yayoi immigrants; age of mutation

Mesh：

Substances：

Year: 2021 PMID： 34013582 PMCID： PMC8453937 DOI： 10.1111/ahg.12424

Source DB: PubMed Journal: Ann Hum Genet ISSN： 0003-4800 Impact factor: 1.670

INTRODUCTION

Moyamoya disease (MMD) is a disorder characterized by spontaneous occlusion of the circle of Willis with the development of a fragile network of basal collateral vessels similar to a puff of smoke on a cerebral angiogram. MMD was reported as a disease of bilateral internal carotid arteries in 1957 (Takeuchi & Shimizu, 1957). The term “moyamoya” means “puffs of smoke” in Japanese, was used for this disease in 1969 (Suzuki & Takaku, 1969) and is used worldwide even today. MMD occurs frequently in East Asia, including Japan, but it is rare in Europe and Africa (Goto & Yonekawa, 1992). Though the majority of MMD cases are sporadic, familial cases have also been substantially reported in previous studies (Kim, 2016). A family history of MMD is found in approximately 10−15% of patients (Kuriyama et al., 2008). Several previous studies have revealed the loci associated with the MMD in East Asia: 3p26–p24.2 (Ikeda et al., 1999), 6q25 (Inoue et al., 2000), 17q25 (Liu et al., 2010; Mineharu et al., 2006), and 8q23 (Sakurai et al., 2004). A genome‐wide association study and a pedigree analysis have identified the same susceptibility gene, Ring finger protein 213 (RNF213), which is located at 17q25 (Kamada et al., 2011; Liu et al., 2011); the RNF213 gene is a regulator of cytoplasmic lipid droplets (Sugihara et al., 2019). The two genetic studies demonstrate a strong association of the variant R4810K in the RNF213 gene (rs112735431) with MMD in the East Asian population with an extremely high odds ratio (approximately 110). The allele frequencies of R4810K in the MMD cases are 48.1% in Japanese, 39.5% in Korean, and 12.5% in Chinese (Liu et al., 2011), which has never been found in MMD patients with European ancestry (Kobayashi et al., 2016; Liu et al., 2011). A very low frequency of R4810K (0.00−2.84%) has been found in healthy subjects in East Asia (Liu et al., 2011, 2012; Takamatsu et al., 2017). It is rare that for a single risk variant, such high penetrance has been shown in East Asians, while patients with MMD have very diverse symptoms (Research Committee on the Pathology and Treatment of Spontaneous Occlusion of the Circle of Willis, & Health Labour Sciences Research Grant for Research on Measures for Infractable Diseases, 2012). Although many variants associated with common (multifactorial) diseases are usually shared by populations around the world, their penetrance is generally low (Schork et al., 2009). On the other hand, the penetrance of risk alleles for Mendelian disorders that are observed only in certain families is generally high (Antonarakis et al., 2010). Previous studies on celiac disease and Crohn's disease have revealed that these diseases show intermediate characteristics between Mendelian disorders and common diseases (Nakagome et al., 2010, 2012; Zhernakova et al., 2010), which is thought to reflect the history of the population. The MMD patients have various diagnostic characters, while the most of East Asian patients have R4810K, especially in Japanese: 70%∼90% of patients have the risk allele (Liu et al., 2011; Takamatsu et al., 2017). The questions are whether the diagnostic diversity is owing to the other mutations on the RNF213 gene, and how/when the lineage harboring R4810K spread in the East Eurasian continent, especially into the Japanese archipelago. In the present study, our purpose is disclosing population genetic characteristics and spread process of the risk allele of MMD found in East Asian populations. We recently reported sequences of the 214‐kb region covering the RNF213 gene for 24 MMD patients (Koganebuchi et al., 2018). In this study, we perform population genetics analysis of RNF213 using the sequence data to elucidate how this risk allele has been maintained in East Asians and estimate the age of the risk allele. For a greater understanding of MMD, we discuss the evolution of the risk allele under the peopling history of Japanese.

MATERIALS AND METHODS

DNA samples of MMD cases

All patients were diagnosed with MMD according to the criteria issued by the Japanese Ministry of Health, Labor, and Welfare (Research Committee on the Pathology and Treatment of Spontaneous Occlusion of the Circle of Willis, & Health Labour Sciences Research Grant for Research on Measures for Infractable Diseases, 2012). Blood samples were collected from unrelated 30 MMD patients at Kitasato University Hospital, Kanagawa Prefecture, Japan. Information on family histories and onset of symptoms was obtained by interview. Each patient was observed various symptoms such as transient ischemic attack, stroke, cerebral hemorrhage, headache, and unilateral or bilateral involvement. All individuals provided written informed consent to participate in this genetic research project. This project was approved by the ethical committees at Kitasato University School of Medicine. Genomic DNA was extracted from blood using the DNA Extractor WB Kit (Wako Pure Chemical Corporation, Japan).

DNA samples of healthy individuals in Japan

We analyzed four local populations in the Japanese archipelago: Northern Kyushu, the Okinawa islands, the Miyako islands and the Ishigaki islands (Figure S1). The three islands are included in the Ryukyu islands. The 47 Ryukyu islanders (15 Okinawa, 16 Miyako, and 16 Ishigaki islanders) who were originally described in Matsukusa et al. (2010) were examined, and they were partially similar to the population described in Sato et al. (2014). Saliva samples of Northern Kyushu were collected from 43 healthy individuals at Saga University. All subjects provided written informed consent to participate in this genetic research project. This project was approved by the ethical committees at Faculty of Medicine, University of the Ryukyus or the Saga University School of Medicine, respectively. We confirmed by interview that the birthplaces of four grandparents were the Saga and/or the Fukuoka prefectures and labeled them as “Northern Kyushu.” Genomic DNA was extracted from the saliva samples following a modified protocol (Quinque et al., 2006) or a modified protocol of Oragene (DNA Genotek, Canada) saliva sample purification using the Puregene DNA Purification Kit (Qiagen, Germany) (http://www.dnagenotek.com/US/pdf/PD‐PR‐00212.pdf).

Polymerase chain reaction amplification

We amplified the 967 bp region, including the risk variant R4810K (rs112735431), by PCR using a pair of primers: 5′‐ACA TGG GCC CAA GGG ACA GAT TTC‐3′ and 5′‐ACA GGG ATG GGC CGA GTC AGG‐3′. For details on the PCR amplification conditions, see the Supporting Information text.

Direct sequencing analysis

The DNA sequencing reactions were performed using the BigDye® Terminator v3.1 Cycle Sequencing Kit (Life Technologies, USA) using the diluted PCR primers according to the commercial protocol. For details on the direct sequencing conditions, see the Supporting Information text.

Haplotype estimation and analysis of 1000 Genomes data

To initially explore LD and haplotype patterns, we selected 13 SNPs (Figure S2) in RNF213 that were polymorphic in all four global populations, Yoruba in Ibadan, Nigeria (YRI), Utah Residents (CEPH) with Northern and Western European Ancestry (CEU), Han Chinese in Beijing, China (CHB), and Japanese in Tokyo, Japan (JPT), in the 1000 Genomes Project. A total of 414 individuals (108 YRI, 99 CEU, 103 CHB, 104 JPT) were involved in the analyses. The 10−20 kb intervals were taken between the 13 SNPs. The SNPs were chosen when the expected heterozygosity was estimated to be greater than 0.2 in at least three of the four populations in the 1000 Genomes Project. This is because the SNPs shared among diverse populations at intermediate frequencies are likely to be old and appeared before the out‐of‐Africa (about 70,000 years ago), and such SNPs are possible to provide high r 2‐values than low‐frequency alleles (Nordborg & Tavaré, 2002; VanLiere & Rosenberg, 2008). For details on the haplotype estimation, see the Supporting Information text.

Population genetic statistics

Intrapopulation genetic diversity was shown by nucleotide diversity (π). The π of exons and introns was estimated by using VCFtools (version 0.1.13) (Danecek et al., 2011) with a 1,000 bp window size and 100 bp stepping size, and those averages were calculated. We compared the linkage disequilibrium (LD) of RNF213 among JPT, CHB, CEU, and YRI: we calculated two statistics of pairwise linkage disequilibrium, D′ and r 2, using Arelquin3.5 (Excoffier & Lischer, 2010).

RNF213 sequences of the 1000 Genomes Project data

VCF files of the 1000 Genomes Phase 3 containing a genomic region of the RNF213 gene (about 138 kb) were obtained using the Data Slicer web tool (http://www.internationalgenome.org/data‐slicer/), including JPT, CHB, CEU, and YRI. The VCF files were converted to FASTA files using vcf‐consensus function included in the VCFtools (version 0.1.13) software package (Danecek et al., 2011).

RNF213 sequencing data of the MMD patients

We used nucleotide sequence data of 214‐kb region covering the RNF213 gene for 24 individuals, which we previously reported (Koganebuchi et al., 2018). In the previous study, we first sequenced the RNF213 locus including the R4810K site for the 30 MMD patients by Sanger PCR‐direct sequencing, and found 22 individuals had the risk allele. Second, we sequenced the 214‐kb region covering the RNF213 gene using MiSEquation (Illumina, USA) for the 22 with the risk allele and 2 individuals without the risk allele for comparison. Hereafter, we call the 24 subjects “MY individuals” in this study. We labeled two sequences as 1 and 2 in the heterozygote: for instance, two sequences from an individual, MY1, were labeled MY1_1 and MY1_2. For details on the MiSeq sequencing procedure, see the Supporting Information text.

Constructing the RNF213 gene sequence tree

To select an outgroup sequence of the RNF213 gene tree, we constructed a phylogenetic tree using YRI individuals and the human ancestral reference sequence (https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/human_ancestor_GRCh37_e59.tar.bz2) provided by the 1000 Genome Project. Finally, a sequence of NA18498 (NA18498_1) was selected as the outgroup for the RNF213 gene tree because NA18498_1 was placed at a blanch that was the nearest to the human ancestral reference sequence. The total number of diploid sequences in phylogenetic analysis, including MY, JPT (128 individuals) and NA18498_1, was 257. Finally, we obtained a dataset of 135,539 bp of aligned sequences. All positions containing gaps and missing data were eliminated, and a total of 135,041 bp was used in subsequent phylogenetic analysis. For details on the multiple alignment of the sequences, see the Supporting Information text. Phylogenetic analyses were performed using the program MEGA7 (Kumar et al., 2016). Pairwise sequence divergences were calculated under the Tamura‐Nei model (Tamura & Nei, 1993). Phylogenetic trees were constructed with the neighbor‐joining (NJ) method (Saitou & Nei, 1987). The reliability of the tree was evaluated using 100 bootstrap replicates.

Estimating the age of R4810K

We conducted coalescent simulations using GENETREE software version 9.0 (Griffiths, 2007) to generate maximum likelihood estimates for the scaled population mutation rate (θML), the coalescent tree and the age of mutations for R4810K using MY and JPT. We also used BEAST version 2.5.2 (Bouckaert et al., 2019) to estimate the age of mutation. For details on the analyses, see the Supporting Information text.

RESULTS

Allele frequency of RNF213 R4810K of healthy individuals

We looked into the allele frequency of the RNF213 R4810K allele and found a frequency of 1.16% of the risk allele in Northern Kyushu (Figure S3), which was almost the same as that shown in previous studies (1.38% of Honshu and 1.85% of Northern Kyushu) (Liu et al., 2012; Takamatsu et al., 2017). In the Ryukyu islands, only on Ishigaki island, the risk allele was shown at a little higher frequency (3.12%) (one risk allele in 32 chromosomes), but it was not found in the Miyako islands and the Okinawa islands at all (0.00%) (Figure S3). The frequency of the total population of the Ryukyu Islands (1.06%) was a little lower than that of other Japanese populations. Thus, we found a regional difference in the risk allele frequencies between the mainland and the Ryukyu islands in the Japanese archipelago.

Nucleotide diversity

We examined the average π of RNF213 in four global populations (JPT, CHB, CEU, YRI) from the 1000 Genomes Project (Figure S4) and found that the π of exons (0.42 − 0.77 × 10−3) was lower than that of introns (1.01 − 1.39 × 10−3). The diversity of exons was higher in YRI (0.77 × 10−3) than in the other populations (0.42 − 0.50 × 10−3). The diversity of introns was also higher in YRI (1.39 × 10−3) than the others (1.01 − 1.17 × 10−3). Nucleotide diversities of the entire RNF213 gene in each population were higher than the averages of those of exons (0.37 − 0.47 × 10−3) and introns (0.64 − 0.87 × 10−3) in whole‐genome data of the 1000 Genomes Project (Mu et al., 2011). Thus, this locus was a relatively highly polymorphic region.

Linkage disequilibrium

We calculated LD values (D' and r 2) of RNF213 in the global populations (Figures 1 and S5). Pairs of SNPs that showed higher LD were found in JPT, CHB, and CEU than in YRI. We found two high LD regions (blocks) in CEU, JPT, and CHB but not in YRI. The LD pattern of JPT was slightly different from CEU and CHB: a high LD was found in the region including rs112935431, corresponding to the risk allele, R4810K. However, this difference was not statistically significant because the risk allele frequency was too low (0.010) to show significant LD.

FIGURE 1

Heatmap matrixes of pairwise LD statistics D′ and r 2

Note: The upper sides of the heatmaps are from the 5′ end. The lower sides of the heatmaps are from the 3′ end

Heatmap matrixes of pairwise LD statistics D′ and r 2 Note: The upper sides of the heatmaps are from the 5′ end. The lower sides of the heatmaps are from the 3′ end

Haplotype analysis

We looked into 13 SNP haplotype frequencies of global populations from the 1000 Genomes Project (YRI, CEU, JPT, and CHB). We found that 12 of 172 haplotypes accounted for ≧2% of the total 414 individuals and named the 12 common haplotypes H1–H12 (Figure 2A). We combined frequencies of the remaining haplotypes (Residual). Comparing the total number of haplotypes among the populations, YRI (82) showed more haplotypes than CEU (58), JPT (64) and CHB (56), indicating that YRI was the most diverse population of the RNF213 gene among the global populations; four haplotypes (H1, H6, H9, H11) account for 13.0% in YRI. The non‐African populations, CEU and CHB–JPT, showed eight (H1, H2, H4, H7–H9, H11, H12) and nine haplotypes (H2–H5, H7–H11) that accounted for 49.5% and 63.1%–46.6%, respectively. Six haplotypes (H2, H4, H7, H8, H9, H11) were common among the non‐African populations (CEU and JPT–CHB), while nine haplotypes (H2–H5, H7–H11) were shared between East Asian populations (CHB and JPT). Though 4 of the 12 common haplotypes (H1, H6, H9, H11) were found in YRI, almost all haplotypes, except for H3 and H10, were found in at least one of the seven African populations in the 1000 Genomes Project (Table S2). Thus, the pattern of haplotype frequencies was explained simply by bottleneck effect of the out‐of Africa migration that was shown in many previous studies (Jakobsson et al., 2008; Li et al., 2008; Ramachandran et al., 2005).

FIGURE 2

RNF213 13 SNP haplotypes (A) Haplotype frequencies in each geographic population. Abbreviations of the populations are the same as in the Section 2. Haplotypes with ≥2% frequency in the total population are numbered from H1 to H12, and those with <2% frequency in the total population are shown as residual. (B) A phylogenetic network for 13 haplotypes of 13 SNPs from RNF213 constructed by the median‐joining method. Pie charts represent 12 major haplotypes (see Figure 2A). The circle sizes are proportional to the world average of frequencies for the 12 major haplotypes with subdivisions showing the frequencies of the haplotypes in the geographic populations: JPT (black), CHB (dark gray), CEU (light gray), and YRI (black shaded lines). The star indicates the ancestral haplotype. The double circle indicates the risk allele‐harboring haplotype. The black bars and the numbers on the branches represent the SNP positions of the haplotype in Figure S2. The presence of the same number multiple times suggests repeat mutations and/or recombinations. The ambiguity in the evolution of the haplotypes is indicated by the two pathways from the ancestral haplotype A haplotype harboring R4810K was found only in JPT that was shared by two sequences included in the residual of RNF213 haplotypes estimated using 13 SNP sites. To understand the evolutionary history of the RNF213 haplotypes, we constructed a phylogenetic network of 13 haplotypes (12 common haplotypes and the haplotype harboring R4810K) and the human ancestral reference sequence as the ancestral haplotype (Figure 2B). The four populations of the 1000 Genomes project did not have the ancestral haplotype. Because H3 and H10 were not found in the African populations (Table S2) but in the East Asian populations, and located at the edges of the network (Figure 2B), it is likely that the two haplotypes were born outside of Africa after the out‐of‐Africa migration. The network showed two reticulations indicating at least two (probably more) past recombination events within the RNF213 locus. Although two separate evolutionary directions from the ancestral haplotype would be often interpreted to be an evidence of balancing selection (Bamshad & Wooding, 2003), there has been no conventional statistical test to distinguish between balancing selection and random genetic drift. The topology of the RNF213 haplotype network can be explained by both of them. First, this does not have really two separate directions: the network contained furthermore invisible reticulations caused by multiple recurrent mutations and/or recombinations in the locus. Second, if it is truly balancing selection, more than two haplotypes should show remarkably higher frequencies. However, there were no outstanding high‐frequency haplotypes, suggesting no evidence of strong balancing selection in the genomic region. H4 showed a relatively higher frequency than the other haplotypes, and the haplotype harboring R4810K likely arose from H4.

Genotypes of the R4810K site on the MMD cases

We analyzed genotypes of the R4810K site in the 30 MMD cases by direct sequencing. Homozygous healthy alleles were found in 8 individuals, while 22 individuals had heterozygous alleles. The number of homozygous risk alleles was zero. Thus, 73.3% of the cases were heterozygotes. A previous study reported a similar genotype frequency in a Japanese MMD patient cohort (Takamatsu et al., 2017). For targeted‐capture sequencing using MiSeq, we selected all 22 MMD patients harboring the risk allele and randomly chosen the 2 of 8 MMD patients without risk allele. These 2 MMD patients were analyzed for comparison to the 22 MMD patients harboring the risk allele. Hereafter, we analyzed the 24 MMD patients (MY individuals) in total.

NJ tree of RNF213

We constructed an NJ tree of RNF213 using the nucleotide sequence data of MY (24 patients) and JPT (104 individuals) (Figure 3A and B). Out of the 48 MY sequences, 22 sequences harbored R4810K. Out of the 22 sequences, 20 sequences clustered with 2 sequences, NA18977_1 and NA 19070_2, harboring R4810K in JPT. We call it the “R4810K cluster.” The remaining two sequences, MY02_1 and MY21_1, were placed in the other nodes. As shown in Figure S6, the two sequences showed two sequential gaps at approximately 14 and 40 kb regions from the 5′ end, indicating past recombination events. Namely, the two sequences were plausibly born by multiple recombinations that occurred between a sequence in the R4810K cluster and a sequence placed at the branches that were closer to the root than the R4810K cluster. Out of the 20 clustered sequences harboring R4810K, 16 sequences were completely the same (Figure 3B). Thus, the sequences harboring R4810K had very high similarity with each other.

FIGURE 3

Phylogenetic tree for RNF213 based on 136‐kbp nucleotide sequences constructed by the neighbor‐joining (NJ) method

Note: The analysis involved 257 nucleotide sequences because the tree was constructed using nucleotide sequence data of an outgroup sequence and individuals in MY and JPT populations, which provided two sequences each. The genetic distance was estimated with the Tamura‐Nei model. Bootstrapping (×100) was performed. The black highlights represent sequences harboring R4810K found in MY, while the gray highlights represent the sequences harboring R4810K found in JPT of the 1000 Genomes Project database. (A) The NJ tree of the RNF213 gene. (B) A part of the NJ tree of the RNF213 gene focused on the R4810K cluster

Phylogenetic tree for RNF213 based on 136‐kbp nucleotide sequences constructed by the neighbor‐joining (NJ) method Note: The analysis involved 257 nucleotide sequences because the tree was constructed using nucleotide sequence data of an outgroup sequence and individuals in MY and JPT populations, which provided two sequences each. The genetic distance was estimated with the Tamura‐Nei model. Bootstrapping (×100) was performed. The black highlights represent sequences harboring R4810K found in MY, while the gray highlights represent the sequences harboring R4810K found in JPT of the 1000 Genomes Project database. (A) The NJ tree of the RNF213 gene. (B) A part of the NJ tree of the RNF213 gene focused on the R4810K cluster

Estimation of the ages of two mutations

To estimate the ages of R4810K and a mutation shared by the MY cluster (mutation nos. 6 and 1 in Figure S7, respectively), we looked for a nonrecombining region harboring the R4810K site. We analyzed all 24 sequences harboring R4810K in MY, JPT and the outgroup sequence (NA19088_2) and found the nonrecombining region (79,723 bp) fitting the infinitely many‐site model using GENETREE (Figure S7). We found seven polymorphic sites in the nonrecombining region (Tables S3 and S4) and a cluster including the patients (MY cluster) in the coalescent tree (Figure S7). Using the nonrecombining region, we estimated the ages by GENETREE and BEAST2 (Table 1). We used two models, constant and exponential growth rate, for the estimation but found no difference between the estimated ages under the two models with GENETREE: the ages of R4810K and the shared mutation in the MY cluster were approximately 11,000 and 6500 years ago, respectively. The estimations with BEAST2 were different between the two models: the ages of R4810K and the shared mutation in the MY cluster were approximately 14,000 and 5900 years ago, or 5100 and 4200 years ago, respectively. Due to large standard errors, the 95% confidence intervals of the ages of R4810K and MY overlapped among the four runs.

TABLE 1

Time estimates (in years ±SD) for the TMRCA of 24 chromosomes harboring R4810K and coalescent tree nodes of MMD cases

	GENETREE (constant)	GENETREE (population growth)	BEAST2 (constant)	BEAST2 (population growth)
R4810K	11,553 ± 4044	10,545 ± 3374	14,465 ± 4626	5107 ± 2590
MY	6730 ± 2256	6244 ± 1953	5884 ± 2816	4189 ± 1870

Time estimates (in years ±SD) for the TMRCA of 24 chromosomes harboring R4810K and coalescent tree nodes of MMD cases The estimated age of R4810K and the mutation shared in the MY patients slightly varied between GENETREE and BEAST2 (Table 1). Though the two programs are based on coalescent simulation, they have different algorithms of genealogy sampling: GENETREE adopts independent sampling, whereas BEAST2 adopts correlated sampling. Independent sampling assumes a restrictive mutation model: infinitely many sites model. The correlated sampling method allows various mutational models but has more difficulties producing an exhaustive sample of high‐quality genealogies (Kuhner, 2009), which might give different estimations between GENETREE and BEAST2. The estimation of GENETREE can provide a stricter result than that of BEAST2.

DISCUSSION

East Asia origin of the risk allele R4810K

The comparison of diversity among the RNF213 sequences from YRI, CEU, CHB and JPT gives us the genetic characteristics of the gene. The π values of African populations are higher than those of non‐African populations, which is a typical pattern of bottleneck effect in the out‐of‐Africa expansion (Figure 2). Comparing the averages of all genes in the entire genome, the average π of exons was smaller than that of introns, which is common in neutral genes. Meanwhile, both the exon and intron π in RNF213 were higher than the averages of the 1000 Genomes (Figure S4). The mutation rate of the RNF213 gene (see Section 2) was higher than that of the whole genome in a recent study (approximately 0.5 ×10−9 mutations per nucleotide per year) (Scally, 2016). Such higher mutation rates are characteristics of genes related to the immune system (Martinsohn et al., 1999), which is often useful for environmental adaptation. However, the currently known function of RNF213 is related to fat metabolism (Sugihara et al., 2019). The comparison of RNF213 haplotypes revealed the expansion process of the gene and occurrence of R4810K after the out‐of‐Africa migration of modern humans. The risk allele haplotype was born by a single nucleotide substitution from H4 that three non‐African populations, JPT, CHB, and CEU, shared, but YRI did not (Figure 2B). When we looked at H4 in all African data from the 1000 Genomes Project, one chromosome of Gambians in western Africa shared H4 (0.44%) (Table S2). H7 and H8 that were placed at the nearest branch of H4 in the haplotype network (Figure 2B) were not shared in the African populations except for one chromosome that contained H8 in the Kenya (Table S2). From these results, it is likely that H4 was born outside of Africa after the out‐of‐Africa migration, or disappeared in the African populations though H4 was born in Africa. The observation of H4 in Gambian and H8 in Kenyan at very low frequency might be explained by back‐migration from non‐African populations (e.g., colonization in the early modern period) or by some Gambians/Kenyans having non‐African ancestry, though this remains to be shown. However, there is no evidence for the hypothetical explanations. We should examine more African populations to see the origin of H4. It is much clearer that the R4810K mutation was born in East Asia, because the allele was found only there.

R4810K brought by ancient immigrants

The high similarity of sequences harboring R4810K showed a recent common ancestor. The 22 sequences of MY harboring R4810K with no recombination were approximately 80 kb (Figure S6). Of 22, 16 sequences were exactly the same sequence type (Figure S6), indicating that the MY patients in the present study are closely related genetically, notwithstanding we collected the MY patient samples from different families. The similarity of the sequences MY harboring R4810K just fits the peopling history discussed next. Recent population genomic studies of ancient/modern samples have revealed peopling history of Japanese (Gakuhari et al., 2020; Jinam et al., 2012; Kanzawa‐Kiriyama et al., 2017, 2019; Nakagome et al., 2015; Sato et al., 2014). According to archeological records, the Jomon culture spanned ∼16,000−3000 years ago in the entire Japanese archipelago after the Upper Paleolithic period (Mizoguchi, 2013). The people who had the Jomon culture, what we call the Jomon people, were aboriginal hunters–fishers–gatherers. The Japanese archipelago was connected to the East Eurasian continent via the northern island, Hokkaido, until the end of the Last Glacial Maximum, and separated from the continent around 18,000 years ago. The Jomon people were, therefore, isolated from the other populations in East Eurasia (Kaifu et al., 2015). Around 3000 years ago, immigrants came to the Japanese archipelago from East Asia and introduced paddy rice farming. This culture was called Yayoi culture, and the people who had the Yayoi culture were called Yayoi immigrants. Based on our ancient genome analysis of a Jomon individual, the divergence time between the Jomon people and the Yayoi immigrants is dated back to the common ancestor of present‐day East Eurasian and Native American (40,000−26,000 years ago) (Gakuhari et al., 2020). According to previous genetic studies, R4810K is found in Chinese, Korean, and Japanese (Figure S3). In our estimation, R4810K arose approximately 14,500−5100 years ago (Table 1), corresponds to the Early to Middle Jomon period, though the estimated age includes wide‐range deviation. Based on the current geographical distribution of R4810K, it is likely that R4810K arose in the East Asian continent. Because the age of the mutation shared in the MY cluster (the split time of the grey arrow of Figure S7) was more recent than that of R4810K (the split time is shown by the black arrow of Figure S7), we speculate that the ancestor of the MY patients examined in this study came to the Japanese archipelago together with the immigrants from the East Asian continent. By analyzing the ancient genome of bones from Korea and China in this period, we might be able to reveal the whole picture of the RNF213 gene. The distribution of R4810K in the Japanese archipelago could fit “the dual structure model” of peopling history in the Japanese archipelago, which is based on morphological anthropology (Hanihara, 1991; Ishida et al., 2009; Matsumura, 2001). The recent genetic and genomic population studies have strongly supported that Ryukyu islanders showed a lower genetic contribution of the Yayoi immigrants than the Hondo people (Koganebuchi et al., 2012; Matsukusa et al., 2010; Nakagome et al., 2015; Sato et al., 2014). When comparing the allele frequency of R4810K among the six sampling sites of the Japanese archipelago, we found the risk allele at higher frequencies in Hondo and Northern Kyushu than in the Ryukyu islands (Figure S3). Thus, it is plausible that the R4810K frequency distribution pattern is formed by the demographic history explained by the dual structure model. The exception, relatively high frequency (3.12 %) of the R4810K allele in the Ishigaki population, would be drift due to the small sample size (16 individuals) and also be an influence of migration from Hondo to the Ishigaki island occurred before the grandparents’ generation of the subjects, however, the diversity of the RNF213 gene harboring R4810K in the Ryukyu Islands and Kyushu, as well as those in mainland East Asia, is required to reveal the detailed expansion process of R4810K in the future.

Diagnostic variation and genetic diversity

Our analysis shows that the R4810K was maintained by demography in spite of its deleterious effects, which likely agrees with diagnostic characteristics. MMD shows some different diagnostic characteristics between the homozygote and heterozygote of the risk allele (Miyatake, Miyake, et al., 2012; Miyatake, Touho, et al., 2012). The R4810K homozygotes were highly associated with early‐onset MMD (age at onset <5 years old), severe symptomatic manifestations at diagnosis, and poor prognosis (Kim et al., 2015). MMD patients with the R4810K heterozygote develop later onset (after reproductive age) and show less severe symptoms than homozygote patients. There is no advantageous phenotype reported in the individuals harboring R4810K so far. Thus, the R4810K mutation could be brought to descendants as the R4810K heterozygote regardless of its deleterious effect. Mendelian disease alleles are highly effective, and their frequencies are very rare because the alleles are generally found in particular family lineages, while common disease alleles are shared among global populations (Manolio et al., 2009). Risk alleles of common disease appeared before human population divergences (approximately ∼200,000 years ago), while Mendelian disease alleles are likely to be very recent mutations (approximately some hundreds or decades ago) (Nakagome & Oota, 2012). R4810K is East Asian‐specific and the age of the mutation is approximately 12,000−5000 years. Thus, the age and the pattern of distribution of the MMD risk allele are intermediate between Mendelian and common disease. Though the RNF213 sequences with the risk allele were very similar to each other, there is high diversity of diagnostic characteristics of the MMD cases (e.g., transient ischemic attack, cerebral infarction, intracranial bleeding, headache) (Kuroda & Houkin, 2008), suggesting that RNF213 genetic variations cannot explain the high diversities of the phenotypes or MMD. Rather, the interaction with the other genes and the environment could explain the high phenotypic diversity of MMD, which is considered an important issue to be addressed in the near future.

AUTHOR CONTRIBUTION

Study design: Kae Koganebuchi and Hiroki Oota. Sample collection: Kae Koganebuchi, Kimitoshi Sato, Kiyotaka Fujii, Toshihiro Kumabe, Kuniaki Haneji, Takashi Toma, Hajime Ishida, Keiichiro Joh, Hidenobu Soejima, and Hiroki Oota; Data analysis: Kae Koganebuchi. Supervision: Shuhei Mano and Motoyuki Ogawa. Manuscript preparation: Kae Koganebuchi and Hiroki Oota.

CONFLICT OF INTEREST

The authors declare that they have no conflict of interest. Supplementary figure 1. Map of the local areas examined in this study. We referred to figure 1 in Koganebuchi et al. (2018) Supplementary figure 2. Relational map for 13 SNPs at RNF213. The black boxes indicate exons. The white boxes indicate UTR regions. Supplementary figure 3. Allele frequencies (%) of RNF213 R4810K in East Asian populations. Supplementary figure 4. The averages of nucleotide diversity (π) (×103) in (A) the exons and (B) the introns of RNF213. Supplementary figure 5. The exact test and the chi2 test for D’ and r 2. Supplementary figure 6. RNF213 sequence types used for coalescent simulation. Supplementary figure 7. Coalescent tree of RNF213 harboring R4810K with no recombination (79,723 bp). Click here for additional data file. Supporting Information Click here for additional data file. Supplementary table 1. Maximum likelihood fits of 24 different nucleotide substitution models Supplementary table 2. Haplotype frequency of seven African populations in the 1000 Genomes Project Supplementary table 3. Haplotype frequencies of the genetic region in RNF213 harboring R4810K with no recombination (79,723 bp) Supplementary table 4. Analysis on the MMD risk allele cluster by GENETREE Click here for additional data file.

52 in total

1. Mathematical properties of the r2 measure of linkage disequilibrium.

Authors: Jenna M VanLiere; Noah A Rosenberg
Journal: Theor Popul Biol Date: 2008-06-01 Impact factor: 1.570

2. A rare Asian founder polymorphism of Raptor may explain the high prevalence of Moyamoya disease among East Asians and its low prevalence among Caucasians.

Authors: Wanyang Liu; Hirokuni Hashikata; Kayoko Inoue; Norio Matsuura; Yohei Mineharu; Hatasu Kobayashi; Ken-Ichiro Kikuta; Yasushi Takagi; Toshiaki Hitomi; Boris Krischek; Li-Ping Zou; Fang Fang; Roman Herzig; Jeong-Eun Kim; Hyun-Seung Kang; Chang-Wan Oh; David-Alexandre Tregouet; Nobuo Hashimoto; Akio Koizumi
Journal: Environ Health Prev Med Date: 2009-11-19 Impact factor: 3.674

3. Guidelines for diagnosis and treatment of moyamoya disease (spontaneous occlusion of the circle of Willis).

Authors:
Journal: Neurol Med Chir (Tokyo) Date: 2012 Impact factor: 1.742

4. Genome-wide SNP analysis reveals population structure and demographic history of the ryukyu islanders in the southern part of the Japanese archipelago.

Authors: Takehiro Sato; Shigeki Nakagome; Chiaki Watanabe; Kyoko Yamaguchi; Akira Kawaguchi; Kae Koganebuchi; Kuniaki Haneji; Tetsutaro Yamaguchi; Tsunehiko Hanihara; Ken Yamamoto; Hajime Ishida; Shuhei Mano; Ryosuke Kimura; Hiroki Oota
Journal: Mol Biol Evol Date: 2014-08-01 Impact factor: 16.240

5. Importance of RNF213 polymorphism on clinical features and long-term outcome in moyamoya disease.

Authors: Eun-Hee Kim; Mi-Sun Yum; Young-Shin Ra; Jun Bum Park; Jae Sung Ahn; Gu-Hwan Kim; Hyun Woo Goo; Tae-Sung Ko; Han-Wook Yoo
Journal: J Neurosurg Date: 2015-10-02 Impact factor: 5.115

6. Mapping of a familial moyamoya disease gene to chromosome 3p24.2-p26.

Authors: H Ikeda; T Sasaki; T Yoshimoto; M Fukui; T Arinami
Journal: Am J Hum Genet Date: 1999-02 Impact factor: 11.025

7. A novel susceptibility locus for moyamoya disease on chromosome 8q23.

Authors: Kaoru Sakurai; Yasue Horiuchi; Hidetoshi Ikeda; Kiyonobu Ikezaki; Takashi Yoshimoto; Masashi Fukui; Tadao Arinami
Journal: J Hum Genet Date: 2004 Impact factor: 3.172

8. The history of human populations in the Japanese Archipelago inferred from genome-wide SNP data with a special reference to the Ainu and the Ryukyuan populations.

Authors: Timothy Jinam; Nao Nishida; Momoki Hirai; Shoji Kawamura; Hiroki Oota; Kazuo Umetsu; Ryosuke Kimura; Jun Ohashi; Atsushi Tajima; Toshimichi Yamamoto; Hideyuki Tanabe; Shuhei Mano; Yumiko Suto; Tadashi Kaname; Kenji Naritomi; Kumiko Yanagi; Norio Niikawa; Keiichi Omoto; Katsushi Tokunaga; Naruya Saitou
Journal: J Hum Genet Date: 2012-11-08 Impact factor: 3.172

9. A partial nuclear genome of the Jomons who lived 3000 years ago in Fukushima, Japan.

Authors: Hideaki Kanzawa-Kiriyama; Kirill Kryukov; Timothy A Jinam; Kazuyoshi Hosomichi; Aiko Saso; Gen Suwa; Shintaroh Ueda; Minoru Yoneda; Atsushi Tajima; Ken-Ichi Shinoda; Ituro Inoue; Naruya Saitou
Journal: J Hum Genet Date: 2016-09-01 Impact factor: 3.172

10. Differences in the Genotype Frequency of the RNF213 Variant in Patients with Familial Moyamoya Disease in Kyushu, Japan.

Authors: Yuichiro Takamatsu; Ken Higashimoto; Toshiyuki Maeda; Masatou Kawashima; Muneaki Matsuo; Tatsuya Abe; Toshio Matsushima; Hidenobu Soejima
Journal: Neurol Med Chir (Tokyo) Date: 2017-09-21 Impact factor: 1.742

1 in total

1. An analysis of the demographic history of the risk allele R4810K in RNF213 of moyamoya disease.

Authors: Kae Koganebuchi; Kimitoshi Sato; Kiyotaka Fujii; Toshihiro Kumabe; Kuniaki Haneji; Takashi Toma; Hajime Ishida; Keiichiro Joh; Hidenobu Soejima; Shuhei Mano; Motoyuki Ogawa; Hiroki Oota
Journal: Ann Hum Genet Date: 2021-05-20 Impact factor: 1.670

1 in total