Literature DB >> 26907499

Atlas of Cryptic Genetic Relatedness Among 1000 Human Genomes.

Larisa Fedorova1, Shuhao Qiu2, Rajib Dutta3, Alexei Fedorov4.   

Abstract

A novel computational method for detecting identical-by-descent (IBD) chromosomal segments between sequenced genomes is presented. It utilizes the distribution patterns of very rare genetic variants (vrGVs), which have minor allele frequencies <0.2%. Contrary to the existing probabilistic approaches our method is rather deterministic, because it considers a group of very rare events which cannot happen together only by chance. This method has been applied for exhaustive computational search of shared IBD segments among 1,092 sequenced individuals from 14 populations. It demonstrated that clusters of vrGVs are unique and powerful markers of genetic relatedness, that uncover IBD chromosomal segments between and within populations, irrespective of whether divergence was recent or occurred hundreds-to-thousands of years ago. We found that several IBD segments are shared by practically any possible pair of individuals belonging to the same population. Moreover, shared short IBD segments (median size 183 kb) were found in 10% of inter-continental human pairs, each comprising of a person from sub-Saharan Africa and a person from Southern Europe. The shortest shared IBD segments (median size 54 kb) were found in 0.42% of inter-continental pairs composed of individuals from Chinese/Japanese populations and Africans from Kenya and Nigeria. Knowledge of inheritance of IBD segments is important in clinical case-control and cohort studies, since unknown distant familial relationships could compromise interpretation of collected data. Clusters of vrGVs should be useful markers for familial relationship and common multifactorial disorders.
© The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  DNA; bioinformatics; biomarker; evolution; genealogy; inheritance

Mesh:

Year:  2016        PMID: 26907499      PMCID: PMC4824066          DOI: 10.1093/gbe/evw034

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

Studies of genetic relatedness rely on the fundamental concept of identical-by-descent (IBD) for inheritance of genetic material (Powell et al. 2010; Browning and Browning 2012; Carmi et al. 2013, 2014; Thompson 2013). Genome of every individual is a mosaic of IBD segments inherited from previous generations. Real human populations have limited sizes and have frequently experienced admixtures. Thus, even genealogically unrelated individuals from the same geographical region frequently share one or several IBD genomic segments transmitted from their common distant ancestors. Investigation of peculiarities in IBD segment inheritance is critical for understanding fundamental questions regarding human evolution and demographic history, as well as for practical purposes including individualized medicine and clinical association studies. However, precise detection of IBD segments, even when shared by not-very-distant genetic relatives, has several problems. Whole-genome SNP analysis on gene arrays frequently produces erroneous results (Browning and Browning 2011; Huff et al. 2011; Durand et al. 2014; Li et al. 2014). The widely used shotgun next-generation sequencing does not confidently distinguish maternal and paternal genomic portions (the so-called “phasing” of sequenced DNA). Numerous phasing errors in distinguishing between parental chromosomes have led to frequent incorrect IBD segments detection (Kong et al. 2008). Characterization of IBD segments by current methods depends on complex statistical algorithms, multiple assumptions, and probabilistic approaches (Su et al. 2012; Browning and Browning 2013; Durand, et al. 2014). Hence, false positive and false negative predictions often take place in establishing distant genetic relatedness. Our group recently presented a novel and simple computational method for detecting shared IBD segments (Al-Khudhair et al. 2015). This method utilizes the distribution patterns of very rare genetic variants (vrGVs), which have minor allele frequencies <0.2%, and does not require phasing of genomic sequences. Since all living species experience an intense influx of mutations in their genomes, vrGVs are very abundant. Any given human being has 50–100 de novo DNA changes, on average (Conrad et al. 2011; Li and Durbin 2011; Kondrashov and Shabalina 2002). Due to this intense mutagenesis, vrGVs occur by the tens of thousands in every individual and their patterns along chromosomes are exceptional clues and signs of their most recent evolution. Usefulness of rare SNPs has been acknowledged in several publications (Hochreiter 2013; Moore et al. 2013). We showed that shared vrGVs between two individuals are clustered in a single or a few genomic loci. This article introduces and defines clusters of vrGVs and presents a new approach to distinguish between identical-by-state (IBS) versus IBD chromosomal segments. When two people share five adjacent vrGVs located in the same region, the probability of this event occurring by random coincidence (the so-called IBS event) is equal to 0.0025, which is less than one in 1013. Therefore, these clusters of shared vrGVs are credible markers of IBD genomic segments. Five or more adjacent shared vrGVs are called as rare variant clusters (RVCs). Characterization of shared RVCs gives a remarkable reliability for IBD segment identification and, at the same time, precise localization of IBD segments on the chromosome. In this article, we characterized the entire set of shared RVCs for every possible pair of 1,092 sequenced individuals (1,191,372 pairs) and demonstrated that distribution of shared RVCs perfectly matches human history and migration routes during the last 9,500 years.

Materials and Methods

Datasets

We used data from the 1000 Genomes Project, phase 1, that are available through public ftp site ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20110521/, last accessed March 1, 2016 (Abecasis et al. 2012). Specifically, Variant Call Format (VCF) files version 4.1 that contain a total of 38.2 million SNPs, 3.9 million short insertions/deletions, and 14 thousand deletions for all the human chromosomes have been used. We have defined vrGVs as polymorphisms in which minor allele counts have <0.2% frequencies in the 1000 Genomes data (Al-Khudhair et al. 2015). By processing VCF files of 1,092 individuals we identified 16,326,219 vrGVs in total for mutant alleles, which represent minor allele counts, and only 17,611 vrGVs for reference alleles. This asymmetry exists due to the fact that the reference human genome has been created based on the pooled information on six sequenced individuals (Lander et al. 2001). Since the vrGVs corresponding to reference alleles represent only 0.1% of vrGVs corresponding to mutant alleles, we omitted the former and processed only the vrGVs from mutant alleles. Algorithms, programs, parameters, and calculated probabilities for false-positive detection of RVCs are presented in the supplementary file “M&M”, Supplementary Material online.

Availability

All our programs, their instruction manuals and notes, supporting files, and results files are available in supplementary file “Data”, Supplementary Material online, and also from our web site (http://bpg.utoledo.edu/∼afedorov/lab/atlas_vrGV.html, last accessed March 1, 2016). Supplementary file “Data”, Supplementary Material online, is an archive file in the “.tar.gz” format, which contains five folders: ProgramsInstructions, InputData, OutputDataWindow20, OutputDataWindow9, and Modeling. The execution of the entire pipeline of Perl programs for processing 1,092 genomes takes <3 h on a modern desktop Linux workstation using a single CPU (no parallelization required). The execution of our modeling program IBDsimulator.pl takes about 1–2 min per one primogenitor genome on a modern desktop computer. To repeat this program 500,000 times we ran it in parallel on 64 cores simultaneously for several weeks.

Statistics

Statistical analyses of the distribution of RVC lengths between two pairs of populations were performed using R package (R Development Core Team 2010).

Results

Creation of vrGV Databases

Complete sets of vrGVs for each of 1,092 individuals have been created and available as supplementary file “Data”, Supplementary Material online (folder InputData). An example of such individual-specific vrGV database is shown in table 1, which represents an arbitrarily chosen individual (HG01365) from Colombian population (CLM). Each minor allele of vrGV in this individual-specific database is present in the genome of the person HG01365 and, often, in one to three other genomes of the 1,092 sequenced individuals. When a vrGV from an individual-specific vrGV database is shared by two, three, or four people, the identifiers of all individuals who have this minor allele are also present in the database (columns 6–9, table 1). The number of vrGVs of a particular person depends on the population that person belongs to (Al-Khudhair et al. 2015). The highest number of vrGVs is seen in the African populations (average number is 67,000 vrGVs per person; SD = 7,500), followed by American (average 24,600 vrGVs; SD = 4,500), Asian (24,100 ± 4,100), and European (16,200 ± 2,700) populations. The number of shared vrGVs for a particular pair of individuals also depends on the populations these two persons belong to (tables 2 and 3). When a pair of individuals shares several vrGVs, these shared vrGVs are usually grouped in one locus or a few loci (table 1 italicized entries). Five or more shared adjacent vrGVs are called RVCs. In order to characterize shared RVCs we created a Perl program RVC.pl, which scans an individual-specific vrGV database and identifies all shared RVCs inside it. For all 1,092 processed genomes, an RVC contains 12.6 vrGVs per cluster on average. The distribution of clusters along chromosomes is seemingly random, so no obvious patterns in their genomic locations have been observed. A segment of the output file from the RVC.pl, representing a complete list of RVCs the individual under analysis shares with the other 1,092 sequenced individuals, is demonstrated in table 4. Such output files were obtained for each of the 1,092 individuals and they provide the information on the number of RVCs an individual shared with other people and also the length of sharing clusters. These datasets are available in the supplementary file “Data”, Supplementary Material online (folder OutputDataWindow20). By computational processing of these datasets, we created the complete table of shared RVCs for each of the possible 1,092 × 1,092 pairs (tables S1 and S2 in supplementary file “Data”, Supplementary Material online, folder OutputDataWindow20). The heat-map representation of it is shown in figure 1.
Table 1

An Exemplified Segment of an Individual-Specific Database for the Individual HG01363 from CLM Population

chrPositionIdentifierrefmutPerson-1Person-2Person-3Person-4
CHR13438563rs185156707GACLM_HG01365
CHR13503010rs141463795CTCLM_HG01365TSI_NA20813
CHR13567024rs184518958AGCLM_HG01365
CHR13669552rs186811888CTFIN_HG00355CLM_HG01365
CHR14022297rs185199014CTCLM_HG01365
CHR14393739rs116739584TGFIN_HG00190CLM_HG01365CEU_NA11829MXL_NA19762
CHR14530544rs187766509GACLM_HG01365CEU_NA11829CHB_NA18749JPT_NA18987
CHR14722235rs148197646TGCLM_HG01365CEU_NA11829
CHR14937903rs185870613GCCLM_HG01365ASW_NA19922
CHR14978507rs183610263ATCLM_HG01365ASW_NA19922
CHR15219590rs191615351CTGBR_HG00096GBR_HG00106GBR_HG00120CLM_HG01365
CHR15343566rs146986028CTGBR_HG00106CLM_HG01365
CHR15481720rs190394368GACLM_HG01365CEU_NA11931
CHR15551303rs186874087GACLM_HG01365IBS_HG01625TSI_NA20797
CHR15553504rs180676356GACLM_HG01365IBS_HG01625TSI_NA20797
CHR15559272rs192278468GCCLM_HG01365IBS_HG01625TSI_NA20797
CHR15560643rs146515020GACLM_HG01365IBS_HG01625TSI_NA20529TSI_NA20797
CHR15561084rs187249140GACLM_HG01365IBS_HG01625TSI_NA20797
CHR15576119rs142071781TCCLM_HG01365IBS_HG01625TSI_NA20795TSI_NA20813
CHR15710524rs189964921GACLM_HG01365
CHR15713296rs189528396CACLM_HG01365

In a window of nine consecutive rows (italics entries) the person HG01363 shares five vrGVs with the individual TSI_NA20797 and also six vrGVs with the individual IBS_HG01625. We have named such chromosomal regions with five or more shared neighboring vrGVs inside a scanning window as RVCs. The default size parameter for a scanning window is 20 consecutive rows.

Table 2. Intra-Population Sharing of RVCs

PopulationNumber of shared RVC/pairLength RVC/pair (Mb)
CEU1.561.73
FIN6.672.77
GBR2.212.74
IBS5.041.69
TSI2.642.23
CHB2.411.26
CHS3.752.14
JPT10.91.53
ASW11.70.88
LWK25.30.89
YRI19.10.74
CLM6.453.97
MXL2.853.87
PUR8.174.44

Table 3. Inter-Population Sharing of RVC for 1,092 Individuals from 14 Populations

Pop 1Pop 2Number of VRC/pairMedian LengthAverage LengthPop 1Pop 2Number of VRC/pairMedian LengthAverage Length
CEUASW0.293686.51161MXLCEU0.446556.5971
CHBASW0.016241.5610MXLCHB0.026287673
CHBCEU0.015405.5760MXLCHS0.024307527
CHSASW0.021308630MXLCLM0.9276961050
CHSCEU0.005244458MXLFIN0.210528.5897
CHSCHB2.4408301164MXLGBR0.4406161000
CLMASW1.224313485MXLIBS1.2248411257
CLMCEU0.550577.5993MXLJPT0.017301501
CLMCHB0.016397715MXLLWK0.544212439
CLMCHS0.009323524PURASW1.676324520
FINASW0.1406471038PURCEU0.605578985
FINCEU0.7707601306PURCHB0.014265528
FINCHB0.033365676PURCHS0.010151469
FINCHS0.014317796PURCLM1.1636691087
FINCLM0.277569.5949PURFIN0.294573956
GBRASW0.293717.51210PURGBR0.634617.51035
GBRCEU1.4048971435PURIBS1.4218101237
GBRCHB0.013427764PURJPT0.006147.5256
GBRCHS0.004322793PURLWK1.340258.5464
GBRCLM0.576581937PURMXL0.8586501086
GBRFIN0.7307251249TSIASW0.200416.5792
IBSASW0.305416.5699TSICEU0.8466041059
IBSCEU0.8586471041TSICHB0.020352.5659
IBSCHB0.005207306TSICHS0.010184.5554
IBSCHS0.006116202TSICLM0.654497.5891
IBSCLM1.73110181416TSIFIN0.4105541029
IBSFIN0.454573.5955TSIGBR0.8165931072
IBSGBR1.0316711098TSIIBS0.847551.5947
JPTASW0.008190465TSIJPT0.005122200
JPTCEU0.005232402TSILWK0.132218392
JPTCHB1.3556411042TSIMXL0.521467857
JPTCHS1.088542911TSIPUR0.758513943
JPTCLM0.004223505YRIASW12.71498556
JPTFIN0.021350.5714YRICEU0.026158354
JPTGBR0.002150440YRICHB0.00282114
JPTIBS0.005117218YRICHS0.0034282
LWKASW8.750360421YRICLM1.150285466
LWKCEU0.041216389YRIFIN0.003194.5395
LWKCHB0.00693230YRIGBR0.016144258
LWKCHS0.00842.566YRIIBS0.115153273
LWKCLM0.995253432YRIJPT0.0013044
LWKFIN0.009167.5294YRILWK8.182364442
LWKGBR0.031189366YRIMXL0.597236.5463
LWKIBS0.171190343YRIPUR1.576282470
LWKJPT0.0053362YRITSI0.049170.5348
MXLASW0.712283517

Median and Average sizes of IBD segments are shown in kb.

Table 4. Example of a segment of the output file CEU_NA12763_dat4

Individual hostIndividuals with shared RVCsNumber of shared vrGV clustersTotal length of clusters (bp)Total number of vrGVs
CEU_NA12763CEU_NA1228643,585,74139
CEU_NA12763MXL_NA197791304,7005
CEU_NA12763ASW_NA203171374,39011
CEU_NA12763FIN_HG003821497,47310
CEU_NA12763GBR_HG0014121,242,73315
CEU_NA12763FIN_HG0028045,300,62828
CEU_NA12763CEU_NA1282731,612,50116
CEU_NA12763FIN_HG003611744,2086
CEU_NA12763CLM_HG012711511,4899
CEU_NA12763FIN_HG0017311,376,2318
CEU_NA12763GBR_HG0010611,216,7255
CEU_NA12763CLM_HG011343991,60425
CEU_NA12763FIN_HG002661771,1069
CEU_NA12763FIN_HG003441676,3868
F

Heat-map table (1,092 × 1,092) presenting the total length (A) or number (B) of shared vrGV clusters for every possible pair of 1,092 individuals. Populations are grouped by the continent they originated from and labeled by different colors according to the Olympic scheme. Five populations with European origin are labeled in blue (CEU, FIN, GBR, IBS, and TSI); three Asian populations—in yellow (CHB, CHS, and JPT); three African populations—in black (ASW, LWK, and YRI); and three American populations—in red (CLM, MXL, and PUR). If a pair of individuals does not share an IBD segment, the corresponding square is present in white. The squares corresponding to pairs that share IBD segment(s) are colored according to a rainbow scheme. The smallest segments, for which total length shared by a pair does not exceed 900 kb are shown in violet, while the largest segments with total length per pair exceeding 10 Mb are shown in red.

Heat-map table (1,092 × 1,092) presenting the total length (A) or number (B) of shared vrGV clusters for every possible pair of 1,092 individuals. Populations are grouped by the continent they originated from and labeled by different colors according to the Olympic scheme. Five populations with European origin are labeled in blue (CEU, FIN, GBR, IBS, and TSI); three Asian populations—in yellow (CHB, CHS, and JPT); three African populations—in black (ASW, LWK, and YRI); and three American populations—in red (CLM, MXL, and PUR). If a pair of individuals does not share an IBD segment, the corresponding square is present in white. The squares corresponding to pairs that share IBD segment(s) are colored according to a rainbow scheme. The smallest segments, for which total length shared by a pair does not exceed 900 kb are shown in violet, while the largest segments with total length per pair exceeding 10 Mb are shown in red. An Exemplified Segment of an Individual-Specific Database for the Individual HG01363 from CLM Population In a window of nine consecutive rows (italics entries) the person HG01363 shares five vrGVs with the individual TSI_NA20797 and also six vrGVs with the individual IBS_HG01625. We have named such chromosomal regions with five or more shared neighboring vrGVs inside a scanning window as RVCs. The default size parameter for a scanning window is 20 consecutive rows. Table 2. Intra-Population Sharing of RVCs Table 3. Inter-Population Sharing of RVC for 1,092 Individuals from 14 Populations Median and Average sizes of IBD segments are shown in kb. Table 4. Example of a segment of the output file CEU_NA12763_dat4

Analysis of genetic relations based on the number and length of shared RVCs

Sharing of RVC Within the Same Population

The highest number of shared RVCs between two individuals is observed, unsurprisingly, when the two persons belong to the same population (fig. 1). The average numbers and lengths of shared RVC within 14 studied populations are shown in table 2. African and African-American individuals have the highest number of shared RVCs per pair within their populations followed by the Japanese and the Puerto-Ricans. In European groups, the highest cluster sharing is observed among the Finns (on average, 6.7 shared RVCs per pair) while the lowest—1.6 RVCs—is found in the Utah white population (CEU). Among Asian people, the average number of shared RVCs also broadly varies from 10.9 for Japanese (JPT) to 2.4 for Chinese (CHB) (table 2). These results are congruent to Frazer et al. (2007) that an average pair of individuals from the same population shares ∼0.5% of their genomes through recent IBD.

Sharing of RVC by Individuals from Neighboring Populations

Continental inter-population RVC sharing is correlated well with the geographic distances between the populations (table 3). In Europe, the lowest number of shared RVC is observed between Finnish people (FIN) and South European populations TSI (Toscani in Italia) and IBS (Iberian Population in Spain), which are geographically distant and historically have not intensively intersected with each other (on average, TSI–FIN and IBS–FIN pairs have ∼0.4 shared RVCs per pair). RVC sharing between all other groups with European origin is higher. Particularly, the number of shared RVCs per pair of individuals belonging to any two European (non-FIN) populations ranges from 0.82 to 1.03. The highest number of RVC sharing (1.03) is observed between pairs of people from Utah (CEU) and Britain (GBR) inhabiting different geographic regions but originated from the same founder populations. In Asia, Chinese Han people from South and Beijing (CHS and CHB) share on average 2.4 RVCs per pair between themselves, and approximately twice less with geographically remote Japanese people (1.1–1.4 RVCs). African YRI (Yoruba in Ibadan, Nigeria) and LWK (Luhya in Webuye, Kenya) groups share on average 7.8 RVCs per pair. However, such high numbers of shared RVCs between African populations may be partially due to the fact that they have ∼3 and 4 times more vrGVs than Asian and European populations, respectively.

Sharing of RVC by Individuals from Different Continents

The lowest numbers of shared RVCs are detected for pairs of individuals inhabiting distant parts of the Old World. A majority of these inter-continental pairs of individuals do not share RVCs at all. Thus, these areas in figure 1 are predominantly white (see table 3 for details). The lowest RVC sharing is found in Asia–Africa pairs (0.0042 shared RVC per pair) and specifically for YRI–JPT populations (where only nine pairs among all 7,832 possible pairs have shared RVCs with median RVC length of merely 44 kb). The second lowest inter-continent RVC sharing is observed between people from Asia and Europe. Only 0.2–3.3% of these inter-continental pairs have shared RVCs. Interestingly, among these Asian–European pairs, the highest admixture is observed between both Chinese groups and two Europeans—FIN and TSI (0.01–0.03 shared RVC per pair). Japanese people share <0.005 RVC per pair with all Europeans except with Finns (0.02). Such enrichment of Asian RVC among the Finns presumably is an effect of belonging of the Finns, unlike other Western Europeans, to the Finno–Ugric population of the Uralic family of the Northern Eurasians (Lahermo et al. 1996; Lappalainen et al. 2006; Rootsi et al. 2007). The highest Old World intercontinental admixture of RVC is depicted between Africa and Europe. Our data are in accordance with previously reported increasing gradient of admixture of African genes from Northern Europe to Southern (Adams et al. 2008; Moorjani et al. 2011; Cerezo et al. 2012; Botigue et al. 2013). We also found that all European groups share more RVCs with LWK than with YRI (see table 3 for details), thus supporting the hypothesis of gene exchange between Europe and Africa through Near Eastern migration routes rather than Trans-Saharan (Cavalli-Sforza et al. 1994; Richards et al. 2000; Currat and Excoffier 2005). The Italian (TSI) population has 2.7 times more shared RVCs with Kenyan (LWK) than with Nigerian (YRI) populations (0.132 vs. 0.049 RVCs per pair, respectively). Northern Europeans (CEU, FIN, and GBR) also share more RVCs with LWK than with YRI (table 3). This prevalence of LWK over YRI in shared IBD chromosomal segments in Southern European populations is statistically significant according to the Chi-squared test with P-value <10−15. RVC sharing among people inhabiting New World perfectly reflects recent global human demographic events and migration routes. All three Caribbean populations (CLM, MXL, and PUR—Colombians, Mexican from Los Angeles, and Puerto Ricans, respectively) share considerable amount of RVCs with African and European populations (on average ∼0.9 and 0.5 shared RVCs per pair and 254 and 620 kb of average median RVC size, respectively). However, these numbers considerably vary from population-to-population (e.g., compare MXL–GBR vs. PUR–GBR, or PUR–YRI vs. MXL–YRI in table 3). American South-West Black population (ASW) represents another good example of recent admixture. Figure 1 demonstrates extensive presence of RVCs from European and American populations in ASW genomes. However, the admixture of African and American populations is still nonhomogeneous and there are multiple strips in the corresponding segments of the heat-map in figure 1. Interestingly, one person from ASW (NA20314) shares 10 times less RVCs with LWK and YRI than any other ASW representative (for details compare ASW_NA20314_dat3 and ASW_NA20314_dat4 files with other “dat3” and “dat4” files from the ASW population available from the supplementary file “Data”, Supplementary Material online (folder OutputDataWindow20). This person also shares the minimal number of RVCs among all possible pairs within ASW population. NA20314 is presented by a tiny white line across African populations in figure 1. Possibly, some errors might have occurred in population identification of this individual. RVC sharing between Caribbean and European populations revealed at least twice fewer admixtures of the Caribbean with the British, Italian, and Finns than with the Spanish thus reflecting a rich Spanish colonial exploration of the region. Due to well-known social restrictions, the African–American genomes share only 0.3 RVC per pair on average with the Spanish, the British, and Utah whites. Genomes of all Caribbean groups are enriched with African RVC (1.0–1.5 RVC per pair in CLM and PUR and 0.5 in MXL). All three Caribbean populations share considerable amount of RVCs with Africans and the impact of YRI and LWK is even. These data are consistent with the database of the slave-trading voyages (www.slavevoyages.org) and also the Atlas of the Transatlantic Slave Trade (Eltis and Richardson 2010). According to this book, three million people were taken from the Bight of Benin—a native land of YRI population. In addition, about five million people were taken from West Central Africa, which people belong to Bantu linguistic/ethic group, the same as LWK population (Gomez et al. 2014). However, a considerable portion of slaves from West Central Africa were brought to Brazil. Finally, African–Americans from the South–West (ASW) share more RVCs with Western African (YRI) than with Eastern African (LWK). Impact of gene flow from Asia to the American continent is the least profound (0.04–0.26 shared RVCs per Asian–American pair).

Modeling the number and size of IBD segment inheritance in generations

Computer simulations in population genetics have several advantages over mathematical modeling (Qiu et al. 2014). We created a program IBDsimulator.pl that models an inheritance of IBD autosomal segments from an initial person (primogenitor) along a chain of his/her descendants in multiple successive generations. The program uses real distribution of meiotic recombination sites along the human genome from HapMap table of genetic versus physical distances in human chromosomes (Frazer et al. 2007). In order to obtain reliable statistics, this program has been repeated independently 500,000 times. The distribution of average numbers and sizes of inherited computer-simulated IBD segments from a primogenitor in successive generations are shown in figure 2, respectively, while the data from the program are available in the supplementary file “Data”, supplementary material online, folder Modeling.
F

Distribution of length and number of IBD autosomal segments inherited from a model primogenitor in consecutive generations calculated by a computer simulation program IBDsimulator.pl. (A) Average number of primogenitor’s IBD segments per descendant. First generation contains one copy of 22 primogenitor’s autosomes (22 IBD segments). (B) Average size of primogenitor’s IBD segments per descendant obtained by IBDsimulator.pl (red curve). Blue curve (open circles) shows the average size of primogenitor’s IBD segments calculated from equation (1) with the following parameters: recombination rate value is = 0.0118 Mb−1.

Distribution of length and number of IBD autosomal segments inherited from a model primogenitor in consecutive generations calculated by a computer simulation program IBDsimulator.pl. (A) Average number of primogenitor’s IBD segments per descendant. First generation contains one copy of 22 primogenitor’s autosomes (22 IBD segments). (B) Average size of primogenitor’s IBD segments per descendant obtained by IBDsimulator.pl (red curve). Blue curve (open circles) shows the average size of primogenitor’s IBD segments calculated from equation (1) with the following parameters: recombination rate value is = 0.0118 Mb−1. An offspring (generation G1) of a primogenitor inherits 22 IBD segments (22 autosomes) from the parent (fig. 2). In the next generation (G2), the average number of IBD segments inherited from this primogenitor reaches its maximum value (28.5 IBD segments per grand-child). In the following generations, the average number of IBD chromosomal segments inherited from the primogenitor drops monotonously. The tenth generation (the G10 progeny of the primogenitor) retains, most often, one or zero IBD segment (on average, 0.37 IBD segments per G10-descendant). At the 20th generation only 7 out of 10,000 G20-descendants inherit an IBD segment from their particular primogenitor, while the rest 99.93% of G20-descendants do not possess any genetic material from this particular G20-primogenitor. The length of IBD segments shortens dramatically during the first few generations. Then, the diminution of the IBD segments length starts slowing down (fig. 2). However, the distribution of the sizes of IBD segments in these generations is very wide (fig. 3). This phenomenon is due to the very uneven distribution of meiotic recombination rates along human chromosomes (Arnheim et al. 2003). In many occasions, the length of an IBD segment in the G20-descendant might be longer than an IBD segment in the G5-descendant (fig. 3). Therefore, a particular length of an IBD segment does not allow accurately determining the generation of the inherited person. For this reason, many papers use genetic distances (measured in centimorgans, cM) rather than physical IBD length in nucleotides (e.g., Browning and Browning 2013). In this article, we use physical distances of IBD segments because measurement of genetic distances of human chromosomes is still not very accurate and based on the HapMap tables (Frazer et al. 2007), which are not continues and have many gaps.
F

Distribution of model primogenitor’s IBD segments by their lengths at 5th, 10th, 15th, and 20th generations. Number of IBD segments within particular ranges of lengths was calculated for 0.4 Mb bins. The last bin represents the number of segments with length >10 Mb.

Distribution of model primogenitor’s IBD segments by their lengths at 5th, 10th, 15th, and 20th generations. Number of IBD segments within particular ranges of lengths was calculated for 0.4 Mb bins. The last bin represents the number of segments with length >10 Mb. The diminution of average IBD segment size in generations is described by the formula (Browning and Browning 2012): where is the consecutive generation number (or equivalently number of meioses) and is the recombination rate ( = 0.0118 Mb−1, one event per 85 Mb). In order to get the most accurate estimation about the time of last common ancestor (in generations) based on the length of shared IBD segment, one should use the local recombination rate () inside equation (1). The computer-simulated curve for IBD segments length on figure 2 (red line) is almost the same as the theoretical one based on equation (1) (blue line). Equation (1) allows us to estimate the time of population admixture/separation below. Along a genealogical lineage, first degree relatives (e.g., parent-offspring) share on average 50% of genetic material, second degree relatives (grandparent–grandchild) 25%, third degree relatives (great grandparent–great grandchild) 12.5%, and so on according to the formula 100% × 2−, where is a degree of relationship in generations. However, due to a limited number of meiotic recombination events per gamete (on average, 36), which are distributed very unevenly along the genome, the inheritance of the primogenitor’s chromosomal material in generations may be very uneven (Consortium 2003). Our computation modeling with real distributions of human meiotic recombination sites based on HapMap dataset (Frazer et al. 2007) generated statistics of such unevenness of autosomal material inheritance (fig. 4). For example, a grandchild does not always get exactly 25% of genetic material from a grandparent. This amount frequently varies between 20% and 33% interval. An explanation to figure 4 is provided in the “Discussion” section and figure 5.
F

Distribution of proportion of model Primogenitor’s genetic materials inherited by descendants in six successive generations. The figure presents 100,000 computational simulation experiments performed with the IBDsimulator.pl program. The x-axis presents the percentage of the total autosomal length of Primogenitor in the descendants. The y-axis shows the number of occurrences of different proportions of Primogenitors’ genetic material in different generations out of 100,000 experiments.

F

Randomness and unevenness of the inheritance pattern of the primogenitor’s chromosomal material in generations. The chromosomal material of primogenitor (parent I) is shown in red. The chromosomal material of the mating partner of primogenitor (parent II) is shown in white. First generation offspring inherits one copy of primogenitor’s chromosome. Since meiotic recombination events (black horizontal lines) are few and random, different gametes from the progeny could obtain different amount of primogenitor’s IBD segments. Gamete “gam A1”, which creates an second generation offspring “Ind A2” has only ∼10% of primogenitor’s (red) chromosome and 90% of the second parent’s (white) chromosome, while another gamete B1 transfers ∼90% of primogenitor’s (red) chromosome in three IBD segments and only 10% of the other parent’s (white) chromosome to the other second generation offspring “Ind B2”. The yellow chromosome in both individuals A2 and B2 is contributed by their second parent (i.e., the mating partner of the first generation offspring). Even the third generation could easily lose all primogenitor’s chromosomal material (red) via gamete A2, or inherit a majority of primogenitor’s chromosomal material (red) in several IBD segments via gamete B2.

Distribution of proportion of model Primogenitor’s genetic materials inherited by descendants in six successive generations. The figure presents 100,000 computational simulation experiments performed with the IBDsimulator.pl program. The x-axis presents the percentage of the total autosomal length of Primogenitor in the descendants. The y-axis shows the number of occurrences of different proportions of Primogenitors’ genetic material in different generations out of 100,000 experiments. Randomness and unevenness of the inheritance pattern of the primogenitor’s chromosomal material in generations. The chromosomal material of primogenitor (parent I) is shown in red. The chromosomal material of the mating partner of primogenitor (parent II) is shown in white. First generation offspring inherits one copy of primogenitor’s chromosome. Since meiotic recombination events (black horizontal lines) are few and random, different gametes from the progeny could obtain different amount of primogenitor’s IBD segments. Gamete “gam A1”, which creates an second generation offspring “Ind A2” has only ∼10% of primogenitor’s (red) chromosome and 90% of the second parent’s (white) chromosome, while another gamete B1 transfers ∼90% of primogenitor’s (red) chromosome in three IBD segments and only 10% of the other parent’s (white) chromosome to the other second generation offspring “Ind B2”. The yellow chromosome in both individuals A2 and B2 is contributed by their second parent (i.e., the mating partner of the first generation offspring). Even the third generation could easily lose all primogenitor’s chromosomal material (red) via gamete A2, or inherit a majority of primogenitor’s chromosomal material (red) in several IBD segments via gamete B2.

Discussion

Genealogical and Genetic Relatedness

From a genealogical viewpoint, every human being has two biological parents, four grandparents, and so on in geometrical progression. At the th generation a person has 2 direct genealogical ancestors. When =20 the number of ancestors becomes 1,048,576 while when =40, it becomes 1,099,511,627,776. Therefore, at generation ∼20–30 a majority of people from the same geographical region are distant genealogical relatives to each other. Rohde et al. (2004) examined human genealogical relations and estimated that the last common genealogical ancestor for all modern humans (presumably from the same continent) lived ∼76 generations ago (∼2,000 years ago). How genetic material is transmitted through generations along a genealogical tree determines the genetic relatedness. The transmission occurs via gametes, which are created, from pieces of maternal and paternal chromosomes via meiotic recombination. On average, 22 human autosomes have 34.5 recombination events per gamete and these recombination sites are distributed very unevenly along chromosomes. The inheritance of genetic material is random and may be uneven (fig. 5). Due to immense variations in recombination rates along the genome, the spread of IBD segment sizes is very wide (fig. 3). During transmission of IBD segments from generation-to-generation they become smaller and smaller (fig. 2). After the tenth generation, a majority of direct genealogical descendants have lost all genetic material from their particular G10-primogenitor. However, since human populations have limited sizes, individuals often share multiple short IBD segments from their common distant ancestors. The patterns (numbers and lengths) of shared IBD segments across human populations significantly vary from population-to-population depending on their size, mating traditions, migration, admixture, and other parameters. Knowledge of inheritance of IBD genomic segments is important for medicine, specifically in case–control association and cohort studies, since unknown distant familial relationships could potentially compromise interpretation of collected data.

IBD Segments Identification with Modern Approaches

In Al-Khudhair et al. (2015), our team has discussed various methods used to detect IBD familial relationships with up to tenth degree of relatedness. In a nutshell, even for close relatives, modern algorithms have very high level of errors in IBD segments identification (Huff et al. 2011; Durand et al. 2014; Li et al. 2014). Recent papers extrapolated statistical analyses of SNP distributions to much older events for intercontinental population admixture, and even for the relationship between modern humans and other, now extinct, archaic hominid groups (Reich et al. 2010; Meyer et al. 2012; Castellano et al. 2014; Lazaridis et al. 2014). These sophisticated statistical methods have been recently reviewed by Racimo et al. (2015). They include Patterson’s D statistic (Green et al. 2010; Durand et al. 2011; Patterson et al. 2012); analysis of incomplete lineage sorting from introgressed haplotypes seen by increased long-range linkage disequilibrium (LD) using S* statistic (Wall et al. 2009; Vernot and Akey 2014); probabilistic hidden Markov model (Prufer et al. 2014; Seguin-Orlando et al. 2014) and conditional random field model (Sankararaman et al. 2014). Yet, the accuracy and the reliability of these methods cannot be directly verified. Importantly, anthropologists have presented reasonable doubts of modern statistical methods for evaluating population admixtures and evolution, showing that statistical conclusions “go so much against the well-known evolutionary realities…” (Weiss and Lambert 2014). A major problem in statistical approaches for revealing genetic relatedness exists in pipeline of approximations that may amplify errors in a vicious cycle. For example, to calculate LD, the phasing of genomic sequences is required, which is prone to errors. Calculated LD values, with already-embedded errors, are frequently used for nucleotide imputations of de novo sequenced genomes, as well as for their phasing. This cycling may result in progressive multiplication of the initial errors. In addition, nucleotide sequence imputations often do not consider many important biological processes (e.g., biased gene conversion) that are often involved in haplotype changes and alteration of LD values. A direct comparison between our approach and the computer predictions of shared IBD segments by two popular algorithms (GERMLINE and PLINK) is possible using table 5 from Gusev et al. (2009). This table 5 of Gusev et al. presents the IBD data for pairs from 45 unrelated individuals from Japan (JPT) and also Chinese people from Beijing (CHB). On the other hand, our table 2 contains the distribution of IBD segments predicted by RVCs within 89 Japanese (JPT) and 97 Han people from Beijing (CHB). For Japanese population, the detected mean IBD segment length is 1.53 Mb for our RVC algorithm, is 1.8 Mb for GERMLINE, and 4.8 for PLINK. For CHB population the mean IBD segment length is 1.26 Mb (RVC), 2.1 Mb (GERMLINE), and 4.8 Mb (PLINK). Thus, we detected on average shorter IBD segments. Table 2 provides the mean number of IBD segments per pair (2.41 for CHB and 10.9 for JPT) and the mean length of shared IBD segment (1.26 Mb for CHB and 1.53 Mb for JPT). According to these records, the expected total number of shared IBD segments for a group of 45 people would be 4,772 (CHB) and 21,582 (JPT), while the total length of all shared IBD segments for 45 people would be 6,012 Mb (CHB) and 33,020 Mb (JPT). These numbers are many times higher than the corresponding numbers for JPT and CHB populations in table 5 of Gusev et al. (2009). Hence, our RVC approach allows to detect several times more shared IBD segments than GERMLINE and PLINK. In addition, GERMLINE and PLINK are used for predicting relatively long shared IBD segments (>1 Mb) that originated over the last 10 generations (e.g., on the order of second to ninth cousins) (Gusev et al. 2009; Henn et al. 2012; Zhuang et al. 2012). Hence, the advantage of our RVC approach is in the ability to detect short IBD segments (down to 30 Kb) that share common ancestors down to 378 generations ago (or ∼9,500 years) (see the section “Estimations of Time for Common Ancestors from Shared RVC” below). In the supplementary file “Data”, Supplementary Material online (folder OutputDataWindow20) we provide exhaustive details about the distribution of RVCs for every possible pair of 1,092 sequenced genomes, so our results can be compared with any competitive programs. There is a clear difference between STRUCTURE software (Falush et al. 2003) and our RVC algorithm. The main use of STRUCTURE is for assigning individuals to populations, inferring the presence of distinct populations, identifying migrants and admixed individuals. For these purposes, STRUCTURE is heavily based on much more frequent SNPs or other genetic markers. In contrast, our approach is aimed at revealing most distant cryptic genetic relatedness among pairs of individuals.

Population Analysis Using RVCs

Contrary to the probabilistic approaches, our method is rather deterministic because we consider a group of very rare events which, practically speaking, cannot happen together only by chance. Indeed, our threshold probability for sharing of clusters of vrGVs between individuals is 0.5 × 10−9 for the default search parameters (five shared vrGVs in a consecutive window of 20 individual-specific vrGVs, see supplementary file “M&M”, Supplementary Material online). Our approach allowed detection of genetic relatedness among people from remote geographic regions. It is in good agreement with the known human population history. Moreover, it allowed clarifying some debated issues. For example, our data (fig. 1) clearly demonstrate that the Finns, which migrated from Northern Eurasia several thousand years ago, deeply admixtured with the European populations and now share the majority of their RVCs with the Europeans. According to the analysis of the mtDNA haplogroups and several autosomal markers, the Finns are undistinguishable from other Europeans (Lahermo et al. 1996). On the contrary, the Y-chromosome investigations show high prevalence (>50%) of North Eurasian-specific N3 haplogroups among the Finns, which also present in China and Japan (Lappalainen et al. 2006; Rootsi et al. 2007). Thus, elevated number of shared short sized RVCs between Finns and both the Chinese and Japanese, compared with other Europeans, supports the Y chromosome data of ancient origin of Finns from Asia. Another example is the distribution of the African RVCs among Europeans. Higher levels of African admixture in Southern (especially South-Western) European compare with Northern have been identified by analysis of Y-chromosome and mtDNA haplogroups as well as by autosomal SNP distribution and IBD sharing (Adams et al. 2008;Moorjani et al. 2011; Cerezo et al. 2012; Botigue et al. 2013). African (sub-Saharan) ancestry was estimated to be around slightly <3% in Iberia and ∼1% in Northern Italy (Moorjani et al. 2011) or <1% for Iberia and TSI (Botigue, et al. 2013). However the authors did not find the difference in IBD segments sharing between YRI and LWK and European populations. Hence, the source and the routes of the delivery of African genomes to the Europeans have been debated. Our data demonstrate significantly higher number of the Kenyan (LWK) RVCs than the Western African (YRI) in all European populations, thus, supporting the hypothesis of the Near Eastern rather than the trans-Saharan route of gene exchange between the Africans and the Europeans.

Estimations of Time for Common Ancestors from Shared RVC

There are two obstacles for the estimation of time for the last common ancestors for people sharing RVCs. First, we detect only the intersections of IBD segments between pairs of individuals. Since the intersections occur randomly, the whole IBD segments may be considerably larger than their intersections. According to figure 3, there is a great variation in the IBD segment sizes that may vary dozens of times for the same generation. The size of intersection of a large and a short fragment never exceeds the shortest one. Second, RVC approach characterizes not the whole IBD segments but only the inner part of them bordered by the two extreme rightmost and leftmost vrGV positions. Even though a human being on average bears ∼30,000 vrGVs, still there are some areas with no rare variant differences between individuals. Therefore, it is necessary to make an adjustment (calibration) for the estimation of time for the last common ancestors for the pairs with shared RVCs. It can be achieved using well-known admixture of Spanish–Americans populations starting in 1492 (about 21 generations ago), for which median size of shared intersected RVCs detected by our approach is 890 kb (table 3). However, according to figure 2, the median size of entire IBD segments after 21st generation should be 3.55 Mb, 4 times larger than the detected value. This shows that although the expected IBD segments after 21st generation is around 3.55 Mb in two individuals, the average length () of the RVCs intersection is only 890 kb. Based on this Spanish–American data, we made calibration of RVC length = (3.55/0.89) × , which can be placed in equation (1) to calculate the time of common ancestors between populations as the following: = 1/(*). Thus, we can calculate the time when the common ancestors to the European–African pairs (median size of shared RVCs is 180 kb) lived using the following:Af–Eu = 1/( × ), where =4 × 180 kb and =0.0118 Mb−1. It gives us Af–Eu = 118 generations ago or ∼2,950 years ago. This valuation is congruent to previous estimations (Moorjani et al. 2011). Using the same approach, we estimated that the last common ancestors for the shortest shared RVCs that are observed for Asian–African pairs (median 54 kb) probably lived ∼378 generations ago or ∼9,500 years ago.

Future Directions

For a broad public usage, a precise definition and cataloging of vrGVs are required. Creation of a public database of human vrGVs is in our nearest plans. With a massive flood of genome sequencing in the next few years, hundreds of millions of novel vrGVs will be available. Hence, the size of the vrGV database should be enormous. (Theoretically, seven billion people on the planet may have up to ten billion SNPs.) Therefore, it would be sensible to generate a database of frequent genetic variants, which are NOT-vrGVs. Any genetic variant that is absent in the NOT-vrGVs database should be considered as a very rare one. According to our preliminary data, the size of the NOT-vrGV is only 22 million genetic variants based on the phase 1 dataset of “1000 Genomes Project”. This number should not increase much with further sequenced genomes because adding new people will not generate novel frequent genetic variants. While considering vrGVs across multiple populations, a variant may have a total frequency of <0.2%, yet local frequency of the same variant in a particular population might be considerably high (e.g., 5%). We would rather exclude counting such variants as vrGVs if their frequency in a particular population is above a certain threshold (e.g., 1%). Human populations vary significantly in the number of vrGVs per person. However, these variations should not noticeably influence the detection of cryptic relatedness since rare variants are spread over a vast genomic regions and the probability of sharing of five or more vrGVs within a particular locus depends only on their frequencies and the window size for registration of RVCs according to the equation (2) from the “Materials and Methods” section (which is merely 0.5 × 10−9 for our default parameters). Due to the simplicity and computational speed, our method may be used for large cohort and GWAS studies where thousands of sequenced genomes will be available. Proper identification of genetic relationships is essential for forensic identification, in criminal investigations, inheritance claims, and in other areas of human life.

Conclusion

Inheritance of genetic materials creates an intricate fractal mosaic of IBD chromosomal segments in the genome. Close familial relationships are presented by shared long IBD segments that in turn are mosaics of shorter IBD segments from previous generations. Further, each IBD segment is built from smaller pieces inherited from distant ancestors. Identification of shared vrGV clusters presents a powerful tool for characterization of long and short IBD segments and for evaluation of population stratification. Proper recognition of genetic relationships is essential for individualized medicine, forensic identification, criminal investigations, inheritance claims, and in other areas of human life.
  53 in total

Review 1.  Identity by descent: variation in meiosis, across genomes, and in populations.

Authors:  Elizabeth A Thompson
Journal:  Genetics       Date:  2013-06       Impact factor: 4.562

2.  Detecting identity by descent and estimating genotype error rates in sequence data.

Authors:  Brian L Browning; Sharon R Browning
Journal:  Am J Hum Genet       Date:  2013-10-24       Impact factor: 11.025

3.  Ancient admixture in human history.

Authors:  Nick Patterson; Priya Moorjani; Yontao Luo; Swapan Mallick; Nadin Rohland; Yiping Zhan; Teri Genschoreck; Teresa Webster; David Reich
Journal:  Genetics       Date:  2012-09-07       Impact factor: 4.562

Review 4.  Identity by descent between distant relatives: detection and applications.

Authors:  Sharon R Browning; Brian L Browning
Journal:  Annu Rev Genet       Date:  2012-09-17       Impact factor: 16.830

5.  The variance of identity-by-descent sharing in the Wright-Fisher model.

Authors:  Shai Carmi; Pier Francesco Palamara; Vladimir Vacic; Todd Lencz; Ariel Darvasi; Itsik Pe'er
Journal:  Genetics       Date:  2012-12-24       Impact factor: 4.562

6.  What type of person are you? Old-fashioned thinking even in modern science.

Authors:  Kenneth M Weiss; Brian W Lambert
Journal:  Cold Spring Harb Perspect Biol       Date:  2014-01-01       Impact factor: 10.005

7.  Gene flow from North Africa contributes to differential human genetic diversity in southern Europe.

Authors:  Laura R Botigué; Brenna M Henn; Simon Gravel; Brian K Maples; Christopher R Gignoux; Erik Corona; Gil Atzmon; Edward Burns; Harry Ostrer; Carlos Flores; Jaume Bertranpetit; David Comas; Carlos D Bustamante
Journal:  Proc Natl Acad Sci U S A       Date:  2013-06-03       Impact factor: 11.205

8.  The complete genome sequence of a Neanderthal from the Altai Mountains.

Authors:  Kay Prüfer; Fernando Racimo; Nick Patterson; Flora Jay; Sriram Sankararaman; Susanna Sawyer; Anja Heinze; Gabriel Renaud; Peter H Sudmant; Cesare de Filippo; Heng Li; Swapan Mallick; Michael Dannemann; Qiaomei Fu; Martin Kircher; Martin Kuhlwilm; Michael Lachmann; Matthias Meyer; Matthias Ongyerth; Michael Siebauer; Christoph Theunert; Arti Tandon; Priya Moorjani; Joseph Pickrell; James C Mullikin; Samuel H Vohr; Richard E Green; Ines Hellmann; Philip L F Johnson; Hélène Blanche; Howard Cann; Jacob O Kitzman; Jay Shendure; Evan E Eichler; Ed S Lein; Trygve E Bakken; Liubov V Golovanova; Vladimir B Doronichev; Michael V Shunkov; Anatoli P Derevianko; Bence Viola; Montgomery Slatkin; David Reich; Janet Kelso; Svante Pääbo
Journal:  Nature       Date:  2013-12-18       Impact factor: 49.962

9.  HapFABIA: identification of very short segments of identity by descent characterized by rare variants in large sequencing data.

Authors:  Sepp Hochreiter
Journal:  Nucleic Acids Res       Date:  2013-10-29       Impact factor: 16.971

10.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

View more
  3 in total

1.  Identification of genetic outliers due to sub-structure and cryptic relationships.

Authors:  Daniel Schlauch; Heide Fier; Christoph Lange
Journal:  Bioinformatics       Date:  2017-07-01       Impact factor: 6.937

2.  Intricacies in arrangement of SNP haplotypes suggest "Great Admixture" that created modern humans.

Authors:  Rajib Dutta; Joseph Mainsah; Yuriy Yatskiv; Sharmistha Chakrabortty; Patrick Brennan; Basil Khuder; Shuhao Qiu; Larisa Fedorova; Alexei Fedorov
Journal:  BMC Genomics       Date:  2017-06-05       Impact factor: 3.969

3.  Global Picture of Genetic Relatedness and the Evolution of Humankind.

Authors:  Gennady V Khvorykh; Oleh A Mulyar; Larisa Fedorova; Andrey V Khrunin; Svetlana A Limborska; Alexei Fedorov
Journal:  Biology (Basel)       Date:  2020-11-10
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.