| Literature DB >> 35925921 |
Marla Mendes1,2, Manjari Jonnalagadda3, Shantanu Ozarkar4, Flávia Carolina Lima Torres1, Victor Borda Pua5, Christopher Kendall2, Eduardo Tarazona-Santos1, Esteban J Parra2.
Abstract
In this study, we present the results of a genome-wide scan for signatures of positive selection using data from four tribal groups (Kokana, Warli, Bhil, and Pawara) and two caste groups (Deshastha Brahmin and Kunbi Maratha) from West of the Maharashtra State In India, as well as two samples of South Asian ancestry from the 1KG project (Gujarati Indian from Houston, Texas and Indian Telugu from UK). We used an outlier approach based on different statistics, including PBS, xpEHH, iHS, CLR, Tajima's D, as well as two recently developed methods: Graph-aware Retrieval of Selective Sweeps (GRoSS) and Ascertained Sequentially Markovian Coalescent (ASMC). In order to minimize the risk of false positives, we selected regions that are outliers in all the samples included in the study using more than one method. We identified putative selection signals in 107 regions encompassing 434 genes. Many of the regions overlap with only one gene. The signals observed using microarray-based data are very consistent with our analyses using high-coverage sequencing data, as well as those identified with a novel coalescence-based method (ASMC). Importantly, at least 24 of these genomic regions have been identified in previous selection scans in South Asian populations or in other population groups. Our study highlights genomic regions that may have played a role in the adaptation of anatomically modern humans to novel environmental conditions after the out of Africa migration.Entities:
Mesh:
Year: 2022 PMID: 35925921 PMCID: PMC9352006 DOI: 10.1371/journal.pone.0271767
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Fig 1Schematic representation of the approach to identify putative selective regions.
We applied six different methods to identify outliers (top 1% results) and selected regions that were observed in all population groups and were outliers for at least two independent methods. Additionally, we performed analyses using a novel coalescence-based method implemented in the program ASMC.
Fig 2Overview of our results.
A) Distribution of the number of regions identified for each chromosome; B) Distribution of the number of genes located within putative selective regions identified with two or three methods for each chromosome, for all thresholds (1%, 0.5% and 0.1%).
List of the putative selected genomic regions that have been described in previous studies.
| chr | Start | End | SNPs | genes | OBS | Shared signal with 1KGP_HC | Shared signal with PopHuman Browser (iHS) |
|---|---|---|---|---|---|---|---|
| 1 | 234663636 | 235491532 | 118 | LOC100506795,TOMM20,SNORA14B,RBM34,ARID4B,MIR4753 |
| xpEHH(0.5%),PBS(0.1%) | GIH,ITU |
| 2 | 72356366 | 73053177 | 20 | CYP26B1,EXOC6B,SNORD78 |
| xpEHH(0.5%), Tajimas`D(0.5%), PBS(0.5%) | GIH,ITU |
| 2 | 96940073 | 98858761 | 56 | SNRNP200,ITPRIPL1,NCAPH,ARID5A,KANSL3,FER1L5,ANKRD39,SEMA4C,FAM178B,FAHD2B,ANKRD36,ANKRD36B,COX5B,ACTR1B,LOC728537,ZAP70,VWA3B |
| PBS(0.1%), xpEHH(0.1%) | GIH,ITU |
| 2 | 241662829 | 242033643 | 60 | KIF1A,AGXT,C2orf54,SNED1 |
| xpEHH(1%) | GIH,ITU |
| 4 | 39289068 | 39529218 | 26 | RFC1,KLB,RPL9,LIAS,LOC401127,UGDH |
| xpEHH(0.5%) | GIH,ITU |
| 6 | 29550028 | 33086926 | 3545 | SNORD32B,OR2H2,GABBR1,MOG,ZFP57,HLA-F,HLA-F-AS1,IER3,AK098012,DDR1,MIR4640,GTF2H4,VARS2,MUC22,HLA-C,HLA-B,HCP5,PMSP,PRRT1,LOC100507547,PPT2,PPT2-EGFL8,EGFL8,AGPAT1,RNF5,AGER,PBX2,GPSM3,NOTCH4,HLA-DMB,HLA-DMA,BRD2,HLA-DOA,HLA-DPA1,HLA-DPB1,HLA-DPB2 |
| PBS(0.5%), xpEHH(1%) | GIH,ITU |
| 7 | 111366163 | 111461829 | 12 | DOCK4,BC043243 |
| xpEHH(0.5%), PBS(0.5%) | GIH, ITU |
| 7 | 119913721 | 120390387 | 19 | KCND2 |
| xpEHH(0.5%), Tajimas`D(0.5%), CLR(0.5%) | GIH, ITU |
| 9 | 123714613 | 124095120 | 18 | C5,CNTRL,RAB14,GSN |
| xpEHH(0.5%), PBS(0.1%) | GIH,ITU |
| 10 | 320129 | 735608 | 23 | DIP2C |
| xpEHH(0.5%), Tajimas`D(1%) | GIH,ITU |
| 10 | 118187423 | 118261387 | 9 | PNLIPRP3,JA611286 |
| Tajimas`D(0.5%) | |
| 11 | 61447904 | 62622555 | 144 | DAGLA,MYRF,DKFZP434K028,BC020196,TMEM258,MIR611,FEN1,FADS1,MIR1908,FADS2,FADS3,RAB3IL1,BEST1,FTH1,BC132896,SNORD27 |
| PBS(0.1%), xpEHH(0.1%) | GIH, ITU |
| 11 | 65479472 | 68846261 | 198 | KAT5,RNASEH2C,AP5B1,SNX32,CFL1,MUS81,EFEMP2,CTSW,FIBP,CCDC85B,FOSL1,KLC2,RAB1B,AK125412,CNIH2,YIF1A,TMEM151A,CD248,RIN1,BRMS1,B3GNT1,SLC29A2,AX747485,NPAS4,MRPL11,LOC100130987,POLD4,CLCF1,RAD9A,PPP1CA,TBC1D10C,CARNS1,RPS6KB2,PTPRCAP,CORO1B,GPR152,CABP4,TMEM134,AIP,PITPNM1,CDK2AP2,CABP2,C11orf24,LRP5,MRGPRF,BC039516,TPCN2 |
| CLR(0.1%),Tajimas`D(0.5%), xpEHH(0.1%), PBS(0.5%) | GIH, ITU |
| 11 | 126293395 | 132206716 | 884 | KIRREL3,DJ031150,NTM |
| xpEHH(0.5%), PBS(0.5%) | GIH,ITU |
| 13 | 92050934 | 93519487 | 139 | GPC5 |
| xpEHH(0.1%), PBS(1%) | GIH,ITU |
| 14 | 63173944 | 63511955 | 35 | KCNH5 |
| xpEHH(0.1%), PBS(0.5%) | GIH,ITU |
| 16 | 29464909 | 32077476 | 122 | BOLA2,KIF22,MAZ,AB209061,AK097472,PRRT2,PAGR1,BC029255,MVP,CDIPT,CDIPT-AS1,SEZ6L2,ASPHD1,KCTD13,TMEM219,TAOK2,HIRIP3,LOC595101,CD2BP2,TBC1D10B,MYLPF,SEPT1,ZNF48,SEPT2,ZNF771,DCTPP1,SEPHS2,ITGAL,MIR4518,ZNF768,ZNF747,AK056973,ZNF764,ZNF688,ZNF785,ZNF689,PRR14,FBRS,LOC730183,SRCAP,SNORA30,LOC100862671,PHKG2,C16orf93,RNF40,ZNF629,BCL7C,MIR4519,BC073928,MIR762,CTF1,FBXL19-AS1,FBXL19,ORAI3,SETD1A,HSD3B7,STX1B,STX4,BC039500,ZNF668,ZNF646,PRSS53,VKORC1,BCKDK,KAT8,PRSS8,PRSS36,FUS,TLS/FUS-ERG,PYCARDC,16orf98,TRIM72,PYDC1,ITGAM,DL489986,ITGAX,IGHV 3–07,IGH |
| Tajimas`D(0.1%), PBS(0.1%), xpEHH(0.1%) | GIH,ITU |
| 16 | 46760587 | 47735434 | 15 | MYLK3,C16orf87,GPT2,ITFG1,PHKB |
| Tajimas`D(0.5%), PBS(0.5%) | |
| 16 | 87117167 | 87457487 | 57 | AK125749,C16orf95,FBXO31,MAP1LC3B,ZCCHC14 |
| xpEHH(0.5%), PBS(0.1%) | GIH,ITU |
| 17 | 17876126 | 18011299 | 4 | LRRC48,ATPAF2,BC150162,GID4,DRG2 |
| PBS(0.5%) | GIH,ITU |
| 17 | 58755212 | 59470192 | 58 | BCAS3 |
| Tajimas’D(0.1%), CLR(0.1%), PBS(1%) | GIH, ITU |
| 20 | 53092265 | 53267710 | 13 | DOK5 |
| xpEHH(0.5%), Tajimas`D(0.1%), CLR(0.1%), PBS(0.1%) | |
| 22 | 35462129 | 35483380 | 3 | ISX |
| ||
| 22 | 46756730 | 46933067 | 49 | CELSR1 |
| xpEHH(0.5%), Tajimas`D(0.5%) |
List of the regions with putative signatures of natural selection in our study that have been described in other studies, with a particular emphasis in studies in South Asian populations or other Asian groups. 1: Metspalu et al. 2011, 2: Suo et al. 2012, 3: Karlsson et al. 2013, 4: Liu et al. 2017, 5: Perdomo-Sabogal and Nowick, 2019. In Blue, we show regions with results in the 0.5% of the most significant values for at least one method, and in red the results in the top 0.1% most significant results for at least one method. We also highlight the regions that also have significant results in the 1kgp high coverage data (1kgp_HC), and in the PopHuman Browser with iHS.
Fig 3Ascertained Sequentially Markovian Coalescent (ASMC) results for chromosome 13.
Showing with a greater resolution the region including the gene GPC5 where the highest enrichment in recent coalescence events is concentrated on this chromosome.