| Literature DB >> 23875710 |
Alan Hodgkinson1, Ferran Casals, Youssef Idaghdour, Jean-Christophe Grenier, Ryan D Hernandez, Philip Awadalla.
Abstract
BACKGROUND: Regions of the genome that are under evolutionary constraint across multiple species have previously been used to identify functional sequences in the human genome. Furthermore, it is known that there is an inverse relationship between evolutionary constraint and the allele frequency of a mutation segregating in human populations, implying a direct relationship between interspecies divergence and fitness in humans. Here we utilise this relationship to test differences in the accumulation of putatively deleterious mutations both between populations and on the individual level.Entities:
Mesh:
Year: 2013 PMID: 23875710 PMCID: PMC3727949 DOI: 10.1186/1471-2164-14-495
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1The relationship between the average GERP score of a gene and the MAF of polymorphisms in the surrounding regions. Genes were split into quartiles based on average GERP score and the average MAF calculated in the sequences surrounding coding regions (A). The correlation between the depth of the depression in minor allele frequency and the average GERP score of genes in each of the top eight GERP score bins (B).
Figure 2The relationship between effective population size () and the MAF of polymorphisms in the regions surrounding the most conserved genes. For genes with the highest GERP scores (top 10%), the average MAF scores surrounding genes in each population, with population codes shown in the corresponding colour to the right of each line (A). The correlation between Ne and the depth of depression in MAF around the most highly conserved genes for old world populations that we have Ne data (B). Population codes are as follows: Utah residents with Northern and Western European ancestry (CEU), British in England and Scotland (GBR), Toscani in Italy (TSI), Finnish from Finland (FIN), Han Chinese in Beong, China (CHB), Southern Han Chinese (CHS), Japanese in Tokyo, Japan (JPN), Yoruba in Idadan, Nigeria (YRI), Luhya in Webuye, Kenya (LWK), Americans of African Ancestry in S.W. USA (ASW), Mexican ancestry from Los Angeles, USA (MXL), Puerto Ricans from Puerto Rico (PUR) and Colombians from Medellin, Colombia (CLM).
Figure 3The relationship between the average GERP score of a non-coding site that is at least 200 KB away from known genes and the MAF of polymorphisms in the surrounding regions.
Individuals with significantly different distributions of GERP scores within populations for singletons at nonsynonymous sites
| HG00244 | HG00253 | GBR | 0.03720 | NS |
| HG01342 | HG01374 | CLM | 0.00036 | 3.29e-05 |
| HG01342 | HG01112 | CLM | NS | 0.00657 |
| HG01342 | HG01494 | CLM | NS | 0.02885 |
| HG01342 | HG01274 | CLM | 0.02758 | NS |
| HG01374 | HG01551 | CLM | 0.00147 | 7.53e-05 |
| HG01374 | HG01550 | CLM | NS | 0.04320 |
| HG01374 | HG01488 | CLM | 0.02787 | 0.02099 |
| HG01551 | HG01274 | CLM | 0.03460 | NS |
| NA19429 | NA19321 | LWK | 0.02957 | NS |
| NA19384 | NA19321 | LWK | 0.04720 | NS |
| NA19660 | NA19741 | MXL | 0.04898 | NS |
| NA19723 | NA19681 | MXL | 0.00993 | 0.00853 |
| NA19723 | NA19783 | MXL | 0.00110 | 0.02363 |
| NA19723 | NA19741 | MXL | 0.00090 | 0.00834 |
| NA19723 | NA19654 | MXL | 0.02773 | NS |
| HG01167 | HG01072 | PUR | 0.00204 | 0.02393 |
| HG01072 | HG01108 | PUR | 0.00015 | NS |
| HG01072 | HG01204 | PUR | 0.00207 | 0.02961 |
| HG01072 | HG01051 | PUR | 0.02411 | NS |
| HG01072 | HG01052 | PUR | 0.00866 | NS |
| HG01082 | HG01108 | PUR | 0.03430 | 0.00019 |
P-values are shown for Mann–Whitney U and Kolmogorov-Smirnov tests after Bonferroni correction, unless the result is non-significant (NS).
Figure 4The numbers and proportions of mutations that occur at nonsynonymous sites with different GERP scores for individuals in the 1000 Genomes populations. For each individual, the proportion of nonsynonymous sites carrying the minor allele that fall into each GERP score bin was found and the proportions were averaged for individuals within each population in the 1000 Genomes data (A). Similarly, the average distribution was found for each population using the absolute numbers of alleles at heterozygous (B) and homozygous derived allele (inferred from a six way primate alignment) (C) sites falling in each positive GERP bin. African populations are blue, admixed American populations are orange, European populations are red and Asian populations are green. Error bars denote 95% confidence intervals.