| Literature DB >> 31081014 |
Jesús Murga-Moreno1, Marta Coronado-Zamora1, Sergi Hervas1, Sònia Casillas1, Antonio Barbadilla1.
Abstract
The McDonald and Kreitman test (MKT) is one of the most powerful and widely used methods to detect and quantify recurrent natural selection using DNA sequence data. Here we present iMKT (acronym for integrative McDonald and Kreitman test), a novel web-based service performing four distinct MKT types. It allows the detection and estimation of four different selection regimes -adaptive, neutral, strongly deleterious and weakly deleterious- acting on any genomic sequence. iMKT can analyze both user's own population genomic data and pre-loaded Drosophila melanogaster and human sequences of protein-coding genes obtained from the largest population genomic datasets to date. Advanced options in the website allow testing complex hypotheses such as the application example showed here: do genes located in high recombination regions undergo higher rates of adaptation? We aim that iMKT will become a reference site tool for the study of evolutionary adaptation in massive population genomics datasets, especially in Drosophila and humans. iMKT is a free resource online at https://imkt.uab.cat.Entities:
Year: 2019 PMID: 31081014 PMCID: PMC6602517 DOI: 10.1093/nar/gkz372
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Comparison of the four MKT methods implemented in iMKT. (A) The hypothetical derived allele frequency (DAF) spectrum of synonymous and non-synonymous classes for a gene exhibiting an excess of both slightly deleterious and fixed non-synonymous differences with n = 10 sampled chromosomes. (B) The standard MKT for this gene (P-value = 0.09, 2 × 2 Fisher's exact test). (C) The 2 × 2 table by Fay, Wyckoff and Wu's correction (24) taking into account only polymorphism found on more than one chromosome (P-value = 0.045, 2 × 2 Fisher's exact test). (D) Extended MKT (9). The count of segregating sites in non-synonymous sites is partitioned into the number of neutral variants and the number of weakly deleterious variants. PN is substituted with the number of nonsynonymous polymorphisms that is neutral (P-value = 0.042, 2 × 2 Fisher's exact test). (E) Asymptotic MKT. Example of the result of asymptotic MKT using D. melanogaster 2R chromosome and D. simulans as outgroup. The two vertical lines show the limits of the x cutoff interval used (in the example [0,0.9]). Black dots indicate the binned values for each DAF category. The solid red curve shows the fitted fit(x). The dashed red line is the final asymptote. The dark gray band indicates the 95% CI around the estimation. The blue dashed line shows the estimated using the standard MKT for comparison. For MKT methods definitions, see Appendix 1. Adapted and expanded from 29.
Figure 2.iMKT graphical output of an application example. Sampling distribution of α values for protein-coding genes located in regions of high recombination (recombination rate >7 cM/Mb) compared to all protein-coding genes in the genome for (A) the D. melanogaster Raleigh (RAL) population (blue) and the D. melanogaster Zambia (ZI) population (yellow) and (B) the human Utah Residents (CEU) population (blue) and the human Yoruba (YRI) population (yellow). The distribution was calculated by randomly sampling 400 genes 100 times from the two lists of genes with replacement and estimating α in each bin. Polymorphisms with a frequency below 0.05 in the analyzed population were discarded (see main text).