| Literature DB >> 20333210 |
Chun-Hsi Chen, Trees-Juen Chuang, Ben-Yang Liao, Feng-Chi Chen.
Abstract
Human-specific small insertions and deletions (HS indels, with lengths <100 bp) are reported to be ubiquitous in the human genome. However, whether these indels contribute to human-specific traits remains unclear. Here we employ a modified McDonald-Kreitman (MK) test and a combinatorial population genetics approach to infer, respectively, the occurrence of positive selection and recent selective sweep events associated with HS indels. We first extract 625,890 HS indels from the human-chimpanzee-macaque-mouse multiple alignments and classify them into nonpolymorphic (41%) and polymorphic (59%) indels with reference to the human indel polymorphism data. The modified MK test is then applied to 100-kb partially overlapped sliding windows across the human genome to scan for the signs of positive selection. After excluding the possibility of biased gene conversion and controlling for false discovery rate, we show that HS indels are potentially positively selected in about 10 Mb of the human genome. Furthermore, the indel-associated positively selected regions overlap with genes more often than expected. However, our result suggests that the potential targets of positive selection are located in noncoding regions. Meanwhile, we also demonstrate that the genomic regions surrounding HS indels are more frequently involved in recent selective sweep than the other regions. In addition, HS indels are associated with distinct recent selective sweep events in different human subpopulations. Our results suggest that HS indels may have been associated with human adaptive changes at both the species level and the subpopulation level.Entities:
Keywords: human-specific indels; positive selection; recent selective sweep
Year: 2009 PMID: 20333210 PMCID: PMC2817433 DOI: 10.1093/gbe/evp041
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
The R Values in Different Polymorphism Data Sets
| Data source | Nonpolymorphic Indels | Polymorphic Indels | RID | Nonpolymorphic Substitutions | Polymorphic Substitutions | RNT |
| dbSNP | 119,353 | 161,470 | 0.74 | 962,193 | 1,107,124 | 0.87 |
| Seattle + NIEHS | 3,466 | 6,623 | 0.52 | 22,520 | 50,552 | 0.45 |
| Seattle SNPs | 723 | 2,224 | 0.33 | 5,070 | 12,950 | 0.39 |
| NIEHS SNPs | 2,764 | 4,451 | 0.62 | 17,593 | 38,044 | 0.46 |
NOTE.—Note that some of the analyzed regions of Seattle and National Institute of Environmental Health Sciences (NIEHS) SNPs overlap with each other. Therefore, the numbers in the row of “Seattle + NIEHS” are smaller than the sums of the two individual data sets. In addition, the RID and RNT values of Seattle and NIEHS SNPs are obviously different from those of dbSNP because of the specific purposes of the two data sets. The Seattle SNPs data set includes mainly inflammatory response genes, whereas the NIEHS data set includes environmental response genes.
RID and RNT are the ratios of nonpolymorphic changes to polymorphic changes for indels and nucleotide substitutions, respectively.
Results of the Modified MK Test
| Summary | PSW | NSW | Neutral | Total |
| No. of windows (A) | 2,174 | 4,975 | 46,092 | 53,241 |
| No. of gene-overlapping windows (B) | 1,563 | 3,263 | 29,527 | 34,353 |
| Percentage (B/A) | 71.9 | 65.6 | 64.1 | 64.5 |
NOTE.—In the modified MK test, the numbers of nonsynonymous substitutions are replaced by those of HS indels. That is, the test examines whether RID is significantly larger than RNT. See Materials and Methods for more details.
“PSW” and “NSW” represent positively and negatively selected windows, respectively.
Results of the DH Test in the EHH Windows with or without HS Indels
| Subpopulation | No. of Windows | #SSRs | Ratio | |
| With HS indels | ||||
| African | 195,513 | 5 (1 | 2.6 × 10-5 (7.2 × 10-6 | 0.720 |
| European | 168,525 | 175 (154) | 0.122 | |
| East Asian | 171,907 | 324 (292) | 0.098 | |
| Without HS indels | ||||
| African | 498,491 | 6 (2) | 1.2 × 10-5 (3.6 × 10-6) | 0.699 |
| European | 352,576 | 256 (224) | 7.3 × 10-4 (6.3 × 10-4) | 0.127 |
| East Asian | 369,998 | 440 (394) | 1.2 × 10-3 (1.1 × 10-3) | 0.105 |
Number of SSRs divided by number of windows.
The false discovery rate (Storey 2002).
The number (or ratio) of SSRs corrected according to the Q value.
Significantly higher in the regions with HS indels than those without HS indels (boldfaced, P values < 0.007, χ2 test).