| Literature DB >> 16781088 |
Abstract
Most of the studies on single nucleotide variations are on substitutions rather than insertions/deletions. In this study, we examined the distribution and characteristics of single nucleotide insertions/deletions (SNindels), using data available from dbSNP for all the human chromosomes. There are almost 300,000 SNindels in the database, of which only 0.8% are validated. They occur at the frequency of 0.887 per 10 kb on average for the whole genome, or approximately 1 for every 11,274 bp. More than half occur in regions with mononucleotide repeats the longest of which is 47 bases. Overall the mononucleotide repeats involving C and G are much shorter than those for A and T. About 12% are surrounded by palindromes. There is general correlation between chromosome size and total number for each chromosome. Inter-chromosomal variation in density ranges from 0.6 to 21.7 per kilobase. The overall spectrum shows very high proportion of SNindel of types -/A and -/T at over 81%. The proportion of -/A and -/T SNindels for each chromosome is correlated to its AT content. Less than half of the SNindels are within or near known genes and even fewer (<0.183%) in coding regions, and more than 1.4% of -/C and -/G are in coding compared to 0.2% for -/A and -/T types. SNindels of -/A and -/T types make up 80% of those found within untranslated regions but less than 40% of those within coding regions. A separate analysis using the subset of 2324 validated SNindels showed slightly less AT bias of 74%, SNindels not within mononucleotide repeats showed even less AT bias at 58%. Density of validated SNindels is 0.007/10 kb overall and 90% are found within or near genes. Among all chromosomes, Y has the lowest numbers and densities for all SNindels, validated SNindels, and SNindels not within repeats.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16781088 DOI: 10.1016/j.gene.2006.04.009
Source DB: PubMed Journal: Gene ISSN: 0378-1119 Impact factor: 3.688