| Literature DB >> 16968531 |
Stephan Hutter1, Albert J Vilella, Julio Rozas.
Abstract
BACKGROUND: DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis.Entities:
Mesh:
Year: 2006 PMID: 16968531 PMCID: PMC1574356 DOI: 10.1186/1471-2105-7-409
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Graphical User Interface of VariScan showing the major options of analysis.
Figure 2Wavelet decomposition tree. MRA allows for the decomposition of a signal into several resolution levels. First, the original signal (with a power of two points) is decomposed by two complementary half-band filters (high-pass and low-pass filters) that divide a spectrum into high-frequency (detail coefficients; D1) and low-frequency (approximation coefficients; A1) components (bands). For example, the low-pass filter will remove all half-band highest frequencies. Information from only the low frequency band (A1), with a half number of points, will be filtered in the second decomposition level. The A2 outcome will be filtered again for further decomposition.
Figure 3Visualization on the UCSC browser of the MRA analysis based on θ values from the mouse genome resequencing project data [20]. The USCS browser shows a 20 Mb-region (within positions 65.000,001–85,000,000). The first two tracks (customer tracks) represent the signal reconstruction of low-frequency bands with information from 9 to 11 MRA levels (first track), and from 12 to 16 MRA levels (second track).
Figure 4Application of the MRA analysis to the coalescent-simulated data set. The data contains 10 sequences of 2,000,000 bp each, and it was generated applying a per-site value of θ = 0.01. Upon this raw data set, we made two different levels of changes: i) two wide reductions in nucleotide diversity levels (g1: α = 1/3, β = 500,000; g2: α = 1/2, β = 500,000); and ii) 11 local valleys of reduced variability (v1: α = 1/4, β = 20,000; v2: α = 1/4, β = 15,000; v3: α = 1/4, β = 10,000; v4: α = 1/4, β = 5,000; v5: α = 1/4, β = 2,000; v6: α = 1/3, β = 20,000; v7: α = 1/3, β = 10,000; v8: α = 1/3, β = 5,000; v9: α = 1/2, β = 10,000; v10: α = 1/2, β = 5,000; v11: α = 1/2, β = 2,000). (a) nucleotide diversity profile obtained by SW using non-overlapping windows of 50 bp; (b) Signal reconstruction of low-frequency bands with information from 7 to 8 MRA levels, showing the location (in arrows) of 5 depleted-variation regions (v4–5, v8, v10–11; β ≤ 5,000). c) Signal reconstruction from 9 to 12 MRA levels, showing the location (in arrows) of 9 depleted-variation regions (v1–4, v6–10; 5,000 ≤ β ≤ 20,000). d) Signal reconstruction from 13 to 15 MRA levels, showing the location (in arrows) of the two broad areas with reduced levels of variation (g1–2; β = 500,000). The nucleotide sequence positions (X axis) are given in kb.
.