| Literature DB >> 16845091 |
Reidar Andreson1, Tarmo Puurand, Maido Remm.
Abstract
SNPmasker is a comprehensive web interface for masking large eukaryotic genomes. The program is designed to mask SNPs from recent dbSNP database and to mask the repeats with two alternative programs. In addition to the SNP masking, we also offer population-specific substitution of SNP alleles in genomic sequence according to SNP frequencies in HapMap Phase II data. The input to SNPmasker can be defined in chromosomal coordinates or inserted as a sequence. The sequences masked by our web server are most useful as a preliminary step for different primer and probe design tasks. The service is available at http://bioinfo.ebc.ee/snpmasker/ and is free for all users.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16845091 PMCID: PMC1538889 DOI: 10.1093/nar/gkl125
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Comparison of different web pages for masking SNPs and repeats
| SNP masking types | Repeat-masking programs | Region defined by coordinates | Region found by sequence homology search | |
|---|---|---|---|---|
| Genome browser UCSC | lower-case, by color, bold/italic | RepeatMasker | Yes | No |
| SNP Research Facility Washington University in St. Louis | IUPAC | RepeatMasker | Yes | No |
| GSF Munich, Germany | ‘N’ | RepeatMasker | Yes | No |
| SNP BLAST NCBI | IUPAC | RepeatMasker | No | Yes |
| SNPmasker University of Tartu | any character, IUPAC, lower-case, by HapMap frequency | RepeatMasker, GenomeMasker | Yes | Yes |
Figure 1Web interface for SNPmasker input.
Figure 2Examples of different masking styles. Masked repeats are shown in boldface and SNPs are highlighted in red for the visualization in this Figure. (A) Original sequence from the human genome sequence, assembly NCBI35.1. (B) The same sequence masked for PCR primer design with the GenomeMasker using parameter ‘target’. Asymmetrical masking is used—on the left side of target the upper strand is masked, on the right side of target the lower strand is masked. The middle part around the third SNP (shown in italic) is the target region which is chosen to be amplified. (C) The sequence masked with the RepeatMasker. (D) Population-specific masking of SNPs. The original nucleotides in the genome sequence have been substituted with a population-specific (lower-case) nucleotides using HapMap frequency information.
The performance of SNPmasker for different tasks
| Job | 1 kb | 10 kb | 100 kb |
|---|---|---|---|
| Sequence from FASTA file | |||
| No repeat-masking, SNPs masked with ‘N’ | 32 s | 35 s | 142 s |
| No repeat-masking, SNPs masked using HapMap allele frequency | 32 s | 35 s | 146 s |
| GenomeMasker, SNPs masked with ‘N’ | 38 s | 40 s | 148 s |
| RepeatMasker, SNPs masked with ‘N’ | 32 s | 35 s | 142 s |
| Sequence defined by chromosomal coordinates | |||
| No repeat-masking, SNPs masked with ‘N’ | 14 s | 14 s | 14 s |
| No repeat-masking, SNPs masked using HapMap allele frequency | 14 s | 14 s | 17 s |
| GenomeMasker, SNPs masked with ‘N’ | 15 s | 15 s | 18 s |
| RepeatMasker, SNPs masked with ‘N’ | 14 s | 14 s | 14 s |