| Literature DB >> 15248903 |
Abstract
BACKGROUND: The massive amount of SNP data stored at public internet sites provides unprecedented access to human genetic variation. Selecting target SNP for disease-gene association studies is currently done more or less randomly as decision rules for the selection of functional relevant SNPs are not available.Entities:
Mesh:
Year: 2004 PMID: 15248903 PMCID: PMC487897 DOI: 10.1186/1471-2105-5-92
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The SNP context view consists of 7 panes with a flexible number of sequence lanes in pane "E". The sequence pane may be exploded by the interlinear display of splice variants or conserved sequences among species.
SNP annotation of human chromosome 6p21
| total genes examined | 396 |
| Total SNP number (mean 49/gene, range 1–463) | 19,495 |
| SNPs inside of a motif | 2,562 |
| total motifs touched | 5,547 |
| SNPs inside of transcription factor site | 4,893 |
| SNPs that deleted a binding site | 3,053 |
| SNPs that inserted a binding site | 1,816 |
| SNPs inside of exon-intron boundary of all known transcripts | 44 |
| SNPs leading to amino acid exchange | 310 |
| Ala>Thr | 15 |
| Glu>Lys | 11 |
| Arg>Gln | 10 |
| SNPs inside of exon splice enhancer motif | 299 |
| 3C | 80 |
| 5C3D | 73 |
| 5B3A | 39 |
Figure 2Allele frequencies of 1,633 SNPs in the Caucasian population (which is a random SNP subset of the 19,495 annotated SNPs on chromosome 6p21) by type of functional change. SNP allele frequencies were on average 4% lower in the two bottom tertiles (P = 0.004) of the frequency distribution in those SNPs that insert a new transcription factor binding site (red line) compared to SNPs that destroyed a binding site or were not found in any motif (black line).
Ten questions that help during the SNP genotyping triage process
| [1] | Is the SNP density 1/800 bp or higher? | The gene of interest may need to be resequenced. |
| [2] | Is the SNP located inside of a DNA repeat? | This SNP could be a sequencing artefact. |
| [3] | Has a particular SNP been submitted by at least two reliable sources? | The genomic sequence may not be polymorphic. |
| [4] | Is the SNP seen jointly in different public (dbSNP, HGBASE) and private (ALLSNPS, REALSNP, CELERA) databases? | The SNP may not exist. |
| [5] | Are allele frequencies available in the target population? | The SNP may not exist in the target population. |
| [6] | Is the SNP located in a known functional motif? | This will increase the likelihood of a causative mutation. |
| [7] | Is the SNP situated in a region that is conserved in other species? | Highly conserved regions have an increased likelihood of being functionally important. |
| [8] | Are there more SNPs in the neighborhood? | These SNPs may interfere with primer design. |
| [9] | Is the SNP seen also in paralogous genomic regions ("multiple genome hitters">)? | These SNPs could interfere at genotyping by creating artificial alleles. |
| [10] | Does the SNP tag any specific haplotype? | Tagging SNP can help to capture effects of neighboring SNPs in linkage disequilibrium. |