Literature DB >> 19238236

SNP-Flankplus: SNP ID-centric retrieval for SNP flanking sequences.

Cheng-Hong Yang1, Yu-Huei Cheng, Li-Yeh Chuang, Hsueh-Wei Chang.   

Abstract

UNLABELLED: The flanking sequences provided by dbSNP of NCBI are usually short and fixed length without further extension, thus making the design of appropriate PCR primers difficult. Here, we introduce a tool named "SNP-Flankplus" to provide a web environment for retrieval of SNP flanking sequences from both the dbSNP and the nucleotide databases of NCBI. Two SNP ID types, rs# and ss#, are acceptable for querying SNP flanking sequences with adjustable lengths for at least sixteen organisms. AVAILABILITY: This software is freely available at http://bio.kuas.edu.tw/snp-flankplus/

Entities:  

Keywords:  PCR; SNP; flanking sequences; primer design

Year:  2008        PMID: 19238236      PMCID: PMC2637961          DOI: 10.6026/97320630003147

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

Single nucleotide polymorphisms (SNPs) are the most commonly encountered genetic variants. Many kinds of primer design software tools, such as Primer 3 [1], provide the suitable polymerase chain reaction (PCR) primers for the PCR-based SNP genotyping methods. A longer template sequence is more helpful for optimal primer design; however, the SNP flanking sequences provided in NCBI dbSNP [2] are not always long enough for regular primer design. Recently, FESD [3] designed a “SNPflank” function to identify flanking sequences for SNP IDs and provided customizable length with rs# input alone for human SNPs but is inaccessible recently. To offer longer template sequences for desired SNP for genotyping experiments, such as TaqMan real‐time PCR [4], PCR-RFLP [5], and PCR‐CTTP [6], we introduce the SNP-Flankplus for on-line retrieval of flanking sequences of target SNPs for sixteen organism genomes.

Methodology

The system design, algorithm and database of the program are described below.

Algorithm

This program adopts the sequences of accession numbers of the corresponding SNPs and the SNP contig position to obtain desired flanking sequence with specific length. In order to save memory space during reading the sequence of accession numbers, this system employs “block location way”, which splits the sequence of the accession numbers into multiple blocks. A specific block is loaded into the memory to search the required sequence and is hit by the algorithm 1 (under supplementary material). When the flanking length exceeds a block, some nearby blocks aer used, i.e. (block hit - d) or (block hit + d). d is the size of extending blocks and is calculated by the algorithm 2 (under supplementary material).

Database

The source databases are retrieved on-line and constantly updated from NCBI dbSNP and Nucleotide [4].

Result

Input

The four main input interfaces in SNP-Flankplus are followed: (1) Single Reference cluster ID (rs#) input; (2) Single NCBI Assay ID (ss#) input; (3) Multiple SNP ID rs# and ss# input by pasting; and (4) Multiple rs# and ss# input through uploading a file (Figure 1a). Users are allowed to enter the SNP ID or multiple SNP IDs (rs# or ss#) for sixteen organisms when querying SNP information. When using the ss# input, the system will first query the corresponding rs#, and then retrieve SNP information related to this rs#. The SNP information contains allele information, submitted SNPs and other data for this RefSNP Cluster. Users can set the desired flanking length for the design of feasible primer sets. Two flanking length options are available: the system can be either set to default lengths of 300 ∼ 1000 bps, or alternatively, the length can be set to the maximum length of the corresponding contig accession (Figure 1b).
Figure 1

A web snapshot. (a) Four input interfaces. (b) SNP information and adjustable flanking length. (c) File or text output.

Output

The flanking sequence output is shown in fasta format with on-line representation and file and/or text. It contains SNP ID (rs#), allele name, chromosome position of SNP, contig position of SNP, organism source, contig accession and sequence corresponding position, SNP type, sequence type, and case sensitivity. This information is separated by the “|” symbol. Its limitation of maximum flanking length is dependent on the corresponding contig accession number. Three types of flanking sequences are able to adjustable in real-time, such as: (1) SNP types contain general nucleotides, alleles, and IUPAC formats, (2) sequence types contain original, reverse, complementary, antisense sequences, and (3) case sensitive types contain upper case and lower case (Figure 1c).

Conclusion

SNP-Flankplus provides a real-time update mechanism is employed, and two SNP ID types (rs# and ss#) for sixteen organisms can be entered to obtain the latest SNP information and sequence. A maximum flanking length can be retrieved based on the corresponding contig accession number.
  6 in total

1.  dbSNP: the NCBI database of genetic variation.

Authors:  S T Sherry; M H Ward; M Kholodov; J Baker; L Phan; E M Smigielski; K Sirotkin
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

2.  Primer3 on the WWW for general users and for biologist programmers.

Authors:  S Rozen; H Skaletsky
Journal:  Methods Mol Biol       Date:  2000

3.  Competitive amplification and unspecific amplification in polymerase chain reaction with confronting two-pair primers.

Authors:  Nobuyuki Hamajima; Toshiko Saito; Keitaro Matsuo; Kazuo Tajima
Journal:  J Mol Diagn       Date:  2002-05       Impact factor: 5.568

4.  Assessment of two flexible and compatible SNP genotyping platforms: TaqMan SNP Genotyping Assays and the SNPlex Genotyping System.

Authors:  Francisco M De la Vega; Katherine D Lazaruk; Michael D Rhodes; Michael H Wenz
Journal:  Mutat Res       Date:  2005-06-03       Impact factor: 2.433

5.  SNP-RFLPing: restriction enzyme mining for SNPs in genomes.

Authors:  Hsueh-Wei Chang; Cheng-Hong Yang; Phei-Lang Chang; Yu-Huei Cheng; Li-Yeh Chuang
Journal:  BMC Genomics       Date:  2006-02-17       Impact factor: 3.969

6.  FESD: a Functional Element SNPs Database in human.

Authors:  Hyo Jin Kang; Kyoung Oak Choi; Byung-Dong Kim; Sangsoo Kim; Young Joo Kim
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

  6 in total
  5 in total

1.  SNP-RFLPing 2: an updated and integrated PCR-RFLP tool for SNP genotyping.

Authors:  Hsueh-Wei Chang; Yu-Huei Cheng; Li-Yeh Chuang; Cheng-Hong Yang
Journal:  BMC Bioinformatics       Date:  2010-04-08       Impact factor: 3.169

2.  Confronting two-pair primer design for enzyme-free SNP genotyping based on a genetic algorithm.

Authors:  Cheng-Hong Yang; Yu-Huei Cheng; Li-Yeh Chuang; Hsueh-Wei Chang
Journal:  BMC Bioinformatics       Date:  2010-10-13       Impact factor: 3.169

3.  REHUNT: a reliable and open source package for restriction enzyme hunting.

Authors:  Yu-Huei Cheng; Jiun-Jian Liaw; Che-Nan Kuo
Journal:  BMC Bioinformatics       Date:  2018-08-10       Impact factor: 3.169

4.  LD2SNPing: linkage disequilibrium plotter and RFLP enzyme mining for tag SNPs.

Authors:  Hsueh-Wei Chang; Li-Yeh Chuang; Yan-Jhu Chang; Yu-Huei Cheng; Yu-Chen Hung; Hsiang-Chi Chen; Cheng-Hong Yang
Journal:  BMC Genet       Date:  2009-06-06       Impact factor: 2.797

5.  Seq4SNPs: new software for retrieval of multiple, accurately annotated DNA sequences, ready formatted for SNP assay design.

Authors:  Helen I Field; Serena A Scollen; Craig Luccarini; Caroline Baynes; Jonathan Morrison; Alison M Dunning; Douglas F Easton; Paul D P Pharoah
Journal:  BMC Bioinformatics       Date:  2009-06-12       Impact factor: 3.169

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.