| Literature DB >> 19747395 |
Chao Li1, Yuhua Li, Xiangmin Zhang, Phillip Stafford, Valentin Dinu.
Abstract
BACKGROUND: Restriction enzymes can produce easily definable segments from DNA sequences by using a variety of cut patterns. There are, however, no software tools that can aid in gene building -- that is, modifying wild-type DNA sequences to express the same wild-type amino acid sequences but with enhanced codons, specific cut sites, unique post-translational modifications, and other engineered-in components for recombinant applications. A fast DNA pattern design algorithm, ICRPfinder, is provided in this paper and applied to find or create potential recognition sites in target coding sequences.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19747395 PMCID: PMC2746817 DOI: 10.1186/1471-2105-10-286
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Create a DNA pattern using silent mutations. Wild type TP53 and modified TP53 with a restriction enzyme NotI recognition site (GCGGCCGC) in the green rectangle. Base pairs with red colors are silent mutations. Number 1000 represent the location of the base pair in TP53 DNA sequence.
Possible DNA sequences required to be looked-up in the brute force approach
| I | R | G | R | E |
| ATT | CGT | GGT | CGT | GAA |
| ATC | CGC | GGC | CGC | GAG |
| ATA | CGA | GGA | CGA | |
| CGG | GGG | CGG | ||
| AGA | AGA | |||
| AGG | AGG | |||
There are total 864 (3 × 6 × 4 × 6 × 2 = 864) possible DNA sequences created by combining codons from each position. Any combination, for example ATTCGTGGTCGTGAA, or the original sequence, ATCCGTGGGCGTGAG, can be translated to the same amino acid sequence IRGRE. The brute force approach is not practical because of the exponential growth in the number of possible sequences.
Figure 2The GUI of ICRPfinder for specific pattern analysis. A plain or FASTA format target DNA sequence can be accepted. Restriction enzymes can be chosen, and also users' pattern sequences can be accepted.
Figure 3The results of finding possible recognition sites for a specific restriction enzyme. BamHI recognition sites are displayed. Lines with different colors are employed for better visualization, especially when results are close to each other. The information for the sequence and position for each recognition site will popup when the mouse hovers over the sequence.
Figure 4The GUI of ICRPfinder for specific position analysis. The specific position can be accepted. All patterns, including restriction enzymes and user's patterns are analyzed.
Figure 5The results of creating unique recognition sites around a specific position. All enzymes which recognise only one site in the target TP53 DNA sequence are displayed. Lines with different colors are utilized for better visualization, especially when results are close to each other. The information for the sequence and position for each recognition site will popup when the mouse hovers over the sequence.
Performance benchmark for ICRPfinder, NEBCutter and WatCut
| ICRPfinder 1 | Wild type, Silent mutation | 1 | Whole coding sequence | 0.03 | 0.21 |
| ICRPfinder 2 | Wild type, Silent mutation | 2406 | Specific region | 0.53 | 0.52 |
| NEBCutter | Wild type | 1 | Whole coding sequence | 3-6 | 6-12 |
| WatCut | Wild type, Silent mutation | 246 (generic enzymes) | Whole coding sequence | 2-3* | |
The performance of ICRPfinder, NEBCutter and WatCut are compared on same data. In the first column ICRPfinder 1 stands for the function to find all potential positions for a specific pattern and ICRPfinder 2 stands for the function to find all unique patterns in a given region.
*Only 100 bp were used to run WatCut, since the application limits the length of the input coding sequence to 100 bp.