| Literature DB >> 17713591 |
Ravi Gupta1, Divya Sarthi, Ankush Mittal, Kuldip Singh.
Abstract
The identification and analysis of repetitive patterns are active areas of biological and computational research. Tandem repeats in telomeres play a role in cancer and hypervariable trinucleotide tandem repeats are linked to over a dozen major neurodegenerative genetic disorders. In this paper, we present an algorithm to identify the exact and inexact repeat patterns in DNA sequences based on orthogonal exactly periodic subspace decomposition technique. Using the new measure our algorithm resolves the problems like whether the repeat pattern is of period P or its multiple (i.e., 2P, 3P, etc.), and several other problems that were present in previous signal-processing-based algorithms. We present an efficient algorithm of O(NL(w) log L(w)), where N is the length of DNA sequence and L(w) is the window length, for identifying repeats. The algorithm operates in two stages. In the first stage, each nucleotide is analyzed separately for periodicity, and in the second stage, the periodic information of each nucleotide is combined together to identify the tandem repeats. Datasets having exact and inexact repeats were taken up for the experimental purpose. The experimental result shows the effectiveness of the approach.Entities:
Year: 2007 PMID: 17713591 PMCID: PMC3171338 DOI: 10.1155/2007/43596
Source DB: PubMed Journal: EURASIP J Bioinform Syst Biol ISSN: 1687-4145