| Literature DB >> 22121222 |
Atheer A Matroud1, M D Hendy, C P Tuffley.
Abstract
We introduce the software tool NTRFinder to search for a complex repetitive structure in DNA we call a nested tandem repeat (NTR). An NTR is a recurrence of two or more distinct tandem motifs interspersed with each other. We propose that NTRs can be used as phylogenetic and population markers. We have tested our algorithm on both real and simulated data, and present some real NTRs of interest. NTRFinder can be downloaded from http://www.maths.otago.ac.nz/~aamatroud/.Entities:
Mesh:
Year: 2011 PMID: 22121222 PMCID: PMC3273788 DOI: 10.1093/nar/gkr1070
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Flowchart of the NTRFinder algorithm.
Figure 2.Percentage of NTRs found in the synthetic sequences.
NTRs found in some IGS sequences searched from GenBank and an additional unpublished sequence (C. esculenta)
| Species and accession number | NTR structure ( | |x| | Start index |
|---|---|---|---|
| | | End index | ||
| (0,1,2,3,4,2,3,2,3,3,3,5,4,3,2,3,3,1,2,5,3,5,4,2,2,4,3) | 10 | 960 | |
| 13 | 2111 | ||
| (9,12,2,6,2,5,4,1) | 21 | 1403 | |
| 30 | 2605 | ||
| (1,1,1,3,3,1,3,2,1,1,2,1,3,1,1,2,1,2,1,1,2) | 30 | 1031 | |
| 44 | 2902 | ||
| (1,1,1,3,3,1,3,1,3,2,1,1,2,1,3,1,1,2,1,2,1,1,2) | 30 | 1036 | |
| 44 | 3133 | ||
| (1,2,2,3,3,2,2,2,3) | 12 | 385 | |
| 45 | 1337 | ||
| (6,4,4,7,4,4,4,3,1) | 21 | 1558 | |
| 51 | 2580 | ||
| (5,3,1,6,10,5,10,9,13,14,15,4) | 11 | 725 | |
| 48 | 2384 | ||
| (1,1,2,2,1,2,4,2,1) | 20 | 1016 | |
| 46 | 1969 | ||
| (3,2,1,1) | 13 | 32 189 | |
| 17 | 32 365 | ||
| (2,2,1) | 19 | 2984 | |
| 52 | 3113 | ||
| (3,1,3,6,5,6,4,3,4) | 75 | 961 | |
| 11 | 3743 | ||
| (3,1,1,1,1,0) | 107 | 6363 | |
| 91 | 7642 |
NTRs found in the Human Y chromosome
| NTR structure ( | |x| | Start index |
|---|---|---|
| | | End index | |
| (1,2,2,1,2,1,2,1,1,2,2,2,2,1) | 12 | 143 865 |
| 56 | 144 880 | |
| (7,22,23,12,14,4) | 2 | 234 183 |
| 88 | 234 767 | |
| (1,1,2,1,1,1,1,,1,1,1,1,1,1,1,1,1,1,1,1) | 15 | 465 369 |
| 14 | 466 397 | |
| (1,1,1,2,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) | 11 | 647 659 |
| 16 | 649 721 | |
| (17,15,31,28,72,62) | 2 | 901 237 |
| 49 | 902 037 | |
| (3,6,8,11,7,6,4,4,5,4,11) | 12 | 1 279 754 |
| 32 | 1 280 875 | |
| (26,27,25,25,25,20,17,13,26) | 1 | 1 397 128 |
| 48 | 1 397 735 | |
| (1,2,1,2,1,2,2,2,2,2,1,2,1,1,1,2,2,2,1,2,2,1,1) | 16 | 1 516 157 |
| 22 | 1 517 560 | |
| (1,1,2,6,2,2,2,1,2) | 19 | 1 626 578 |
| 35 | 1 627 258 | |
| (1,1,1,0,2,1) | 19 | 2 102 194 |
| 56 | 2 102 594 | |
| (2,2,2,1,2,1,1,1,1,2,6) | 21 | 2 164 541 |
| 15 | 2 165 091 |
Figure 3.Running time of NTRFinder (on a Pentium Dual core T4300 2.1 GHz) plotted against segment length on a log–log scale. The search was performed on segments of different lengths, with the minimum and maximum TR lengths set to 8 and 50, respectively. The distribution suggests the running time is approximately linear with the sequence length.