| Literature DB >> 16845104 |
Gábor Tóth1, Gábor Deák, Endre Barta, György B Kiss.
Abstract
Identification of dispersed or interspersed repeats, most of which are derived from transposons, retrotransposons or retrovirus-like elements, is an important step in genome annotation. Software tools that compare genomic sequences with precompiled repeat reference libraries using sensitive similarity-based methods provide reliable means of finding the positions of fragments homologous to known repeats. However, their output is often incomplete and fragmented owing to the mutations (nucleotide substitutions, deletions or insertions) that can result in considerable divergence from the reference sequence. Merging these fragments to identify the whole region that represents an ancient copy of a mobile element is challenging, particularly if the element is large and suffered multiple deletions or insertions. Here we report PLOTREP, a tool designed to post-process results obtained by sequence similarity search and merge fragments belonging to the same copy of a repeat. The software allows rapid visual inspection of the results using a dot-plot like graphical output. The web implementation of PLOTREP is available at http://bioinformatics.abc.hu/PLOTREP/.Entities:
Mesh:
Year: 2006 PMID: 16845104 PMCID: PMC1538846 DOI: 10.1093/nar/gkl263
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1The diagram explains the terms used in the description of the algorithm and provides help to interpret the 2D plot. Matching fragments are shown as red diagonals (1–8). Fragments 1–5 are in positive orientation while fragments 6–8 are in negative orientation. The absolute offset of the diagonals is calculated as indicated for fragment 3 as an example. The offset differences separating fragments 1, 2 and 3 are small, therefore they can be grouped together as the initial step of merging. Similarly, fragments 4 and 5 can also be grouped. A gap is a discontinuity with small offset difference between flanking fragments like 6 and 7. Insertions and deletions are defined with respect to the reference sequence on the vertical axis, and their presence is examined after the groups of ‘same-diagonal’ fragments are formed. An insertion or a deletion is characterized by large offset difference and adjacency of the flanking fragments in the reference sequence or in the query, respectively. Depending on the parameters, fragments belonging together can be merged as shown by the black lines 9 and 10.
Figure 2An example of PLOTREP results. A genomic query sequence was searched against a small user-supplied library containing LTR and internal sequences of an LTR retrotransposon. (A) A diagram summarizing all matching hits and those merged by PLOTREP, providing an overall picture of repeat positions. (B) A table listing positions of merged regions along with the positions of insertions, deletions and virtual gaps within these regions. (C) A 2D dot-plot like diagram displaying the comparison between the query (on the horizontal axis) and a library reference sequence (here in combined LTR–internal–LTR structure on the vertical axis). All matching fragments are shown as red lines and the merged regions are depicted as black lines. (D) A dot-plot like diagram displaying the query sequence compared with itself. (E) The sequence of a merged region or a region covered by an insertion can be downloaded by clicking on the ‘S’ button. (F) Local alignments generated by CENSOR can be viewed by clicking on the ‘A’ button in the table listing the raw CENSOR output of fragment positions (this alternatively displayed table is not shown here).