| Literature DB >> 30165587 |
Abstract
SUMMARY: Repetitive elements comprise large proportion of many genomes. They have impact on both genome evolution and regulation. Their classification and the study of evolutionary history is a major emerging field. Various software exist to-date to classify and map repeats across genomes. The major unresolved drawback, however, is the fragmented nature of many identified repeat loci. This ultimately makes the classification of novel repeats and their evolutionary analyses difficult. To improve on this, we developed a pipeline (RepeatCraft) that integrates results from several repeat element classification tools based on both sequence similarity and structural features. The pipeline de-fragments closely spaced repeat loci in the genomes, reconstructing longer copies, thus allowing for a better annotation and sequence comparisons. The pipeline also includes a user interface that can run in a web browser allowing for an easy access and exploration of the repeat data.Entities:
Mesh:
Year: 2019 PMID: 30165587 PMCID: PMC6419915 DOI: 10.1093/bioinformatics/bty745
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Schematic diagram (A) showing how RepeatCraft groups repeat ‘fragments’, based on their coverage in the consensus sequence (blue track) and the distance between consecutive repeats. By default, RepeatCraft only merges consecutive repeats (strict merging). The ‘loose’ merge considers non-consecutive closely spaced repeats and retains the annotation of other short repeats (i.e. simple repeats) in between the fragments. (B) The range plot panel of the web application provides a track-based visualization of the result of RepeatCraft, similar to a genome browser. The first track shows the annotation from RepeatMasker, the second track shows the improved annotation from RepeatCraft. The remaining tracks display the annotations from other tools (e.g. LTR_FINDER)