| Literature DB >> 28405230 |
Eric Disdero1, Jonathan Filée1.
Abstract
BACKGROUND: Population genomic analysis of transposable elements has greatly benefited from recent advances of sequencing technologies. However, the short size of the reads and the propensity of transposable elements to nest in highly repeated regions of genomes limits the efficiency of bioinformatic tools when Illumina or 454 technologies are used. Fortunately, long read sequencing technologies generating read length that may span the entire length of full transposons are now available. However, existing TE population genomic softwares were not designed to handle long reads and the development of new dedicated tools is needed.Entities:
Keywords: Long read sequence; Population genomic; Structural variation; Transposable element
Year: 2017 PMID: 28405230 PMCID: PMC5385071 DOI: 10.1186/s13100-017-0088-x
Source DB: PubMed Journal: Mob DNA
Fig. 1Simplified workflow of the Presence/Absence module. Green and red bars indicate different flanking sequences, large black arrows represent TEs
Fig. 2Simplified workflow of the New insertion module. Green, red, yellow and purple bars indicate different flanking sequences, large black and blue arrows represent TEs
Fig. 3Performance test of LoRTE according to the PacBio read coverage. a Percentage of the TEs annotated in the Drosophila melanogaster genome that have been recovered by the program. b Percentage of the insertion/deletion artificially made in the synthetic reads that have been identified. c Numbers of new TE deletion and insertion found in the genuine reads and absent in the reference genome. d Numbers of polymorphic TE deletion and insertion found in the real PacBio reads and absent in the reference genome
Fig. 4Family distribution of the total number of new TE insertion and deletion found whatever the read coverage in the Drosophila melanogaster PacBio reads and absent in the reference genome. Polymorphic/heterozygous events are included