| Literature DB >> 24629057 |
Mariko Nakagome, Elena Solovieva, Akira Takahashi, Hiroshi Yasue, Hirohiko Hirochika, Akio Miyao1.
Abstract
BACKGROUND: Transposition event detection of transposable element (TE) in the genome using short reads from the next-generation sequence (NGS) was difficult, because the nucleotide sequence of TE itself is repetitive, making it difficult to identify locations of its insertions by alignment programs for NGS. We have developed a program with a new algorithm to detect the transpositions from NGS data.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24629057 PMCID: PMC4004357 DOI: 10.1186/1471-2105-15-71
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Schematic presentation of insertion of and TSD and principal of TIF algorithm. A. In the process of Tos17 insertion, 5 bp sequence (shown in red letters) flanking Tos17 is duplicated. B. Short reads of NGS containing end of Tos17 sequence (shown in blue letters) were searched and then made a group by TSD.
Figure 2Basic TIF algorithm. Input sequences are short read sequences of a sequencer such as Illumina HiSeq2000 with FASTQ format. The data are outputted with FASTA format.
Run time (Real) for basic TIF and RelocaTE
| ttm2 | 292 333 698 | 20 m 3.811 s | 103 m 40.315 s |
| ttm5 | 294 687 288 | 18 m 0.557 s | 105 m 16.471 s |
Figure 3TIF output of short reads for ttm2. FASTQ files of ttm2 are directly subjected to TIF program. TSDs were shown in red letters.
Sensitivity and specificity of TIF
| | | ||||||
|---|---|---|---|---|---|---|---|
| | |
|
| ||||
|
|
| ||||||
| 21 | 5 | 28 | 4 | 3 | 52 | 11 | 10 |
| 20 | 5 | 28 | 4 | 3 | 56 | 12 | 11 |
| 19 | 5 | 29 | 4 | 3 | 56 | 12 | 11 |
| 18 | 5 | 29 | 4 | 3 | 57 | 12 | 11 |
| 17 | 5 | 34 | 4 | 3 | 63 | 12 | 11 |
| 16 | 5 | 43 | 4 | 3 | 74 | 12 | 11 |
| 15 | 5 | 93 | 3 | 2 | 122 | 12 | 11 |
| 14 | 5 | 167 | 3 | 2 | 190 | 12 | 11 |
| 13 | 5 | 424 | 3 | 2 | 433 | 12 | 11 |
| 12 | 5 | 1616 | 3 | 2 | 1677 | 12 | 11 |
| 11 | 5 | 4038 | 3 | 2 | 4142 | 12 | 11 |
aNumbers of flanking sequences detected with head or tail sequences. bNumbers of detected loci on the reference genome by cBLAST 2.2.26 and dBLAST 2.2.29 + .
Specificity of TSD length by basic TIF
| | | ||||
|---|---|---|---|---|---|
| 17 | 3 | 4 | 0 | 12 | 2 |
| 17 | 4 | 2 | 0 | 2 | 0 |
| 17 | 5 | 12 | 4 | 24 | 11 |
| 17 | 6 | 0 | 0 | 0 | 0 |
| 17 | 7 | 0 | 0 | 0 | 0 |
Assignment of TSDs flanking transposed in ttm2 and ttm5 to the genome of the original line, Nipponbare
|
| ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| TIF | RelocaTE | ttm2 | chr04 | 30 259 052 | 30 259 056 | 5 | GTTTC | GTTTC | Forward | Yes |
| TIF | RelocaTE | ttm2 | chr05 | 1 925 905 | 1 925 909 | 5 | CTATC | CTATC | Forward | Yes |
| TIF | RelocaTE | ttm2 | chr10 | 22 134 718 | 22 134 714 | 5 | CTTGC | CTTGC | Reverse | Yes |
| TIF | RelocaTE | ttm2 | chr10 | 22 531 003 | 22 531 007 | 5 | ACTTT | ACTTT | Forward | Yes |
| TIF | | ttm5 | chr01 | 34 453 645 | 34 453 641 | 5 | CTTTG | CTTTG | Reverse | Yes |
| TIF | RelocaTE | ttm5 | chr02 | 1 004 769 | 1 004 765 | 5 | ATACC | ATACC | Reverse | Yes |
| TIF | RelocaTE | ttm5 | chr02 | 31 596 628 | 31 596 632 | 5 | CTAAT | CTAAT | Forward | Yes |
| TIF | RelocaTE | ttm5 | chr03 | 741 226 | 741 222 | 5 | GCTGC | GCTGC | Reverse | Yes |
| TIF | RelocaTE | ttm5 | chr03 | 8 304 678 | 8 304 674 | 5 | GAATA | GAATA | Reverse | Yes |
| TIF | | ttm5 | chr06 | 24 967 881 | 24 967 877 | 5 | TGCAT | TGCAT | Reverse | Yes |
| TIF | | ttm5 | chr07 | 20 064 391 | 20 064 395 | 5 | CTTAT | CTTAT | Forward | cYes |
| 20 080 552 | 20 080 556 | |||||||||
| TIF | RelocaTE | ttm5 | chr09 | 12 970 618 | 12 970 614 | 5 | CATGC | CATGC | Reverse | Yes |
| | RelocaTE | ttm5 | chr10 | 14 739 090 | 14 739 094 | 5 | | GAACT | Forward | No |
| TIF | | ttm5 | chr10 | 19 069 885 | 19 069 889 | 5 | ACTTG | ACTTG | Forward | Yes |
| TIF | RelocaTE | ttm5 | chr10 | 21 583 054 | 21 583 058 | 5 | CTTAT | CTTAT | Forward | Yes |
| TIF | RelocaTE | ttm5 | chr12 | 2 155 899 | 2 155 895 | 5 | GGAAC | GGAAC | Reverse | Yes |
aOriginal Tos17s are located from 26 694 799 to 26 698 904 at chromosome 7 and from 15 415 378 to 15 419 573 at chromosome 10.
bpositions of 1st base of flanking sequences to the tail and head sequence of Tos17.
cTSD on chromosome 7 in ttm5 is located within one of two very similar sequences; thus, it was difficult to determine its position conclusively from the short read sequences.
Figure 4Confirmation of insertions detected by TIF and RelocaTE programs with PCR/electrophoresis. Genomic DNAs of ttm2 and ttm5 were subjected to PCR using the triple-primer method (see Methods), and the PCR products were electrophoresed in 1.5% agarose gels with molecular weight markers, followed by detection of amplified fragments with ethidium bromide. P represents molecular weight marker of φX174/HincII (Toyobo, Osaka, Japan); N, Nipponbare; 2, ttm2; 5, ttm5; W, wild-type; M, homozygous Tos17 insertion; H, heterozygous Tos17 insertion. Tos17 transposed loci are indicated by chromosome number and the start position of the TSDs on genome sequence.