| Literature DB >> 35440097 |
Juan Paolo A Sicat1, Paul Visendi2, Steven O Sewe3, Sophie Bouvaine3, Susan E Seal3.
Abstract
BACKGROUND: Whiteflies are agricultural pests that cause negative impacts globally to crop yields resulting at times in severe economic losses and food insecurity. The Bemisia tabaci whitefly species complex is the most damaging in terms of its broad crop host range and its ability to serve as vector for over 400 plant viruses. Genomes of whiteflies belonging to this species complex have provided valuable genomic data; however, transposable elements (TEs) within these genomes remain unexplored. This study provides the first accurate characterization of TE content within the B. tabaci species complex.Entities:
Keywords: Bemisia tabaci; Bioinformatics; DNA transposons; TE annotation; Transposable elements; Whitefly
Year: 2022 PMID: 35440097 PMCID: PMC9017028 DOI: 10.1186/s13100-022-00270-6
Source DB: PubMed Journal: Mob DNA
Repetitive elements identified in the three whitefly genomes
| 29.25 | 18.07 | 25.28 | 15.66 | 16.48 | 23.42 | 25.94 | 12.92 | 19.86 | |
| 0.86 | 2.6 | 0.61 | 2.65 | 0.42 | 1.72 | ||||
| 0.96 | 0.61 | 1.25 | 3.18 | 0.57 | 0.96 | 0.44 | 0.38 | 0.94 | |
| 0.16 | 0.04 | 0.17 | 0.96 | 0.04 | 0.18 | 0.16 | 0.04 | 0.08 | |
| 0.49 | 0.21 | 1.19 | 18.5 | 0.19 | 1.51 | 0.08 | 0.07 | 0.7 | |
| 12.96 | 0 | 16.26 | 1.99 | 0 | 14.81 | 11.9 | 0 | 15.22 | |
| 43.82 | 18.92 | 44.14 | 40.29 | 17.28 | 40.88 | 38.52 | 13.41 | 36.8 | |
Results of the identification of TEs reported by their respected studies, using the last publicly available RepBase library (RepBase RepeatMasker-edition20180826), and the custom-built repeat library built using the workflow described in the study
RepeatMasker output of RepeatMasker library and the species-specific custom-built library for the Drosophila melanogaster genome (release 6 [50])
| DNA | 1.79 | 1.21 |
| LINE | 4.93 | 4.50 |
| SINE | < 0.001 | 0.00 |
| LTR | 10.68 | 10.22 |
| Unclassified | 0.04 | 0.34 |
| Total Interspersed Repeats | 17.44 | 16.88 |
Comparison of the results of the identification of TEs using RepeatMasker RepBase library and the species-specific repeat library in the D. melanogaster genome. The custom-built repeat library was built using the workflow described in the study
Fig. 1Distribution of transposable elements in each respective genome. Stacked bar chart illustrating the length of each genome and the length occupied by the TEs in each genome. RepeatMasker with the species-specific repeat library was used to identify the TE content of each genome
Fig. 2Percent proportion of transposable elements and assembly size. Each genome was plotted in relation to their TE proportion and assembly size. TE proportion in the six genomes is positively correlated with the size of the genome assembly (p = 0.006). The grey shaded area represents the 95% confidence interval while the blue line is the regression line (r = 0.0.93)
Fig. 3Percent proportion of each order of transposable elements in the non-whitefly and whitefly group. Box plots comparing percent genome proportion of each order of TEs between the non-whitefly and the whitefly group. The box represents the interquartile range (25th to 75th percentile) values and the line in the middle of the box represents the middle quartile (50th percentile or median). The upper whisker represents the values 1.5 times larger than the 75th percentile and the lower whisker represents the values smaller than the 25th percentile. (A) The overview of the distribution of TEs classes between the non-whitefly and whitefly groups. The majority of the TEs identified were DNA transposons, most abundant in whitefly genomes. (B) The comparison of the distribution of retrotransposons between the non-whitefly and whitefly group. SINEs distribution varies significantly across the non-whitefly group; the D. citri genome assembly having the highest at 3% while none was detected in M. persicae genome assembly
Repeat Superfamilies identified within the genomes
| MEAM1 | 48 | 20 | 9 | 5 | 82 |
| MED/Q | 49 | 20 | 6 | 4 | 79 |
| SSA-ECA | 44 | 18 | 9 | 4 | 75 |
| 43 | 18 | 5 | 1 | 67 | |
| 30 | 23 | 6 | 7 | 66 | |
| 36 | 18 | 7 | 0 | 61 |
The table presents a summary of the number of superfamilies found in each class of TEs in each of the genomes. DNA represent DNA transposons, LINE Long interspersed nuclear elements, SINE Short interspersed nuclear elements, LTR Long terminal repeats
Number of clusters shared across the three B. tabaci genomes
| ALL | 216 | 174 | 120 | 2 | 220 | 1 | 733 |
| MEAM1 and MED/Q | 273 | 212 | 179 | 4 | 318 | 1 | 987 |
| MEAM1 and SSA-ECA | 133 | 74 | 56 | 0 | 283 | 0 | 546 |
| MED/Q and SSA-ECA | 126 | 74 | 56 | 0 | 183 | 2 | 441 |
| Total | 748 | 534 | 411 | 6 | 1004 | 4 | 2707 |
The table presents a summary of the number of clusters identified as shared across the B. tabaci genomes. The clusters were created from the TE consensus sequences from the six genomes included in the study. DNA represent DNA transposons, LINE Long interspersed nuclear elements, SINE Short interspersed nuclear elements, LTR Long terminal repeats
Fig. 4Repeat landscapes of the B. tabaci genomes. The repeat landscapes illustrate the activity of the different classes of transposable elements found in the three B. tabaci genomes. Sequence divergence scores were measured using Kimura distance which is represented on the x-axis while the percent coverage of the element in the genome is represented on the y-axis. Elements with low sequence divergence scores represent a more recent transposable element activity while elements with higher sequence divergence scores represent older transposition events