| Literature DB >> 22383973 |
Renè Massimiliano Marsano1, Daniela Leronni, Pietro D'Addabbo, Luigi Viggiano, Eustachio Tarasco, Ruggiero Caizzi.
Abstract
A set of 67 novel LTR-retrotransposon has been identified by in silico analyses of the Culex quinquefasciatus genome using the LTR_STRUC program. The phylogenetic analysis shows that 29 novel and putatively functional LTR-retrotransposons detected belong to the Ty3/gypsy group. Our results demonstrate that, by considering only families containing potentially autonomous LTR-retrotransposons, they account for about 1% of the genome of C. quinquefasciatus. In previous studies it has been estimated that 29% of the genome of C. quinquefasciatus is occupied by mobile genetic elements.The potential role of retrotransposon insertions strictly associated with host genes is described and discussed along with the possible origin of a retrotransposon with peculiar Primer Binding Site region. Finally, we report the presence of a group of 38 retrotransposons, carrying tandem repeated sequences but lacking coding potential, and apparently lacking "master copy" elements from which they could have originated. The features of the repetitive sequences found in these non-autonomous LTR retrotransposons are described, and their possible role discussed.These results integrate the existing data on the genomics of an important virus-borne disease vector.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22383973 PMCID: PMC3286476 DOI: 10.1371/journal.pone.0030770
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Evolutionary relationships of C. quinquefasciatus LTR-retrotransposons.
Phylogenetic relationships of the LTR retrotransposons based on the amino acids alignment of the conserved RT, RNase H and INT domains. The clades in which fall retrotransposons detected in this paper are indicated with different colors, along with the most common tRNA complementary to the PBS is indicated for each homogeneous group. Elements from this study are indicated as “cpgypsy_” followed by a number. AAGYPSY# elements are LTR retrotransposons identified in previous analyses [17]. The N-J bootstrap values supporting the internal branches are indicated at the nodes. Only bootstrap values greater than 50% are reported. Bel-like elements were used as outgroup. Note that, for families composed of two or more copies (see table 1), representative elements (see file S1) were used for the phylogenetic analyses.
Structural features of the C. quinquefasciatus LTR-retrotransposons detected.
| Lineage | Family | copies | Element | length | LTRs | %LNI | ORFs | PBS | TSD | supercont |
| Mag | cqgypsy_8 | 3 | cqgypsy_8.1 | 4993 | 179 | 100 | 1 | Leu | gtcac | 3.1653 |
|
|
|
|
|
|
|
|
|
|
|
|
| Mag | cqgypsy_11 | 11 | cqgypsy_11.1 | 5129 | 169/181 | 99 | 2 | Ser | ataa | 3.429 |
|
|
|
|
|
|
|
|
|
|
|
|
| Mag | cqgypsy_21 | 2 | cqgypsy_21.1 | 6184 | 287 | 99 | 2 | Ser | tcctt | 3.770 |
|
|
|
|
|
|
|
|
|
|
|
|
| Mag | cqgypsy_25 | 6 | cqgypsy_25.1 | 7859 | 198 | 99 | 2 | Arg | ggaag | 3.176 |
|
|
|
|
|
|
|
|
|
|
|
|
| Mag | cqgypsy_32 | 3 | cqgypsy_32.1 | 4918 | 190 | 99.5 | 1 | Leu | ggaat | 3.540 |
| cqgypsy_32.2 | 4779 | 182 | 97.8 | 3 | Leu | attac | 3.1290 | |||
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
| |||
| Mag | cqgypsy_38 | 5 | cqgypsy_38.1 | 4472 | 164 | 97.6 | 2 | Ser | cctgg | 3.723 |
| cqgypsy_38.2 | 4534 | 164 | 97.6 | 2 | Ser | ttaat | 3.1068 | |||
| cqgypsy_38.3 | 3248 | 119 | 97.3 | 2 | Ser | attcc | 3.1314 | |||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||
| Mag | cqgypsy_51 | 5 | cqgypsy_51.1 | 4904 | 179 | 100 | 2 | Ser | acctg | 3.1151 |
| cqgypsy_51.2 | 6291 | 179 | 98.9 | 2 | Ser | gacac | 3.243 | |||
| cqgypsy_51.3 | 4575 | 188 | 100 | frag | Ser | aacac | 3.1291 | |||
|
|
|
|
|
|
|
|
|
|
| |
| Gypsy | cqgypsy_13 | 7 | cqgypsy_13.1 | 7249 | 302 | 100 | 3 | Thr | tatata | 3.734 |
|
|
|
|
|
|
|
|
|
|
|
|
| Mdg3 | cqgypsy_29 | 4 | cqgypsy_29.1 | 5316 | 264 | 100 | 2 | Leu | gttg | 3.462 |
| cqgypsy_29.2 | 5343 | 263 | 99.6 | 2 | Leu | atag | 3.168 | |||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||
|
|
|
|
|
|
|
|
| |||
|
|
|
|
|
|
|
|
| |||
|
|
|
|
|
|
|
|
| |||
| Osvaldo | cqgypsy_1 | 3 | cqgypsy_1.1 | 11926 | 2137 | 99 | 2 | Lys | ggtt | 3.62 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||
| Osvaldo | cqgypsy_3 | 7 | cqgypsy_3.1 | 10049 | 1591/1596 | 99 | 2 | Lys | aagt | 3.349 |
|
|
|
|
|
|
|
|
|
|
|
|
| Osvaldo | cqgypsy_7 | 5 | cqgypsy_7.1 | 6914 | 997 | 99 | 1 | Lys | aagt | 3.38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||
| Osvaldo | cqgypsy_56 | 14 | cqgypsy_56.1 | 9473 | 1384 | 100. | 2 | Lys | aaat | 3.568 |
| cqgypsy_56.2 | 9393 | 1369/1368 | 95.1 | 2 | Lys | ttat | 3.83 | |||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||
| Osvaldo | cqgypsy_60 | 48 | cqgypsy_60.1 | 10172 | 1308 | 99.3 | 2 | Lys | acaac | 3.133 |
| cqgypsy_60.2 | 10164 | 1310 | 99.9 | frag | Lys | actt | 3.784 | |||
| cqgypsy_60.3 | 9985 | 1224 | 99.6 | 2 | Lys | cagg | 3.215 | |||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||
|
|
|
|
|
|
|
|
| |||
|
|
|
|
|
|
|
|
| |||
| Osvaldo | cqgypsy_65 | 20 | cqgypsy_65.1 | 10447 | 1565 | 99% | 2 | Lys | aacc | 3.254 |
“Lineage” indicates the major lineage they belong to; the estimated copy number detected by BLAST analysis is indicated in the column “copies”; copies enumerated in column “Elements” are those identified by the LTR_STRUC program; “length” indicates the overall element length; “ORFs” indicates the number of ORFs detected in each element; TSD shows the target sequence duplicated upon insertion, Primer Binding Site (PBS); LTR indicates the LTR length; supercontig indicates the supercontig where a given element was identified. %LNI: percent LTRs nucleotide identity.
Note that two values are reported in the LTRs column if the two LTRs of an element differ in size. “frag” indicates fragmented coding regions.
Figure 2Evolutionary relationships of Osvaldo-like elements of C. quinquefasciatus LTR-retrotransposons.
Phylogenetic relationships of the Osvaldo-like retrotransposons based on the amino acids alignment of the conserved RT, RNase H and INT domains CPGYPSY5 and AAGYPSY# are LTR retrotransposons identified in previous analyses [17]. Elements “gypsy ELE ###” were retrieved from the TEfam database. The N-J bootstrap values supporting the internal branches are indicated at the nodes. Only bootstrap values greater than 50% are reported. Bel-like elements were used as outgroup.
Figure 3Organization of the LTR-PBS region of cqgypsy_1.
A) The tRNA sequences inserted into the 5′LTR of the cqgypsy_1 element. The LTR sequence is colored in red, while the PBS sequence is colored in blue. The red bar indicates the duplicated sequence surrounding the putative Twin element. Each of the tRNA halves of the putative Twin is highlighted in turquoise (tRNALys) or in yellow (tRNAGlu). The PBS is depicted in blue. B) tRNAscan output showing the secondary structure of the two halves of the insertion as a cloverleaf structure. C) Local alignment results of cqgypsy_1 with the gypsy_Ele180 and gypsy_Ele185. The aligned region correspond to the 5′LTR (red)/PBS(black) boundary.
Features of the non-autonomous LTR retrotransposons identified in this paper.
| Element | supercont | length | LTR | PBS | TSD | Rep Position | Period | Copies | Entropy | % |
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
| ||||||
|
|
|
|
| |||||||
| cqUNK_3 | 3.322 | 14602 | 315/337 | Leu | attcc | 3252–3314 | 2 | 31.5 | 1.10 | 0.9 |
| 13294–13367 | 33 | 2.2 | 1.63 | |||||||
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_5 | 3.49 | 6237 | 931 | Asn | nd | 2701–2763 | 12 | 5.3 | 1.28 | |
| 2933–3723 | 164 | 4.8 | 1.97 | 14.2 | ||||||
| 5366–5398 | 17 | 1.9 | 1.94 | |||||||
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
| |||||||
|
|
|
|
|
| ||||||
|
|
|
|
| |||||||
|
|
|
|
| |||||||
| cqUNK_7 | 3.506 | 2267 | 182 | Arg | ggtgc | 608–1551 | 117 | 8.1 | 1.96 | 41.6 |
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_10 | 3.176 | 3540 | 194 | Ser | agaag | 1278–1780 | 144 | 3.5 | 1.91 | 14.2 |
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_12 | 3.654 | 5450 | 280 | Leu | acaag | 3411–4140 | 88 | 8.3 | 1.94 | 20.1 |
| 4176–4541 | 50 | 7.3 | 1.95 | |||||||
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_14 | 3.563 | 4105 | 176 | Arg | ggcta | 1564–3072 | 93 | 16.7 | 1.97 | 36.5 |
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_16 | 3.54 | 6178 | 334 | Ser | nd | 2066–2417 | 159 | 2.2 | 2.00 | 5.7 |
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_18 | 3.688 | 4616 | 205 | Asp | acaga | 2097–2230 | 72 | 1.9 | 1.91 | |
| 3114–3334 | 51 | 4.3 | 1.96 | 14.2 | ||||||
| 3345–3646 | 48 | 6.3 | 1.92 | |||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||
| cqUNK_20 | 3.707 | 6208 | 519 | Ser | atctg | 2130–2843 | 39 | 18.2 | 1.97 | 11.5 |
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
| ||||||
|
|
|
|
| |||||||
| cqUNK_22 | 3.589 | 8814 | 182 | Ser | tactc | 1373–3671 | 164 | 14.0 | 1.97 | 40.1 |
| 5792–6779 | 164 | 6.0 | 1.97 | |||||||
| 7247–7500 | 27 | 9.7 | 1.91 | |||||||
| 7526–7852 | 160 | 2.0 | 1.80 | |||||||
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_24 | 3.393 | 4023 | 185 | Ser | ttcat | 401–444 | 9 | 5.1 | 1.40 | 9.2 |
| 3370–3696 | 41 | 8.0 | 1.84 | |||||||
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_26 | 3.220 | 3444 | 337 | Ser | cagcc | 593–2143 | 150 | 10.3 | 1.89 | 51.4 |
| 2332–2552 | 73 | 3.0 | 1.64 | |||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||
| cqUNK_28 | 3.537 | 5520 | 197 | Arg | caccc | 942–1803 | 72 | 12.0 | 1.99 | 15.6 |
| 3221–3349 | 65 | 2.0 | 1.69 | |||||||
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_31 | 3.1198 | 3124 | 371 | Ser | gtcca | 1001–1586 | 159 | 3.7 | 1.93 | 31.4 |
| 1633–2047 | 40 | 11.3 | 1.76 | |||||||
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_33 | 3.1048 | 5047 | 694 | Tyr | nd | 915–1217 | 167 | 1.8 | 2.00 | 6.0 |
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_37 | 3.343 | 5389 | 189 | Met | actgg | 2792–3519 | 179 | 4.1 | 1.99 | 13.5 |
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
| ||||||
|
|
|
|
| |||||||
| cqUNK_39 | 3.820 | 5080 | 573/581 | Tyr | tgatg | 2829–2899 | 35 | 2.0 | 1.95 | 1.4 |
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
| |||||||
|
|
|
|
|
| ||||||
|
|
|
|
| |||||||
|
|
|
|
| |||||||
| cqUNK_42 | 3.7 | 7544 | 208 | Ser | ctatt | 691–1175 | 127 | 3.9 | 1.90 | 6.4 |
|
|
|
|
|
|
|
|
|
|
|
|
| cqUNK_45 | 3.590 | 3246 | 315 | Gln | nd | 2382–2901 | 278 | 1.9 | 1.99 | 16.0 |
For each non-autonomous element is reported the supercontig in which a representative element can be found, the overall length, the LTR size, the tRNA complementary to the PBS. It is also indicated the position, the period and the copies of the repeated DNA contained in the elements listed. The entropy value gives an estimation of the complexity of the repeats (see main text). The portion occupied by repeats in terms of % of the total size of the element is also indicated (column %).
The contribution of LTR-retrotransposons to C. quinquefasciatus gene organization.
| Element | Interaction | Description | GENE ID | Supercont:position |
| Cqgypsy_2 | Within intron | Dual specificity tyrosine-phosphorylation-regulated kinase | CPIJ004687 | 3.72: 556,787–583,090 |
| Exon-Intron junction | 5′-3′ exoribonuclease, putative | CPIJ016423 | 3.746: 154,787–166,602 | |
| 1–2 Kbp upstream | fimbrin/plastin | CPIJ004008 | 3.57: 387,682–393,342 | |
| 2–3 Kbp downstream | allergen, putative | CPIJ018993 | 3.1504: 58,993–71,374 | |
| 0–1 Kbp downstream | Adenylyltransferase and sulfurtransferase MOCS3 | CPIJ001621 | 3.19: 246,047–247,568 | |
|
|
|
|
|
|
| Cqgypsy_5 | 1–2 Kbp downstream | chaperonin | CPIJ013429 | 3.475: 239,574–242,207 |
| Exon-Intron junction | 40 S ribosomal protein S2 | CPIJ012693 | 3.480: 10,936–19,727 | |
| 4–5 Kbp upstream | serine threonine-protein kinase | CPIJ018896 | 3.1443: 17,857–21,694 | |
|
|
|
|
|
|
|
|
|
|
| |
| Cqgypsy_8 | 1–2 Kbp upstream | suppressor of ty3 | CPIJ014381 | 3.539: 307,272–309,528 |
| 1–2 Kbp downstream | suppressor of ty3 | CPIJ014381 | 3.539: 307,272–309,528 | |
| intron | suppressor of ty3 | CPIJ014381 | 3.539: 307,272–309,528 | |
|
|
|
|
|
|
| Cqgypsy_15 | Within intron | dystrophin major muscle isoform | CPIJ013032 | 3.423: 50,593–185,416 |
|
|
|
|
|
|
| Cqgypsy_21 | 0–1 Kbp upstream | flotillin-2 | CPIJ007626 | 3.148: 169,798–180,783 |
|
|
|
|
|
|
| Cqgypsy_29 | 3–4 Kbp downstream | protein phosphatase-1 | CPIJ008212 | 3.168: 687,712–708,602 |
| 0–1 Kbp downstream | helicase | CPIJ019431 | 3.1585: 41,788–51,610 | |
|
|
|
|
|
|
|
|
|
|
| |
| Cqgypsy_47 | 2–3 Kbp downstream | sodium/iodide cotransporter | CPIJ002364 | 3.33: 789,851–792,666 |
| 4–5 Kbp upstream | serine protease inhibitor, serpin | CPIJ012013 | 3.346: 385,437–397,533 | |
| 3–4 Kbp downstream | pre-mrna splicing factor prp17 | CPIJ011807 | 3.365: 424,340–426,137 | |
| 1–2 KbpUpstream | uridine cytidine kinase i | CPIJ016204 | 3.736: 44,070–45,560 | |
| 1–2 Kbp downstream | uridine cytidine kinase i | CPIJ016204 | 3.736: 44,070–45,560 | |
| 1–2 Kbp downstream | coatomer | CPIJ014834 | 3.606: 201,807–202,272 | |
| 1–2 Kbp downstream | poly a polymerase | CPIJ014835 | 3.606: 205,341–209,830 | |
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
| |
|
|
|
|
| |
| Cqgypsy_53 | 3–4 Kbp downstream | esterase B1 precursor | CPIJ016336 | 3.777: 170,027–172,021 |
| 4–5 Kbp upstream | ATP synthase D chain, mitochondrial | CPIJ011691 | 3.328: 179,978–180,887 | |
| 2–3 Kbp downstream | Eftud2 protein, putative | CPIJ000064 | 3.1: 1,221,757–1,228,225 | |
| 1–2 Kbp upstream | cell division protein kinase 5 | CPIJ000065 | 3.1: 1,233,232–1,234,222 | |
| Exon-Intron junction | sarcolemmal associated protein-2, putative | CPIJ011313 | 3.310: 56,964–69,006 | |
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
| |
|
|
|
|
| |
| Cqgypsy_59 | 1 Kbp upstream | negative elongation factor E | CPIJ000025 | 3.1: 727,264–728,238 |
| 4–5 Kbp downstream | semaphorin | CPIJ000027 | 3.1: 744,915–761,147 | |
| 4–5 Kbp downstream | superoxide dismutase, putative | CPIJ005173 | 3.91: 822,776–823,755 | |
| Within intron | enhancer of polycomb | CPIJ018246 | 3.1131: 70,175–80,884 | |
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
| |
|
|
|
|
| |
| Cqgypsy_61 | Overlaps last exon | cytochrome c oxidase subunit I | CPIJ016836 | 3.816: 17,796–33,791 |
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
| |
| Cqgypsy_64 | Exon-Intron junction | arsenite inducible RNA associated protein aip-1 | CPIJ005006 | 3.82: 520,914–524,059 |
| Exon-Intron junction | nk homeobox protein | CPIJ019260 | 3.1511: 40,639–44,852 | |
| 2–3 Kbp downstream | 60 S ribosomal protein L7 | CPIJ017548 | 3.961: 25,320–28,131 |
For each insertions detected in proximity (+/− 5 Kbp) or into genes are reported the kind of interaction (upstream, downstream, exon, intron), the Vectorbase identifier of the gene, its description and its position in the supercontig.
Contribution of the non-autonomous elements identified in this paper to the formation of mature mRNAs of C. quinquefasciatus genes.
| Element | Description | gene ID | Supercontig:position |
| CqUNK_3 | ATP-dependent RNA helicase DHX8 | CPIJ011263 | 3.322: 64334–68170 |
| cell cycle control protein cwf8 | CPIJ011261 | 3.322: 54937–56710 | |
| tRNA methyltransferase | CPIJ011262 | 3.322: 62963–64258 | |
| carboxylesterase-6 | CPIJ006908 | 3.137: 13520–18988 | |
| bombesin receptor subtype-3 | CPIJ017637 | 3.980: 85049–99690 | |
| N-acetylgalactosaminyltransferase 7 | CPIJ014647 | 3.660: 5557–7406 | |
| saposin | CPIJ014133 | 3.545: 130173–138797 | |
|
|
|
|
|
| CqUNK_9 | dopamine beta hydroxylase | CPIJ019622 | 3.1797: 2222–17524 |
|
|
|
|
|
|
|
|
| |
| CqUNK_19 | malate dehydrogenase | CPIJ015299 | 3.611: 78722–91118 |
| igf2 mRNA binding protein, putative | CPIJ011349 | 3.312: 412154–436319 | |
|
|
|
|
|
| CqUNK_23 | midasin | CPIJ010145 | 3.251: 518823–535901 |
| centaurin-alpha 2 | CPIJ019112 | 3.1516: 3934–9235 | |
| sterol desaturase, putative | CPIJ009637 | 3.227: 524825–527033 | |
|
|
|
|
|
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
| CqUNK_32 | choline O-acetyltransferase | CPIJ001609 | 3.19: 128455–132041 |
|
|
|
|
|
| CqUNK_42 | Transcription factor Ken 2 | CPIJ012629 | 3.427: 256203–270029 |
| laminin gamma-3 chain | CPIJ005194 | 3.96: 903006–924273 | |
| trypsin | CPIJ004641 | 3.70: 338878–341919 | |
| f-actin capping protein alpha | CPIJ011271 | 3.322: 247301–257388 | |
| malate dehydrogenase | CPIJ008123 | 3.169: 118442–121449 | |
|
|
|
|
|
|
|
|
| |
|
|
|
| |
| CqUNK_45 | monocarboxylate transporter | CPIJ008119 | 3.184: 607328–610272 |
| pol-like protein | CPIJ018514 | 3.1248: 73743–87450 | |
| elongation factor 1 alpha | CPIJ009557 | 3.231: 370372–372795 | |
| olfactory receptor, putative | CPIJ013754 | 3.526: 317670–324857 |