| Literature DB >> 33789582 |
Guangjie Han1,2, Nan Zhang1, Heng Jiang1, Xiangkun Meng1, Kun Qian1, Yang Zheng1, Jian Xu3, Jianjun Wang4,5.
Abstract
BACKGROUND: Short interspersed nuclear elements (SINEs) belong to non-long terminal repeat (non-LTR) retrotransposons, which can mobilize dependent on the help of counterpart long interspersed nuclear elements (LINEs). Although 234 SINEs have been identified so far, only 23 are from insect species (SINEbase: http://sines.eimb.ru/ ).Entities:
Keywords: Horizontal transfer; Long interspersed nuclear elements (LINEs); Plutella xylostella; Retrotransposon; Short interspersed nuclear element (SINE)
Year: 2021 PMID: 33789582 PMCID: PMC8010984 DOI: 10.1186/s12864-021-07543-z
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1The schematic representation of structure of PxSE1, PxSE2, PxSE3, PxSE4 and PxSE5 in P. xylostella. The A, B, IE and C in tRNAArg or 5S rRNA region represent A box, B box, intermediate element and C box, respectively.The PxSE1, PxSE2 and PxSE3 are tRNA-derived SINEs, PxSE4 and PxSE5 are 5S rRNA-derived SINEs
Fig. 2The consensus sequences of PxSE1, PxSE2, PxSE3, PxSE4 and PxSE5. BmSEm, SINE2-1_PXu and HaSE3 sequences were obtained from Repbase database. tRNA and 5S rRNA sequences were downloaded from D. melanogaster tRNAArg sequence (Accession number: V00243) and B. mori (Accession number: K03316), respectively. a PxSE1 and PxSE2 consensus sequences aligned with tRNA sequence and BmSE. Nucleotides shaded in black are conserved across sequences. The underlined sequences of A Box and B Box are the RNA pol III promoter sequences. b PxSE3 consensus sequence aligned with tRNA-related region and conserved central domain of SINE2-1_PXu. c PxSE4, PxSE5 consensus sequences aligned with 5S rRNA and 3′-region of PxLINE1.1. PxLINE1.1 is a new LINE transposon in P. xylostella
Novel SINE elements identified in this study
| SINE Family | Species | RNA Origin | Consensus Length | Tail | Copy Numbera | Divergence | Resource |
|---|---|---|---|---|---|---|---|
| PxSE1 | tRNA | 263 | (GT)n | 6208 (68) | 0.035 | WGS | |
| MsSE1 | tRNA | 267 | (GT)n | 7513 (133) | 0.091 | WGS | |
| SfSE1 | tRNA | 298 | (ATGT)n | 11,117 (79) | 0.130 | WGS | |
| SlNPVSE1 | tRNA | 260 | (TGTTA)n | 1 (1) | ND | Nr/nt | |
| SlituSE1 | tRNA | 259 | (ATGTT)n | ND(8) | ND | EST | |
| SlittSE1 | tRNA | 270 | (ATGTT)n | ND(15) | ND | EST | |
| CfSE1 | tRNA | 252 | (ATTT)n | ND(10) | ND | EST | |
| PxSE2 | tRNA | 263 | (ATTT)n | 5056 (33) | 0.071 | WGS | |
| ObSE1 | tRNA | 275 | (TATT)n | 4521 (120) | 0.066 | WGS | |
| CsSE1 | tRNA | 287 | (TATT)n | 533 (125) | 0.036 | WGS | |
| PxSE3 | tRNA | 339 | (GAATA)n | 5158 (50) | 0.089 | WGS | |
| MsSE2 | tRNA | 333 | (TAT)n | 16,157 (126) | 0.090 | WGS | |
| PgSE1 | tRNA | 303 | (TAT)n | 1751 (66) | 0.097 | WGS | |
| PmSE1 | tRNA | 310 | (GAT)n | 5740 (138) | 0.098 | WGS | |
| LaSE1 | tRNA | 319 | (GAT)n | 3832 (52) | 0.107 | WGS | |
| PzSE1 | tRNA | 293 | (GAT)n | ND(15) | ND | TSA | |
| EpSE1 | tRNA | 300 | ND | ND(8) | ND | TSA | |
| SeSE1 | tRNA | 291 | (GAT)n | ND(43) | ND | TSA | |
| PxSE4 | 5S rRNA | 389 | (TGA)n | 4415 (50) | 0.078 | WGS | |
| LaSE2 | 5S rRNA | 348 | ND | 1214 (45) | 0.101 | WGS | |
| CsSE2 | 5S rRNA | 294 | (TGA)n | 532 (169) | 0.021 | WGS | |
| ObSE2 | tRNA | 255 | (CGAAA)n | 863 (65) | 0.012 | WGS | |
| PxSE5 | 5S rRNA | 389 | (ATGT)n | 1952 (23) | 0.132 | WGS |
ND not determined
a the number in bracket is the number of copies used to reconstruct the consensus sequences
Fig. 3The nucleotide sequence and conceptual translation of the partner LINE element, PxLINE1.1, for PxSE5. Flanking direct repeats are indicated in lowercase. The nucleotides of TSD are indicated with the wavy line. The nucleotides of 3′ tail sequence are indicated with the straight line
Copies with high identity to PxLINE1.1 in P. xylostella
| Subject No. | Identity % | Length | Start | End | Evalue | Bit score |
|---|---|---|---|---|---|---|
| AHIO01036046.1 | 99.6 | 3203 | 31,532 | 34,758 | 0 | 5838 |
| AHIO01028688.1 | 99.5 | 3203 | 1034 | 4236 | 0 | 5819 |
| AHIO01028673.1 | 99.4 | 3203 | 10,866 | 7665 | 0 | 5810 |
| AHIO01016682.1 | 99.3 | 3203 | 4885 | 8087 | 0 | 5797 |
| AHIO01031207.1 | 99.2 | 3207 | 51 | 3252 | 0 | 5784 |
| AHIO01033561.1 | 99 | 3209 | 7965 | 4750 | 0 | 5749 |
| AHIO01003557.1 | 96.1 | 3214 | 43,696 | 40,488 | 0 | 5243 |
| AHIO01028576.1 | 99.7 | 4408 | 11,375 | 15,783 | 0 | 5067 |
Fig. 4Examples for the relative age distribution of SINE families in P. xylostella, M. sexta, C. suppressalis, L. accius and O. brumata. The abscissa showed the identities between each consensus sequence and the copies. The ordinate showed the copy numbers of sequence with the same identity. The same color represented the same family of SINE
Fig. 5Gene association of SINEs in P. xylostella. a Overall proportions of SINEs in the genome of in P. xylostella are represented as pie charts. b Integration of a PxSE2 element within the CDS of a gene encoding a nitrogen permease regulator 3-like protein. The sequences with yellow represent the exon region of LOC105380419, the sequences with lowercase is a PxSE2.2 copy of PxSE2
The annotation of SINEs copies integrated into CDS and untranslated regions (UTR) in P. xylostella
| Copies | Location | GeneID | Gene mapping | COG class annotation | Swissprot annotation | Nr annotation |
|---|---|---|---|---|---|---|
| PxSE1.1 | 436,420–436,680 | LOC105383591 | CDS | Signal transduction mechanisms | Cyclic nucleotide-gated cation channel subunit A | uncharacterized protein |
| PxSE1.2 | 46,181–46,441 | LOC105394666 | CDS | Signal transduction mechanisms | Cyclic nucleotide-gated cation channel subunit A | uncharacterized protein |
| PxSE2.1 | 800,671–800,788 | LOC105384210 | 3′ UTR | ND | Transcription factor 25 homolog | transcription factor 25 |
| PxSE2.2 | 2,273,356–2,273,095 | LOC105380419 | CDS | ND | Nitrogen permease regulator 3-like protein | nitrogen permease regulator 3-like protein |
| PxSE2.3 | 35,405–35,544 | LOC105390425 | 3′ UTR | ND | Serine/threonine-protein kinase | serine/threonine-protein kinase grp-like |
| PxSE2.4 | 250,511–250,756 | LOC105381765 | CDS | Replication, recombination and repair | DNA topoisomerase 3-alpha | uncharacterized protein |
| PxSE2.5 | 266,277–266,424 | LOC105391817 | 3′ UTR | ND | Leucine-rich repeat serine/threonine-protein kinase 1 | uncharacterized protein |
| PxSE3.1 | 274,387–274,287 | LOC105388973 | 3′ UTR | ND | Gamete and mating-type specific protein A | uncharacterized protein |
| PxSE3.2 | 699,462–699,145 | LOC105386775 | CDS | ND | Uncharacterized protein | serine/arginine repetitive matrix protein 1-like |
| PxSE3.3 | 490,234–490,329 | LOC105381296 | 3′ UTR | ND | Venom acid phosphatase Acph-1 (Precursor) | prostatic acid phosphatase |
| PxSE3.4 | 149,472–149,747 | LOC105389005 | 3′ UTR | General function prediction only | Protein suppressor of hairy wing | zinc finger protein 26-like |
| PxSE3.5 | 165,558–165,422 | LOC105383366 | CDS | ND | Mediator of RNA polymerase II transcription subunit 12 | uncharacterized protein |
| PxSE4.1 | 1,260,998–1,260,894 | LOC105380733 | 3′ UTR | General function prediction only | Ras-related protein Rab-24 | ras-related protein Rab-24-like |
| PxSE4.2 | 63,533–63,636 | LOC105391976 | 3′ UTR | General function prediction only | Protein fem-1 homolog B | protein fem-1 homolog B |
| PxSE4.3 | 31,764–31,620 | LOC105393953 | 3′ UTR | General function prediction only | Ras-related protein Rab-24 | ras-related protein Rab-24-like |
| PxSE4.4 | 787,511–787,311 | LOC105382052 | 3′ UTR | RNA processing and modification | WW domain-containing protein ZK1098.1 | transcription elongation regulator 1-like isoform X1 |
| PxSE4.5 | 160,917–160,809 | LOC105388357 | CDS | ND | Serine/threonine kinase SAD-1 | PAS domain-containing serine/threonine-protein kinase |
| PxSE4.6 | 1,160,410–1,160,304 | LOC105398290 | 3′ UTR | General function prediction only | Ras-related protein Rab-24 | ras-related protein Rab-24-like |
| PxSE4.7 | 819,669–819,771 | LOC105385258 | CDS | ND | Probable 4-coumarate-CoA ligase 2 | probable 4-coumarate-CoA ligase 3 |
| PxSE4.8 | 217,956–218,125 | LOC105383715 | CDS | ND | Integrin alpha-PS3 light chain (Precursor) | integrin alpha-PS5-like |
| PxSE5.1 | 306,942–307,066 | LOC105398401 | 5′ UTR | ND | ND | CLK4-associating serine/arginine rich protein-like |
| PxSE5.2 | 101,368–101,480 | LOC105393324 | CDS | Coenzyme transport and metabolism | Molybdopterin synthase catalytic subunit | molybdopterin synthase catalytic subunit-like |
| PxSE5.3 | 993,048–992,940 | LOC105383883 | CDS | ND | Salivary glue protein Sgs-3 (Precursor) | uncharacterized protein |
| PxSE5.4 | 338,901–338,769 | LOC105387600 | 3′ UTR | ND | Alpha-(1,3)-fucosyltransferase C | alpha-(1,3)-fucosyltransferase C-like |
| PxSE5.5 | 499,158–499,032 | LOC105388075 | 3′ UTR | ND | Cuticle collagen 1 (Precursor) | breast cancer metastasis-suppressor 1-like protein isoform X1 |
ND not determined
Fig. 6The evolutionary tree of 23 novel SINEs in this study (a) and the taxonomy tree of lepidopteran insects harboring PxSE3-like SINEs (b)
Fig. 7The evidence of HTT from Lepidoptera to baculovirus. Multiple sequence alignment of SlNPVSE1 and its flanking sequences and the orthologous sequences. Se-WH-S is a host sequence from S. exigua genome (WNNL01000005.1:248783–248238), SlNPV-II is baculovirus sequence from S. litura nucleopolyhedrovirus II (Accession number: EU780426.1:29774–31088) containing SINE copy, SeNPV-251 and ScNPV-vpn72 are orthologous sequecnes of SlNPV-II from S. eridania nucleopolyhedrovirus isolate 251 (Accession number: MH320559.1:31479–31679) and S. cosmioides nucleopolyhedrovirus isolate VPN72 (Accession number: MK419955.1:32601–32796), respectively