| Literature DB >> 32607133 |
Alexander Belyayev1, Jiřina Josefiová1, Michaela Jandová1, Václav Mahelka1, Karol Krak1,2, Bohumil Mandák1,2.
Abstract
Extensive and complex links exist between transposable elements (TEs) and satellite DNA (satDNA), which are the two largest fractions of eukaryotic genome. These relationships have a crucial effect on genome structure, function and evolution. Here, we report a novel case of mutual relationships between TEs and satDNA. In the genomes of Chenopodium s. str. species, the deletion derivatives of tnp2 conserved domain of the newly discovered CACTA-like TE Jozin are involved in generating monomers of the most abundant satDNA family of the Chenopodium satellitome. The analysis of the relative positions of satDNA and different TEs utilizing assembled Illumina reads revealed several associations between satDNA arrays and the transposases of putative CACTA-like elements when an ~ 40 bp fragment of tnp2 served as the start monomer of the satDNA array. The high degree of identity of the consensus satDNA monomers of the investigated species and the tnp2 fragment (from 82.1 to 94.9%) provides evidence of the genesis of CficCl-61-40 satDNA family monomers from analogous regions of their respective parental elements. The results were confirmed via molecular genetic methods and Oxford Nanopore sequencing. The discovered phenomenon leads to the continuous replenishment of species genomes with new identical satDNA monomers, which in turn may increase species satellitomes similarity.Entities:
Keywords: CACTA transposons; Chenopodium; Next generation sequencing; Oxford Nanopore sequencing; Satellite DNA; Transposase
Year: 2020 PMID: 32607133 PMCID: PMC7320549 DOI: 10.1186/s13100-020-00219-7
Source DB: PubMed Journal: Mob DNA
Chenopodium species used for the study, their genome composition, ploidy, genome size and geographical origin
| Species (accession number) | Genome composition [ | Locality | Coordinates | Chr. No | Genome |
|---|---|---|---|---|---|
| B + D | Russian Federation, Velsk | N 61.066704 E 42.095002 | 2n = 4x = 36 | 2570 | |
| D | China, Xinjiang, Altaj, Burqin | N 47.815500 E 87.080028 | 2n = 2x = 18 | 960 | |
| B + C + D | Czech Republic, Hrádek | N 48.781583 E 16.261528 | 2n = 6x = 54 | 3808 | |
| A | Russian Federation, Primorski Krai, Nakhodka city district | N 42.88775 E 132.722361 | 2n = 2x = 18 | 2608 | |
| B | Czech Republic, Slatina | N 50.226389 E 14.210528 | 2n = 2x = 18 | 1785 | |
| E | China, Xinjiang, Altaj, Hoboksar | N 46.541472 E 85.358083 | 2n = 2x = 18 | 1144 | |
| B + E | Russian Federation, Verkhnekolymsky raion, Popovka river mouth | N 64.646833 E 151.640306 | 2n = 4x = 36 | 2935 | |
| B + E | China, Xinjiang, Tumuxiukezhen | N 41.667306 E 79.693528 | 2n = 4x = 36 | 2929 | |
| A + C + D | Russian Federation, Primorski Krai, Nakhodka city district | N 42.88775 E 132.722361 | 2n = 6x = 54 | 3247 | |
| C + D | China, Xinjiang, Tumuxiukezhen | N 41.667306 E 79.693528 | 2n = 4x = 36 | 1192 | |
| B + C + F | Iran, Kurdistan, Marivan | N 35.498461 E 46.166946 | 2n = 6x = 54 | 4421 | |
| E | Tajikistan, Gorno-Badakhshan autonomous region, Murghob district | N 37.821667 E 73.566667 | 2n = 2x = 18 | 1154 | |
| A + G | Iran, west Azerbaijan, Siah Cheshmeh (Chaldoran) | N 39.065972 E 44.386170 | 2n = 4x = 36 | 2177 | |
| C + D | Czech Republic, Mělník | N 50.349528 E 14.497444 | 2n = 4x = 36 | 2029 | |
| C + D | Czech Republic, Prague | N 50.115964 E 14.433326 | 2n = 4x = 36 | 2022 | |
| B | Czech Republic, Švermov | N 50.176806 E 14.105472 | 2n = 2x = 18 | 1775 | |
| H | Iran, Ardabil, Meshgin Shahr | N 38.405556 E 47.694722 | 2n = 2x = 18 | 924 |
Fig. 1The tnp2 transposase and the CficCl-61-40 satDNA family. a Schematic representation of contig 22 of the assembled C. acuminatum genome (the first 4000 bp) at different zoom levels. Red squares are the fragments of tnp2. The green line indicates the length of the CficCl-61-40 satDNA array. Blue triangles/squares are conserved motifs of the basic monomer. The green triangle is a similar conserved motif within tnp2 (parental monomer). The red frame indicates the homologous protein sequence of the start monomer and the similar fragment from the other plant species (Cdd: pfam 02992). The positions of PCR primers used for validation of the physical existence of the association of tnp2B with CficCl-61-40 satDNA family arrays are shown with yellow rectangles (see also Fig. 2a). A diagram of the domain organization of the complete CACTA-like TE Jozin is shown at the bottom of a (see also Additional file 3). The 3′ position of the parent for the CficCl-61-40 satDNA array start monomer is shown with an arrow (for further explanation see the text). b Phylogenetic relationships of conserved protein domains of the tnp2 transposase family. Tnp2A in the genomes of the species of the C. album aggregate is highlighted in red. Tnp2B in the genomes of the species of C. album aggregate is highlighted in blue. GenBank accession numbers follow the plant species name. c Phylogenetic relationships of CficCl-61-40 satDNA family monomers and corresponding fragments of tnp2B (the latter are highlighted in blue). A graphical representation of the conservation of CficCl-61-40 satDNA family monomers by sequence logo is shown at the bottom of c (Additional file 2.1)
Fig. 2Experimental validation of the computationally identified structures. a PCR screening for the association of tnp2B with CficCl-61-40 satDNA family arrays. b (1) Fiber-FISH analysis of DNA strands of C. acuminatum. The CficCl-61-40 satDNA family probe (red signal) is associated with arrays (three examples). The bar represents 1 μm. (2) The distribution of CficCl-61-40 satDNA family sequences in the chromosomes of C. album s. str. CficCl-61-40 is labeled with Cy-3 (red signal); chromosomes are stained with DAPI (blue signal). The bar represents 5 μm. c Self-to-self comparisons of the three Oxford Nanopore ultralong reads from the C. acuminatum genome (Additional file 4) displayed as dot plots (YASS program output). Parallel lines indicate tandem repeats (the distance between the diagonals equals the lengths of the motifs). Histograms at the axes indicate the regions with tandem repeats. (1) Read #17 of 49,142 bp; (2) Read #131 of 34,531 bp; (3) Read # 313 of 30,368 bp. The positions of the tnp2B parental fragments are indicated with red arrows, and the associated CficCl-61-40 satDNA family arrays (squares) are indicated with blue arrows
Presence and association of tnp2B and CficCl-61-40 satDNA family arrays (see also Additional file 1)
| Species | Genomes | Contig No | Array | Similarity | |||
|---|---|---|---|---|---|---|---|
| B + D | yes | no | – | – | – | – | |
| B + C + D | yes | yes | 1990 | 579 bp | 12.0 | 89.7% | |
| A | no | no | – | – | – | – | |
| B | yes | no | – | – | – | – | |
| E | no | no | – | – | – | – | |
| B + E | no | no | – | – | – | – | |
| B + E | no | no | – | – | – | – | |
| A + C + D | yes | no | – | – | – | – | |
| C + D | yes | no | – | – | – | – | |
| B + C + F | no | no | – | – | – | – | |
| E | no | no | – | – | – | – | |
| A + G | no | no | – | – | – | – | |
| C + D | yes | yes | 541 | 134 bp | 54.2 | 89.8% | |
| 28,391 | 371 bp | 22.2 | 94.9% | ||||
| C + D | yes | yes | 700 | 371 bp | 27.9 | 90.0% | |
| 10,973 | 386 bp | 17.4 | 94.9% | ||||
| 11,346 | 341 bp | 27.1 | 80.0% | ||||
| B | no | no | – | – | – | – | |
| H | no | no | – | – | – | – |