| Literature DB >> 27146967 |
E V Evtushenko1, V G Levitsky2,3, E A Elisafenko2, K V Gunbin2,3, A I Belousov1, J Šafář4, J Doležel4, A V Vershinin5.
Abstract
BACKGROUND: A prominent and distinctive feature of the rye (Secale cereale) chromosomes is the presence of massive blocks of subtelomeric heterochromatin, the size of which is correlated with the copy number of tandem arrays. The rapidity with which these regions have formed over the period of speciation remains unexplained.Entities:
Keywords: 1RS BAC library; 454 sequences; DNA motifs; Rye; Secale cereale; Subtelomeric heterochromatin; TE–tandem junctions; Tandem repeats; Transposable elements
Mesh:
Substances:
Year: 2016 PMID: 27146967 PMCID: PMC4857426 DOI: 10.1186/s12864-016-2667-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Multiple arrays of tandemly repeated families are present on the short arm of the rye chromosome 1R (1RS). a FISH image of early meiotic prophase chromosomes of CS/1RS, showing the location of pSc200 (fluorescing green) and pSc250 (red) sequences. b Southern hybridization profiles of CS/1RS probed with pSc250. The size of hybridization fragments is shown in kb
Fig. 2PFGE separation of PstI-restricted BAC inserts harboring pSc200 arrays. a Ethidium bromide stained gel, b Southern hybridization probed with pSc200. Lane 1: BAC clone 126C20, lane 2: 119C15, lane 3: 119 M22, lane 4: 114I10
Percent identity of pSc200 monomers present in BAC clone 119C15
| Sequence of the pSc200 monomers in contig | fr47-1 | fr47-2 | exo2-1 | T79-1 | T79-2 | exo6-1 | exo7-1 | T713-1 | exo9-1 | exo9-2 | T7-6 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| fr47-1 | 100 | 89.36 | 95.77 | 88.56 | 95.23 | 89.04 | 98.41 | 88.86 | 95.49 | 89.36 | 95.77 |
| fr47-2 | 100 | 90.16 | 96.55 | 89.33 | 97.60 | 90.43 | 99.47 | 89.36 | 96.82 | 89.92 | |
| exo2-1 | 100 | 90.16 | 98.41 | 89.84 | 96.83 | 89.66 | 98.41 | 90.69 | 98.94 | ||
| T79-1 | 100 | 89.33 | 96.27 | 89.63 | 96.02 | 89.10 | 97.61 | 89.92 | |||
| T79-2 | 100 | 89.01 | 96.82 | 88.83 | 98.94 | 89.87 | 99.47 | ||||
| exo6-1 | 100 | 90.11 | 97.07 | 89.07 | 97.07 | 89.60 | |||||
| exo7-1 | 100 | 89.92 | 97.08 | 90.43 | 97.35 | ||||||
| T713-1 | 100 | 88.86 | 96.29 | 89.39 | |||||||
| exo9-1 | 100 | 89.63 | 99.73 | ||||||||
| exo9-2 | 100 | 90.45 | |||||||||
| Т7-6 | 100 |
Fragment of the pSc200 array from DNA BAC199C15 was sub-cloned in a plasmid vector pGem-5Zf(+). Then a series of deletion clones was obtained according to [24] and their inserts were sequenced and assembled into a contig encompassing 11 full-length monomers of pSc200
Fig. 3Phylogeny of (a) pSc200 and (b) pSc250 monomers present on each of the seven rye chromosomes. The chromosome-specific 454 libraries obtained by Martis et al. (2013) were used to reconstruct phylogenetic networks and assign pSc200 and pSc250 monomers to each of the seven rye chromosomes (1R and 1RS chromosomes, red color: ERX140512 ERX140519 libraries; 2R chromosome, orange: ERX140513 library; 3R and 3RS chromosomes, yellow: ERX140514 and ERX140520 libraries; 4R chromosome, green: ERX140515 library; 5R chromosome, cyan: ERX140516 library; 6R chromosome, blue: ERX140517 library; 7R and 7RS chromosomes, violet: ERX140518 and ERX140521 libraries). The phylogenetic trees shown represent galled phylogenetic networks generated by Dendroscope v3.2.8 software based on the trees obtained by maximum likelihood method. Black circles at branch ends refer to sequenced pSc200 monomers present in the ВАС clone 119С15 (see Table 1). The scale bar corresponds to the weighted evolutionary distance (GTR nucleotide substitution model) and indicates the weighted number of substitutions per alignment site. The two histograms depicting the distribution of pairwise distances are shown: the x-axis plots the sequence pairwise distance (=100 - % of sequence identity) while the y-axis plots the occurrence frequency
Fig. 4The structure of BAC clones (a) 122 F3 and (b) 84C15. White rectangles: non-array sequence, black rectangles: vector DNA
Sequence composition of genome-wide 454 reads and of the sequences adjacent to pSc200 and pSc250 arrays
| All reads | Reads with junctions of pSc200 | Reads with junctions of pSc250 | ||||
|---|---|---|---|---|---|---|
| Type of sequence | Cumulative length, bp | Proportion to cumulative length, % | Cumulative length of non-tandem DNA, bp | Proportion, % | Cumulative length of non-tandem DNA, bp | Proportion, % |
| Class I TE | ||||||
| Ty3/Gypsy-like | 4 142 315 628 | 50.84 | 36542 | 35.68 | 65786 | 40.62 |
| Ty1/Copia-like | 799 492 074 | 9.81 | 14322 | 13.99 | 14677 | 9.06 |
| solo-LTR | 60 362 185 | 0.74 | 13119 | 12.81 | 37289 | 23.02 |
| LINE | 57 651 280 | 0.71 | 483 | 0.47 | 1212 | 0.75 |
| SINE | 1 201 224 | 0.02 | 0 | 0.00 | 0 | 0.00 |
| Class II TE | ||||||
| CACTA | 454679975 | 5.58 | 5544 | 5.41 | 5621 | 3.47 |
| EnSpm | 18661838 | 0.23 | 984 | 0.96 | 23 | 0.01 |
| Harbinger | 20519563 | 0.25 | 0 | 0.00 | 0 | 0.00 |
| Mariner | 24069442 | 0.30 | 370 | 0.36 | 147 | 0.09 |
| Hat | 12376913 | 0.15 | 0 | 0.00 | 268 | 0.17 |
| Helitron | 3779908 | 0.05 | 0 | 0.00 | 0 | 0.00 |
| Others | 37550591 | 0.46 | 124 | 0.12 | 307 | 0.19 |
| Simple repeats, low complexity | 27 883 202 | 0.34 | 259 | 0.25 | 383 | 0.24 |
| rDNA | 8 782 174 | 0.11 | 0 | 0.00 | 396 | 0.24 |
| Tandem repeats | 16 953 966 | 0.21 | 185 | 0.18 | 2420 | 1.49 |
| Unclassified (unknown) | 48 627 218 | 0.60 | 2348 | 2.29 | 2996 | 1.85 |
We computed DNA composition of all reads and compared with that in non-tandem DNA adjacent to the pSc200 and pSc250 tandem arrays. Length of all repeats was defined according to annotations that we got in output files of RepeatMasker tool (see “Methods”). Columns “Proportion” denote the ration of the cumulative length of the given non-tandem DNA to the cumulative length of all reads
Fig. 5The genome-wide and array-flanking sequence (junction sites) abundance of various TE families. Only those TE families exhibiting a large disparity in abundance are shown.
Enrichment estimates (t-test) for top-scoring motifs
| А. Top-scoring motifs in TE - pSc200 junctions, Logo | |||
| Motifs | t-test, | ||
| Olivia | Daniela | ||
| 1 |
| 1.8E-04 | |
| 2 |
| 1.4E-09 | 1.3E-03 |
| 3 |
| 1.4E-09 | 7.6E-04 |
| 4 |
| 2.0E-07 | 3.6E-03 |
| 5 |
| 5.2E-13 | |
| 6 |
| ||
| 7 |
| 9.5E-04 | |
| 8 |
| ||
| 9 |
| 1.8E-09 | 4.8E-06 |
| 10 |
| 7.8E-06 | |
| 11 |
| 1.7E-08 | 1.4E-04 |
| 12 |
| 1.7E-05 | |
| В. Top-scoring motifs in TE - pSc250 junctions, Logo | |||
| Motifs | t-test, | ||
| Laura | Xalas | ||
| 1 |
| ||
| 2 |
| 3.8E-05 | |
| 3 |
| 1.2E-04 | |
| 4 |
| ||
| 5 |
| ||
| 6 |
| 1.3E-04 | |
| 7 |
| 1.9E-03 | 1.4E-03 |
| 8 |
| 1.0E-07 | 4.0E-03 |
| 9 |
| 8.8E-04 | 4.0E-03 |
| 10 |
| 4.0E-05 | 1.1E-09 |
| 11 |
| 2.6E-06 | |
| 12 |
| 1.4E-09 | 1.6E-03 |
Top-scoring motifs present in the (A) TE/pSc200 junctions, (В) TE/pSc250 junctions, Logo. *significance of enrichment was estimated by Fisher’s t-test as described in “Methods” the only statistically significant values are shown
Fig. 6The most highly enriched TE families localizing in the vicinity of pSc200 and pSc250 arrays. The x-axis plots the percent of TE/array junctions harboring a particular TE family, while the y-axis plots TE enrichment relative to the genome average at the junction site. Dashed lines delimit the areas on the plot used for TE family selection. Selected TE families are indicated and are followed by their x and y values. ТЕs located in the vicinity of (a) pSc200, (b) pSc250