| Literature DB >> 30857296 |
Alexander Belyayev1, Jiřina Josefiová2, Michaela Jandová3, Ruslan Kalendar4,5, Karol Krak6,7, Bohumil Mandák8,9.
Abstract
Satellite DNA (satDNA) is the most variable fraction of the eukaryotic genome. Related species share a common ancestral satDNA library and changing of any library component in a particular lineage results in interspecific differences. Although the general developmental trend is clear, our knowledge of the origin and dynamics of satDNAs is still fragmentary. Here, we explore whole genome shotgun Illumina reads using the RepeatExplorer (RE) pipeline to infer satDNA family life stories in the genomes of Chenopodium species. The seven diploids studied represent separate lineages and provide an example of a species complex typical for angiosperms. Application of the RE pipeline allowed by similarity searches a determination of the satDNA family with a basic monomer of ~40 bp and to trace its transformation from the reconstructed ancestral to the species-specific sequences. As a result, three types of satDNA family evolutionary development were distinguished: (i) concerted evolution with mutation and recombination events; (ii) concerted evolution with a trend toward increased complexity and length of the satellite monomer; and (iii) non-concerted evolution, with low levels of homogenization and multidirectional trends. The third type is an example of entire repeatome transformation, thus producing a novel set of satDNA families, and genomes showing non-concerted evolution are proposed as a significant source for genomic diversity.Entities:
Keywords: genome evolution; high order repeats; next-generation sequencing; plants; satellite DNA
Mesh:
Substances:
Year: 2019 PMID: 30857296 PMCID: PMC6429384 DOI: 10.3390/ijms20051201
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Phylogenetic tree calculated using Bayesian inference within the C. album aggregate estimated based on the concatenated dataset of three chloroplast DNA spacers (adapted from [27]). Major evolutionary lineages (A–H) are marked by grey rectangles. The numbers above branches correspond to the ages of the particular clades (in millions of years) as inferred by the analysis in BEAST2. Positions of explored diploid species are shown in red. Polyploid species are shown in blue. The schematic stratigraphic time scale (Miocene–Holocene) is shown at the bottom of the figure.
The accessions and geographic origin of Chenopodium diploid species used for satDNA cluster analysis (NGS), probe preparation (cloning) and fluorescent in situ hybridization (FISH).
| Species | Clade | ID Number | Locality |
|---|---|---|---|
|
| D | 429/3 | China, Burquin |
|
| A | 742/4 | Russian Federation, Nakhodka |
|
| B | 330/2 | Czech Republic, Slatina |
|
| E | 433/9 | China, Hoboksar |
|
| E | 830/3C | Tajikistan, Gorno-Badakhshan |
|
| B | 328/10 | Czech republic, Švermov |
|
| H | 771/1 | Iran, Shahr |
Summary of chromosome parameters, genome size, RE clusters and percentage of CficCl-61-40 satDNA family in the genomes of C. album aggregate diploid species.
| Species | Chr. Numb. | Chr. Size | Genome Size | RE Clusters | RE Singlets | CficCl-61-40 |
|---|---|---|---|---|---|---|
|
| 18 | 0.8–1.5 | 960 | 393251 | 34269 | 3.80 |
|
| 18 | 0.7–0.9 | 1200 | 307778 | 38905 | 2.25 |
|
| 18 | 1.5–4.5 | 1785 | 369861 | 20661 | 0.31 |
|
| 18 | 1.7–3.3 | 1144 | 327760 | 82679 | 0.42 |
|
| 18 | 1.2–2.5 | 1154 | 249599 | 42427 | 0.25 |
|
| 18 | 2.5–5.0 | 1775 | 369583 | 72167 | 0.27 |
|
| 18 | 1.5–2.0 | 924 | 542674 | 93278 | 0.79 |
Figure 2RepeatExplorer (RE) analysis of next-generation sequencing (NGS) data in Chenopodium diploids. (A) Cluster 61 of C. ficifolium demonstrate layouts that are typical for tandem repeats where nodes represent the sequence reads and edges between the nodes correspond to similarity hits; (B) Self-to-self comparisons of the contig 25 cluster 61 displayed as dot plots (genomic similarity search tool YASS program output) where parallel lines indicate tandem repeats (the distance between the diagonals equals the lengths of the motifs ~40 bp); (C) Agarose gel electrophoresis of PCR products obtained with primers designed from consensus monomer sequence of C. ficifolium (Cluster 61) showing typical ladder structure of tandem array.
Figure 3Phylogenetic relationships of the CficCl-61-40 satDNA family sequences. Phylogenetic tree based on the k-mer analysis.
Figure 4Agarose gel electrophoresis of PCR products obtained with primers designed from consensus monomer sequence of proposed high order repeat (HOR) units for determination of their physical counterparts. Cloned DNA fragments are shown by asterisks. The far-right line is an example of negative amplification of a computer-generated proposed HOR unit.
Figure 5Chromosomal distribution CficCl-61-40 satDNA family sequences. CficCl-61-40 is labelled red; C. acuminatum-specific HOR unit CacuCl-1-117 is labelled green. Bar represent 5 μm.