| Literature DB >> 29443947 |
Radka Symonová1, W Mike Howell2.
Abstract
To understand the cytogenomic evolution of vertebrates, we must first unravel the complex genomes of fishes, which were the first vertebrates to evolve and were ancestors to all other vertebrates. We must not forget the immense time span during which the fish genomes had to evolve. Fish cytogenomics is endowed with unique features which offer irreplaceable insights into the evolution of the vertebrate genome. Due to the general DNA base compositional homogeneity of fish genomes, fish cytogenomics is largely based on mapping DNA repeats that still represent serious obstacles in genome sequencing and assembling, even in model species. Localization of repeats on chromosomes of hundreds of fish species and populations originating from diversified environments have revealed the biological importance of this genomic fraction. Ribosomal genes (rDNA) belong to the most informative repeats and in fish, they are subject to a more relaxed regulation than in higher vertebrates. This can result in formation of a literal 'rDNAome' consisting of more than 20,000 copies with their high proportion employed in extra-coding functions. Because rDNA has high rates of transcription and recombination, it contributes to genome diversification and can form reproductive barrier. Our overall knowledge of fish cytogenomics grows rapidly by a continuously increasing number of fish genomes sequenced and by use of novel sequencing methods improving genome assembly. The recently revealed exceptional compositional heterogeneity in an ancient fish lineage (gars) sheds new light on the compositional genome evolution in vertebrates generally. We highlight the power of synergy of cytogenetics and genomics in fish cytogenomics, its potential to understand the complexity of genome evolution in vertebrates, which is also linked to clinical applications and the chromosomal backgrounds of speciation. We also summarize the current knowledge on fish cytogenomics and outline its main future avenues.Entities:
Keywords: AT/GC compositional evolution; fish cytogenomics; genome evolution; quantitative cytogenomics.; rDNAome; repetitive sequences
Year: 2018 PMID: 29443947 PMCID: PMC5852592 DOI: 10.3390/genes9020096
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Main cytogenomic traits in fish-like chordates.
| Group | 2 | Micro- Chromosomes | CG Heterogeneity | WGD after the First Two Basal Vertebrates´ WGDs | C-value/Haploid DNA Content (pg) [ | Specific Features in the Genome History and Chromosomal Evolution |
|---|---|---|---|---|---|---|
| Myxiniformes (hagfishes) | 14–48 | NO | unknown | not observed | Myxinidae 2.5–4.59 | chromatin diminution, programmed genome rearrengement [ |
| Petromyzontiformes (lampreys) | 76–178 | NO | GC-rich DNA repeats | not observed | Petromyzontidae 1.29–2.5 | programmed genome rearrengement [ |
| Chondrichthyes (cartilaginous fishes) | 54–102 | YES | observed, presumably satellite DNA | not observed | Chimeriformes 1.5-2 Selachimorpha ~3–17 Rajimorphii 2.7–17 | AT/GC heterogeneity positively correlated with genome size [ |
| Ceratodontiformes (lungfishes) | 34–68 | Only | unknown | not observed | “genomic obesity” without WGD documented [ | |
| Coelacanthiformes (lobe-finned fishes) | 48 | YES | unknown | not observed | chromosomes similar to ancient frogs [ | |
| Acipenseriformes (sturgeons, paddlefish) | ~ 120–240–360 | YES ~ 50% | NORs and GC-rich microchromosomes [ | multiple in sturgeons, one in paddlefish | multiple WGD, ploidy diversity | |
| Lepisosteiformes (gars) | 56–58 | small sized chromosomes | in both genera | not observed | regionally high recombination rate | |
| Amiiformes (bowfin) | 46 | NO | only NORs | not observed | convergent evolution with teleosts? | |
| Polypteriformes (bichirs) | 36–38 biarmed, extremelly large | NO | only NORs | not observed | not investigated | |
| Teleostei | ~ 50 (exceptions up to 100–150 or more) | Micro B-chromosomes | only NORs | TGD and lineage specific WGDs | mostly 0.4- ~ 1.0 | from genome compaction to lineage specific WGD |
WGD: Whole-genome duplication; NOR: Nucleolar organizer region; pg: picograms.
a orders and families mostly based on [22]; b c-value, based on [12] database; c n: based on genomic/sequence data, originates from NCBI, 2n based on cytogenetic data [86]; d number of assemblies currently released and level of assembly (C = contig, D = draft, S = scaffold, Ch = chromosomal level); e Sequencing methods (I = Illumina, PB = PacBio, S = Sanger), linkage map (LM) available.
Figure 1Three levels of ribosomal DNAs (rDNAs) functionality, generally and specifically in fishes.
Figure 2(A) GC-profiles across three linkage groups (LGs) so far unassigned to their corresponding chromosome pairs showing fluctuations in GC-percentage produced using the chromoplot tool. These three linkage LGs are arranged according to their numbers in the Ensembl [103] and separated by vertical lines above profiles and by the alternation of gray and black colors (x-axis—genome position, y-axis—GC%); (B) Partial karyotype stained with 4',6-diamidino-2-phenylindole/ Chromomycin A3 (DAPI/CMA3) showing six pairs of larger chromosomes of the spotted gar with altering GC-rich regions in red and AT-rich regions in green. Reproduced with permission from [7].
Figure 3Summary of an integrative cytogenomic study on the rDNAome in European pikes (Esox lucius and E. cisalpinus). (A) Fluorescence in situ hybridization (FISH) with 5S rDNA (red) and 45S rDNA (green); (B) Distribution of single-nucleotide polymorphism (SNPs) along the E. lucius 5S rDNA unit obtained from Illumina reads showing absence of SNPs in the internal controlling region composed of Box-A, IE and Box-C elements; (C) Distribution of variants in intergenic spacer regions (IGS) in three PacBio reads (a–c). Slanted lines indicate tandemly arranged units visualized through the alignment of reads (x-axis) with a 5S gene (y-axis); (D) Higher-order organization of 5S rDNA arrays in E. lucius. Self-to-self comparison of long PacBio molecules representing three groups; (E) 5S rRNA domain reconstruction of E. lucius indicating its potential functionality; (F) Methylation analysis of 5S rDNA by the methylation-sensitive HpaII (H) restriction enzyme and its methylation-insensitive MspI (M) isoschizomere. Reproduced from [8].
Detailed overview of fish species with a sequenced genome. Diverse levels of genome assemblies (draft, contig, scaffold, fully assembled genomes to the chromosome level) and numbers of assembly versions are listed together with basic cytogenetic traits (2n, C-value, GC%).
| Species | Order a | Family a | Size (Mb) | C-val b | Assembly/Type d | GC% | Notes e | |
|---|---|---|---|---|---|---|---|---|
| Ovalentaria | Pomacentridae | 991.585 | 0.94 | — | 1/S | 41.5 | ||
| Cichliformes | Cichlasomat., Heroini | 844.903 | — | —/48 | 1/S | 41.4 | ||
| Anguilliformes | Anguillidae | 1018.7 | 1.11–1.67 | —/38 | 1/S | 42.9 | ||
| Anguilliformes | Anguillidae | 1288.6 | 1.09 | —/38 | 1/S | 38.8 | ||
| Anguilliformes | Anguillidae | 1413.03 | 1.01–1.66 | —/38 | 1/S | 41.0 | ||
| Perciform × Scorpaeniform | Anoplomatidae | 699.326 | 0.71–0.84 | — | 1/C | 40.3 | ||
| Characiformes | Characidae | 1335.24 | — | 25/50 | 1/Ch | 38.4 | ||
| Cyprinodontiformes | Rivulidae | 866.963 | — | — | 1/S | 41.1 | ||
| Gobiiformes | Gobiid., Oxudercinae | 955.752 | — | —/46 | 1/S | 40.1 | ||
| Cephalochordata | Branchiostomidae | 426.124 | — | —/36 | 2/S | 40.15 | ||
| Cephalochordata | Branchiostomidae | 521.895 | — | — | 1/S | 41.8 | ||
| Chimaeriformes | Callorhinchidae | 974.499 | 1.94 | — | 1/S | 42.6 | ||
| Perciformes | Channidae | 615.3 | 0.63–0.77 | —/48 | 1/D | — | ||
| Clupeiformes | Clupeidae | 807.712 | — | —/50-52 | 1/S | 44.5 | ||
| Perciformes | Cottidae | 563.609 | — | — | 1/S | 36.8 | ||
| Pleuronectiformes | Cynoglossidae | 470.199 | 0.62 | 22 | 1/Ch | 41.27 | ||
| Cyprinodontiformes | Cyprinodontidae | 1011.85 | — | —/48 | 1/S | 39.0 | ||
| Cyprinodontiformes | Cyprinodontidae | 1035.18 | 1.6 | —/48 | 1/S | 39.5 | ||
| Cypriniformes | Cyprinidae | 1713.66 | 1.6–2.0 | 50/100 | 2/S, Ch | med 37.1 | ||
| Cypriniformes | Cyprinidae | 1679.2 | 1.6–2.3 | 25/48 | 4/3S, 1 Chr | med 36.7 | ||
| Moroniformes | Moronidae | 675.917 | 0.78 | — | 2/S | 40.4 | ||
| Myxiniformes | Myxinidae/Eptatretinae | 2608.38 | 2.98 | /36 | 1/S | — | ||
| Esociformes | Esocidae | 904.497 | 0.8–1.4 | 25/50 | 3/Ch | 42.2 | PB | |
| Cyprinodontiformes | Fundulidae | 1021.9 | 1.3–1.5 | -/48 | 1/S | 41.2 | ||
| Gadiformes | Gadidae | 824.311 | 0.4–0.9 | -/46 | 2/S | 46.3 | 454, I, PB [ | |
| Perciformes | Cottiodei, Gasterosteid. | 446.611 | 0.6–0.7 | —/42 | 2/S, Ch | 44.6 | LM [ | |
| Perciformes × Cichliformes | Cichlidae (Afr.) | 831.412 | 0.97 | —/40 | 1/S | 41.9 | ||
| Syngnathiformes | Syngnathidae | 493.776 | — | — | 1/S | 43.7 | ||
| Siluriformes | Ictaluridae | 783.275 | 1.0 | 29/58 | 1/Ch | 39.8 | PB, I [ | |
| Cyprinodontiformes | Rivulidae | 680.367 | — | —/48 | 2/S | med 39.5 | ||
| Perciformes × Cichliformes | Cichlidae (Afr.) | 70.8584 | — | —/44 | 1/S | 42.1 | ||
| Labriformes | Labridae | 805.481 | — | — | 1/S | 40.9 | ||
| Acanthuriformes | Sciaenidae | 678.938 | — | — | 2/S | med 41.4 | ||
| Perciformes | Centropomidae | 668.481 | 0.7 | —/48 | 2/S | med 40.6 | ||
| Ceolacanthiformes | Coelacanthidae | 2860.59 | 2.8–6.6 | —/48 | 2/S | med 42.5 | ||
| Lepisosteiformes | Lepisosteidae | 945.878 | 1.4 | 29/58 | 1/Ch | 40.4 | ||
| Petromyzontiformes | Petromyzontidae | 1030.66 | ~1.4 | —/144-162 | 1/S | 48.1 | ||
| Cypriniformes | Cyprinidae | 752.539 | congeners 1.2–1.5 | —/50 | 1/S | 37.4 | ||
| Rajiformes | Rajidae | 1555.46 | 3.5–4.6 | — | 1/C | 40.3 | ||
| Perciformes × | Percichthyidae | 633.241 | 0.83 | — | 1/S | — | ||
| Perciformes × Cichliformes | Cichlidae (Afr.) | 859.842 | — | — | 2/S | 41.4 | ||
| Perciformes × Cichliformes | Cichlidae (Afr.) | 73.4256 | — | — | 1/S | 41.8 | ||
| Cypriniformes | Cyprinidae, Cultrinae | 1116.0 | 1.12–1.35 | — | 1/D | 37.3 | [ | |
| Perciformes × Cichliformes | Cichlidae (Afr.) | 68.2386 | — | —/46 | 1/S | 41.5 | ||
| Acanthuriformes | Sciaenidae | 619.301 | — | — | 1/S | 39.3 | ||
| Tetraodontiformes | Molidae | 639.452 | 0.8–0.9 | —/46 | 1/S | 41.2 | [ | |
| Synbranchiformes | Synbranchidae | 684.144 | 0.6–0.9 | —/24 | 1/S | 41.5 | ||
| Moroniformes | Moronidae | 585.167 | 0.9 | —/48 | 1/S | 40.0 | ||
| Perciformes × Cichliformes | Cichlidae (Afr.) | 847.91 | — | - | 1/S | 42 | ||
| Cyprinodontiformes | Notobranchiidae | 1132.74 | 1.56 | 19/38 | 4/S, Chr | 43, 8 | ||
| Cyprinodontiformes | Notobranchiidae | 5.23461 | — | —/38 | 1/S | 44.8 | ||
| Perciformes | Nototheniidae | 636.614 | — | — | 1/S | 40.8 | ||
| Salmoniformes | Salmonidae | 2369.93 | 2.6–3.0 | 30/60 | 1/Ch | 43.6 | ||
| Salmoniformes | Salmonidae | 2179 | 1.9–2.9 | 29/60 | 2/Ch | med 43.7 | ||
| Perciformes × Cichliformes | Cichlidae (Afr.) | 1009.86 | 0.9–1.2 | 23/44 | 2/Ch | med 39.9 | PB, I [ | |
| Beloniformes | Adrianichthyidae | 869.818 | 0.9–1.1 | 24/48 | 5/Ch | med 40.8 | ||
| Scombriformes | Stromateidae | 350.449 | — | — | 1/S | 38.2 | ||
| Pleuronectiformes | Paralichthyidae | 643.911 | 0.7 | 24/46–48 | 2/S, Ch | med 42.4 | ||
| Gobiiformes | Gobiid., Oxudercinae | 679.761 | 0.96 | — | 1/S | 40.2 | ||
| Gobiiformes | Gobiid., Oxudercinae | 701.697 | — | — | 1/S | 40 | ||
| Petromyzontiformes | Petromyzontidae | 885.535 | 1.6–2.4 | —/168 | 1/S | 46.8 | ||
| Cypriniformes | Cyprinidae | 1219.33 | 1.1 | —/50 | 2/S | med 40.6 | ||
| Cyprinodontiformes | Poeciliidae | 748.923 | 0.75–0.97 | —/46 | 1/S | 39.6 | ||
| Cyprinodontiformes | Poeciliidae | 815.145 | 0.9–1.0 | —/46 | 1/S | 40.8 | ||
| Cyprinodontiformes | Poeciliidae | 801.711 | 0.7–1.38 | —/46 | 1/S | 40.7 | ||
| Cyprinodontiformes | Poeciliidae | 731.622 | 0.77–1.0 | 23/46 | 1/Ch | 40.3 | ||
| Osmeriformes | Salangidae | 525 | — | — | 1/D | — | ||
| Pleuronectiformes | Pleuronectidae | 547.831 | 0.67 | —/48 | 1/C | 42 | ||
| Perciformes × Cichliformes | Cichlidae (Afr.) | 830.133 | — | — | 1/S | 41.9 | ||
| Characiformes | Serrasalmidae | 1285.35 | — | — | 1/S | 40.6 | ||
| Perciformes × Cichliformes | Cichlidae (Afr.) | 71.2951 | — | — | 1/S | 42.3 | ||
| Orectolobiformes | Rhincodontidae | 2931.6 | — | — | 1/S | 41.8 | ||
| Salmoniformes | Salmonidae | 2966.89 | 3.0–3.3 | 29/60 | 1/Ch | 43.9 | PB, I, S [ | |
| Gobiiformes | Gobiid., Oxudercinae | 695.009 | — | — | 1/S | 39.1 | ||
| Osteoglossiformes | Osteoglossidae | 777.359 | — | — | 4/S, Chr | med 43.9 | ||
| Scorpaeniformes | Sebastidae | 899.65 | — | — | 1/S | 40.9 | ||
| Scorpaeniformes | Sebastidae | 681.653 | — | — | 1/S | 40.8 | ||
| Scorpaeniformes | Sebastidae | 746.045 | — | — | 1/S | 40.8 | ||
| Scorpaeniformes | Sebastidae | 756.297 | — | — | 1/S | 40.7 | ||
| Scorpaeniformes | Sebastidae | 648.011 | — | — | 1/S | 40.7 | ||
| Carangiformes | Carangidae | 677.67 | 0.74 | — | 1/S | 40.9 | ||
| Carangiformes | Carangidae | 685 | — | — | 1/S | - | PB, I | |
| Carangiformes | Carangidae | med 750.45 | 0.83 | — | 2/S | 40.25 | ||
| Cypriniformes | Cyprinidae | 1632.72 | — | — | 1/S | 38 | ||
| Cypriniformes | Cyprinidae | 1750.29 | 2.35 | —/96 | 1/S | 38.7 | ||
| Cypriniformes | Cyprinidae | 1655.79 | — | — | 1/S | 38.1 | ||
| Cypriniformes | Cyprinidae | 48.1393 | 2.4 | —/50 | 1/C | 51.8 | ||
| Ovalentaria | Pomacentridae | 800.492 | — | — | 1/S | 42.1 | ||
| Tetraodontiformes | Tetraodontidae | 378.032 | — | — | 1/S | 45.6 | ||
| Tetraodontiformes | Tetraodontidae | 391.485 | 0.4 | 22/44 | 1/Ch | 45.8 | ||
| Tetraodontiformes | Tetraodontidae | 342.403 | 0.35–0.51 | 21/42 | 7/S, Ch | 46.6 | Genoscope | |
| Scombriformes | Scombridae | 684.497 | — | —/48 | 1/C | 39.7 | ||
| Cyprinodontiformes | Poeciliidae | 708.396 | 0.75 | — | 1/S | 40.9 | ||
| Cyprinodontiformes | Poeciliidae | 733.802 | 0.7–1.0 | —/48 | 1/S | 41.2 | ||
| Cyprinodontiformes | Poeciliidae | 729.664 | 0.8–1.0 | —/48 | 1/S | 39.8 |
a orders and families mostly based on [22]; b c-value, based on [12] database; c n: based on genomic/sequence data, originates from NCBI, 2n based on cytogenetic data [86]; d number of assemblies currently released and level of assembly (C = contig, D = draft, S = scaffold, Ch = chromosomal level); e Sequencing methods (I = Illumina, PB = PacBio, S = Sanger), linkage map (LM) available.