| Literature DB >> 32194618 |
Yude Wang1,2, Huifang Tan1, Minghe Zhang1, Rurong Zhao1, Shi Wang1, Qinbo Qin1, Jing Wang1, Chun Zhang1, Min Tao1, Ming Ma2, Bo Chen2, Shaojun Liu1.
Abstract
Distant hybridization leads to obvious changes in genotypes and phenotypes, giving rise to species with novel capabilities. However, the fusion of distinct genomes also polymerizes the DNA or gene variations that occur during the course of evolution. Knowledge of the early stages of post-hybridization evolution is particularly important. Here, we investigated the full-length (FL) transcriptomes and the sequences resulting from the genome resequencing of the red crucian carp-like homodiploid fish (RCC-L) and goldfish-like homodiploid fish (GF-L) derived from the interspecific hybridization of koi carp (KOC) and blunt snout bream (BSB) to provide molecular evidence for the hybrid origin of the goldfish (GF). We compared the orthologous genes in the transcriptomes of RCC-L and GF-L with those of KOC and BSB. We also mapped the orthologous genes to the common carp (CC) and BSB genomes and classified them into eight gene patterns in three categories (chimaera, mutant, and biparental origin genes). The results showed that 48.20% and 46.50% of the genes were chimaera and that 3.70% and 8.30% of the genes were mutations of orthologous genes in RCC-L and GF-L, respectively. In RCC-L and GF-L, 63.70% and 68.20% of the genetic materials were from KOC, and 12.30% and 11.90% of the genetic materials were from BSB. The sequences from the genome resequencing of RCC-L and GF-L were mapped to the genome sequences of CC and BSB, revealing that the similarities of both RCC-L and GF-L to the CC genome (92.52%, 90.18%) were obviously higher than to the BSB genome (50.33%, 49.18%), supporting the suggestion that the genomes of both RCC-L and GF-L were mainly inherited from KOC but had some DNA fragments from BSB. Overall, our results provide molecular biological evidence for the hybrid origin of red crucian carp (RCC) and GF.Entities:
Keywords: chimeric genes; full-length transcriptomes; gene structure; goldfish-like homodiploid fish; resequencing
Year: 2020 PMID: 32194618 PMCID: PMC7063666 DOI: 10.3389/fgene.2020.00122
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1The formation of experimental fish hybrids. (A) KOC; (B) BSB; (C) RCC-L-F1; (D) RCC-L-F2; (E) GF-L-F1; (F) RCC-L-F3; (G) GF-L-F2, Bar = 3cm.
Summary of the PacBio-based RNA sequencing in this study.
| GF-L | RCC-L | KOC | BSB | |
|---|---|---|---|---|
| Number of non-full-length PacBio reads | 65479 | 7003 | 8428 | 7832 |
| Number of full-length non-chimeric PacBio reads | 59274 | 91459 | 93736 | 114649 |
| Average length of full-length non-chimeric PacBio reads (bp) | 1968 | 1893 | 1893 | 1893 |
| Number of non-redundant full-length transcripts after correction | 55807 | 55837 | 61281 | 67473 |
| High-quality full-length non-chimeric PacBio reads (bp) | 8621 | 28705 | 27516 | 31341 |
New gene annotations.
| Species | Database | KOG | KEGG | NR | Swiss-Prot | GO | Total unigenes | Total annotated |
|---|---|---|---|---|---|---|---|---|
| GF-L | Gene number | 165 | 134 | 289 | 219 | 105 | 682 | 308 |
| Annotation ratio | 24.19% | 19.65% | 42.38% | 32.11% | 15.40% | – | 45.16% | |
| RCC-L | Gene number | 91 | 55 | 185 | 122 | 61 | 489 | 246 |
| Annotation ratio | 18.61% | 11.25% | 37.83% | 24.95% | 12.47% | – | 50.31% | |
| KOC | Gene number | 430 | 267 | 555 | 475 | 326 | 691 | 556 |
| Annotation ratio | 62.23% | 38.64% | 80.32% | 68.74% | 47.18% | – | 80.16% | |
| BSB | Gene number | 315 | 169 | 664 | 448 | 269 | 1356 | 667 |
| Annotation ratio | 23.23% | 12.46% | 48.97% | 33.04% | 19.84% | – | 49.19% |
Figure 2GO analysis results of new genes in the (A) KOC, (B) BSB, (C) RCC-L and (D) GF-L genomes.
Figure 3Analysis of Eukaryotic Orthologous Groups (KOG) functional classification of the new genes.
Figure 4Fusion genes. (A) PBfusion6; (B) PBfusion49; (C) PBfusion67; (D) PBfusion134; (E) PBfusion172.
The nucleotide identity percentages of fusion gene sequences in the RCC-L and GF-L genomes compared with the genomes of related species.
| Fusion gene ID | Common carp genome | Blunt snout bream genome | ||
|---|---|---|---|---|
| Alignment location | Alignment Similarity (%) | Alignment location | Alignment similarity (%) | |
| PB.fusion6 | NC_031698.1 | 84.38 | scaffold33: 2504726-2504911 | 91.94 |
| PB.fusion49 | NC_031698.112756298-12756707 | 85.41 | scaffold33: 2504726-2504911 | 91.94 |
| PB.fusion67 | NC_031698.112756527-12756707 | 93.92 | scaffold33: 2504726-2504906 | 91.71 |
| PB.fusion134 | NC_031698.112756298-12756707 | 84.51 | scaffold33: 2504726-2504911 | 91.94 |
| PB.fusion172 | NC_031698.112756296-12756707 | 85.48 | scaffold33: 2504726-2504911 | 92.47 |
Figure 5The gene chimeric model. Schematic diagrams of gene patterns for the offspring arising from the hybridization of KOC (K) and BSB (B). Blue bars marked K variation denote offspring fragments with the KOC-specific variants; green bars marked B variation show BSB-specific variants, and red bars marked F variation show offspring-specific variants. Genes were classified into three categories. The first category includes patterns 1–3 (A–C, respectively) in which chimeric genes had single or multiple chimeric fragments consisting of successive alternating variations from parent-specific variants, either with or without offspring-specific mutations. The second category includes patterns 4–6 (D–F, respectively), consisting of genes derived from either or both progenitors but with mutations unique to the offspring. The third category includes patterns 7–8 (G–H, respectively), in which genes are derived exclusively from one parent.
Figure 6Distribution of the Ka/Ks ratios in 5,621 putative orthologous gene pairs between GF-L and KOC.
The resequenced RCC-L and GF-L genomes compared with the genomes of related species.
| Species | Common carp genome | Blunt snout bream genome | Red crucian carp genome | |||
|---|---|---|---|---|---|---|
| The number of alignment reads | Alignment similarity (%) | The number of alignment reads | Alignment similarity (%) | The number of alignment reads | Alignment similarity (%) | |
| RCC-L | 694868799 | 92.52% | 437080629 | 57.33% | 761625845 | 99.66% |
| GF-L | 366968429 | 90.18% | 230436483 | 49.18% | 376491759 | 99.28% |
Summary of SNPs in the RCC-L and GF-L genomes.
| Item | RCC-L | GF-L | ||
|---|---|---|---|---|
| Mapping to CC | Ratio | Mapping to CC | Ratio | |
| SNP number | 29376810 | 1 | 39260783 | 1 |
| CDS | 2148286 | 0.073128634 | 2653293 | 0.067581255 |
| Synonymous | 1002681 | 0.034131718 | 1217248 | 0.03100417 |
| Missense | 754025 | 0.025667355 | 937279 | 0.023873161 |
| Stopgain | 7076 | 0.00024087 | 9772 | 0.0002489 |
| Stoploss | 1241 | 4.22442E-05 | 1709 | 4.35294E-05 |
| Other | 383263 | 0.013046447 | 487285 | 0.012411495 |
| Intronic | 10300931 | 0.350648386 | 13299414 | 0.338745511 |
| Upstream | 4332007 | 0.147463492 | 5909459 | 0.150518114 |
| Downstream | 3216301 | 0.109484352 | 4383919 | 0.111661527 |
| Intergenic | 8145279 | 0.277269009 | 11548961 | 0.294160231 |