Literature DB >> 31779581

Rapid genomic DNA variation in newly hybridized carp lineages derived from Cyprinus carpio (♀) × Megalobrama amblycephala (♂).

Kaikun Luo1,2, Shi Wang1,2,3, Yeqing Fu1,2, Pei Zhou1,2, Xuexue Huang1,2, Qianhong Gu1,2, Wuhui Li1,2,4, Yude Wang1,2,3, Fangzhou Hu1,2,3, Shaojun Liu5,6.   

Abstract

BACKGROUND: Distant hybridization can generate changes in phenotypes and genotypes that lead to the formation of new hybrid lineages with genetic variation. In this study, the establishment of two bisexual fertile carp lineages, including the improved diploid common carp (IDC) lineage and the improved diploid scattered mirror carp (IDMC) lineage, from the interspecific hybridization of common carp (Cyprinus carpio, 2n = 100) (♀) × blunt snout bream (Megalobrama amblycephala, 2n = 48) (♂), provided a good platform to investigate the genetic relationship between the parents and their hybrid progenies. RESULT: In this study, we investigated the genetic variation of 12 Hox genes in the two types of improved carp lineages derived from common carp (♀) × blunt snout bream (♂). Hox gene clusters were abundant in the first generation of IDC, but most were not stably inherited in the second generation. In contrast, we did not find obvious mutations in Hox genes in the first generation of IDMC, and almost all the Hox gene clusters were stably inherited from the first generation to the second generation of IDMC. Interestingly, we found obvious recombinant clusters of Hox genes in both improved carp lineages, and partially recombinant clusters of Hox genes were stably inherited from the first generation to the second generation in both types of improved carp lineages. On the other hand, some Hox genes were gradually becoming pseudogenes, and some genes were completely pseudogenised in IDC or IDMC.
CONCLUSIONS: Our results provided important evidence that distant hybridization produces rapid genomic DNA changes that may or may not be stably inherited, providing novel insights into the function of hybridization in the establishment of improved lineages used as new fish resources for aquaculture.

Entities:  

Keywords:  Distant hybridization; Hox gene; Lineage; Pseudogene; Recombinant cluster

Mesh:

Substances:

Year:  2019        PMID: 31779581      PMCID: PMC6883602          DOI: 10.1186/s12863-019-0784-2

Source DB:  PubMed          Journal:  BMC Genet        ISSN: 1471-2156            Impact factor:   2.797


Background

Hybridization may cause interactions involving a wide range of types and levels of genetic divergence between the parental forms [1]. In nature, hybridization among species is reasonably common on a per-species basis, even though it is usually very rare on a per-individual basis. On a per-individual basis, the isolation mechanisms (e.g., reproductive barriers) prevented the occurrence of high frequency hybridization events among individuals of different species. Although hybrids are rare in populations, a few hybrids can provide a bridge to allow a trickle of alleles to pass between species. Thus, if species that hybridize are common, even low rates of hybridization per individual can have important evolutionary consequences in a high fraction of species. It was found that approximately 10–30% of multicellular animal and plant species hybridize regularly [2]. Hybridization among species can act as an additional, perhaps more abundant, source of adaptive genetic variation than mutation (very rare, approximately 10− 8 to 10− 9 per generation per base pair) [3-7]. For example, in Darwin’s finches, ‘New additive genetic variance introduced by hybridization is estimated to be two to three orders of magnitude greater than that introduced by mutation’ [3]. In both plants and animals, distant hybridization appears to facilitate speciation and adaptive radiation [8]. Hybridization has played a key role in recombining the adaptive traits of two species and generating novel phenotypes [9]. For example, common wheat (Triticum aestivum), originated from hybridization between T. turgidum and Aegilops tauschii, has significantly increased grain yield and the harvest index [10]; another plant hybrid is derived from the interspecific hybridization between Vigna umbellata (♀) and V. exilis (♂), which is tolerant to drought and presents early flowering [11]. In Cyprinidae, the autotetraploid hybrids, originated from hybridization between red crucian carp (Carassius auratus red var., ♀) × blunt snout bream (Megalobrama amblycephala, ♂), has significantly shortened the age of sexual maturity compared to their allotetraploid parents [12]; the hybrids derived from blunt snout bream (♀) × Bleeker’s yellow tail (Xenocypris davidi Bleeker, ♂) has showed significantly higher growth rate compared to their parents [13]. Hybridization can lead to rapid genomic changes, including chromosomal rearrangements, genome expansion, genomic DNA variation, differential gene expression, and gene silencing [14]. One such example is that of Brassica hybrids, in which multiple genome rearrangements and segment deletions occurred within five generations [15]. In addition, Rieseberg et al. found extensive genomic reorganization and karyotypic evolution in Helianthus hybrids, indicating the occurrence of rapid karyotypic evolution [16]. In Cyprinidae, in the allotetraploid hybrids, chimeric genes (9.67–11.06%) and mutation events (1.02–1.16%) occurred in different generations of this nascent allopolyploids [17]; Liu et al. revealed 19.04%, 4.17% chimeric genes and 6.90%, 5.05% mutations of orthologous genes in F1 and F2 of diploid hybrids, respectively [18]. Distant hybridization can generate changes in phenotypes and genotypes, leading to the formation of new hybrid lineages with genetic variation and providing a good experimental model for tracing the changes of genetic and epigenetic levels in the early stage of distant hybridization. Moreover, these newly established bisexual fertile diploid and tetraploid lineages provide new germplasm resources, which are used to produce improved diploid and triploid varieties by crossing diploid species, respectively [19-22]. In our previous study, we successfully obtained two types of improved carp offspring from common carp (2n = 100, abbreviated COC) (♀) × blunt snout bream (2n = 48, abbreviated BSB) (♂); one is the improved diploid common carp (2n = 100, IDC-F1), and the other is the improved diploid scattered mirror carp (2n = 100, IDMC-F1) [23]. In this study, we carried out self-crossing of these two types of improved carp offspring (IDC-F1 and IDMC-F1), respectively. Interestingly, the self-crossed offspring of IDC-F1 showed two phenotypes: one was consistent with that of their parents (abbreviated IDC-F2-C), and the other was very similar to that of IDMC-F1 (abbreviated IDC-F2-M) (Fig. 1a). In contrast, the self-crossed offspring of IDMC-F1 showed only one phenotype, that is, a scattered mirror carp-like appearance (abbreviated IDMC-F2) (Fig. 1a). To further explore the relationship of the genetic evolution of COC, IDC and IDMC, we studied the Hox gene structures in the genomic DNA of the different generations of the IDC and IDMC lineages. Determination of the genotypes of these lineages is very useful for understanding the processes associated with the genomic DNA changes that accompany phenotype changes.
Fig. 1

Crossing procedure and appearances of COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2; variable sequence types (including haplotypes and recombinant clusters) in different Hox genes in these species. a Crossing procedure and appearances of COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2. b Variable sequence types (including haplotypes and recombinant clusters) in HoxA4a in these species. c Variable sequence types (including haplotypes and recombinant clusters) in HoxD4a in these species. d Variable sequence types (including haplotypes and recombinant clusters) in HoxD10a in these species

Crossing procedure and appearances of COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2; variable sequence types (including haplotypes and recombinant clusters) in different Hox genes in these species. a Crossing procedure and appearances of COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2. b Variable sequence types (including haplotypes and recombinant clusters) in HoxA4a in these species. c Variable sequence types (including haplotypes and recombinant clusters) in HoxD4a in these species. d Variable sequence types (including haplotypes and recombinant clusters) in HoxD10a in these species Hox genes, which encode transcription factors, are essential for the development of various morphological features. In vertebrates, Hox genes consist of two exons and the highly conserved homeodomain (60 aa), which is encoded by the second exon [24]. Late evolutionary novelties are generally considered to be associated either with the emergence of particular lineages or with important steps in their unique evolution [25]. Recent studies have shown that the origin and evolution of the Hox genes played a crucial role in genome replication, sequence variation, and selective pressure [25-28]. The search for regulatory elements through comparative genomic approaches using Hox genes promises to be particularly successful because their nucleotide sequences and functions are extremely conserved in all vertebrates; meanwhile, Hox gene clusters provide a good starting point for the study of genetic variation in genomic DNA [29].

Results

Sequence information for COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2 clones

In this study, we used 12 pairs of degenerate PCR primers (Additional file 1: Table S1) to obtain partial sequence information for 20 putative Hox genes from COC, 12 putative Hox genes from BSB, 42 putative and 15 recombinant Hox genes from IDC-F1, 19 putative and 5 recombinant Hox genes from IDMC-F1, 17 putative and 12 recombinant Hox genes from IDMC-F2, 19 putative and 10 recombinant Hox genes from IDC-F2-C, and 18 putative and 13 recombinant Hox genes from IDC-F2-M. All of these fragments were between 700 and 1600 bp in length, including the exon 1-intron-exon 2 region (Tables 1 and 2). In this study, to avoid biased amplification of only one copy of the characterized Hox genes, we selected 30 clones of each gene from IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2 and 20 clones of each gene from COC and BSB. All fragments from COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2 were confirmed to be Hox gene sequences via the NCBI website (http://www.ncbi.nlm.nih.gov), and each included the conserved homeobox region. All of the sequence information and GenBank accession numbers in this study is detailed in Additional file 1: Table S2.
Table 1

PCR amplification bands (non-recombinant bands) in COC, BSB, IDC-F1, IDMC-F1, IDMC-F2, IDC-F2-C, and IDC-F2-M

GenesSpeciesLocusSize (bp)Exon 1 (bp)Intron (bp)Exon 2 (bp)
HoxA4aCOCHoxA4ai117789–500501–970971–1177
HoxA4aii118289–500501–975976–1182
BSBHoxA4a-BSB118889–500501–981982–1188
IDC-F1HoxA4ai117789–500501–970971–1177
HoxA4aii118289–500501–975976–1182
HoxA4aiii118489–500501–977978–1184
HoxA4a-1118189–500501–974975–1181
HoxA4a-BSB118889–500501–981982–1188
IDMC-F1HoxA4ai117789–500501–970971–1177
HoxA4aii118289–500501–975976–1182
IDMC-F2HoxA4ai117789–500501–970971–1177
HoxA4aii118289–500501–975976–1182
IDC-F2-CHoxA4ai117789–500501–970971–1177
HoxA4aii118289–500501–975976–1182
IDC-F2-MHoxA4ai117789–500501–970971–1177
HoxA4aii118289–500501–975976–1182
HoxA9aCOCHoxA9ai8171–381382–620621–817
HoxA9aii8911–381382–694695–891
BSBHoxA9b8791–381382–682683–879
IDC-F1HoxA9ai8171–381382–620621–817
HoxA9aii8671–381382–670671–867
IDMC-F1HoxA9ai8171–381382–620621–817
IDMC-F2HoxA9ai8171–381382–620621–817
IDC-F2-CHoxA9ai8171–381382–620621–817
HoxA9aii8911–381382–694695–891
IDC-F2-MHoxA9ai8171–381382–620621–817
HoxA2bCOCHoxA2bi14901–314315–905906–1490
HoxA2bii14751–314315–890891–1475
BSBHoxA2b14791–311312–894895–1479
IDC-F1HoxA2bi14901–314315–905906–1490
HoxA2bii14751–314315–890891–1475
HoxA2biii14861–314315–901902–1486
HoxA2b-114481–314315–863864–1448
IDMC-F1HoxA2bi14901–314315–905906–1490
HoxA2bii14751–314315–890891–1475
IDC-F2-CHoxA2bi14901–314315–905906–1490
HoxA2bii14751–314315–890891–1475
IDC-F2-MHoxA2bi14901–314315–905906–1490
HoxA2bii14751–314315–890891–1475
IDMC-F2HoxA2bi14901–314315–905906–1490
HoxA2bii14751–314315–890891–1475
HoxA11bCOCHoxA11bi14403–590591–13421343–1440
BSBHoxA11b-BSB17033–602603–16051606–1703
IDC-F1HoxA11bi14403–590591–13421343–1440
HoxA11bii14013–590591–13031304–1401
IDMC-F1HoxA11bi14393–590591–13421343–1439
IDMC-F2HoxA11bi14393–590591–13421343–1439
IDC-F2-CHoxA11bi14403–590591–13421343–1440
IDC-F2-MHoxA11bi14403–590591–13421343–1440
HoxB1aCOCHoxB1aiψ1510
HoxB1aii15261–462463–12501251–1526
BSBHoxB1a15221–459460–12461247–1522
IDC-F1HoxB1aiψ1510
HoxB1aiii14841–450451–12081209–1484
IDMC-F1HoxB1aiψ1510
HoxB1aii15251–462463–12491250–1525
IDMC-F2HoxB1aiψ1510
IDC-F2-CHoxB1aiψ1510
IDC-F2-MHoxB1aiψ1510
HoxB4aψCOCHoxB4aiψ1631
BSBHoxB4aψ1617
IDC-F1HoxB4aiψ1630
HoxB4aiiψ1613
IDMC-F1HoxB4aiψ1630
IDMC-F2HoxB4aiψ1630
IDC-F2-CHoxB4aiψ1630
IDC-F2-MHoxB4aiψ1630
HoxB1bCOCHoxB1bi7311–477478–565566–731
BSBHoxB1b-BSB7511–477478–585586–751
IDC-F1HoxB1bi7311–477478–565566–731
HoxB1bii7331–477478–567568–733
HoxB1b-BSB7511–477478–585586–751
IDMC-F1HoxB1bi7311–477478–565566–731
IDMC-F2HoxB1bi7311–477478–565566–731
IDC-F2-CHoxB1bi7311–477478–565566–731
IDC-F2-MHoxB1bi7311–477478–565566–731
HoxB5bCOCHoxB5bi11911–561562–985986–1191
HoxB5bii11901–564565–984985–1190
BSBHoxB5b-BSB12271–564565–10211022–1227
IDC-F1HoxB5bi11911–561562–985986–1191
HoxB5bii11901–564565–984985–1190
HoxB5biii11961–561562–990991–1196
HoxB5b-BSB12261–564565–10201021–1226
IDMC-F1HoxB5bi11911–561562–985986–1191
HoxB5bii11901–564565–984985–1190
IDMC-F2HoxB5bi11911–561562–985986–1191
IDC-F2-CHoxB5bi11901–561562–984985–1190
HoxB5bii11901–564565–984985–1190
IDC-F2-MHoxB5bi11901–561562–984985–1190
HoxB5bii11901–564565–984985–1190
HoxC4aCOCHoxC4ai11691–410411–928929–1169
HoxC4aii11761–410411–935936–1176
BSBHoxC4a11251–410411–933934–1125
IDC-F1HoxC4ai11691–410411–928929–1169
HoxC4aii11761–410411–935936–1176
HoxC4aiii11731–410411–932933–1173
HoxC4a-111791–410411–938939–1179
IDMC-F1HoxC4ai11691–410411–928929–1169
HoxC4aii11761–410411–935936–1176
IDMC-F2HoxC4ai11681–410411–928929–1168
HoxC4aii11741–410411–934935–1174
IDC-F2-CHoxC4ai11691–410411–928929–1169
HoxC4aii11761–410411–935936–1176
IDC-F2-MHoxC4ai11691–410411–928929–1169
HoxC4aii11751–410411–934935–1175
HoxC6bCOCHoxC6bi9422–392393–763764–942
BSBHoxC6b-BSB9222–392393–737738–922
IDC-F1HoxC6bi9492–392393–763764–949
HoxC6bii9642–392393–778779–964
HoxC6b-BSB9232–392393–737738–923
IDMC-F1HoxC6bi9492–392393–763764–949
IDMC-F2HoxC6bi9492–392393–763764–949
IDC-F2-CHoxC6bi9492–392393–763764–949
IDC-F2-MHoxC6bi9492–392393–763764–949
HoxD4aCOCHoxD4ai9421–315316–717718–942
HoxD4aii9441–315316–719720–944
BSBHoxD4a-BSB9111–306307–686687–911
IDC-F1HoxD4ai9421–315316–717718–942
HoxD4aii9441–315316–719720–944
HoxD4aiii9521–315316–727728–952
HoxD4a-19371–315316–712713–937
HoxD4a-29601–315316–735736–960
HoxD4a-BSB9111–306307–686687–911
IDMC-F1HoxD4ai9421–315316–717718–942
HoxD4aii9441–315316–719720–944
IDMC-F2HoxD4ai9421–315316–717718–942
HoxD4aii9441–315316–719720–944
IDC-F2-CHoxD4ai9421–315316–717718–942
HoxD4aii9441–315316–719720–944
IDC-F2-MHoxD4ai9421–315316–717718–942
HoxD4aii9441–315316–719720–944
HoxD10aCOCHoxD10ai15511–589590–13211322–1551
HoxD10aii15461–592593–13161317–1546
BSBHoxD10a-BSB15741–592593–13441345–1574
IDC-F1HoxD10ai15541–589590–13241325–1554
HoxD10aii15461–592593–13161317–1546
HoxD10aiii14951–592593–12651266–1495
HoxD10a-114801–592593–12501251–1480
HoxD10a-BSB15741–592593–13441345–1574
IDMC-F1HoxD10ai15541–589590–13241325–1554
HoxD10aii15461–592593–13161317–1546
IDMC-F2HoxD10ai15541–589590–13241325–1554
HoxD10aii15461–592593–13161317–1546
IDC-F2-CHoxD10ai15541–589590–13241325–1554
HoxD10aii15461–592593–13161317–1546
IDC-F2-MHoxD10ai15541–589590–13241325–1554
HoxD10aii15441–592593–13141315–1544

Ψdenotes a pseudogene

Table 2

PCR amplification bands (recombinant bands) in COC, BSB, IDC-F1, IDMC-F1, IDMC-F2, IDC-F2-C, and IDC-F2-M

GenesSpeciesLocusSize (bp)Exon1 (bp)Intron (bp)Exon2 (bp)
HoxA4aIDC-F1HoxA4ai + HoxA4aii118289–500501–975976–1182
HoxA4aii + HoxA4a-1 + HoxA4ai118289–500501–975976–1182
HoxA4ai + HoxA4aiii118489–500501–977978–1184
HoxA4ai + HoxA4a-BSB118889–500501–981982–1188
HoxA4ai + HoxA4a-BSB + HoxA4ai118289–500501–975976–1182
HoxA4a-BSB + HoxA4ai1182/118889–500501–975/981976/982–1182/1188
HoxA4a-BSB + HoxA4aii118889–500501–981982–1188
IDMC-F1HoxA4aii + HoxA4ai118289–500501–975976–1182
IDMC-F2HoxA4aii + HoxA4ai118289–500501–975976–1182
HoxA4ai + HoxA4aii118289–500501–975976–1182
IDC-F2-CHoxA4aii + HoxA4ai1177/118289–500501–970/975971/976–1177/1182
HoxA4ai + HoxA4aii1177/118289–500501–970/975971/976–1177/1182
IDC-F2-MHoxA4ai + HoxA4aii1177/118289–500501–970/975971/976–1177/1182
HoxA9aIDMC-F2HoxA9ai + HoxA9aii8911–381382–694695–891
HoxA2bIDC-F1HoxA2bi + HoxA2bii14901–314315–905906–1490
IDMC-F2HoxA2bi + HoxA2bii14901–314315–905906–1490
IDC-F2-CHoxA2bii + HoxA2bi14751–314315–890891–1475
IDC-F2-MHoxA2bii + HoxA2bi14751–314315–890891–1475
HoxA11bIDMC-F2HoxA11bi + HoxA11b-BSB + HoxA11bi1439/14553–590/605591/606–1342/13581343/1359–1439/1455
IDC-F2-MHoxA11bi + HoxA11b-BSB + HoxA11bi14553–605606–13571358–1455
HoxB1aIDC-F2-MHoxB1ai + HoxB1aiiψ15101–462463–12341235–1510
HoxB1aii + HoxB1aiψ15031–462463–12271228–1503
HoxB5bIDC-F1HoxB5bi + HoxB5bii + HoxB5bi11901–564565–984985–1190
HoxB5bi + HoxB5bii11901–564565–984985–1190
IDMC-F1HoxB5bii + HoxB5bi11941–564565–988989–1194
IDMC-F2HoxB5bi + HoxB5bii11871–561562–981982–1187
HoxB5bii + HoxB5bi + HoxB5bii11941–564565–988989–1194
IDC-F2-CHoxB5bii + HoxB5bi11901–561562–984985–1190
IDC-F2-MHoxB5bii + HoxB5bi11901–564565–984985–1190
HoxB5bi + HoxB5bii11881–561562–982983–1188
HoxC4aIDC-F1HoxC4aii + HoxC4ai11741–410411–933934–1174
HoxC4aii + HoxC4aiii11731–410411–932933–1173
IDMC-F1HoxC4aii + HoxC4aiψ11691–410411–928929–1169
IDMC-F2HoxC4ai + HoxC4aii1168/11751–410411–928/935929/936–1168/1175
HoxC4aii + HoxC4ai11681–410411–928929–1168
IDC-F2-CHoxC4ai + HoxC4aii1169/11751–410411–928/934929/935–1169/1175
HoxC4aii + HoxC4ai11691–410411–928929–1169
HoxC4ai + HoxC4aii + HoxC4ai11751–410411–934935–1175
IDC-F2-MHoxC4ai + HoxC4aii11751–410411–934935–1175
HoxC4aii + HoxC4ai11751–410411–934935–1175
HoxD4aIDC-F1HoxD4aiii + HoxD4a-19371–315316–712713–937
IDMC-F1HoxD4ai + HoxD4aii9441–315316–719720–944
IDMC-F2HoxD4ai + HoxD4aii9441–315316–719720–944
HoxD4aii + HoxD4ai9421–315316–717718–942
IDC-F2-CHoxD4ai + HoxD4aii + HoxD4ai9361–314315–711712–936
IDC-F2-MHoxD4aii + HoxD4ai937/9441–315316–712/719713/720–937/944
HoxD4ai + HoxD4aii + HoxD4ai9371–315316–712713–937
HoxD4aii + HoxD4ai + HoxD4aii + HoxD4ai9371–315316–712713–937
HoxD10aIDC-F1HoxD10aii + HoxD10ai15541–589590–13241325–1554
HoxD10ai + HoxD10a-114941–588589–12641265–1494
IDMC-F1HoxD10aii + HoxD10ai15451–592593–13151316–1545
IDMC-F2HoxD10aii + HoxD10ai1542/15581–592593–1312/13291313/1330–1542/1558
IDC-F2-CHoxD10aii + HoxD10ai15561–592593–13261327–1556
HoxD10ai + HoxD10aii + HoxD10ai15451–592593–13151316–1545
IDC-F2-MHoxD10ai + HoxD10aii1544/15541–589590–1314/13241315/1325–1544/1554

Ψdenotes a pseudogene

PCR amplification bands (non-recombinant bands) in COC, BSB, IDC-F1, IDMC-F1, IDMC-F2, IDC-F2-C, and IDC-F2-M Ψdenotes a pseudogene PCR amplification bands (recombinant bands) in COC, BSB, IDC-F1, IDMC-F1, IDMC-F2, IDC-F2-C, and IDC-F2-M Ψdenotes a pseudogene

Molecular organization of the Hox genes sequences

The organization of the Hox clusters in COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2 are shown in Tables 1 and 2. Figure 1 and Additional file 2: Figures S1-S3 visually reflect the genetic variation in the Hox gene clusters of the two types of improved carp lineages. The Hox gene cluster organization showed that, as the first generation of distant hybridization, IDC-F1 had undergone extremely significant mutations; for example, in HoxA4a, IDC-F1 has five putative clusters and seven recombinant clusters (Fig. 1b and Tables 1 and 2); in HoxD4a, IDC-F1 has six putative clusters and one recombinant cluster (Fig. 1c and Tables 1 and 2); in HoxD10a, IDC-F1 has five putative clusters and two recombinant clusters (Fig. 1d and Tables 1 and 2). However, most of the Hox gene clusters in IDC-F1 (with a total of 42 putative and 15 recombinant Hox gene clusters) were not stably inherited in the second generation (IDC-F2-C and IDC-F2-M): IDC-F2-C has only 19 putative and 10 recombinant Hox gene clusters (Fig. 1, Additional file 2: Figures S1-S3 and Tables 1 and 2), and IDC-F2-M has only 18 putative and 13 recombinant Hox gene clusters (Fig. 1, Additional file 2: Figures S1-S3 and Tables 1 and 2). Additionally, although it was the first generation of a distant hybridization, we did not find obvious mutations in the Hox genes of the first generation of IDMC; almost all of the Hox gene clusters were derived from the female parent, COC, except for five recombinant clusters (Fig. 1, Additional file 2: Figures S1-S3 and Tables 1 and 2). Almost all of the Hox gene clusters in IDMC-F1 were stably inherited by the second generation (IDMC-F2), but IDMC-F2 had more obvious recombination events (the number of recombinant clusters increased to 12) (Fig. 1, Additional file 2: Figures S1-S3 and Tables 1 and 2). In this study, the self-crossed offspring of IDC-F1 showed two phenotypes. One of the offspring, IDC-F2-M, had a similar phenotype to that of IDMC-F1, so we searched for similarities and differences among the Hox gene clusters between the offspring IDC-F2-M and IDMC-F1 or IDMC-F2. Notably, we found similarities in the Hox gene clusters of these species (Fig. 1, Additional file 2: Figures S1-S3 and Tables 1 and 2); for example, the type of recombinant cluster including HoxA11b (HoxA11bi + HoxA11b-BSB + HoxA11bi) was found in only IDC-F2-M and IDMC-F2 (Additional file 2: Figure S1 c). In addition, as shown in Fig. 1, Additional file 2: Figures S1-S3 and Tables 1 and 2, IDC-F2-M possessed more abundant Hox gene clusters than IDMC-F1, similar to IDMC-F2, except that the Hox genes of IDMC-F2 were mainly concentrated in recombinant clusters. Among these Hox gene clusters, we found that all copies of HoxB4a in COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2 were pseudogenes containing a stop codon that prematurely terminates the expression of a full-length functional product (Fig. 2a, b and Tables 1 and 2). We also found that the copies of HoxB1ai in COC, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2 were pseudogenes due to stop codons (Fig. 2c, d and Tables 1 and 2). These results revealed that the Hox gene family in cyprinid fishes had undergone rapid evolution, with some genes gradually becoming pseudogenes, and some genes completely pseudogenised. Moreover, we also found pseudogenes in the recombinant clusters; for example, HoxB1ai + HoxB1aii and HoxB1aii + HoxB1ai in IDC-F2-M (Fig. 2c and Tables 1 and 2) and HoxC4aii + HoxC4ai in IDMC-F1 (Fig. 2e and Tables 1 and 2).
Fig. 2

Pseudogene sequences of HoxB1a, HoxB4a, and HoxC4a. a The nucleotide sequences of HoxB4a in COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2. b The putative amino acid sequence of HoxB4ai in COC. c The nucleotide sequences of HoxB1a in COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2. d The putative amino acid sequence of HoxB1ai in COC. e The nucleotide sequences of HoxC4a in COC and IDMC-F1. The red boxes indicate the stop codon bases. The green boxes indicate that there is no corresponding amino acid site due to the occurrence of the stop codon, and the “*” sign is used instead

Pseudogene sequences of HoxB1a, HoxB4a, and HoxC4a. a The nucleotide sequences of HoxB4a in COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2. b The putative amino acid sequence of HoxB4ai in COC. c The nucleotide sequences of HoxB1a in COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2. d The putative amino acid sequence of HoxB1ai in COC. e The nucleotide sequences of HoxC4a in COC and IDMC-F1. The red boxes indicate the stop codon bases. The green boxes indicate that there is no corresponding amino acid site due to the occurrence of the stop codon, and the “*” sign is used instead

Phylogenetic relationships

An unrooted phylogenetic tree of 12 Hox genes was constructed using MrBayes based on the alignment results (Fig. 3). The overall phylogenetic tree was divided into twelve well-conserved clades, and each clade contained one zebrafish Hox gene. Meanwhile, we analysed the percentage nucleotide identity and the percentage amino acid identity between duplicated Hox coding regions in COC, BSB, IDC-F1, IDMC-F1, IDMC-F2, IDC-F2-C, and IDC-F2-M (Tables 3 and 4). As shown in Tables 3 and 4, the close relationships were observed among IDC-F1, IDC-F2-C, and IDC-F2-M within the IDC lineage; between IDMC-F1 and IDMC-F2 within the IDMC lineage; and among IDC-F1, IDMC-F1, IDMC-F2, IDC-F2-C, and IDC-F2-M within both lineages. To evaluate the speciation of the two types of improved carp lineages, the percentages of nucleotide (amino acid) identity among the 12 Hox gene groups in COC, BSB, and both improved carp lineages were examined (Tables 3 and 4, Fig. 3). The identities of the orthologous Hox genes between the two types of improved carp lineages and COC were much higher than those between the two types of improved carp lineages and BSB, except for the gene clusters inherited from BSB. In some Hox genes, such as HoxA4a, HoxA2b and HoxC4a, both the nucleotide and amino acid sequences of both improved carp lineages had a high degree of identity to COC and BSB. In some Hox genes, such as HoxA11b, HoxC6b, HoxD4a and HoxD10a, although the nucleotide sequences between the two types of improved carp lineages and COC or BSB had lower identities, they had higher amino acid sequence identities, which suggested that most mutations were synonymous. In some Hox genes, such as HoxA9a, both the nucleotide and amino acid sequences of both improved carp lineages had a low degree of identity to COC and BSB (Tables 3 and 4).
Fig. 3

Phylogenetic analyses of the amino acid sequences of 12 Hox genes (HoxA4a, HoxA9a, HoxA2b, HoxA11b, HoxB1a, HoxB4a, HoxB1b, HoxB5b, HoxC4a, HoxC6b, HoxD4a, and HoxD10a) in COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, IDMC-F2 and zebrafish (Danio rerio). Phylogenetic tree constructed using MrBayes with the HKY + I + G model (−lnL = 15,356.5967); MCMC = 2 million generations. The phylogenetic tree for each Hox gene is marked by a separate colour, as shown in the figure

Table 3

Percentage nucleotide identity (on the left) and percentage amino acid identity (on the right) between duplicated Hox coding regions in COC, BSB, IDC-F1, IDMC-F1, IDMC-F2, IDC-F2-C, and IDC-F2-M (following the format below)

HoxA4a (%)HoxA9a (%)HoxA2b (%)HoxA11b (%)HoxB1a (%)HoxB4a (%)HoxB1b (%)HoxB5b (%)HoxC4a (%)HoxC6b (%)HoxD4a (%)HoxD10a (%)
IDC-F1 i: IDC-F1 ii94.5/98.581.8/88.593.7/97.688.9/96.092.3/P94.8/95.794.1/96.498.0/99.092.1/98.490.7/96.688.0/97.0
: IDC-F1 iii93.0/98.596.1/97.991.5/P94.8/96.098.5/98.686.9/94.985.8/96.3
: IDC-F1 (1)96.4/98.090.1/95.997.5/99.098.5/99.490.3/97.4
: IDC-F1 (2)89.1/95.5
: IDC-F1 (BSB)92.3/98.586.2/91.189.4/97.283.4/97.387.8/94.981.3/96.7
: IDMC-F1 i100.0/100.099.8/100.099.5/98.999.8/99.899.6/P99.7/P100.0/100.0100.0/100.0100.0/100.0100.0/100.099.8/99.499.9/100.0
: IDMC-F1 ii94.6/98.593.8/97.388.1/P94.3/96.897.9/98.690.8/97.288.3/97.4
: IDMC-F2 i100.0/100.0100.0/100.099.6/98.999.7/99.799.5/P99.6/P100.0/100.0100.0/100.099.9/100.099.8/100.099.8/99.499.6/99.6
: IDMC-F2 ii94.5/98.593.7/97.397.7/98.690.9/97.288.4/97.4
: IDC-F2-C i99.7/99.099.8/100.099.6/98.699.9/100.099.4/P99.7/P100.0/100.099.8/100.0100.0/100.099.8/100.099.7/98.8100.0/100.0
: IDC-F2-C ii94.5/98.580.1/86.993.7/97.694.3/96.898.0/99.090.9/97.288.5/97.4
: IDC-F2-M i99.9/100.099.8/100.099.7/99.3100.0/100.099.5/P99.7/P100.0/100.099.8/100.099.9/100.099.8/100.099.7/99.499.9/100.0
: IDC-F2-M ii94.2/98.593.7/97.694.3/96.497.9/99.090.8/96.688.2/97.0
: COC i99.8/99.0100.0/100.099.6/99.399.8/99.799.5/P99.8/P99.5/100.099.9/100.099.9/100.099.1/98.999.6/98.899.2/98.5
: COC ii94.5/98.080.2/88.593.8/97.688.2/P94.1/96.498.0/99.090.8/97.288.3/97.4
: BSB92.3/98.577.3/85.494.9/96.974.2/94.586.7/P85.9/P86.2/91.189.9/97.693.4/91.683.3/97.387.8/94.981.4/97.0
IDC-F1 ii: IDC-F1 iii96.2/99.592.6/97.693.2/95.798.4/99.592.7/96.087.9/95.6
: IDC-F1 (1)93.5/97.592.3/96.398.7/100.090.8/97.284.0/96.7
: IDC-F1 (2)86.0/94.4
: IDC-F1 (BSB)92.2/98.086.4/89.790.0/96.081.3/96.386.2/93.881.0/96.3
: IDMC-F1 i94.5/98.581.9/88.593.7/97.988.8/95.992.3/P94.8/95.794.1/96.498.0/99.092.1/98.490.8/97.288.1/97.0
: IDMC-F1 ii99.7/100.099.7/99.699.1/99.699.9/99.599.4/99.499.2/98.9
: IDMC-F2 i94.5/98.581.8/88.593.7/97.988.9/96.092.3/P94.8/95.794.1/96.497.9/99.092.0/98.490.8/97.287.8/97.4
: IDMC-F2 ii99.8/100.099.5/99.699.6/99.599.5/99.499.4/99.6
: IDC-F2-C i94.4/97.581.9/88.593.7/97.688.9/96.092.3/P94.8/95.793.9/96.498.0/99.092.0/98.490.7/96.688.0/97.0
: IDC-F2-C ii99.8/100.092.6/95.899.5/100.099.3/99.6100.0/100.099.5/99.499.4/99.6
: IDC-F2-M i94.6/98.581.9/88.593.9/98.388.9/96.092.3/P94.8/95.794.1/96.498.1/99.092.0/98.490.7/97.288.0/97.0
: IDC-F2-M ii99.5/100.099.5/100.099.1/99.299.8/100.099.4/98.899.2/99.2
: COC i94.5/98.581.8/88.593.9/98.388.9/95.992.2/P94.6/95.794.0/96.498.1/99.091.5/97.390.8/96.687.6/96.3
: COC ii99.6/99.592.1/95.399.7/100.099.1/99.2100.0/100.099.3/99.499.5/99.6
: BSB92.2/98.081.0/90.194.9/97.371.6/91.685.4/P86.4/89.790.4/96.493.3/92.581.4/96.386.2/93.881.1/96.7
IDC-F1 iii: IDC-F1 (1)92.0/97.089.8/96.398.3/99.586.9/95.584.6/98.1
: IDC-F1 (2)86.0/96.0
: IDC-F1 (BSB)90.8/97.588.7/95.784.1/92.778.4/95.2
: IDMC-F1 i93.0/98.596.1/98.391.2/P94.8/96.098.5/98.687.0/95.585.7/96.3
: IDMC-F1 ii96.2/99.592.6/97.384.7/92.693.4/96.098.3/99.092.8/96.688.2/95.9
: IDMC-F2 i93.0/98.596.2/98.391.2/P94.8/96.098.4/98.687.0/95.585.7/96.7
: IDMC-F2 ii96.2/99.592.5/97.398.2/99.092.9/96.688.2/95.9
: IDC-F2-C i92.7/97.596.2/97.991.1/P94.8/96.098.5/98.686.9/94.985.8/96.3
: IDC-F2-C ii96.2/99.592.5/97.693.5/96.098.4/99.592.9/96.688.2/95.9
: IDC-F2-M i92.9/98.596.3/98.691.2/P94.6/96.098.6/98.686.9/95.585.8/96.3
: IDC-F2-M ii95.9/99.592.5/97.693.4/95.798.3/99.592.8/96.088.1/95.6
: COC i93.0/98.096.2/98.691.2/P94.7/96.098.6/98.687.0/94.985.4/95.6
: COC ii96.2/99.092.6/97.684.9/92.693.4/95.798.4/99.592.7/96.688.1/95.9
: BSB90.8/97.593.5/97.383.4/94.288.8/96.093.3/92.184.1/92.778.5/95.6
IDC-F1 (1): IDC-F1 (2)88.8/96.0
: IDC-F1 (BSB)91.3/97.587.8/95.578.1/96.3
: IDMC-F1 i96.4/98.090.1/96.397.5/99.098.6/100.090.3/97.4
: IDMC-F1 ii93.5/97.592.4/95.998.6/99.590.9/97.784.2/97.0
: IDMC-F2 i96.4/98.090.2/96.397.4/99.098.6/100.090.2/97.8
: IDMC-F2 ii93.5/97.592.3/95.998.3/99.591.0/97.784.2/97.0
: IDC-F2-C i96.1/97.090.2/95.997.5/99.098.5/99.490.3/97.4
: IDC-F2-C ii93.5/97.592.3/96.398.7/100.091.0/97.784.3/97.0
: IDC-F2-M i96.3/98.090.3/96.697.6/99.098.5/100.090.3/97.4
: IDC-F2-M ii93.2/97.592.3/96.398.5/100.090.9/97.284.1/96.7
: COC i96.4/98.090.2/96.697.6/99.098.6/99.490.1/96.7
: COC ii93.5/97.092.4/96.398.7/100.090.9/97.784.2/97.0
: BSB91.3/97.591.0/96.393.4/92.587.8/95.578.2/96.7
IDC-F1 (2): IDC-F1 (BSB)83.5/94.4
: IDMC-F1 i89.2/96.0
: IDMC-F1 ii86.2/94.9
: IDMC-F2 i89.2/96.0
: IDMC-F2 ii86.3/94.9
: IDC-F2-C i89.1/95.5
: IDC-F2-C ii86.3/94.9
: IDC-F2-M i89.1/96.0
: IDC-F2-M ii86.2/94.4
: COC i89.2/95.5
: COC ii86.1/94.9
: BSB83.5/94.4
IDC-F1 (BSB): IDMC-F1 i92.3/98.586.2/91.189.4/97.283.4/97.387.9/95.581.4/96.7
: IDMC-F1 ii92.3/98.090.0/96.486.5/94.481.3/96.7
: IDMC-F2 i92.3/98.586.2/91.189.4/97.283.4/97.387.9/95.581.2/97.0
: IDMC-F2 ii92.2/98.086.6/94.481.3/96.7
: IDC-F2-C i92.0/97.586.2/91.189.4/97.283.5/97.387.8/94.981.3/96.7
: IDC-F2-C ii92.2/98.090.2/96.486.6/94.481.3/96.7
: IDC-F2-M i92.2/98.586.2/91.189.3/97.283.3/97.387.8/95.581.3/96.7
: IDC-F2-M ii91.9/98.090.0/96.086.5/93.881.2/96.3
: COC i92.3/98.586.0/91.189.4/97.282.7/96.387.9/94.980.9/95.9
: COC ii92.2/97.590.0/96.086.5/94.481.3/96.7
: BSB100.0/100.0100.0/100.098.4/99.699.5/100.0100.0/100.099.8/99.6

Notes: Values before slashes (/) denote nucleotide identity, and values after slashes denote amino acid identity; P represents one or two amino acid sequences as pseudogene sequences for which the identity cannot be compared

Table 4

Percentage nucleotide identity (on the left) and percentage amino acid identity (on the right) between duplicated Hox coding regions in COC, BSB, IDC-F1, IDMC-F1, IDMC-F2, IDC-F2-C, and IDC-F2-M (following the format above)

HoxA4a (%)HoxA9a (%)HoxA2b (%)HoxA11b (%)HoxB1a (%)HoxB4a (%)HoxB1b (%)HoxB5b (%)HoxC4a (%)HoxC6b (%)HoxD4a (%)HoxD10a (%)
IDMC-F1 i: IDMC-F1 ii94.6/98.593.8/97.687.7/P94.3/96.897.9/98.690.9/97.788.4/97.4
: IDMC-F2 i100.0/100.099.8/100.099.6/99.399.7/99.899.6/P99.6/P100.0/100.0100.0/100.099.9/100.099.8/100.0100.0/100.099.5/99.6
: IDMC-F2 ii94.5/98.593.7/97.697.7/98.691.0/97.788.5/97.4
: IDC-F2-C i99.7/99.0100.0/100.099.6/98.999.7/99.899.5/P99.8/P100.0/100.099.8/100.0100.0/100.099.8/100.099.8/99.499.9/100.0
: IDC-F2-C ii94.5/98.580.2/86.993.8/97.994.3/96.898.0/99.091.0/97.788.6/97.4
: IDC-F2-M i99.9/100.0100.0/100.099.7/99.699.8/99.899.6/P99.8/P100.0/100.099.8/100.099.9/100.099.8/100.099.8/100.099.8/100.0
: IDC-F2-M ii94.2/98.593.8/97.994.3/96.497.9/99.090.9/97.288.3/97.0
: COC i99.8/99.099.8/100.099.7/99.699.7/99.599.5/P99.6/P99.5/100.099.9/100.099.9/100.099.1/98.999.7/99.499.2/98.5
: COC ii94.5/98.080.1/88.593.8/97.987.8/P94.1/96.498.0/99.090.9/97.788.3/97.4
: BSB92.3/98.577.2/85.495.1/97.374.2/94.486.3/P85.8/P86.2/91.189.9/97.693.4/91.683.3/97.387.9/95.581.5/97.0
IDMC-F1 ii: IDMC-F2 i94.6/98.593.9/97.687.8/P94.3/96.897.8/98.690.9/97.788.1/97.8
: IDMC-F2 ii99.9/100.099.7/99.399.5/99.099.8/100.099.6/99.2
: IDC-F2-C i94.5/97.593.9/97.387.6/P94.3/96.897.9/98.690.8/97.288.3/97.4
: IDC-F2-C ii99.9/100.099.7/99.699.8/100.099.9/99.599.8/100.099.6/99.2
: IDC-F2-M i94.5/98.594.0/97.987.8/P94.3/96.898.0/98.690.8/97.788.3/97.4
: IDC-F2-M ii99.6/100.099.7/99.699.8/99.699.7/99.599.7/99.499.2/98.9
: COC i94.6/98.593.9/97.987.8/P94.2/96.898.0/98.690.9/97.288.0/96.7
: COC ii99.7/99.599.8/99.699.2/99.199.8/99.699.9/99.599.6/100.099.4/99.2
: BSB92.3/98.095.0/96.985.0/95.990.5/96.893.3/92.186.5/94.481.4/97.0
IDMC-F2 i: IDMC-F2 ii94.5/98.593.7/97.697.8/98.691.0/97.788.2/97.8
: IDC-F2-C i99.7/99.099.8/100.099.7/98.999.7/99.799.8/P99.6/P100.0/100.099.8/100.099.9/100.099.7/100.099.8/99.499.6/99.6
: IDC-F2-C ii94.5/98.580.1/86.993.7/97.994.3/96.897.9/99.091.0/97.788.3/97.8
: IDC-F2-M i99.9/100.099.8/100.099.8/99.699.7/99.7100.0/P99.6/P100.0/100.099.8/100.099.8/100.099.7/100.099.8/100.099.6/99.6
: IDC-F2-M ii94.2/98.593.7/97.994.3/96.497.8/99.090.9/97.287.9/97.4
: COC i99.8/99.0100.0/100.099.7/99.699.7/99.599.8/P99.6/P99.5/100.099.9/100.099.8/100.099.0/98.999.7/99.499.1/98.9
: COC ii94.5/98.080.2/88.593.9/97.987.8/P94.1/96.497.9/99.090.9/97.788.0/97.8
: BSB92.3/98.577.3/85.495.0/97.374.0/94.286.4/P85.9/P86.2/91.189.9/97.693.5/91.683.3/97.387.9/95.581.3/97.4
IDMC-F2 ii: IDC-F2-C i94.4/97.593.7/97.397.7/98.690.9/97.288.4/97.4
: IDC-F2-C ii100.0/100.099.7/99.699.6/99.5100.0/100.099.8/100.0
: IDC-F2-M i94.6/98.593.9/97.997.8/98.690.9/97.788.3/97.4
: IDC-F2-M ii99.7/100.099.7/99.699.8/99.599.8/99.499.4/99.6
: COC i94.5/98.593.7/97.997.8/98.691.0/97.288.0/96.7
: COC ii99.6/99.599.7/99.699.6/99.599.7/100.099.6/100.0
: BSB92.2/98.094.9/96.993.2/92.186.6/94.481.4/97.0
IDC-F2-C i: IDC-F2-C ii94.4/97.580.2/86.993.7/97.694.3/96.898.0/99.090.9/97.288.5/97.4
: IDC-F2-M i99.6/99.0100.0/100.099.8/99.399.9/100.099.8/P99.8/P100.0/100.099.8/100.099.9/100.099.7/100.099.7/99.499.9/100.0
: IDC-F2-M ii94.1/97.593.7/97.694.3/96.497.9/99.090.8/96.688.2/97.0
: COC i99.5/98.099.8/100.099.7/99.399.7/99.799.7/P99.6/P99.5/100.099.7/100.099.9/100.099.1/98.999.6/98.899.2/98.5
: COC ii94.4/97.080.1/88.593.9/97.687.7/P94.1/96.498.0/99.090.8/97.288.3/97.4
: BSB92.0/97.577.2/85.495.0/96.974.2/94.586.3/P85.8/P86.2/91.189.7/97.693.4/91.683.2/97.387.8/94.981.4/97.0
IDC-F2-C ii: IDC-F2-M i94.6/98.580.2/86.993.9/98.394.3/96.898.1/99.090.9/97.788.5/97.4
: IDC-F2-M ii99.7/100.0100.0/100.099.8/99.699.8/100.099.8/99.499.4/99.6
: COC i94.5/98.580.1/86.993.9/98.394.2/96.898.1/99.091.0/97.288.2/96.7
: COC ii99.6/99.599.2/98.499.7/100.099.8/99.6100.0/100.099.7/100.099.6/100.0
: BSB92.2/98.080.3/88.595.0/97.390.5/96.893.3/92.586.6/94.481.4/97.0
IDC-F2-M i: IDC-F2-M ii94.3/98.593.9/98.394.3/96.498.0/99.090.8/97.288.1/97.0
: COC i99.7/99.099.8/100.099.8/100.099.8/99.799.8/P99.6/P99.5/100.099.7/100.0100.0/100.099.0/98.999.6/99.499.2/98.5
: COC ii94.4/98.080.1/88.594.0/98.387.8/P94.1/96.498.1/99.090.8/97.788.2/97.4
: BSB92.2/98.577.2/85.495.1/97.674.2/94.586.4/P85.8/P86.2/91.189.7/97.693.5/91.683.2/97.387.8/95.581.4/97.0
IDC-F2-M ii: COC i94.2/98.593.9/98.394.2/96.498.0/99.090.9/96.687.8/96.3
: COC ii99.4/99.599.7/100.099.6/99.299.8/100.099.6/99.499.4/99.6
: BSB91.9/98.095.0/97.390.5/96.493.3/92.586.5/93.881.3/96.7
COC i: COC ii94.5/98.080.2/88.593.9/98.387.8/P94.0/96.498.1/99.090.9/97.287.9/96.7
: BSB92.3/98.577.3/85.495.1/97.674.2/94.586.4/P86.0/P86.0/91.189.8/97.693.5/91.682.9/96.387.9/94.981.0/96.3
COC ii: BSB92.2/97.580.3/89.095.0/97.385.2/95.990.3/96.493.3/92.586.5/94.481.4/97.0

Notes: Values before slashes (/) denote nucleotide identity, and values after slashes denote amino acid identity; P represents one or two amino acid sequences as pseudogene sequences for which the identity cannot be compared

Phylogenetic analyses of the amino acid sequences of 12 Hox genes (HoxA4a, HoxA9a, HoxA2b, HoxA11b, HoxB1a, HoxB4a, HoxB1b, HoxB5b, HoxC4a, HoxC6b, HoxD4a, and HoxD10a) in COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, IDMC-F2 and zebrafish (Danio rerio). Phylogenetic tree constructed using MrBayes with the HKY + I + G model (−lnL = 15,356.5967); MCMC = 2 million generations. The phylogenetic tree for each Hox gene is marked by a separate colour, as shown in the figure Percentage nucleotide identity (on the left) and percentage amino acid identity (on the right) between duplicated Hox coding regions in COC, BSB, IDC-F1, IDMC-F1, IDMC-F2, IDC-F2-C, and IDC-F2-M (following the format below) Notes: Values before slashes (/) denote nucleotide identity, and values after slashes denote amino acid identity; P represents one or two amino acid sequences as pseudogene sequences for which the identity cannot be compared Percentage nucleotide identity (on the left) and percentage amino acid identity (on the right) between duplicated Hox coding regions in COC, BSB, IDC-F1, IDMC-F1, IDMC-F2, IDC-F2-C, and IDC-F2-M (following the format above) Notes: Values before slashes (/) denote nucleotide identity, and values after slashes denote amino acid identity; P represents one or two amino acid sequences as pseudogene sequences for which the identity cannot be compared

Discussion

Hybridization offers a means by which diversity may be increased because, unlike mutation, it provides genetic variation at hundreds or thousands of genes in a single generation [4]. Our results provide a good model for genetic variation by showing obvious genotypic differences in the IDC-F1 fish derived from the distant hybridization of COC (♀) × BSB (♂). The Hox gene clusters in IDC-F1 were approximately twice as large as those in COC, except for the recombinant clusters. The topology of the phylogenetic tree of 12 Hox genes (Fig. 3) further suggested that some of the Hox genes orthologous to zebrafish genes were present as two copies in COC (except for HoxA11b, HoxB1b, HoxB4a, and HoxC6b), one copy in BSB, and two to six copies (not counting recombinant clusters) in IDC-F1. The proliferation of such a rich diversity in gene copy number further reveals that distant hybridization as a catalyst accelerates the formation of species [8]. One of the highlights of this study is the development of IDMC-F1 derived from the distant hybridization of COC (♀) × BSB (♂), which has a significant difference in phenotype compared to its parents; even in the self-crossed offspring of IDC-F1, two distinct phenotypes were differentiated: IDC-F2-C was consistent with that of IDC-F1, and IDC-F2-M was very similar to that of IDMC-F1 (Fig. 1). Determining the mechanisms that lead to these new phenotypes to appear will help us to understand the impact of hybridization on the speciation processes. At present, three possible mechanisms are considered. Firstly, alleles of additive effect may not all be fixed in the same direction between diverging populations, under this mechanism, some hybrid genotypes then fall outside the parental distribution (+ + + − × − − − + can generate + + + + or − − − −) [30]. Secondly, these new phenotypes derived from hybridization may result from interactions (dominance or epistasis) between alleles fixed independently in different populations. Thirdly, research in recent years has begun to reveal a wider variety of genetic mechanisms underlying new hybrid phenotypes, e.g., genome restructuring, duplication/deletion [31], alterations in the timing and levels of gene expression, transposon activation and epigenetic effects [32-35]. We speculated that the third mechanism was the possible reason for the differentiation of the mirror carp-type offspring (IDMC-F1 and IDC-F2-M). Under this mechanism, the genomes of hybrid progeny contain a rich variety of genetic variants, which are rapidly changing in the early generation of hybridization, and most of the variant types cannot be stably inherited to the next generation. In fact, most of the Hox gene copies in IDC-F1 were not stably inherited in the second generation (IDC-F2-C and IDC-F2-M). These result validated the possible mechanism of the differentiation of the mirror carp-type offspring. In contrast, as with the first generation of distant hybridization, we did not find obvious mutations in Hox genes in the first generation of IDMC; almost all of the Hox gene clusters were derived from the female parent, COC, except for the recombinant clusters. Almost all of the Hox gene clusters in IDMC-F1 were stably inherited in IDMC-F2, but at the same time, IDMC-F2 contained more obvious recombination events. The gene types that were stably inherited from a single parent in the offspring have experienced long-term evolutionary testing and became essential for the evolution of species. The functions of Hox genes have become increasingly clear in recent years, but questions about the evolution of Hox genes remain unresolved. Gene duplication and mutation are the basis for understanding Hox gene evolution, and mutations in coding sequences may produce new functional proteins. Hox gene clusters in fish are more variable in gene content than expected, and each cluster has its own characteristics in terms of absolute length and content of conserved non-coding sequences [36]. This study fully confirms this argument; for example, among these Hox genes, two to six copies (not counting recombinant clusters) were found in IDC-F1. Furthermore, Hox cluster degeneration may be ongoing, at least in fish, because HoxB4a is active in zebrafish but its orthologues are pseudogenes in COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2. Similarly, HoxB1ai is active in zebrafish, but its orthologues are pseudogenes in COC, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2. These results revealed that the Hox gene clusters are undergoing continuous degeneration in the cyprinid fishes, with some genes gradually becoming pseudogenes, and some genes completely pseudogenised. One of the most important finding of this study is the discovery of Hox gene recombinant clusters, which may be the first in Hox genes of cyprinid fishes or even vertebrates. In the two types of improved carp lineages derived from COC (♀) × BSB (♂), these recombinant clusters come from the recombination of different types of gene copies, most of which cannot be stably inherited to the next generation. Moreover, for HoxA11b, we found the recombinant cluster type (HoxA11bi + HoxA11b-BSB + HoxA11bi) only in IDC-F2-M and IDMC-F2, indicating that it might be necessary for development of the morphological features of mirror carp-like species. In this study, we studied the genetic variation in 12 Hox genes in the two types of improved carp lineages derived from COC (♀) × BSB (♂). We first revealed the interesting results of the abundant gene clusters derived from IDC-F1 and found a wide variety of recombinant clusters in the two types of improved carp lineages. In summary, our results provided important evidence that distant hybridization produced rapid genomic DNA changes that may or may not stably inherited, providing novel insight into the function of hybridization in the establishment of the improved lineages used as new fish resources for aquaculture. The genetic evolution of the Hox gene family provides clues for revealing the gene regulatory mechanisms underlying biological evolution and cell differentiation.

Conclusions

Based on the establishment of the two types of improved carp lineages derived from common carp (♀) × blunt snout bream (♂), our results provided important evidence that distant hybridization produced rapid genomic DNA changes that may or may not stably inherited, providing novel insight into the function of hybridization in the establishment of the improved lineages used as new fish resources for aquaculture.

Methods

Ethics statement

The guidelines established by the Administration of Affairs Concerning Animal Experimentation state that approval from the Science and Technology Bureau of China and the Department of Wildlife Administration is not necessary when the fish in question are neither rare nor near extinction (first- or second-class state protection level). Therefore, approval was not required for the experiments conducted in this study.

Animals and crossing procedure

All of the natural materials, such as common carp (Cyprinus carpio, 2n = 100, abbreviated as COC) and blunt snout bream (Megalobrama amblycephala, 2n = 48, abbreviated as BSB) were obtained from the Center for Polyploidy Fish Genetics Breeding of Hunan Province located at Hunan Normal University, Changsha, Hunan, China. The protocols for crossing and culturing were described previously [23]. The two types of improved carp offspring from COC (♀) × BSB (♂) were the improved diploid common carp (2n = 100, IDC-F1) and the improved diploid scattered mirror carp (2n = 100, IDMC-F1); the phenotype of the latter has changed significantly from that of the female parent, COC. The self-crossed offspring of IDC-F1 showed two phenotypes: one was consistent with that of their parents (abbreviated IDC-F2-C), and the other was very similar to that of IDMC-F1 (abbreviated IDC-F2-M). In contrast, the self-crossed offspring of IDMC-F1 showed only one phenotype, that is, a scattered mirror carp-like appearance (abbreviated IDMC-F2). The IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2 fish were cultured in ponds at the Center for Polyploidy Fish Genetics Breeding of Hunan Province located at Hunan Normal University, Changsha, Hunan, China, and fed artificial feed. All fishes were deeply anaesthetized with 100 mg/L MS-222 (Sigma-Aldrich, St. Louis, MO, USA) prior to dissection.

DNA extraction, PCR amplification, cloning and sequencing of Hox genes

Total genomic DNA from the peripheral blood cells of COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2 extracted by routine approaches [37] were used separately as templates. Several combinations of degenerate PCR primers (Additional file 1: Table S1) [38, 39] were used to amplify up to 12 Hox gene sequences (HoxA4a, HoxA9a, HoxA2b, HoxA11b, HoxB1a, HoxB4a, HoxB1b, HoxB5b, HoxC4a, HoxC6b, HoxD4a, and HoxD10a) in COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2. The PCRs were performed in a volume of 50 μL using Taq DNA polymerase (TaKaRa, Dalian, China). The thermal cycling program uses thermal gradient PCR and used these conditions for the first time. The thermal cycling program generally consisted of an initial denaturation step at 94 °C for 5 min, followed by 35 cycles of 94 °C for 35 s, 50–60 °C for 60 s, and 72 °C for 60–150 s and a final extension step at 72 °C for 10 min. The PCR products were cloned into the pMD18-T vector (TaKaRa, Dalian, China). The plasmids were transformed into E. coli DH5a, purified and sequenced with vector-specific primers using the primer walking method on an ABI 3730XL automatic sequencer (ABI PRISM 3730, Applied Biosystems, CA, USA). The sequences were BLAST searched against the non-redundant protein database maintained at the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov) to determine their identity.

Sequence comparison and analysis

All of the sequence information and GenBank accession numbers in this study is detailed in Additional file 1: Table S2. The sequence homology and variation among the fragments amplified from COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2 were analysed using BioEdit [40] and the DNAStar 5.0 software package (DNAStar Inc.). To increase the probability of detecting duplicated paralogs and circumventing errors from PCR, we sequenced 20–30 clones for each gene from each of COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2. The obtained sequences were screened for Hox gene fragments using the BLAST (http://www.ncbi.nlm.nih.gov), ClustalW (http://www.ebi.ac.uk/) [41] and MEGA 4.0 [42] programs to determine identity. Then, we evaluated the organization of the Hox clusters in IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2 in comparison with COC and BSB to characterize the Hox genes.

Phylogenetic analysis - unconstrained Bayesian analysis

The derived amino acid sequences of 12 Hox genes (HoxA4a, HoxA9a, HoxA2b, HoxA11b, HoxB1a, HoxB4a, HoxB1b, HoxB5b, HoxC4a, HoxC6b, HoxD4a, and HoxD10a) in COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, IDMC-F2 were aligned with the Hox genes of zebrafish retrieved from GenBank using Clustal X 1.81 [43]. Regions of zebrafish Hox gene sequence that were difficult to align were removed from the alignment. Gaps were also removed from the alignment. An unrooted phylogenetic tree of all amino acid sequences of the 12 Hox genes (the pseudogenes found in this study were not excluded from the phylogenetic analysis) was analysed in MrBayes version 3.1.2 [44, 45]. We also tested the Hox genes for saturation using DAMBE v6.4.41 [46], and the results revealed that the Hox genes were suitable for phylogenetic analysis. The best-fitting substitution models for each gene fragment were determined by Modeltest 3.7 [47], and the HKY + I + G model was chosen for the Hox genes by using the Bayesian information criterion. MrBayes was run for 2 million generations with two runs and four chains in parallel and a burn-in of 25%, and the analysis was terminated after the average standard deviation of the split frequencies fell under 0.01. The final trees were visualized in FIGTREE 1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/. 2018.). Additional file 1: Table S1. The 12 combinations of degenerate PCR primers designed based on the alignment and identification of consensus sequences of orthologous Hox genes from zebrafish (Danio rerio), medaka (Oryzias latipes), rainbow trout (Oncorhynchus mykiss), pufferfish (Fugu rubripes), mouse (Mus musculus), and humans (Homo sapiens). Table S2. Sequence information and GenBank accession numbers for COC, BSB, IDC-F1, IDMC-F1, IDC-F2-C, IDC-F2-M, and IDMC-F2 clones, and GenBank accession numbers of zebrafish (Danio rerio) used in this study; the tick symbol means the suquence used in this analysis. Additional file 2: Figure S1. Variable sequence types (including haplotypes and recombinant clusters) in different Hox genes in these species. Figure S2. Variable sequence types (including haplotypes and recombinant clusters) in different Hox genes in these species. Figure S3. Variable sequence types (including haplotypes and recombinant clusters) in HoxB5b in these species.
  42 in total

1.  Major ecological transitions in wild sunflowers facilitated by hybridization.

Authors:  Loren H Rieseberg; Olivier Raymond; David M Rosenthal; Zhao Lai; Kevin Livingstone; Takuya Nakazato; Jennifer L Durphy; Andrea E Schwarzbach; Lisa A Donovan; Christian Lexer
Journal:  Science       Date:  2003-08-07       Impact factor: 47.728

2.  The "fish-specific" Hox cluster duplication is coincident with the origin of teleosts.

Authors:  Karen D Crow; Peter F Stadler; Vincent J Lynch; Chris Amemiya; Günter P Wagner
Journal:  Mol Biol Evol       Date:  2005-09-14       Impact factor: 16.240

3.  Hybridization as an invasion of the genome.

Authors:  James Mallet
Journal:  Trends Ecol Evol       Date:  2005-05       Impact factor: 17.712

Review 4.  Genomic evolution of Hox gene clusters.

Authors:  Derek Lemons; William McGinnis
Journal:  Science       Date:  2006-09-29       Impact factor: 47.728

5.  Testing natural selection vs. genetic drift in phenotypic evolution using quantitative trait locus data.

Authors:  H A Orr
Journal:  Genetics       Date:  1998-08       Impact factor: 4.562

6.  Adaptive introgression of abiotic tolerance traits in the sunflower Helianthus annuus.

Authors:  Kenneth D Whitney; Rebecca A Randell; Loren H Rieseberg
Journal:  New Phytol       Date:  2010-03-19       Impact factor: 10.151

7.  The chimeric genes in the hybrid lineage of Carassius auratus cuvieri (♀)×Carassius auratus red var. (♂).

Authors:  Qingfeng Liu; Yanhua Qi; Qiuli Liang; Xiujuan Xu; Fangzhou Hu; Jing Wang; Jun Xiao; Shi Wang; Wuhui Li; Min Tao; Qinbo Qin; Rurong Zhao; Zhanzhou Yao; Shaojun Liu
Journal:  Sci China Life Sci       Date:  2018-06-13       Impact factor: 6.038

8.  Hox gene clusters in the Indonesian coelacanth, Latimeria menadoensis.

Authors:  Esther G L Koh; Kevin Lam; Alan Christoffels; Mark V Erdmann; Sydney Brenner; Byrappa Venkatesh
Journal:  Proc Natl Acad Sci U S A       Date:  2003-01-23       Impact factor: 11.205

9.  The autotetraploid fish derived from hybridization of Carassius auratus red var. (female) × Megalobrama amblycephala (male).

Authors:  Qinbo Qin; Yude Wang; Juan Wang; Jing Dai; Jun Xiao; Fangzhou Hu; Kaikun Luo; Min Tao; Chun Zhang; Yun Liu; Shaojun Liu
Journal:  Biol Reprod       Date:  2014-08-27       Impact factor: 4.285

10.  A new type of homodiploid fish derived from the interspecific hybridization of female common carp × male blunt snout bream.

Authors:  Shi Wang; Xiaolan Ye; Yude Wang; Yuting Chen; Bowen Lin; Zhenfeng Yi; Zhuangwen Mao; Fangzhou Hu; Rurong Zhao; Juan Wang; Rong Zhou; Li Ren; Zhanzhou Yao; Min Tao; Chun Zhang; Jun Xiao; Qinbo Qin; Shaojun Liu
Journal:  Sci Rep       Date:  2017-06-23       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.