Literature DB >> 33281869

Comparative Genomics Studies on the dmrt Gene Family in Fish.

Junjian Dong1, Jia Li2, Jie Hu1, Chengfei Sun1, Yuanyuan Tian1, Wuhui Li1, Ningning Yan1, Chengxi Sun3, Xihui Sheng4, Song Yang5, Qiong Shi2, Xing Ye1.   

Abstract

Doublesex and mab-3-related transcription factor (dmrt) genes are widely distributed across various biological groups and play critical roles in sex determination and neural development. Here, we applied bioinformatics methods to exam cross-species changes in the dmrt family members and evolutionary relationships of the dmrt genes based on genomes of 17 fish species. All the examined fish species have dmrt1-5 while only five species contained dmrt6. Most fish harbored two dmrt2 paralogs (dmrt2a and dmrt2b), with dmrt2b being unique to fish. In the phylogenetic tree, 147 DMRT are categorized into eight groups (DMRT1-DMRT8) and then clustered in three main groups. Selective evolutionary pressure analysis indicated purifying selections on dmrt1-3 genes and the dmrt1-3-2(2a) gene cluster. Similar genomic conservation patterns of the dmrt1-dmrt3-dmrt2(2a) gene cluster with 20-kb upstream/downstream regions in fish with various sex-determination systems were observed except for three regions with remarkable diversity. Synteny analysis revealed that dmrt1, dmrt2a, dmrt2b, and dmrt3-5 were relatively conserved in fish during the evolutionary process. While dmrt6 was lost in most species during evolution. The high conservation of the dmrt1-dmrt3-dmrt2(2a) gene cluster in various fish genomes suggests their crucial biological functions while various dmrt family members and sequences across fish species suggest different biological roles during evolution. This study provides a molecular basis for fish dmrt functional analysis and may serve as a reference for in-depth phylogenomics.
Copyright © 2020 Dong, Li, Hu, Sun, Tian, Li, Yan, Sun, Sheng, Yang, Shi and Ye.

Entities:  

Keywords:  comparative genomics studies; dmrt genes; fish; phylogenetic evolution; synteny analysis

Year:  2020        PMID: 33281869      PMCID: PMC7689362          DOI: 10.3389/fgene.2020.563947

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


Introduction

Doublesex and Mab-3-related transcription factor (dmrt) genes are originally homologous to Doublesex (Dsx) in Drosophila melanogaster and Male abnormal 3 (Mab-3) in Caenorhabditis elegans, both of which play important roles in sex determination (Burtis and Baker, 1989; Zhu et al., 2000; Zarkower, 2001). In recent years, a large number of genes from the dmrt family have been identified from lower invertebrates to higher vertebrates, including corals, nematodes, fruit flies, frogs, fish, birds, and mammals, some of which have been confirmed to be related to sex differentiation (Hodgkin, 2002). Currently, in addition to Dsx and Mab, the dmrt family in vertebrates include nine dmrt genes (dmrt1–8 and dmrt2b) that share common characteristics with Dsx and Mab-3. Almost all of the encoded polypeptide chains contain a conserved DNA-binding motif, known as the Doublesex and Mab-3 (DM) domain, which is composed of six conserved cysteines and two histidines (locus 1 of CCHC and locus 2 of HCCC). Both loci form two highly intertwined zinc-finger-like DNA-binding regions can bind to the minor groove in DNA. Notably, this domain is highly conserved among organisms of different evolutionary types (Erdman and Burtis, 1993; Zhu et al., 2000). Fish dmrt genes were first discovered in Nile tilapia (Oreochromis niloticus) and rainbow trout (Oncorhynchus mykiss) (Guan et al., 2000; Marchand et al., 2000). These genes in the dmrt family have now been identified in more than 30 fish species. Seven dmrt genes have been found in fish, including dmrt1–6 and dmrt2b. DMRT1 plays an important role in sex differentiation and testicular development (Matson and Zarkower, 2012), except the DM-W gene, a DMRT1 W-linked paralog in Xenopus laevis, play the opposite roles in primary ovary development (Yoshimoto et al., 2010). DMRT1 is specifically expressed only in the embryonic genital ridge and adult testes of human males, and is related to the expression of sex-determining genes and differentiation of primordial germ cells (Raymond et al., 1998; Moniot et al., 2000; Matson et al., 2011). Alternatively, studies on more than 20 fish species have determined that fish dmrt1 expression is related to male development regardless of the various sex determination mechanisms (Kobayashi et al., 2004, 2008; Johnsen et al., 2010), indicating that dmrt1 plays a key role in male germ cells self-renewal and differentiation, testicular development and spermatogenesis of fish (Herpin and Schartl, 2011; Lin et al., 2017). Furthermore, in the medaka Oryzias latipes, a Y-specific dmy gene, copy of autosome dmrt1, is the master sex-determining gene inducing male formation too. The genomes of amphibians, reptiles, birds, and mammals contain only a single dmrt2 gene, whereas fish harbor two dmrt2 genes (dmrt2a and dmrt2b) (Liu et al., 2009; Su et al., 2015; Lyu et al., 2019). DMRT2 is widely distributed in the tissues of mammals and fish, and is expressed in both testes and ovaries (Kim et al., 2003; Winkler et al., 2004; El-Mogharbel et al., 2007). However, the function of DMRT2 has not been conserved during the evolution of species (Meng et al., 1999; Seo et al., 2006). For example, mouse DMRT2 is mainly involved in somite differentiation, in particular the patterning of the axial skeleton system (Lourenco et al., 2010). In contrast, both zebrafish dmrt2a and dmrt2b are involved in somite development, of which dmrt2a is necessary for symmetric somite formation and fast muscle differentiation (Saude et al., 2005; Lu et al., 2017), and dmrt2b regulates asymmetric organ positioning via the Hedgehog signaling pathway and therefore it is related to branchial arch and slow muscle development (Zhou et al., 2008; Li et al., 2018). This indicates that differences exist in the expression and functionality of dmrt2a and dmrt2b in fish. Mammalian DMRT3 is highly expressed in the testis but not in the ovary; hence, it may be related to testicular differentiation and development (Hong et al., 2007). In mice, DMRT3 is also expressed in numerous non-gonadal tissues such as the embryonic forebrain and olfactory placode, in addition to spinal cord neurons, and thus it may be involved in neuronal specification (Smith et al., 2002; Kim et al., 2003; Andersson et al., 2012). Fish Dmrt3 is highly expressed in the testis and nervous system, and has accordingly been speculated to play a role in the developmental processes of the nerves and germ cells (Yamaguchi et al., 2006; Li et al., 2008; Dong et al., 2010). The mouse dmrt4 gene is expressed in the testis and ovary, in addition to other various tissues (Kim et al., 2003). It can regulate the formation and development of ovarian follicles (Balciuniene et al., 2006). Alternatively, Xenopus DMRT4 is involved in the regulation of neurogenesis in the olfactory system (Huang et al., 2005b). In some fish species, the expression of dmrt4 in the ovary is significantly higher than that in the testis (Guan et al., 2000; Su et al., 2013; Wang, 2013); in other species, its expression is significantly higher in the testis than the ovary (Kondo et al., 2002; Dong and Chen, 2013; Sheng et al., 2014), whereas yet other species show high expression in both organs (Yamaguchi et al., 2006). In addition, dmrt4 is also expressed in the spleen (Yamaguchi et al., 2006; Sheng et al., 2014), kidney (Kondo et al., 2002; Sheng et al., 2014), gills (Kondo et al., 2002; Wang, 2013), and brain (Dong and Chen, 2013) in fish. Hence, it has been speculated to be related to immune and nervous system development. Mouse DMRT5 is mainly expressed in brain tissue and is necessary for the early embryonic development of the cerebral cortex (Veith et al., 2006a; Konno et al., 2012). As a novel neurogenic factor, DMRT5, together with DMRT3, jointly controls hippocampal development and neocortical area map formation (Muralidharan et al., 2017; De Clercq et al., 2018). Fish dmrt5 is highly expressed primarily in the brain but can also be found in the gonads, eyes, and pituitary gland (Guo et al., 2004; Veith et al., 2006a; Yamaguchi et al., 2006; Gu et al., 2019). Furthermore, dmrt5 plays a key role in zebrafish neurogenesis in the telencephalon (Yoshizawa et al., 2011) and can regulate corticotrope and gonadotrope differentiation in the pituitary (Graf et al., 2015), in addition to spermatogenesis (Xu et al., 2013). Mammalian DMRT6 is mainly expressed in gonadal intermediate cells and differentiating spermatogonia. It plays a crucial role in coordinating the transition of primordial germ cells from the mitotic to meiotic developmental programs during spermatogenesis (Zhang X. et al., 2014) and is also expressed in the embryonic brain of mice (Kim et al., 2003). Early studies have suggested that the dmrt6 gene is missing in fish (Veith et al., 2006b). However, recent studies have found that certain fish, such as coelacanth, tilapia, and Southern catfish also carry the dmrt6 gene, and that tilapia dmrt6 is involved in spermatogenesis (Forconi et al., 2013; Zhang X. et al., 2014). However, DMRT7 and DMRT8 are only present in mammals. The two genes are very similar, although DMRT8 does not have a complete DM domain. DMRT7 is specifically expressed in the male and female gonads and is related to mouse gonadal development and spermatogenesis (Kawamata and Nishimori, 2006; Hong et al., 2007). In comparison, DMRT8 is highly expressed in the male gonads and may have evolved from DMRT7 (Ottolenghi et al., 2002; Veith et al., 2006a). Currently, reports are only available regarding the phylogenetic analysis of pan-arthropod and pan-metazoan DMRT family members (Volff et al., 2003; Wexler et al., 2014; Panara et al., 2019); to our knowledge, no studies have yet been published on the phylogeny of fish dmrt family. However, fish comprise a wide variety of species and previous reports have shown that members of the fish dmrt family own unique features such as two paralogs of dmrt2 genes (dmrt2a and dmrt2b), in addition to diverse tissue expression of the same gene family member in various fish (e.g., dmrt4), thus suggesting a remarkable difference in function. As the sequences of DMRT family members are highly variable with only the DM domain [∼49 amino acids (aa)] exhibiting high sequence homology (Volff et al., 2003), it is difficult to accurately determine the evolutionary relationship among the family members based on such short sequences, which in turn has limited our understanding of the history of DMRT functional development. Nevertheless, in recent years the whole-genome sequencing of many fish species has significantly facilitated the in-depth and systematic analysis on the evolutionary relationships among gene family members. In this study, we therefore employed the fine genomic map of largemouth bass recently obtained using third-generation sequencing by our team and collected the dmrt sequences of 16 fish species with different taxonomic positions from published whole-genome sequences, in order to analyze the sequence structure, phylogenetic relationship, sequence conservation, and synteny of members of the fish dmrt family. These findings will lay a solid foundation for a more systematic understanding of the structural characteristics of these members in fish dmrt family, and for further investigations into the different functions of fish dmrt family members in sex determination or differentiation along with their underlying mechanisms.

Materials and Methods

Sequence Collection

In the present study, we employed two strategies to collect nucleotide or deduced amino acid sequences for dmrt family members in various vertebrates (Supplementary Table S1). For those with publicly available sequences, such as in human (Homo sapiens) and mouse (Mus musculus), we downloaded the sequences from NCBI or Ensembl (Supplementary Table S2). Other dmrt sequences were extracted from corresponding genome databases through BLAST (Altschul et al., 1990) and Genewise (Birney et al., 2004). In brief, we used zebrafish (Danio rerio), Japanese medaka (Oryzias latipes), and mouse DMRT protein sequences from NCBI as the references, and mapped them onto the examined genomes using tBLASTn with an E-value <1e–5 and an alignment rate>0.6. Solar v0.9.6 was applied to connect high-identity segment pairs. Subsequently, we discarded those low-quality results with alignment rate <0.6 and mapping identity <0.5. Finally, each gene sequence was predicted on the target genomic region using Exonerate v2.2.0 (Slater and Birney, 2005), and extended 5 kb in the upstream and downstream directions to obtain the integrated gene model. A total of 147 dmrt sequences were derived from 23 representative vertebrate species, including 2 mammals (human and mouse), 2 birds (chicken Gallus gallus and zebra finch Taeniopygia guttata), 1 reptile (green Anole Anolis carolinensis), 1 amphibian (Western clawed frog Xenopus tropicalis), and 17 fish species belonging to two classes (Actinopterygii and Sarcopterygii), and ten superorders (Percomorpha: largemouth bass, Asian sea bass Lates calcarifer, European sea bass Dicentrarchus labrax, Japanese pufferfish Takifugu rubripes, Chinese tongue sole Cynoglossus semilaevis, and threespine stickleback Gasterosteus aculeatus; Atherinomorpha: Japanese medaka and southern platyfish Xiphophorus maculatus; Protacanthopterygii: Atlantic salmon Salmo salar; Paracanthopterygii: Atlantic cod Gadus morhua; Ostariophysi: channel catfish Ictalurus punctatus and electric eel Electrophorus electricus; Clupeomorpha: Atlantic herring Clupea harengus; Elopomorpha: Japanese eel Anguilla japonica; Osteoglossomorpha: Asian arowana Scleropages formosus; Holostei: spotted gar Lepisosteus oculatus; Coelacanthiformes: African coelacanth Latimeria chalumnae).

Sequence Alignment and Phylogenetic Analysis

We performed phylogenetic analysis on these collected dmrt sequences. MAFFT v7.273 (Katoh et al., 2002) was employed to align these sequences. Gblocks was used to find conserved fragments with the following parameter settings: minimum number of sequences for a conserved/flank position (75/75), maximum number of contiguous non-conserved positions (50), minimum length of a block (50), allowed gap positions (all). ProtTest v3.42 was operated to determine the best-fit models of amino acid replacement (Darriba et al., 2011). Based on the Akaike Information Criterion (AIC) algorithm, we set the best-fit model as “JTT+I+G+F.” Finally, we utilized PhyML 3.0, MrBayes v3.24 7, and MEGA v7.0 8 to analyze these sequences with 1,000,000 generations for Ngen and 100 for Samplefreq (Ronquist et al., 2012). Branch support values were calculated using Bayesian posterior probabilities. Evolview (He et al., 2016) was applied to edit constructed phylogenetic trees.

Identification of Conserved Synteny for the dmrt1–dmrt3–dmrt2(2a) Gene Cluster (Synteny Analysis)

To evaluate the conservation of the dmrt1dmrt3dmrt2(2a) gene cluster, we explored conserved genes in the upstream and downstream regions (20 kb) within the genomes of 19 examined species, using zebrafish genomic sequence as the reference, since the zebrafish genome is currently the best fish genome assembly with the highest quality and the completest genome annotation. These examined genome assemblies were explored using tBLASTn (Altschul et al., 1990), and the best-fit results were selected using a Perl script and Adobe Illustrator.

Substitution Rate Estimation and Comparison (Ka/Ks Analysis)

We calculated the average non-synonymous substitutions (Ka), synonymous substitutions (Ks), and Ka/Ks among dmrt1, dmrt2(2a), dmrt3, and the dmrt1dmrt3dmrt2(2a) gene cluster to test the selective pressure at the codon-based sequence level among various species. First, we aligned dmrt gene sequences from each species to spotted gar (L. oculatus; as the reference sequence) by using Prank v100802 with the “-codon” model (Loytynoja and Goldman, 2005). Subsequently, we calculated the Ka, Ks, and Ka/Ks values of each pair using Ka/Ks Calculator v2.0 with four different algorithms, including gMYN (Wang D.P. et al., 2009), gYN (Wang D. et al., 2009), MYN (Zhang et al., 2006), and YN (Yang and Nielsen, 2000).

Analysis of Regulatory Regions and Cross-Species Comparisons of the dmrt1–dmrt3–dmrt2(2a) Gene Cluster

Complete genomic sequences with 20 kb-upstream/downstream regions of the dmrt1dmrt3dmrt2(2a) gene cluster were extracted from various species. We applied mVISTA (Frazer et al., 2004) to align these relevant genomic sequences. This tool can align and compare long sequences based on the window-based comparisons of sequence conservation. Repetitive elements were annotated using RepeatMasker v4.06 software (Chen, 2004), and the zebrafish genomic sequence was used as the reference. Pair-wise sequence comparisons were determined with a threshold of 70% identity in each 50-bp window. In addition, five typical regulatory elements, including BRE, CAAT box, E box, GC box, and TATA box, were predicted in each sequence using a Perl script (the motif function in Primer 5.0 and Genomatix MatInspector). Finally, Adobe Illustrator and R were applied to produce graphs for the information obtained.

Results

Cross-Species Changes in dmrt Family Members and Copy Numbers

A total of 147 dmrt sequences were derived from 23 representative vertebrate species (Table 1 and Supplementary Tables S1, S2). Among them, 128 dmrt sequences for 17 species were downloaded from the NCBI/Ensembl databases (asterisk in Table 1 and Accession number in Supplementary Table S2). The remaining 19 dmrt sequences for three species were extracted from genomes through the method described in section “Similarities and Variances of the dmrt Gene Family Members in Various Fish Species.” These nucleotide sequences and corresponding deduced protein sequences were used for our further data analysis.
TABLE 1

Identification of the dmrt family genes in the examined vertebrates.

ClassSuperorderSpeciesCommon namedmrt1dmrt2 (dmrt2a/b)dmrt3dmrt4 (dmrtA1)dmrt5 (dmrtA2)dmrt6 (dmrtB1)dmrt7 (dmrtC2)dmrt8 (dmrtC1)

Numbers
MammalianH. sapiensHuman1*1*1*1*1*1*1*1*
M. musculusMouse1*1*1*1*1*1*1*1*
BirdsG. gallusChicken1*1*1*-1*1*--
T. guttataZebra finch1*1*1*-1*1*--
ReptiliaA. carolinensisAnole lizard1*1*1*1*1*1*--
AmphibiaX. tropicalisClawed frog1*1*1*1*1*1*--
Fish (Actinopterygii)PercomorphaM. salmoidesLargemouth bass121111--
L. calcariferAsian seabass1*2*1*1*1*1*--
D. labraxEuropean seabass12111---
T. rubripesJapanese pufferfish1*2*1*1*1*---
C. semilaevisTongue sole1*2*1*1*1*---
G. aculeatusStickleback1*2*1*1*1*---
AtherinomorphaO. latipesJapanese medaka1*2*1*1*1*---
X. maculatusSouthern platyfish1*2*1*1*1*---
ProtacanthopterygiiS. salarAtlantic salmon1*4*2*1*2*---
ParacanthopterygiiG. morhuaAtlantic cod1*1*1*1*1*---
OstariophysiD. rerioZebrafish1*2*1*-1*---
I. punctatusChannel catfish1*2*1*1*1*1*--
E. electricusElectronic eel12111---
ClupeomorphaC. harengusAtlantic herring1*2*1*1*1*---
ElopomorphaA. japonicaJapanese eel1*1*2*1*1*---
OsteoglossomorphaS. formosusAsian arowana1*2*1*1*1*---
HolosteiL. oculatusSpotted gar1*2*1*1*1*1*--
Fish (Sarcopterygii)CoelacanthiformesL. chalumnaeCoelacanth1*1*1*1*1*1*--
Identification of the dmrt family genes in the examined vertebrates. In mammals, eight dmrt genes (dmrt1–dmrt8) were identified in their genomes. However, in other species, dmrt7 and dmrt8 were lost. In addition, dmrt4 was also lost in birds. In the fish dmrt gene family, dmrt1dmrt5 showed relatively high conservation. Among these, dmrt2 usually consisted of two paralogs (dmrt2a and dmrt2b) in most fish species, with only three species (Atlantic cod, Japanese eel, and coelacanth) carrying a single paralog. dmrt6 was only found in five fish species, i.e., largemouth bass, Asian sea bass, channel catfish, spotted gar, and African coelacanth. In addition, some of the dmrt genes were duplicated in Atlantic salmon (dmrt2, 3, 5) and Japanese eel (dmrt3; see Table 1).

Structural Characterization and Evolutionary Analysis of the dmrt Family Genes

The gene structure of dmrt1 is composed of five exons in all examined species except for Atlantic salmon (Table 2 and Figure 1A), and a highly conserved DM domain (with a total of 49 aa) is located in the DMRT1 protein. In comparison, dmrt2 contains three exons and dmrt3dmrt4 contain two exons in most examined species. Dmrt5 consists of 2 to 4 exons in higher vertebrates but only two in all examined fish species except for Stickleback (Table 2). Dmrt6 contains four exons in higher vertebrates, whereas the number of exons in fish varies greatly (from 2 to 4). dmrt7 and dmrt8 can only be identified in mammals, and both contain a large number of exons (8 for dmrt7, 6–7 for dmrt8). Except for DMRT8, all DMRT proteins (DMRT1–7) contain a conserved DM domain, often locating in the first exon of each gene (Figure 1A).
TABLE 2

Exon numbers in dmrt family genes of the examined vertebrates.

ClassSuperorderSpeciesCommon namedmrt1dmrt2 (dmrt2a/b)dmrt3dmrt4 (dmrtA1)dmrt5 (dmrtA2)dmrt6 (dmrtB1)dmrt7 (dmrtC2)dmrt8 (dmrtC1)

Exon numbers
MammalianH. sapiensHuman53223486
M. musculusMouse53223487
BirdsG. gallusChicken532-24--
T. guttataZebra finch532-24--
ReptiliaA. carolinensisAnole lizard532244--
AmphibiaX. tropicalisClawed frog532234--
Fish (Actinopterygii)PercomorphaM. salmoidesLargemouth bass53/32223--
L. calcariferAsian seabass53/32223--
D. labraxEuropean seabass53/3222---
T. rubripesJapanese pufferfish53/3222---
C. semilaevisTongue sole54/3222---
G. aculeatusStickleback54/3533---
AtherinomorphaO. latipesJapanese medaka54/3222---
X. maculatusSouthern platyfish54/3222---
ProtacanthopterygiiS. salarAtlantic salmon34/4/3/32/122/2---
ParacanthopterygiiG. morhuaAtlantic cod53222---
OstariophysiD. rerioZebrafish53/32-2---
I. punctatusChannel catfish53/32224--
E. electricusElectronic eel53/3222---
ClupeomorphaC. harengusAtlantic herring53/3222---
ElopomorphaA. japonicaJapanese eel532/222---
OsteoglossomorphaS. formosusAsian arowana54/3222---
HolosteiL. oculatusSpotted gar53/32222--
Fish (Sarcopterygii)CoelacanthiformesL. chalumnaeCoelacanth532224--
FIGURE 1

Characterization and phylogenetic analysis of the dmrt family in vertebrates. (A) Various structures of dmrt genes. The genomic structures of dmrt1–dmrt5 are based on the data of Japanese pufferfish, whereas those of dmrt6–dmrt8 are derived from the relevant data of mouse. (B) A Bayesian phylogenetic tree of 147 dmrt sequences. The phylogenetic analysis was performed using MrBayes v3.2.6. Amino acid replacement model selection was calculated using ProtTest with the best-fit model of JTT+I+G+F. The tree is rooted with the N. vectensis dmrtA.

Exon numbers in dmrt family genes of the examined vertebrates. Characterization and phylogenetic analysis of the dmrt family in vertebrates. (A) Various structures of dmrt genes. The genomic structures of dmrt1dmrt5 are based on the data of Japanese pufferfish, whereas those of dmrt6–dmrt8 are derived from the relevant data of mouse. (B) A Bayesian phylogenetic tree of 147 dmrt sequences. The phylogenetic analysis was performed using MrBayes v3.2.6. Amino acid replacement model selection was calculated using ProtTest with the best-fit model of JTT+I+G+F. The tree is rooted with the N. vectensis dmrtA. Using DMRTA protein sequence of the sea anemone (Nematostella vectensis) as the out-group, we constructed a protein-based phylogenetic tree (Figure 1B), in which the DMRT family is distinctly categorized into eight groups (DMRT1–DMRT8). All DMRT proteins are distributed in the following three main groups: Group 1 includes five subfamilies, i.e., DMRT1, 2, 6, 7, and 8. DMRT2 was placed as the sister of DMRT7/8 and DMRT1 as the sister of DMRT6, suggesting a closer evolutionary relationship among these subfamilies. The subfamilies DMRT2, 7, and 8 were together placed as a sister group to the DMRT1 and 6 subfamilies. Group 2 includes DMRT4 and DMRT5. Group 3 contains only one subfamily DMRT3 (see more details in Figure 1B).

The dmrt1-dmrt3-dmrt2(2a) Cluster in Fish Genomes

Syntenic relationships of the dmrt1-dmrt3-dmrt2(2a) gene cluster were analyzed in 17 fish species, and using zebrafish (Cypriniformes, Ostariophysi) as the base reference. The dmrt1dmrt3dmrt2(2a) gene cluster is relatively conserved in various fish species; consistent with this, fish within the same superorder were clustered together (Figure 2A). A total of 11 genes (gas1a, dapk1, ctsla, fbp2, fbp1a, kank1a, smarca2, adamts3, npffr2a, gc, and slc4a4a) neighbor the zebrafish dmrt1dmrt3dmrt2(2a) gene cluster (Supplementary Table S3). All these genes were also found to neighbor the dmrt1dmrt3dmrt2(2a) gene cluster in channel catfish and Atlantic herring, whereas some genes were lost around this gene cluster in the remainder fish species (with frequent loss of the gc).
FIGURE 2

Phylogenetic tree and synteny conservation of the fish dmrt1-dmrt3-dmrt2(2a) gene clusters. (A) The rectangular Bayesian phylogenetic tree, rooted with human and mouse sequences. Numbers besides the nodes are Bayesian posterior probabilities (colored). (B) The synteny of examined dmrt1-dmrt3-dmrt2(2a) gene clusters. Full names of relevant genes are provided in Supplementary Table S3.

Phylogenetic tree and synteny conservation of the fish dmrt1-dmrt3-dmrt2(2a) gene clusters. (A) The rectangular Bayesian phylogenetic tree, rooted with human and mouse sequences. Numbers besides the nodes are Bayesian posterior probabilities (colored). (B) The synteny of examined dmrt1-dmrt3-dmrt2(2a) gene clusters. Full names of relevant genes are provided in Supplementary Table S3. Mammals have lost larger numbers of genes next to this cluster, which also happens in two fish species (Atlantic salmon and coelacanth). Furthermore, in some fish species, such as largemouth bass, the dapk1 gene experienced a polyploidization event to generate four tandem duplicated copies. This phenomenon was also observed in the southern platyfish, which harbors two copies of the ctsla gene in its genome. Moreover, in both largemouth bass and southern platyfish, the gas1a gene experienced a translocation and inversion event as well (see more details in Figure 2B).

Substitution Rates (Ka/Ks) of the dmrt1–dmrt3–dmrt2(2a) Cluster in Fish Genomes

Ka/Ks represents the ratio of non-synonymous substitutions (Ka) to synonymous substitutions (Ks). This ratio can be used to determine whether there is selective pressure on a given protein-coding gene. It is generally believed that synonymous mutations are not subjected to natural selection, whereas non-synonymous mutations are. Ka/Ks > 1 implies the existence of positive selection; Ka/Ks = 1 suggests neutral selection; and Ka/Ks < 1 indicates purifying selection. Comparing the Ka and Ka/Ks values using four different methods (gMYN, gYN, MYN, and YN), we found conserved substitution rates in dmrt1, dmrt2(2a), dmrt3, and the dmrt1dmrt3dmrt2(2a) gene cluster. In detail, all the mean Ka/Ks values were less than 1 (Figure 3), indicating a purifying selection on these genes. However, dmrt1 showed a higher average Ka/Ks value (0.0900) than dmrt2 (0.0576), dmrt3 (0.0645), and the dmrt1dmrt3dmrt2(2a) gene cluster (0.0606), suggesting that the evolution of dmrt1 might be less conservative and thereby may provide more variants for selection.
FIGURE 3

Ka (A) and Ka/Ks (B) values for the target coding sequences. Paralog genes in the spotted gar were used as the references. **p < 0.01; ***p < 0.001; and ****p < 0.0001 (based on a Wilcoxon test). Four methods, including GMYN, GYN, MYN, and YN, were used.

Ka (A) and Ka/Ks (B) values for the target coding sequences. Paralog genes in the spotted gar were used as the references. **p < 0.01; ***p < 0.001; and ****p < 0.0001 (based on a Wilcoxon test). Four methods, including GMYN, GYN, MYN, and YN, were used.

Conserved Sequences and Regulatory Elements in the Fish dmrt1–dmrt3–dmrt2(2a) Gene Clusters

To visualize the genomic conservation, the mVISTA tool was employed to generate a VISTA plot (Figure 4A). We used Chinese tongue sole (ZZ/ZW) (Chen et al., 2014), Japanese medaka (XX/XY) (Kasahara et al., 2007), southern platyfish (males are XY or YY, females are WX, WY, or XX) (Volff and Schartl, 2001), European sea bass (polygenic sex determination system, PSD system) (Vandeputte et al., 2007; Mei and Gui, 2015), Asia sea bass (protandrous hermaphrodite) (Wang et al., 2018), and largemouth bass which was previously reported as WZ/ZZ) (Glennon et al., 2012), whereas our recent analysis combining genomic map, ddRAD-Seq and sex-reversal experiments suggests a XX/XY sex determination system (data unpublished),, to evaluate regulatory elements and genomic sequence changes in the dmrt1dmrt3dmrt2(2a) gene cluster with 20-kb upstream/downstream regions in various sex-determination systems.
FIGURE 4

Species variances of the regulatory regions in the dmrt1-dmrt3-dmrt2(2a) gene cluster. (A) A VISTA plot of the dmrt1-dmrt3-dmrt2(2a) gene clusters among six examined fish species. Peaks of similarity in pair-wise sequence alignments between zebrafish (D. rerio) are compared with Chinese tongue sole (C. semilaevis), Japanese medaka (O. latipes), southern platyfish (X. maculatus), Asian sea bass (L. calcarifer), European sea bass (D. labrax), and largemouth bass (M. salmoides). Blue peaks represent coding exons and pink peaks denote non-coding sequences. The horizontal axis shows relative positions in the zebrafish genomic sequence, whereas the vertical axis indicates percentage of identity (50–100%). Three distinct regions, named as Region 1, Region 2, and Region 3, were identified. (B) Protein sequence alignments of the Regions 2 and 3 in 17 examined fish species. (C) Numbers of different regulatory elements in six representative fish species. Additional details regarding the sequences of each regulatory element are provided in Supplementary Table S4.

Species variances of the regulatory regions in the dmrt1-dmrt3-dmrt2(2a) gene cluster. (A) A VISTA plot of the dmrt1-dmrt3-dmrt2(2a) gene clusters among six examined fish species. Peaks of similarity in pair-wise sequence alignments between zebrafish (D. rerio) are compared with Chinese tongue sole (C. semilaevis), Japanese medaka (O. latipes), southern platyfish (X. maculatus), Asian sea bass (L. calcarifer), European sea bass (D. labrax), and largemouth bass (M. salmoides). Blue peaks represent coding exons and pink peaks denote non-coding sequences. The horizontal axis shows relative positions in the zebrafish genomic sequence, whereas the vertical axis indicates percentage of identity (50–100%). Three distinct regions, named as Region 1, Region 2, and Region 3, were identified. (B) Protein sequence alignments of the Regions 2 and 3 in 17 examined fish species. (C) Numbers of different regulatory elements in six representative fish species. Additional details regarding the sequences of each regulatory element are provided in Supplementary Table S4. Overall, a similar conservation pattern in both coding and non-coding sequences was observed. Comparisons of these six fish species along with zebrafish showed considerable homology within and between these dmrt genes. We also identified three regions with remarkable diversity among these fish (lower panels in Figure 4A). Region 1 covers 207 bp located at the 11-kb upstream region of the dmrt1 gene and contains nine TATA boxes (63 bp), which only exists in Chinese tongue sole. Region 2 is located in the third exon of dmrt1 with 18-bp missing in Chinese tongue sole. Comparing the protein sequences of tongue sole and other fish species, we determined that six amino acids (-P/S-A/S/T/P-YY-S/G/N-N-) were missing (Figure 4B). Region 3, located in the second exon of dmrt3, shows a 21-nucleotide (nt) deletion in Japanese medaka and a 15-nt deletion in southern platfish (Figure 4B). In the examined six species, TATA box represents the main regulatory element. In Chinese tongue sole, TATA boxes are much more frequent than in other fish, however, markedly fewer E boxes, GC boxes, and B recognition elements (BREs) are present in Japanese medaka than in other species (Figure 4C).

Synteny of Other dmrt Genes [Excluding dmrt1, dmrt2(2a), and dmrt3] in Fish Genomes

Based on the whole-genome sequence of largemouth bass and other eight representative vertebrate species (including O. niloticus, T. rubripes, O. latipes, I. punctatus, L. oculatus, A. carolinensis, and H. sapiens) obtained from NCBI, we performed a synteny analysis of four dmrt genes, including dmrt2b, dmrt4, dmrt5, and dmrt6. The results (Figure 5) indicated that among these fish species, the KN motif and ankyrin repeat domain-containing protein 4 (kank4) and low-density lipoprotein receptor-related protein 8 (lrp8) genes in the upstream of dmrt2b were conserved. The ELAV-like protein 2 (elavl2) and caspase activity and apoptosis inhibitor 1 (caap1) genes in the downstream of dmrt4, and the elavl4 and FAS-associated factor 1 (faf1) genes in the downstream of dmrt5 were also conserved, which is consistent with the findings in reptiles and humans. This suggests that dmrt2b, dmrt4, and dmrt5 were relatively conserved during the evolutionary process.
FIGURE 5

Synteny of dmrt genes in various fish species. The synteny analyses were performed in eight vertebrate species (M. salmoides, O. niloticus, T. rubripes, O. latipes, I. punctatus, L. oculatus, A. carolinensis, and H. sapiens). Chr., chromosome; LG, linkage group. The most conserved surrounding genes of dmrt genes were shown in green and bold fonts. The most conserved were shown in red fonts.

Synteny of dmrt genes in various fish species. The synteny analyses were performed in eight vertebrate species (M. salmoides, O. niloticus, T. rubripes, O. latipes, I. punctatus, L. oculatus, A. carolinensis, and H. sapiens). Chr., chromosome; LG, linkage group. The most conserved surrounding genes of dmrt genes were shown in green and bold fonts. The most conserved were shown in red fonts. Although dmrt6 was lost in most fish species including T. rubripes and O. latipes, in L. oculatus and I. punctatus, lrp8 was present in the upstream of dmrt6, which is consistent with higher vertebrates; whereas in M. salmoides and O. niloticus, dmrt6 was located between the conserved plectin (plec) and epiplakin-F-box/LRR-repeat protein 6 (eppk1fbxl6) genes (see more details in Figure 5).

Discussion

Fish are the oldest and most diverse group among vertebrates, containing about 32,000 species and accounting for more than half of the vertebrate species. Fish have undergone a long history of emergence, development, and evolution. The increasing amount of fish genomic information provides an important resource for studying the evolution, structure, and function of key genes through comparative genomics analysis. Seven dmrt genes have been identified in fish to date, including dmrt1–6 and dmrt2b. dmrt genes have also been reported in more than 30 fish species and a number of functional studies have been performed to reveal that regardless of the sex determination mechanism, the majority of fish dmrt genes (Table 3) are related to sexual development (Li et al., 2008, 2018; Liu et al., 2009; Herpin and Schartl, 2011; Yoshizawa et al., 2011; Xu et al., 2013). However, the phylogenetics of the dmrt gene family in fish have not yet been reported.
TABLE 3

The relevant references for dmrt family genes in fish.

GenesNumber of speciesReferences
dmrt123Acipenser gueldenstaedtii (Fajkowska et al., 2016); A. japonica (Jeng et al., 2019); Acanthopagrus schlegelii (Wu and Chang, 2018); Anoplopoma fimbria (Smith et al., 2013); C. semilaevis (Cui et al., 2017); D. rerio (Lin et al., 2017; Webster et al., 2017); Epinephelus coioides (Lyu et al., 2019); G. morhua (Johnsen and Andersen, 2012); Gobiocypris rarus (Cao et al., 2012); Halichoeres poecilopterus (Miyake et al., 2012); L. chalumnae (Forconi et al., 2013); M. amblycephala (Su et al., 2015); M. salmoides (Yan et al., 2019); O. latipes and X. maculatus (Kondo et al., 2002); O. niloticus (Wei et al., 2019); Odontesthes bonariensis (Fernandino et al., 2008); O. mykiss (Marchand et al., 2000); Plecoglossus altivelis (Wang et al., 2014); Sebastes schlegeli (Ma et al., 2014); Solea senegalensis (Portela-Bens et al., 2017); T. rubripes (Yamaguchi et al., 2006); X. maculatus (Veith et al., 2006b)
dmrt2 (dmrt2, 2a)15C. semilaevis (Zhu et al., 2019); Carassius auratus (Jiang et al., 2012); Carassius auratus gibelio (Liu and Gui, 2011); D. rerio (Zhou et al., 2008; Lu et al., 2017); G. morhua (Johnsen and Andersen, 2012); E. coioides (Lyu et al., 2019); Labeo rohita (Sahoo et al., 2019); M. albus (Sheng et al., 2014); M. amblycephala (Su et al., 2015); O. latipes and X. maculatus (Kondo et al., 2002); S. senegalensis (Portela-Bens et al., 2017); Scophthalmus maximus (Robledo et al., 2015); T. rubripes (Xu et al., 2013); X. maculatus (Veith et al., 2006b)
dmrt2b5Carassius auratus (Jiang et al., 2012); Carassius auratus gibelio (Liu and Gui, 2011); D. rerio (Li et al., 2018); E. coioides (Lyu et al., 2019); G. morhua (Johnsen and Andersen, 2012); M. albus (Sheng et al., 2014); M. amblycephala (Su et al., 2015); T. rubripes (Yamaguchi et al., 2006)
dmrt310D. rerio (Li et al., 2008); E. coioides (Lyu et al., 2019); G. morhua (Johnsen and Andersen, 2012); L. chalumnae (Forconi et al., 2013); M. albus (Sheng et al., 2014); M. amblycephala (Su et al., 2015); O. latipes and X. maculatus (Kondo et al., 2002); S. senegalensis (Portela-Bens et al., 2017); T. rubripes (Yamaguchi et al., 2006)
dmrt4 (dmrta1)11C. semilaevis (Dong and Chen, 2013); G. morhua (Johnsen and Andersen, 2012); M. albus (Sheng et al., 2014); M. amblycephala (Su et al., 2013); O. aureus (Cao et al., 2007); O. latipes and X. maculatus (Kondo et al., 2002); Paralichthys olivaceus (Wen et al., 2009); S. senegalensis (Portela-Bens et al., 2017); T. rubripes (Yamaguchi et al., 2006); X. maculatus (Veith et al., 2006b)
dmrt5 (dmrta2)7D. rerio (Guo et al., 2004); G. morhua (Johnsen and Andersen, 2012); M. albus (Sheng et al., 2014); O. niloticus (Shirak et al., 2006); P. altivelis (Wang et al., 2014); Scatophagus argus (Gu et al., 2019); T. rubripes (Yamaguchi et al., 2006); X. maculatus (Veith et al., 2006b)
dmrt6 (dmrtb1)2L. chalumnae (Forconi et al., 2013); O. niloticus (Zhang X. et al., 2014)
The relevant references for dmrt family genes in fish. To obtain a better understanding of the functional diversification of this gene family, we therefore examined dmrt gene complements from the whole genome sequences of 17 representative fish species representing 10 various superorders and several non-fish outgroups. The evolutionary relationships of the dmrt genes in fish were subsequently examined using both phylogenetic and synteny analyses.

Similarities and Variances of the dmrt Gene Family Members in Various Fish Species

Zhou et al. (2008) showed that unlike mammals and other groups that only harbored one dmrt2, zebrafish carries a second paralog of dmrt2(2a), dmrt2b, which was subsequently identified in many other fish species (Zhou et al., 2008; Liu et al., 2009; Su et al., 2015; Lyu et al., 2019). The 17 representative fish species analyzed in the present study belong to Actinopterygii, with the exception of coelacanth L. chalumnae that belongs to Sarcopterygii. Among the 16 actinopterygians, 14 harbored the two paralogs of dmrt2 (dmrt2a and dmrt2b), however, dmrt6, which is commonly found in mammals and other groups, was only identified in four actinopterygians including M. salmoides and the sarcopterygian L. chalumnae. Among the 17 fish species, only A. japonica and G. morhua carried dmrt2a alone and lacked dmrt6. L. chalumnae only had one dmrt2 (2a) and one dmrt6, similar to higher vertebrates. A search through the database revealed that two other sarcopterygians (Protopterus annectens and Latimeria menadoensis) also only carried one dmrt2a and dmrt6 (see more details in Supplementary Table S2; Forconi et al., 2013; Biscotti et al., 2018). Actinopterygii and Sarcopterygii are two relatively independent evolutionary branches of fish. Sarcopterygii is a side-branch in the evolution of fish, from which tetrapods evolved (Nelson et al., 2016). Therefore, the characteristics of the dmrt family genes in Sarcopterygii are more similar to those of higher vertebrates. Based on the cross-species comparisons of dmrt family genes and copy numbers, we found that some of the dmrt genes were duplicated in S. salar and A. japonica (S. salar: dmrt2, 3, 5; A. japonica: dmrt1–3; see Table 1). Lien et al. (2016) suggested that S. salar is a typical tetraploid teleost that had experienced a salmonid-specific genome duplication. The copies of dmrt genes were duplicated in its genome, whereas one copy of dmrt1 and dmrt4 were lost (Lien et al., 2016). Loss of the duplicated gene possibly occurred owing to the salmonid-specific genome duplication event, which may lead to rearrangements of genome sequences, as S. salar has lost numerous syntenic genes in comparison with other teleosts. Similar dmrt duplication and loss were also found in four other fish species (e.g., brown trout Salmo trutta and Sockeye salmon Oncorhynchus nerka) that belong to the same superorder as S. salar (i.e., Protacanthopterygii; Supplementary Table S5). In addition, the copy number of dmrt1 and dmrt3 is doubled in A. japonica, which is considered to be an uncommon ploidy (2n = 38) of this special teleost (Nomura et al., 2004). The conservation of fish dmrt1 and dmrt(3–5) sequences is relatively high, all of which containing the highly conserved DM domain and a stable number of exons (majority of dmrt1 contained 5 exons and most dmrt3–5 had 2 exons). Phylogenetic analysis showed that dmrt4 and 5 were clustered into a major branch, indicating that these genes appear to be originated from a common ancestor of dmrt. To date, the dmrt7 and dmrt8 genes have not been found in fish but only in mammals. In fact, they exist in all mammals, from the lower Monotremata in Prototheria (platypus) (Tsend-Ayush et al., 2009) to Marsupiala in Metatheria (wombat), and to the higher Euarchonta in Eutheria (mouse) (Veith et al., 2006a), thus indicating that both genes were only formed after the evolutionary divergence of mammals from other vertebrates including fish (Veith et al., 2006a). Apart from dmrt5, which is highly expressed in brain tissue, the other members of the fish dmrt family are highly expressed in the gonads. Specifically, dmrt1, 3, and 6 are highly expressed in the testis (dmrt1: (Guan et al., 2000; Marchand et al., 2000; He et al., 2003; Guo et al., 2005; Huang et al., 2005a; Veith et al., 2006b; Yamaguchi et al., 2006; Johnsen et al., 2010; Su et al., 2015); dmrt3:(Yamaguchi et al., 2006; Dong et al., 2010; Sheng et al., 2014; Su et al., 2015); dmrt6: (Forconi et al., 2013; Zhang X. et al., 2014), whereas dmrt2a, 2b, and dmrt4 are expressed in both male and female gonads, with different fish species showing different expression profiles [dmrt2a/b:(Yamaguchi et al., 2006; Zhou et al., 2008; Liu and Gui, 2011; Sheng et al., 2014; Su et al., 2015); dmrt4:(Kondo et al., 2002; Veith et al., 2006b; Yamaguchi et al., 2006; Cao et al., 2007; Wen et al., 2009; Dong and Chen, 2013; Sheng et al., 2014; Jiang et al., 2019)]. Current studies have shown that fish dmrt family members may mainly be involved in embryonic sex differentiation, gonadal development, and gametogenesis (Herpin and Schartl, 2011; Xu et al., 2013; Zhang X. et al., 2014; Graf et al., 2015), in addition to other functions such as neural development (Li et al., 2008; Lourenco et al., 2010; Yoshizawa et al., 2011).

Similarities and Variances of the dmrt1–dmrt3–dmrt2(2a) Gene Cluster in Various Fish Genomes

In vertebrate genomes, the dmrt1, dmrt2(2a), and dmrt3 genes are in tandem in the order of dmrt1dmrt3dmrt2(2a) (Johnsen and Andersen, 2012). Our phylogenetic analysis based on this dmrt1dmrt3dmrt2(2a) cluster confirmed the clustering in fish within the same superorder, thus indicating that the dmrt1dmrt3dmrt2(2a) gene cluster is highly conserved in various fish species (Figure 2). Further analysis of the conserved genes flanking this cluster revealed that D. rerio carried 11 neighboring genes, as did I. punctatus and C. harengus. However, other fish species showed partial loss (such as the gc gene), duplication (dapk1 in M. salmoides and ctsla in X. maculatus), and transversion (gas1a in M. salmoides and X. maculatus). This may have been caused by genomic polyploidization events during the evolutionary process of fish (Braasch and Postlethwait, 2012). Despite the large variations in the flanking genes among different fish species, the number and location of the dmrt1dmrt3dmrt2(2a) genes have been stable. Thus, the high conservation of the dmrt1dmrt3dmrt2(2a) gene cluster in various fish genomes suggests their crucial biological functions in fish. Among the fish genomes analyzed in this study, the Ka/Ks ratios of the dmrt1dmrt3dmrt2(2a) gene cluster and the three dmrt genes were less than 0.2, impling that after the examined actinopterygians diverged from L. oculatus, the dmrt1dmrt3dmrt2(2a) gene cluster was subjected to relatively strong purification selection in its evolutionary process, whereas its positive selection may have occurred prior to the divergence from L. oculatus. These low Ka/Ks ratios across various fish species indicate that the dmrt1dmrt3dmrt2(2a) genes are highly conserved during evolution. Occurrence of a non-synonymous substitution would alter the conformation and function of the corresponding protein, thereby affecting any individual’s sex differentiation, which in turn would affect the inheritance of the mutation site by its offspring (Wang D. et al., 2009). Therefore, the high conservation of the dmrt1/2/3 genes across fish suggests that its key role in sex differentiation. Analysis of the conserved sequences and regulatory elements was performed on the dmrt1dmrt3dmrt2(2a) gene cluster of three representative fish genomes with different sex determination systems [i.e., C. semilaevis (ZW) (Chen et al., 2012), O. latipes (XY) (Otake et al., 2006), X. maculatus (WXY) (Schultheis et al., 2009)]. Three distinct regions 1–3 were identified (Figure 4A). C. semilaevis showed 207-bp only exists in Region 1 and 18-bp deletions in Regions 2, respectively, and Region 1 contained nine TATA boxes. O. latipes and X. maculatus showed 21- and 15-bp deletions, respectively, in Region 3. Analysis of the regulatory elements for this gene cluster indicated that the number of TATA boxes in C. semilaevis was higher than that in other fish species (twice of O. latipes), whereas O. latipes had significantly more E box, GC box, and BRE elements than other fish species. Fish with various sex-determination systems showed significant differences in their conserved sequences and regulatory elements, suggesting that the dmrt1dmrt3dmrt2(2a) gene cluster may be related to the sex-determination systems in fish. In our recent study, it reveals that M. salmoides is a XY/XX system species (Sun et al., 2020). In conserved sequences analysis of fish dmrt1dmrt3dmrt2(2a) gene clusters, it had much difference in Regions 1–3 between M. salmoides and C. semilaevis (ZW/ZZ). The Region 2 of M. salmoides was more similar to O. latipes (XX/XY). The Region 3 of M. salmoides was similar to D. labrax (PSD) and L. calcarifer (hermaphrodite) (Wang et al., 2018). Therefore, the sex-determination systems of M. salmoides might be preferred to XY/XX system species.

Conserved Synteny of the dmrt Genes in Fish Genomes

The synteny analysis performed in this study showed that apart from dmrt6, all other six dmrt genes in the fish dmrt gene family (dmrt1, dmrt2a, dmrt2b, and dmrt3–5) were relatively conserved. Fish dmrt1dmrt3dmrt2(2a) clusters are located in tandem in genomes, which is consistent with higher vertebrates. Fish dmrt4 is usually located on a different chromosome from the cluster, and the downstream elavl2 gene is conserved. In contrast, dmrt4 is located on the same chromosome as this dmrt1dmrt3dmrt2(2a) cluster in higher vertebrates, but the downstream elavl2 gene is also conserved. Fish dmrt5 gene is the same as that in higher vertebrates, in which the upstream and downstream elavl4 and faf1 genes are also conserved. dmrt2b gene can be found in most fish species, and the upstream kank4 and lrp8 genes are conserved. The dmrt6 gene is lost in most fish genomes. However, in L. oculatus and I. punctatus, dmrt6 is conserved with downstream lrp8, which is consistent with higher vertebrates, however, in M. salmoides and O. niloticus, the dmrt6 gene is located between the conserved plec and eppk1fbxl6 genes (see Figure 5). Kondo et al. (2002) was the first report of conserved synteny analysis on the dmrt1–4 genes between fish and human. This study demonstrated that the dmrt1, 2, and 3 genes formed clusters in fish and constituted a part of a large number of genes in this cluster that exhibit conserved synteny between human and fish. Johnsen and Andersen (2012) performed chromosomal synteny analysis on dmrt2a and dmrt2b, and proposed that these genes originated from the second round (2R) of whole genome duplication of the ancestral dmrt2 (Johnsen and Andersen, 2012). In turn, Mawaribuchi et al. (2019) performed phylogenetic cluster analysis of lower bilaterian and higher animal dmrt genes, based on which they speculated that the dmrt3 gene emerged by genome duplication (1R), and dmrt1 and dmrt6 emerged after the 2R genome duplication; they also proposed an evolutionary history for the dmrt family genes in bilateria (Mawaribuchi et al., 2019). Therefore, according to our data coupled with these relevant literatures, we hypothesized evolutionary history of the dmrt genes in fish (Figure 6).
FIGURE 6

Hypothetical evolutionary history of the dmrt genes in fish through fish-specific (3R) genome duplications. This figure was constructed based on figures in this study. The common ancestor of chordate might possess four ancestral genes, including dmrt4/5, dmrt2a/2b, dmrt93B, and dmrt1/6. A common ancestor of vertebrata may have possessed four dmrt family genes, dmrt1/6, dmrt2a/2b, dmrt3, and dmrt4/5. The syntenies of kank1-dmrt1-dmrt3-dmrt2a, dmrt4-elavl2-caap1, kank4-lrp8-dmrt2b, and dmrt5- elavl4 are conserved after three rounds of whole genome duplication in the ancestral vertebrates. dmrt6 is lost in most fish species.

Hypothetical evolutionary history of the dmrt genes in fish through fish-specific (3R) genome duplications. This figure was constructed based on figures in this study. The common ancestor of chordate might possess four ancestral genes, including dmrt4/5, dmrt2a/2b, dmrt93B, and dmrt1/6. A common ancestor of vertebrata may have possessed four dmrt family genes, dmrt1/6, dmrt2a/2b, dmrt3, and dmrt4/5. The syntenies of kank1-dmrt1-dmrt3-dmrt2a, dmrt4-elavl2-caap1, kank4-lrp8-dmrt2b, and dmrt5- elavl4 are conserved after three rounds of whole genome duplication in the ancestral vertebrates. dmrt6 is lost in most fish species. Firstly, we propose that dmrt genes exist in various fish species through the fish-specific (3R) genome duplications (Johnsen and Andersen, 2012). Secondly, we propose that dmrt3 might originate from dmrt93B and emerged by genome duplication (1R). According to our phylogenic tree of the dmrt family in vertebrates (Figure 1B), in addition to the conserved genes flanking dmrt1 and dmrt6 (kank1/4) (Figure 5), we propose that dmrt1 and dmrt6 might originate from the same ancestral dmrt (labeled with dmrt1/6? in Figure 6) and they emerged after genome duplication (2R). Thirdly, (1) we propose that the common ancestor of chordates might have four ancestral dmrt genes (dmrt4/5, dmrt2a/2b, dmrt93B, and dmrt1/6), and two ancestral surrounding genes (kank1/4 and elval2/4). (2) A common ancestor of vertebrata might have possessed four dmrt family genes, inclduing dmrt1/6, dmrt2a/2b, dmrt3, and dmrt4/5. The four dmrt family genes and their conserved surrounding genes are distributed in tandem on two pairs of chromosomes, specifically kank1/4dmrt1/6dmrt3dmrt2a/2bdmrt4/5–elalv2/4 and kank1/4dmrt1/6dmrt2a/2bdmrt4/5–elalv2/4. (3) After three rounds (3R) of whole genome duplication, the syntenies of kank1dmrt1dmrt3dmrt2a, dmrt4elavl2caap1, kank4lrp8dmrt2b, and dmrt5elavl4 are conserved in fish. (4) dmrt6 is lost in most fish species, although some retained it with downstream kank4lrp8, or it was recombined to other chromosomes with a location between plec and eppk1. We should note that our present study has several limitations. First, although fish are the most numerous vertebrates on earth, whole genome sequences are currently available for only a small fraction of fish. In this study, 17 representative fish species from 10 superorders were selected for analysis; however, the number and coverage of species was still insufficient and may have limited the generalizability of our results. Second, this study was based on fish species with known genome sequences, which may have affected the accuracy of data analysis because genome assembly techniques and quality vary significantly among species. For example, since coelacanth genomes are not assembled to the chromosomal level, our syntenic analysis of their genes is affected. Third, it is expected that as sequencing coverage and quality increase for fish genomes, future studies will be able to confirm and expand findings and generalizability from the present study. In this study, we applied bioinformatics methods to perform phylogenetic and synteny analyses on dmrt genes in 17 fish speices. (1) All the examined fish species have dmrt1–5 and most fish species harbored two dmrt2 paralogs (dmrt2a and dmrt2b). Phyletic evolution and structure of dmrt1∼5 and dmrt2b genes were relatively conserved in most of fish. The dmrt6 gene is lost in most fish genomes and less conservative. (2) Purifying selections on the dmrt1, dmrt2(2a), dmrt3, and the dmrt1dmrt3dmrt2(2a) gene cluster were observed. (3) Fish with various sex-determination systems have the similar genomic conservation patterns of the dmrt1dmrt3dmrt2(2a) gene cluster. dmrt2b, dmrt4, and dmrt5 were also relatively conserved during the evolutionary process. The high conservation of the dmrt1dmrt3dmrt2(2a) gene cluster in various fish genomes suggests their crucial biological functions while various dmrt family members and sequences across fish species suggest different biological roles during evolution. Furthermore, we hypothesized the evolutionary history of the dmrt genes in fish after fish-specific genome duplication(s). Moreover, here raised a series of new questions during the course of our data analysis. For example, in terms of evolutionary analysis, whether dmrt2b is homologous and functionally similar to a specific dmrt in higher animals, or does fish dmrt6 have similar functions to mammalian counterpart. We anticipate that these gene trees will help to place current dmrt research in a proper phylogenomic context. Our present study will provide a solid molecular basis for functional research on the fish dmrt family and may in particular serve as genetic reference for in-depth phylogenomics studies.

Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

Author Contributions

XY and QS conceived and designed the project and revised the manuscript. JD and JL performed the genomic investigations and wrote the manuscript. JH, CFS, YT, NY, CXS, XS, and SY participated in discussion and figure preparation. All authors read and approve the final manuscript.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The reviewers JM and ZS declared a shared affiliation, with no collaboration, with one of the authors CFS, to the handling editor at the time of review.
  107 in total

1.  Gene structure, multiple alternative splicing, and expression in gonads of zebrafish Dmrt1.

Authors:  Yiqing Guo; Hanhua Cheng; Xiao Huang; Shang Gao; Hongshi Yu; Rongjia Zhou
Journal:  Biochem Biophys Res Commun       Date:  2005-05-13       Impact factor: 3.575

2.  Sex dimorphic expression of five dmrt genes identified in the Atlantic cod genome. The fish-specific dmrt2b diverged from dmrt2a before the fish whole-genome duplication.

Authors:  Hanne Johnsen; Øivind Andersen
Journal:  Gene       Date:  2012-06-27       Impact factor: 3.688

3.  Using RepeatMasker to identify repetitive elements in genomic sequences.

Authors:  Nansheng Chen
Journal:  Curr Protoc Bioinformatics       Date:  2004-05

4.  Sexually dimorphic expression of dmrt1 in immature and mature Atlantic cod (Gadus morhua L.).

Authors:  Hanne Johnsen; Marit Seppola; Jacob S Torgersen; Madjid Delghandi; Øivind Andersen
Journal:  Comp Biochem Physiol B Biochem Mol Biol       Date:  2010-04-02       Impact factor: 2.231

5.  ProtTest 3: fast selection of best-fit models of protein evolution.

Authors:  Diego Darriba; Guillermo L Taboada; Ramón Doallo; David Posada
Journal:  Bioinformatics       Date:  2011-02-17       Impact factor: 6.937

6.  Zebrafish dmrta2 regulates the expression of cdkn2c in spermatogenesis in the adult testis.

Authors:  Shan Xu; Wei Xia; Yonathan Zohar; Jian-Fang Gui
Journal:  Biol Reprod       Date:  2013-01-17       Impact factor: 4.285

7.  VISTA: computational tools for comparative genomics.

Authors:  Kelly A Frazer; Lior Pachter; Alexander Poliakov; Edward M Rubin; Inna Dubchak
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

8.  Developmentally regulated and non-sex-specific expression of autosomal dmrt genes in embryos of the Medaka fish (Oryzias latipes).

Authors:  Christoph Winkler; Ute Hornung; Mariko Kondo; Cordula Neuner; Jutta Duschl; Akihiro Shima; Manfred Schartl
Journal:  Mech Dev       Date:  2004-07       Impact factor: 1.882

9.  Independent evolution for sex determination and differentiation in the DMRT family in animals.

Authors:  Shuuji Mawaribuchi; Yuzuru Ito; Michihiko Ito
Journal:  Biol Open       Date:  2019-08-09       Impact factor: 2.422

10.  The Drosophila doublesex proteins share a novel zinc finger related DNA binding domain.

Authors:  S E Erdman; K C Burtis
Journal:  EMBO J       Date:  1993-02       Impact factor: 11.598

View more
  3 in total

1.  Chromosome-Scale, Haplotype-Resolved Genome Assembly of Non-Sex-Reversal Females of Swamp Eel Using High-Fidelity Long Reads and Hi-C Data.

Authors:  Hai-Feng Tian; Qiaomu Hu; Hong-Yi Lu; Zhong Li
Journal:  Front Genet       Date:  2022-05-18       Impact factor: 4.772

2.  Genome-Wide Identification, Phylogeny, and Expression Profile of the Dmrt (Doublesex and Mab-3 Related Transcription Factor) Gene Family in Channel Catfish (Ictalurus punctatus).

Authors:  Siqi Xu; Shiyong Zhang; Wenping Zhang; Hongyan Liu; Minghua Wang; Liqiang Zhong; Wenji Bian; Xiaohui Chen
Journal:  Front Genet       Date:  2022-04-28       Impact factor: 4.772

3.  High Polymorphism in the Dmrt2a Gene Is Incompletely Sex-Linked in Spotted Scat, Scatophagus argus.

Authors:  Umar Farouk Mustapha; Daniel Assan; Yuan-Qing Huang; Guang-Li Li; Dong-Neng Jiang
Journal:  Animals (Basel)       Date:  2022-02-28       Impact factor: 2.752

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.